LoginCreate an Account

Documentation for Dynamic Progamming

We have a very detail documentation for Dynamic Progamming with 3 sections, 14 sub sections, 9 paragraphs and 4261 words

Introduction

What is Dynamic Programming?

If you have a complex problem but you can find a way to break it down to overlapping sub-problems which could be broken down into even smaller problems in the same way. That way is Dynamic Programming (DP) where you can find an optimal solution for the original problem from optimal solutions of smaller versions. An optimal solution is the solution that is calculated from all possible solutions.

Ex: to find the nth Fibonacci number: F(n), you need to know the F(n-1) and F(n-2) because F(n) = F(n-1) + F(n-2) so this problem can be broken down into smaller problems which can be continued breaking down into even smaller problems as the same way because F(n-1) = F(n-2) + F(n-3).

As you can see, both F(n) and F(n-1) need F(n-2) to solve the their problem so the sub-problems are overlapping when sub-problems have same sub-problems as their parent's. These features make this problem DP.

DP algorithms are different than divide and conquer algorithms where the sub-problems are not overlapping.

How to Implement DP?

There are two ways to implement a DP algorithm: Bottom-up (tabulation) and top-down (memoization)

Bottom-up (tabulation)

Using base cases and iteration to calculate bigger problems.

Ex: using base cases of Fibonacci sequence: F[0] = 0 and F[1] = 1, we can use iteration to calculate F[2] = F[1] + F[0], then F[3] = F[2]+F[1] ..

Top-down (memoization)

Using recursion (stops when reaching base cases) and memoizing results to improve efficiency.

Ex: in Fibonacci sequence, F(n) = F(n-1) + F(n-2). So using recursion, we have F(n-1) = (F(n-2) + F(n-3)) and F(n-2) = (F(n-3) + F(n-4)), then repeating until we reach F(0) or F(1).

The problem of this approach is that some problems will need to be solved multiple times. In our example, to find F(n), we need to execute F(n-3) two times, and when we break F(n-3) to smaller problems, the number of this repetition will be even larger and there will be a ton of unnecessary repeated computation.

To solve the problem, we use hash map to save results for repeating calculations so we don't have to do the calculation again but just returning the memoized result.

Here is the pseudocode to find F(n) using top-down method with memoization:

memo = hashmap
Function F(integer i):
    // reaching base cases
    if i is 0 or 1: 
        return i

    // memoizing
    if i doesn't exist in memo:
        memo[i] = F(i - 1) + F(i - 2)

    // always return memoized results to save time
    return memo[i]

Here is the comparison when finding F(5)

 

Top-down method may take more time than bottom-up method but it is easier to write.

When to Use DP?

The purpose of DP is to find the optimal solution of a problem so if you see a problem related to find best cases, ex: max/min/largest/smallest/longest/shortest ... You can consider using DP.

DP is also about considering to all possible solutions so if a problem wants you to find how many ways / possible outcomes for something so that problem may be a DP problem.

Furthermore, because a DP problem can be broken into overlapping sub-problems so if a problem has "future decisions" depend on "earlier decisions", it may be a DP problem.

Ex: determine the length of the longest integer subsequence from [1, 2, 6, 3, 5]. It's easy for us to take 1 and 2 at beginning but will we take 6 when we don't know about 3 and 5? The only way to solve that is to consider results of both taking and not taking 6 so we have to use DP.

Strategies

Framework

If a stair has N steps so how many distinct ways you can climb to the top of the stair if you can only take 1 or 2 steps at a time? This is Climbing Stairs problem.

To solve a DP problem like this, we need to combine 4 steps:

Step 1. Determine what is the state

State is a set of variables which describe the state that you are in. We call those variables are State Variables. In Climbing Stairs problem, the current step that you are on is a state variable because it defines where you are. Let's name it i.

We only choose the variables that are related to the outcome. In this case, to figure out how many ways to climb to top, the only variable that is matter is the current step that you are on.

Step 2. Design your function / loop

Generally, finding total ways to reach the top of a stair that has N steps is the same problem with finding total ways to climb to any certain given step in a stair. That means if you input i as the step you want to climb and your function/loop will return the total ways to climb to that step.

i in this case is the state variable which will be our function argument or the index of our result array in our loop.

Step 3. Find a recurrence relation between states

A recurrence relation is an equation that relates different states with each other. Ex: to climb to the 30th step, we arrived from either the 29th or the 28th step because you can take only 1 or 2 steps at a time. 

That means the total ways to reach the 30th step = the total ways to reach the 29th + the total ways to reach the 28th step. Using this logic, you can realize that dp(n) = dp(n-1) +dp(n-2) which is actually the Fibonacci sequence.

Always think about the edge cases or the ending / beginning cases can think about how to reduce that edge cases to smaller problems. Ex: if you have problem dp(i), just think how it can be related to dp(i-1) or dp(i+1), or if you have problem dp(i,j), you need to think about the relation to the dp(i-1,j), dp(i,j-1) or dp(i-1,j-1)

Step 4. Find base cases

Base cases are used to determine when we stop our recursion in top-down approach or be used as start points in bottom-up approach. In Climbing Stairs, there are only 2 ways to climb a 2 step stair, or 1 way to climb a 1 step stair so you can use dp(2) = 2 and dp(1) = 1 as base cases.

Put all things together, you will have a completed function / loop. Here is the pseudocode for the DP function of Climbing Stair problem

// using i, the state variable, as function's parameter
Function dp(interger i) 

    // stop function using base cases
    if i < 3 :
        return i // i = 2 => 2, i = 1 => 1

    // add the recurrence relation between states
    return dp(i - 1) + dp(i - 2);

In the real code, you will need to use memoisation as well. Check Climbing Stairs for the detail code of both approaches for DP algorithm.

Practice Example

We will use House Robber problem to practice DP strategy.

Step 1. Decide state variables

In House Robber problem, the variable that can describe your current state is the index of the house that you are currently at, let's name it i.

If the problem allows you to rob "up to k houses only", then would be another necessary state variable because with 3 robberies left is different than with 5 robberies left. However, in the problem actually asks to find "the maximum amount of money you can rob" so i is the only state variable.

Step 2. Design function / loop

The problem is asking for "the maximum amount of money you can rob". Therefore, our function our our array dp[] returns / represents the maximum amount of money we can rob at any given state i. That means if someone give me i, we will return the answer with our dp() or d[] so that's why i will be our function argument or array index: dp(i) or dp[i].

To find the answer for the nth house, just return dp(n-1) or dp[n-1].

Step 3. Find recurrence relation

It is a good place to think about a general state. Ex: in House Robber problem, let's say you are at the ith house then you have two options:

  1. Not rob this house: then we don't gain money so the money we will have after skipping this house is the total money we robbed till the (i-1)th house: dp(i-1)
  2. Rob this house: we gain nums[i] money but this means we did not rob the previous house due to the security system or the current total money we have before robbing this house is the total money we robbed till the (i-2)th house (two houses ago). So after robbing this house, the total amount of money we will have is nums[i]+dp(i-2).

Because the problem requires to find "the maximum amount of money you can rob" so we have to use this to make our decision. In other words, we should choose the options with amount of money after making the our decision. Put all things together, we can have our recurrence relation: dp(i) = max(dp(i-1), nums[i] + dp(i-2). 

Step 4. Find base cases

It is clear that if we have only one house, then the most money we can make is by robbing the house. This means dp(0) = nums[0]

If we have two houses, the most money we can make is by robbing the richer house: dp(1) = max(nums[0], nums[1])

Combine all things together, we can have a pseudocode for the top-down approach like below (including memoisation):

memo = hashmap
// our dp function
Function dp(i) 
   // Base cases
   if i == 0: return nums[0] 
   if i == 1: return max(nums[0], nums[1]) 

   // recurrence relation
   if i not in memo:
      memo[i] = max(dp(i - 1), dp(i - 2) + nums[i])

   return memo[i]

// our main function
Function rob(nums)
    return dp(len(nums) - 1)

For real detail code using two approaches, please check: House Robber.

You can also practice DP with the following problems:

Multidimensional DP

The number of state variables of a DP problem = its DP dimension. Ex: a DP problem has 3 state variables => it is a 3D DP problem.

The following are common things that are usually related to a state variable:

  • An index along some input. Ex: nums = [0, 1, 2, 3, 4, 5, 6] so dp(4) represents the answer of nums = [0, 1, 2, 3]
  • A second index along some input. Ex: nums = [0, 1, 2, 3, 4, 5, 6] so dp(1, 3) represents the answer of nums = [1, 2, 3]
  • Explicit number constraints given in the problem. Ex: "You are allowed to pick only k elements"
  • Variables / data structure (a tuple or bitmask) that describe status in a given state. Ex: "true if currently holding a key, false if not" or bitmask is a mask where ith bit indicates if the ith city has been visited.

    Time and Space Complexity

    The complexity of a DP algorithm is equal to the total possible states. If you have 3 state variables, just find total possible cases for each and multiply them all to get the complexity of your algorithm.

    For example, if you have i, k and holding as state variables of your problem:

    • With i can run from 0, to nums.length - 1.
    • k is can run from 1 to M
    • and holding is a Boolean variable which has 2 states.

    If each state takes O(1) to compute so the complexity in this case is O(nums.length * M * 2)

    Convert Top-Down to Bottom-Up

    Because Bottom-Up approach does not have to deal with memoization so the memory and the complexity is much smaller so you may want to convert all your DP to bottom-up approach.

    1. Replace recursive function name with array name

    For example, in MaxScore from Multiplication Operations problem, the recurrence relation in the recursive function is:

    memo[i][left] = Math.max(
        multipliers[i] * nums[left] + multiply(left+1, i+1), 
        multipliers[i] * nums[right] + multiply(left, i+1)
    );

    Then the recurrence relation in our loop is

    memo[i][left] = Math.max(
        multipliers[i] * nums[left] + memo[i+1][left+1],
        multipliers[i] * nums[right] + memo[i+1][left]
    );

    We just need to replace the recursive function with name of the array that we use to save the values for states.

    2. Find the start and end points for our state variables.

    Start points should start from some where around the base cases and covered by the base cases and your loop also needs to cover all cases for each state. You can check the MaxScore from Multiplication Operations Explanation, base case is i == m so we can start from m - 1 and run to 0. Then left can run from i to 0 because you can take 0 to i left elements to cover all cases for this state.

    Furthermore, memo[i][left] needs value from mem[i+1][left+1] whereas i and left can be m - 1, so we also need to assign memo[m][m] as 0 as well.

    You can also practice converting between the approaches with the following 2D DP problems:

    Common Patterns in DP

    Iteration in the Recurrence Relation

    In the Min Cost Climbing Stairs, the recurrence relation is:

    dp(i)=min(dp(i - 1) + cost[i - 1], dp(i - 2) + cost[i - 2])

    This is a static recurrence relation because we are allowed to climb on 1 or 2 steps at a time. But if we are allowed to climb up to k step at a time, so our recurrence relation should be

    min(dp(i - 1) + cost[i - 1], dp(i - 2) + cost[i - 2], ... , dp(i - k) + cost[i - k])
    OR
    dp(i)=min(dp(j) + cost[j]) with (i-k) <= j < i

    This is the case when we would need iteration in our recurrence relation.

    Here are some counting DP problems that you can practice:

    State Transition by Inaction

    In some DP problems, sometimes, we may see the same value for two different states. For example, in House Robber problem, sometimes we don't rob two adjacent houses in order to rob the maximum amount of money so that two states have same value.

    This is when we "do nothing" or use Inaction to gain benefits. Think about this would be useful because it will be a part of a decision making due to some restriction from the problem.

    Check the following problems:

    State Reduction

    In DP, more state variables means more time and space complexity so we should consider to reduce the number of state variable by finding ways to generate some variables from others.

    For example, in MaxScore from Multiplication Operations, we need 3 state variables: left, right, and index, however, we can generate right from left and index.

    Counting DP

    Some problems will not only ask to find the max, min but total distinct ways to do something. This is when we need to use Counting DP which has recurrence relation related to sum rather than min / max. Ex: dp[i][j] = dp[i][j-1] + dp[i-1][j]

    Counting DP also has different base cases which are never been equal to zero (most of cases are equal to 1) or we will always add 0 to DP function itself.

    Sometimes, Counting DP can be combined with other patterns of DP. Here are some counting DP problems that you can practice:

    Kadane's Algorithm

    If you have to find the contiguous of a given array (nums) that has the maximum sum in O(n) time and O(1) space complexity so that's a case for using Kadane's Algorithm.

    In this algorithm, we use current variable to save the current sum of the current contiguous part. And in the ith state, we can pick nums[i] for our contiguous part or not.

    • If we pick nums[i] => current = current + nums[i]
    • If we don't pick, means we have to start a new contiguous part with nums[i] => current = nums[i]

    Because we need to find the maximum sum, so our recurrence relation should be

    current = max(current + nums[i], nums[i])

    And we need the best variable to save the best sum we had in the past because the current is only reflecting the max sum between the current contiguous part with the new contiguous part.

    Here is the pseudocode:

    // Given an input array of numbers "nums",
    best = negative infinity
    current = 0
    for num in nums:
        current = Max(current + num, num)
        best = Max(best, current)
    
    return best

    Maximum Sub Array + EXPLANATION is a good example for this algorithm

    DP for Path in Matrix

    If you have to move from A to B in a 2D matrix with strict rules (ex: move down and right only at a time or has obstacles) so you can use Path in Matrix pattern with dp[i][j] = dps from previous allowed moves.

    If there are no moving rules, so that problem is a graph or BFS, not DP.

    Here are some DP problems for this pattern:

    More Practice Problems