• Data Science
  • James Le
  • NOV 05, 2018

Greedy Algorithm and Dynamic Programming

Ready to learn Data Science? Browse Data Science Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.

In an algorithm design there is no one 'silver bullet' that is a cure for all computation problems. Different problems require the use of different kinds of techniques. A good programmer uses all these techniques based on the type of problem. In this blog post, I am going to cover 2 fundamental algorithm design principles: greedy algorithms and dynamic programming.


A greedy algorithm, as the name suggests, always makes the choice that seems to be the best at that moment. This means that it makes a locally-optimal choice in the hope that this choice will lead to a globally-optimal solution.

How do you decide which choice is optimal?

Assume that you have an objective function that needs to be optimized (either maximized or minimized) at a given point. A Greedy algorithm makes greedy choices at each step to ensure that the objective function is optimized. The Greedy algorithm has only one shot to compute the optimal solution so that it never goes back and reverses the decision.

Greedy algorithms have some advantages and disadvantages:

  1. It is quite easy to come up with a greedy algorithm (or even multiple greedy algorithms) for a problem.

  2. Analyzing the run time for greedy algorithms will generally be much easier than for other techniques (like Divide and conquer). For the Divide and conquer technique, it is not clear whether the technique is fast or slow. This is because at each level of recursion the size of gets smaller and the number of sub-problems increases.

  3. The difficult part is that for greedy algorithms you have to work much harder to understand correctness issues. Even with the correct algorithm, it is hard to prove why it is correct. Proving that a greedy algorithm is correct is more of an art than a science. It involves a lot of creativity.



Let us say that we have a machine, and to determine its state at time t, we have certain quantities called state variables. There will be certain times when we have to make a decision which affects the state of the system, which may or may not be known to us in advance. These decisions or changes are equivalent to transformations of state variables. The results of the previous decisions help us in choosing the future ones.

What do we conclude from this? We need to break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems. If you are given a problem, which can be broken down into smaller sub-problems, and these smaller sub-problems can still be broken into smaller ones - and if you manage to find out that there are some over-lappping sub-problems, then you’ve encountered a DP problem.

The core idea of Dynamic Programming is to avoid repeated work by remembering partial results and this concept finds it application in a lot of real life situations.

In programming, Dynamic Programming is a powerful technique that allows one to solve different types of problems in time O(n2) or O(n3) for which a naive approach would take exponential time.


Let’s go over a couple of well-known optimization problems that use either of these algorithmic design approaches:


In this problem, our input is a set of time-intervals and our output is a subset of non-overlapping intervals. Our objective is to maximize the number of selected intervals in a given timeframe. The greedy approach is to consider jobs in increasing order of finish time, and then teak each job provided it’s compatible with the ones already taken. Here’s the O(n^2) implementation pseudocode:

Interval-Scheduling( (s_{1}, f_{1}), …, (s_{n}, f_{n}) ):
1 - Remain = {1, …, n}
2 - Selected = {}
3 - While (|Remain| > 0) {
4 -     k in Remain is such that f_{k} = min of f_{i} (i in Remain)
5 -     Selected = Selected and {k}
6 -     Remain = Remain - {k}
7 -     for every i in Remain {
8 -         if (s_{i} < f_{k} then Remain = Remain - {i}
9 -     }
10 - }
11 - return Selected

This greedy algorithm is optimal, but we can also use dynamic programming to solve this problem. After sorting the interval by finishing time, we let S[k] = max(S[k - 1], 1 + S[j]):

  • Where k represents the intervals order by finish time.

  • Where j < k is the largest index such that the finish time of item j does not overlap the start time of item k.


In this problem, our input is a set of time-intervals and our output is a partition of the intervals, each part of the partition consists of non-overlapping intervals. Our objective is to minimize the number of parts in the partition. The greedy approach is to consider intervals in increasing order of start time and then assign them to any compatible part. Here’s the O(n logn) implementation pseudocode:

Schedule-All-Intervals((s_{0},f_{0}), …, (s_{n-1},f_{n-1})):
1 - Sort intervals by their starting time.
2 - For j = 0 to n - 1 do:
3 -     Consider  = {1, …, depth}
4 -     For every i < j that overlaps with j do:
5 -         Consider = Consider - { Label[i] }
6 -     If |Consider| > 0 then:
7 -         Label[j] = anything from Consider
8 -     Else:
9 -         Label[j] = nothing
10 - Return Label[]

The algorithm is optimal and it never schedules two incompatible intervals in the same parts.


In this problem, our input is a set of time-intervals, where each interval has a cost. Our output is a subset of non-overlapping intervals. Our objective is to maximize the sum of the costs in the subset. A simple brute force approach can be viewed below:

Weighted-Scheduling-Attempt ((s_{1}, f_{1}, c_{1}), …, (s_{n}, f_{n}, c_{n})):
1 - Sort intervals by their finish time.
2 - Return Weighted-Scheduling-Recursive (n)
Weighted-Scheduling-Recursive (j):
1 - If (j = 0) then Return 0
2 - k = j
3 - While (interval k and j overlap) do k—-
4 - Return max(c_{j} + Weighted-Scheduling-Recursive (k), Weighted-Scheduling-Recursive (j - 1))

Although this approach works, it fails spectacularly because of redundant sub-problems, which leads to exponential running time. To improve time complexity, we can try a top-down dynamic programming method known as memoization. Below you can see an O(n^2) pseudocode:

Weighted-Sched ((s_{1}, f_{1}, c_{1}), …, (s_{n}, f_{n}, c_{n})):
1 - Sort intervals by their finish time.
2 - Define S[0] = 0
3 - For j = 1 to n do:
4 -     k = j
5 -     While (intervals k and j overlap) do k—-
6 -     S[j] = max(S[j - 1], c_{j} + S[k])
7 - Return S[n]


For this problem given an input as a sequence of numbers, we want an output of an increasing subsequence. Our goal is to maximize the length of that subsequence. Using dynamic programming again, an O(n^2) algorithm follows:

Longest-Incr-Subseq(a_{1}, …, a_{n}):
1 - For j = 1 to n do:
2 -     S[j] = 1
3 -     For k = 1 to j - 1 do:
4 -         If a_{k} < a_{j} and S[j] < S[k] + 1 then:
5 -             S[j] = S[k] + 1
6 - Return max_{j} S[j]

S[j] is the maximum length of an increasing subsequence of the first j numbers ending with the j-th. Note that S[j] = 1 + maximum S[k] where k < j and a_{k} < a_{j}.


We can make whatever choice seems best at the moment and then solve the subproblems that arise later. The choice made by a greedy algorithm may depend on choices made so far but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one. In other words, a greedy algorithm never reconsiders its choices.

This is the main difference from dynamic programming, which is exhaustive and is guaranteed to find the solution. After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage, and may reconsider the previous stage's algorithmic path to solution.

For example, let's say that you have to get from point A to point B as fast as possible, in a given city, during rush hour. A dynamic programming algorithm will look into the entire traffic report, looking into all possible combinations of roads you might take, and will only then tell you which way is the fastest. Of course, you might have to wait for a while until the algorithm finishes, and only then can you start driving. The path you will take will be the fastest one (assuming that nothing changed in the external environment).

On the other hand, a greedy algorithm will start you driving immediately and will pick the road that looks the fastest at every intersection. As you can imagine, this strategy might not lead to the fastest arrival time, since you might take some "easy" streets and then find yourself hopelessly stuck in a traffic jam.

The Harvard Innovation Lab

Made in Boston @

The Harvard Innovation Lab


Matching Providers

Matching providers 2
comments powered by Disqus.