Recently I bumped into the knapsack problem and the more I got into it, the more I started appreciating it. It is a well-known NP Hard problem existing in the Optimization world. Problem is framed as the following:

Given the choices of items with the weight and value of each, the idea is to pick and fill the knapsack with the best valuable item combination. The only challenge is to make sure the system doesn’t cross the constraint of weight for the knapsack. Problem can be described mathematically as:

The problem description is simple enough as described above. The objective is to fill the bag or knapsack with maximum possible valuables combined. The main challenge lies in the combinations of the solutions in front of us. Consider an example of three items and if we derive the number of possible solutions, it would be 2^{3} = 8 possible solutions with (0,0,0) being one solution in which none of the item gets selected and (1,1,1) being another solution where all the items are selected. The algorithm will eliminate a few solutions which are violating the capacity constraints. For simple problem with a few items such as 10-30 items, it can work fine, since the possible solutions are in the range of thousands to a billion. Beyond 50 items it becomes challenging to even explore the scale of problem. This is where computer engineer starts appreciating the power of optimization approaches and derive intelligent ways to explore the solution space without having to scan through all possible solutions and still get optimized solutions. Think about writing a code to solve this problem with 10,000 items and getting the optimized solution for the same. Good news is that we have techniques such as branch and bound with which one can write the code and a modest machine can find the answer within a few minutes with this approach. Definitely one can always debate if you have the world’ s worst case data set in your hand, and in that case the above statement might not be true and one can end up waiting for hours (even one can exit the program in such situation within some maximum iterations).

Knapsack problem is one of the useful known problems and can be applied in various scenarios or functions. Let us look at the two examples:

Allocation of funds for projects:

Consider a company with intent to invest into the product strategy with the total available investment of $1 billion. Company gets many proposals which can be executed across industry. Each proposal has two important information, return on investment (market opportunity) and requires budget (funds required to execute the plan). Company cannot do partial allocation, so in this case either a proposal is selected or rejected. The quest is to find the overall investment which a company is going to make, in such a way that it maximizes the total return on investment. We have one to one mapping of weights from funds required for execution to the return on investment and Total investment budget to the capacity constraint of the knapsack. This problem is exact match to Knapsack problem described above and solution can be deduced with the approach of knapsack algorithm.

Selection of activities:

Consider a case where we have 20 activities in hand which are all important and we have only a day (8 hours) to complete. Each activity has information like efforts (in minutes) and priority / importance. In this example the problem is to find out the tasks which an individual should select and complete in the given 8 hours of his working. Once again this can be mapped to knapsack problem and system can find the solution. Number of activities is mapped to the number of items, weights are mapped to efforts (in minutes), priority or importance is mapped to value and total capacity is mapped to a day (8 hours).

Dealing with large scale selection problem:

The brute force way to enumerate solution is easy but in actual sense companies have huge number of proposals (numbers in 100’s), in such a case we are looking into selecting the best possible returns solution from 2^100 = 1.26765E+30, which might take us hundreds or thousands of centuries to iterate over and find the best answer.

The problem really becomes complex and one cannot rely on brute force approaches of listing all the possible solutions. The smarter techniques of reducing the number of solutions to check are handy.

This is where various optimization techniques play a big role, such as branch and bound techniques as they help in finding the near to optimal answer in reasonable time.

- Sort items with its density (value/weight), with high value per unit weight being at top.
- From the sorted item list start selecting the item to fill the bag.
- After adding each item to basket make sure it does not exceed the capacity of bag or knapsack.
- If it exactly fills the capacity, system is lucky and it has found the best solution in the first iteration.
- In case little capacity is remaining and system cannot fit the next item, multiple iterations are run by removing the last item and trying out other items. Primarily exploring more feasible solutions.
- In an iterative way, utilizing the known best solution, system can quickly find out the best combination.
- In each step of item selection, system can find out by checking the total value of solution with the known best solution up to that iteration if it makes sense to try further combination with the existing selected items, as it is known that going down further in the sorted list of items would decrease the total value of combined items.

Branch and bound can be implemented in Java or any other language and one can solve the knapsack problem of scale or at least have near to good solution with the existing the algorithm within limited iteration. Another interesting measure one can find quickly is the upper bound or best solution using the linear relaxation technique. Steps to find out the best or upper bound of the problem:

- Sort items with its density (value/weight), with higher value per unit weight at top.
- Keep selecting an item until no more item can be added as the remaining capacity is not sufficient for the next item.
- Relax the selection variable to take the fractional value to fill the remaining capacity. Compute the fractional value of the last item and add it to the total value of this solution.

The above method provides the upper bound for a solution which can be used by branch and bound method for termination once the solution matches the linear relaxation solution or near to it with many iteration or after running for several minutes.

Optimization is an interesting subject and once you are into it, you start noticing lot of practical problems around you and start wondering, is it the best solution we have! Such as making investment decision or train or bus schedules, waiting in traffic light, location of hospital or police station, shortest path, etc .

Share your experiences and scenarios with such optimization problem and let us make the world run better with good optimized solutions of the problems in industries and around us.

Abinash MohapatraNice article. You talked abt either rejection/acceptance use cases which is otherwise called 0/1 knapsack (if allowed maximum once) and the other case is Unbounded knapsack. I also encountered a similar situation (not exactly knapsack) of dynamic programming during my first few days @SAP. The problem was, given an invoice with maximum value amount and a set of items with their amounts. Find the set of items which need to be put in an invoice so that the number of invoices generated is minimum.

Sudhir VermaPost authorThanks for sharing the minimum invoice problem. Yes in this blog I mainly covered 0/1 knapsack problem of large scale.

Girikanth AvadhanulaThe load building problem in Transportation Management is quite similar. A set of deliveries to be picked out of the considerations set subject to the maximum weight or volume of the transportation vehicle. The VSR optimizer reads the problem and solves it.

Sudhir VermaPost authorThanks for sharing another optimization problem from Transport Management.In VSR does it also optimizes the total distance (route). Generally in VRP (Vehicle Routing Problems) one need to do knapsack and tsp (travel sales person) problem together.