Image source: https://xkcd.com/1831/
When dealing with optimization problems, you might have heard phrases like “this is really hard to solve” or “this is an NP-hard problem.” But what is an NP-hard problem, why is it hard to solve, and what can we do about it if we need to find the solution to one of these problems? I’ll lay these questions to rest for you, share some insight into exact and heuristic algorithmic solutions to these problems, and tips on how to decide which approach is best for you.
Let’s start with a simple explanation of the difference between NP-hard and non-NP-hard problems:
NP-hard problems are fundamentally more difficult than non-NP-hard problems because the computation time required to solve them increases much faster as the size of the problem increases. As an example, you can look back on an earlier post where I demonstrated how quickly the number of required calculations for solving NP-hard problems grows. (Interested readers can also refer here to learn more about the complexity classes and the differences between them.) Some of the well-known NP-hard problems are the Knapsack Problem, Traveling Salesman Problem (TSP), and Facility Location Problem while the Shortest Path Problem, Minimum Spanning Tree (MST), and Assignment Problem are a few non-NP-hard problems.
But how do you solve them??
In these cases, you can use two specific types of algorithms to your advantage: exact and heuristic. For example, in TSP, a simple heuristic would be a nearest neighbor algorithm where one can start from a random city or node and sequentially visit the nearest city until all have been visited; exact algorithms, on the other hand, use sophisticated mathematical optimization methods such as branch-and-cut. Exact algorithms are guaranteed to find the optimal solution for the problem. But keep in mind, the larger the problem, the more complex the solution space. This, in turn, can make the exact algorithms become slower. Heuristics, on the other hand, don’t guarantee optimality, but that’s a price you might be willing to pay in order to obtain a fair solution in a shorter execution time than that of the exact algorithm.
To determine which algorithms best fit your needs, consider the checklist below:
- How large is your problem? As mentioned above, NP-hard problems are computationally expensive, but due to advancements in our computer powers and the algorithms handling these problems, we are able to solve larger problems, faster. So, what you consider large, may not be large after all. As an example, 10 years ago we could solve a TSP instance with 1000 nodes in about 7 minutes on a desktop (e.g. dsj1000 problem here). But today the Concorde TSP iPhone app can often solve instances with 1000+ nodes exactly, with all computations carried out locally on your iPhone/iPad.
- How quickly do you need the results? Surely, getting the results faster is better but what is the payoff? Do you really need to get the results in a matter of milliseconds and seconds or is one hour also an acceptable time? If you’ll only be solving this problem every few years, then how about you take a few days to solve the problem to optimality rather than solving it in a few minutes using a heuristic?
- How well do you know the structure of your problem? One algorithm applied on two different problems in the same family may result in completely different execution times, one in minutes and one in hours. Do you know which category your problem belongs to? On the same note, are you sure your current model and formulation is the best model to represent your problem? One problem might be formulated in two different ways, while one is hard to solve the other can obtain the solution faster. For example, consider how modeling a problem using Integer Programming (IP) versus Constraint Programming (CP) would affect the solutions execution time.
- How constrained is your problem? Try formulating your problem and solve a very small sample to get an understanding of its real complexities and execution time. Sometimes there are features and constraints that one may think are restricting. But they may turn out very helpful in finding the solution because they cut down the solution space, leaving fewer options to evaluate. A very simple illustrative example of this is when one decides on opening two new warehouses among 10 potential sites and realizes that one of the warehouses should be chosen from a group of 3 sites. This restriction will reduce the number of options to consider tremendously.
- How precise does the final solution need to be? Surely, the best possible solution is your goal, but how good is good enough? Is 1% worse than the best solution also acceptable? How about 5%? What about 10%? What is the range or the gap that you are comfortable accepting for a solution to be business-optimal? Are you comparing the solution from the algorithm with an existing system or solution to measure the quality? If so, maybe even a 20% gap is a lot better than what you already have and can do the job perfectly! In dealing with the NP-hard problems, the difference between execution times–and of course memory usage–for obtaining the optimal solution or the solution with a 5% gap might be very considerable (sometimes hours or days). When you are deciding on the optimal level of a chemical additive in a drug, there should probably be no room for error and the solution should be optimal. But if it’s about the cost of an investment, $5000 difference in half a million may not seem that big of a deal. If the latter is closer to your problem, then maybe you can sacrifice some quality for a better execution time; however, it goes without saying that you shouldn’t jump too quickly into using heuristics until you have made all of the proper considerations.
- How much time or resources can you invest on development? Someone with experience can build an optimization model in a few hours, whereas the same person might require days to design and develop a custom heuristic. On the flip side, there are off-the-shelf implementations of some heuristics that a non-technical person (someone with no background in operations research or alike) with coding skills might be able to manage easily. Although, knowing about the fine tuning of a heuristic’s parameters is still critical.
As a rule of thumb, consider exact algorithms first. Advancements in the computational power and algorithms that can solve the problems to optimality make this a worthy investment. Consider the above checklist. If, after answering all these questions, you realize that getting the acceptable balance between the time and quality of the solution is still something that cannot be achieved through the means of exact algorithms, then consider using heuristic approaches.
But remember, these are still very hard questions to answer unless you have the relevant background or you are experienced in working with optimization problems. For those who aren’t, our team at Opex Analytics can provide the expertise and resources to work through these questions and develop appropriate solutions for you.
Stay tuned: in the next blog we will go over a couple of examples and work through the above list in order to decide the best solution approach for each.