Traveling Salesman Problem
The Traveling Salesman Problem (TSP) asks the following question: given a list of cities and the distances between each pair, what is the shortest possible route that visits every city exactly once and returns to the starting city?
This is one of the most studied problems in combinatorial optimization. It is NP-hard, meaning that no efficient algorithm is known for solving it exactly in all cases. In practice, heuristics are used to find good (though not necessarily optimal) solutions.
This page is based on work I did in 2018, which you can find in this repository.
Construction heuristics
Construction heuristics build a tour from scratch by adding one city at a time. They are fast and produce reasonable solutions, which can then be improved with optimization techniques. The four algorithms below differ in how they choose which city to add next, and where to insert it in the partial tour.
Nearest Neighbor
The simplest construction heuristic. It is greedy: at each step, it travels to the closest unvisited city.
- Start from a random city.
- Travel to the nearest unvisited city.
- Repeat until every city has been visited.
This algorithm is fast ($$O(n^2)$$) but tends to produce tours with crossing edges, especially toward the end when few cities remain and the nearest one may be far away.
Nearest Insertion
Instead of building a path one edge at a time, insertion heuristics maintain a partial tour (a closed loop) and repeatedly insert a new city into it. Nearest insertion selects the unvisited city that is closest to any city already in the tour.
- Start with a tour consisting of a random city and its nearest neighbor.
- Selection: find the unvisited city k that minimizes $$d(k, j)$$ for some city j already in the tour.
- Insertion: find the edge $$(i, j)$$ in the tour such that inserting k between i and j causes the smallest increase in tour length, i.e. minimizes $$d(i, k) + d(k, j) - d(i, j)$$.
- Repeat until every city has been visited.
Cheapest Insertion
Cheapest insertion considers all unvisited cities and all possible positions at once. It picks the (city, position) pair that results in the smallest increase in tour length.
- Start with a tour consisting of a random city and its nearest neighbor.
- Among all unvisited cities k and all edges $$(i, j)$$ in the tour, find the combination that minimizes $$d(i, k) + d(k, j) - d(i, j)$$.
- Insert k between i and j.
- Repeat until every city has been visited.
This is slower than nearest insertion ($$O(n^3)$$ vs $$O(n^2)$$), but generally produces shorter tours because it makes globally cheaper choices at each step.
Farthest Insertion
Farthest insertion works like nearest insertion, but with the opposite selection strategy: it picks the city that is farthest from any city in the tour. The intuition is that by incorporating distant outliers first, the tour quickly takes on a shape resembling the convex hull of the point set, and later insertions only fill in the interior.
- Start with a tour consisting of a random city and its nearest neighbor.
- Selection: find the unvisited city k that maximizes $$\min_{j \in \text{tour}} d(k, j)$$.
- Insertion: insert k at the position in the tour that causes the smallest increase in length.
- Repeat until every city has been visited.
Despite its counterintuitive selection rule, farthest insertion is one of the best construction heuristics. It tends to avoid the crossing edges that plague nearest neighbor and often outperforms the other insertion variants.
Optimization heuristics
Construction heuristics produce a complete tour, but it is usually far from optimal. Optimization heuristics start from an existing tour and iteratively improve it by rearranging the order of cities. They repeat until no further improvement can be found (a local optimum).
Node Insertion
Node insertion tries to improve a tour by removing a single city from its current position and reinserting it at the best possible position elsewhere in the tour.
- Start with a random tour.
- For each city k in the tour, remove it and try reinserting it at every other position. If any reinsertion reduces the tour length, apply it.
- Repeat until a full pass over all cities produces no improvement.
Each improvement removes crossing edges or tightens detours. The complexity per pass is $$O(n^2)$$ and the algorithm typically converges in a small number of passes. Because it only moves one city at a time, it can get stuck in local optima that require moving two or more cities simultaneously to escape.
Python implementation
The original implementation of these algorithms was written in Python. The code below shows the core logic for each heuristic.
Helper functions
def closest_neighbor(distances, tour, node, in_tour=False, farthest=False):
"""Find the closest (or farthest) neighbor to a node,
among cities in the tour or not yet visited.
"""
neighbors = distances[node]
candidates = [
(city, dist)
for city, dist in neighbors.items()
if (city in tour if in_tour else city not in tour)
]
return sorted(candidates, key=lambda x: x[1])[-1 if farthest else 0]
def insertion_cost(distances, i, j, k):
"""Cost of inserting node k between nodes i and j."""
return distances[i][k] + distances[k][j] - distances[i][j]
Nearest Neighbor
def nearest_neighbor(cities, distances):
city = randrange(1, len(cities))
current, tour, tour_length = city, [city], 0
while len(tour) != len(cities):
nearest, edge_length = closest_neighbor(distances, tour, current)
tour_length += edge_length
tour.append(nearest)
current = nearest
# close the tour by adding the last edge
tour_length += distances[current][city]
return tour, tour_length
Nearest Insertion
def nearest_insertion(cities, distances, farthest=False):
city = randrange(1, len(cities))
tour = [city]
select = max if farthest else min
# min_dist[c] = distance from c to its nearest tour city
min_dist = {node: distances[node][city] for node in cities if node != city}
while min_dist:
# selection: pick closest (or farthest) city to the tour
best = select(min_dist, key=min_dist.get)
# insertion: find the position that minimizes tour length increase
best_position, best_cost = None, float('inf')
for i in range(len(tour)):
j = (i + 1) % len(tour)
cost = insertion_cost(distances, tour[i], tour[j], best)
if cost < best_cost:
best_position, best_cost = i, cost
tour.insert(best_position + 1, best)
del min_dist[best]
# update distances with the newly added city
for node in min_dist:
min_dist[node] = min(min_dist[node], distances[node][best])
return tour
Cheapest Insertion
def cheapest_insertion(cities, distances):
city = randrange(1, len(cities))
tour = [city]
# start with the nearest neighbor
neighbor, length = closest_neighbor(distances, tour, city)
tour.append(neighbor)
while len(tour) != len(cities):
# find the (city, position) pair whose insertion
# causes the smallest increase in tour length
best_dist, new_tour = float('inf'), None
for candidate in cities:
if candidate in tour:
continue
for i in range(len(tour)):
j = (i + 1) % len(tour)
dist = insertion_cost(distances, tour[i], tour[j], candidate)
if dist < best_dist:
best_dist = dist
new_tour = tour[:i + 1] + [candidate] + tour[i + 1:]
tour = new_tour
return tour
Farthest Insertion
def farthest_insertion(cities, distances):
# same as nearest insertion, but selecting the farthest city
return nearest_insertion(cities, distances, farthest=True)
Node and Edge Insertion
def substring_insertion(cities, distances, k):
solution = generate_initial_tour(cities)
stable, best = False, compute_length(solution, distances)
while not stable:
stable = True
for i in range(len(solution) - k):
for j in range(len(solution)):
substring = solution[i:(i + k)]
candidate = solution[:i] + solution[(i + k):]
candidate = candidate[:j] + substring + candidate[j:]
tour_length = compute_length(candidate, distances)
if best > tour_length:
stable, solution, best = False, candidate, tour_length
return solution
# node insertion: move one city at a time
node_insertion = partial(substring_insertion, k=1)
# edge insertion: move two consecutive cities at a time
edge_insertion = partial(substring_insertion, k=2)