Monday, January 2, 2012

The Knights Tour Problem - Multi-Processor Algorithmic Parallelism with Python

Resources
Wikipedia Entry.
Stefan Behnel's page [What the Chess Knight is Doing] features some handy info on the solution space.

A Brief Introduction to the Knight's Tour Problem

A Knight's Tour is a traversal of an 8x8 chess-board by the Knight / Horse piece, where the Knight touches each and every square once and only once during the tour, and makes only its characteristic L-shaped legal moves.

Open Knights Tour on 5x5 Board
[Graphic Linked from Wikipedia Entry]

The problem can be further generalised by varying board parameters - to produce arbitrarily sized square NxN grids, and also arbitrarily-sized rectangular NxM grids.

There is also the closely-related problem of finding closed tours - where the Knight must be in a position on the final square of its open tour to theoretically return again to the first square of that tour by a legal move [closing the tour].

Closed Knights Tour on 8x8 Board
[Graphic Linked from Wikipedia Entry]

The Knights Tour Problem is a specific instance of the more general problem of finding a Hamiltonian Path (i.e. a connected path that touches each vertex once and only once) [open tour problem], and a finding a Hamiltonian cycle [closed tour problem].

The Solution Space


NxN Square Board - Count of Open Undirected Tours

4x4 => 0
5x5 => 1728
6x6 => 6,637,920
7x7 => 165,575,218,320
8x8 => 19,591,828,170,979,904


Brute Force Solution Approach

A brute force approach is typically only practical in the case of the 5x5 board, and is not generally useful for the higher dimensional problems, due to the fact the the solution space explodes with increasing problem dimensions.  From the Wikipedia entry:  "For a regular 8x8 chess board, there are approximately 4×10^51 possible move sequences, and it would take an unfathomable amount of time to iterate through such a large number of moves."

The ration of solutions to possible move sequences is small.

A Useful Heuristic - Warnsdorff's Rule [Wikipedia]

Warnsdorff's rule is a heuristic method for finding a knight's tour. Briefly, one moves the knight according to the following rule: always proceed to the square from which the knight will have the fewest onward moves.  It is, of course, possible to have two or more choices for which the number of onward moves is equal; there are various methods for breaking such ties, including one devised by Pohl and another by Squirrel and Cull.  This rule may also more generally be applied to any graph. In graph-theoretic terms, each move is made to the adjacent vertex with the least degree. Although the Hamiltonian path problem is NP-hard in general, on many graphs that occur in practice this heuristic is able to successfully locate a solution in linear time.  The knight's tour is a special case.  The heuristic was first described in "Des Rösselsprungs einfachste und allgemeinste Lösung" by H. C. von Warnsdorff in 1823.


Brute-Force / Depth First Search
Finding of All Open Tours
in 5 x 5 Case

Solution Description

A back-tracker / depth-first searcher was developed in Python 2.7.

A Definition of Back-Tracking from the Wikipedia:

Backtracking is a general algorithm for finding all (or some) solutions to some computational problem, that incrementally builds candidates to the solutions, and abandons each partial candidate c ("backtracks") as soon as it determines that c cannot possibly be completed to a valid solution.

A Definition of State Space Search from the Wikipedia:


State space search is a process used in the field of computer science, including artificial intelligence (AI), in which successive configurations or states of an instance are considered, with the goal of finding a goal state with a desired property.

Problems are often modelled as a state space, a set of states that a problem can be in. The set of states forms a graph where two states are connected if there is an operation that can be performed to transform the first state into the second.

State space search often differs from traditional computer science search methods because the state space is implicit: the typical state space graph is much too large to generate and store in memory. Instead, nodes are generated as they are explored, and typically discarded thereafter. A solution to a combinatorial search instance may consist of the goal state itself, or of a path from some initial state to the goal state.


A General Description of the Specific Back-Tracker Implementation

Essentially the tracker successively explores the state space, collecting solutions as it goes.

Considering the 5x5 grid case, and starting from a blank board:

It is first noted that there are 25 possible first moves than can be made, since the knight can be placed on any of the 5 x 5 = 25 squares.

From each of these 25 possible first moves, uniquely different (although sometimes symmetrical) second moves can be made.  Now, the number of possible second moves is not the same for every possible first move, for example first moves to edge squares will have less possible second moves than first moves to the centre square (or centre squares, in the case of even-dimensioned square boards).

Then for each of the possible second moves, there are a number of possible third moves, and so on, until either the knight gets stuck or successfully untangles a Hamiltonian path.

Thus the solution space can be thought of as a tree structure, with the blank board as the root node, and each of the 25 possible first choices as nodes adjacent to the root node.  Each of the possible second choices are then nodes adjacent to their respective first-choice nodes, and so on.  Knight's tours will then be Hamiltonian paths of length 25 [= N^2 = NxN], and aborted attempts will correspond to shorter paths.

The back-tracker dynamically and successively generates this tree structure - exploring a particular branch, until a dead-end is reached, then back-tracking until the first unexplored branch is encountered, pruning the fully examined branch off the tree and discarding it to minimize persistence (memory/disc) utilisation, before continuing forward again to explore another remaining unexamined branch.

Parallelisation with Python's Multiprocessing Module

The Knights tour problem just screams parallelisation, but how to do this given the restrictions imposed by Python's crummy Global Interpreter Lock (GIL) on the language's support for true multi-threading ?  With the standard library's multiprocessing module:

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

So, what multiprocessing actually gives you is a set of independent operating-system level processes, each with it's own Python VM, and some infrastructure for both passing objects and (save us) directly sharing memory between these distinct processes.

Design Pattern

The design I've chosen utilises a single co-ordinating process, which spawns agent processes that are responsible for the execution of the actual business of exploring the solution space.

In the simplest case each agent process is assigned a different first choice, and it generates, examines and prunes the branch associated with that specific first choice, communicating solutions found back to the co-ordinating process, and finally notifying it of completion (total exploration of assigned solution space).

When the co-ordinating process is informed of successfully termination of an agent process, it will then spawn a new agent and assign them to examine a different branch of the tree.

When all branches have been examined, the solution space can be considered exhausted and all solutions found, and the coordinating process will report to file, and terminate.

Communication between the coordinating processes and its agents is achieved via use of the duplex pipe object available in the multiprocessing module.

Test

The Test
Exhaustively searching for and identifying all 1728 Knight's Tours on the 5x5 board.

Test Machine
I5-4x2.3GHz-8MB = Intel i5-2410M 2.3GHz with 8 GB
Test Machine Syn features an Intel i5 array of 4 cores at 2.3 GHz, with 8 GB memory available, running 64bit Windows 7, and 32 bit Python 2.7.

The Results

Single Process, No Threading - Effectively 1 Core

First off, the performance of the basic unthreaded, singleton-process back-tracker.

Total Time = 1hr46mins = 106 mins = 6360 seconds
Total Solution Count = 1728
Average Rate of Solution Finding = 0.27 solutions per second
Average Time to Find Next Solution = 3.68 seconds per solution

1 Controller Core + 3 x Agent Cores


Total Time = 41 mins = 2460 seconds
Total Solution Count = 1728
Average Rate of Solution Finding = 0.70 solutions per second
Average Time to Find Next Solution = 1.42 seconds per solution


Comparison - Singleton Process versus Multi-Process [1 Controller, 3 Agents]



So, the question is, from a quantitative perspective, what gains have been achieved by parallelising the algorithm ?

Since the test machine has 4 cores available, and the algorithm uses one of them for co-ordination, and farms out the work to 3 agent processes, we expect to see about a 3-fold gain in speed / decrease in total time.

Actually Observed Performance Ratio = 6360 / 2460 ~  2.59
Theoretical Maximum Performance Ratio = 3 [going from 1 to 3 worker cores]

So, overall our increase in performance is not too bad, we didn't quite achieve the maximum ratio of 3, but we're not too far off at 2.59.  Quantitatively we have achieved ~ 86% of the maximum theoretically possible increase in performance.

POST SCRIPT

"So, overall our increase in performance is not too bad" - Unfortunately, this is a blatant lie, see the next post on this topic for damning details.



2 comments:

  1. Hi David! I think there is an error. 6,637,920 is possible move sequences count on 6x6 board. Not on 7x7 board.

    ReplyDelete

comment: