Solving the Knight's Tour Problem with HANA Graph

former_member191806 · ‎10-29-2018

Last week during a Code Jam at TechEd Barcelona, vitaliy.rudnytskiy threw the glove at me by starting a challenge to solve the "Knight's Tour" problem using the Graph capabilities of HANA.

The problem is best summarised as the following: "find a succession of moves of a knight piece on a chess board such that the knight visits each square exactly once".

Graph Theory

We can model the chess board like a graph with the squares as the vertices and all the possible moves as the edges.

By using this abstraction, the problem actually boils down to a special case of the more general Hamiltonian Path problem, which is basically the search of a path traversing all vertices of a graph exactly once.

Graph Engine

HANA has built-in support for processing graphs. As of now, it has the following main features:

Storing and representing graphs (basically you need a vertex and an edge table and then you can create a so-called "workspace" out of them).

Running some simple algorithms on graphs (neighbourhood, shortest path, connected components, etc).

Performing queries based on the openCypher query language.

Writing stored procedures using a dedicated GraphScript language.

To solve this problem, I used the stored procedures together with the graph representation.

Chess Board

The first step would be to generate the chess board as a graph. We will label the rows from 1 to 8 and the columns from A to H:

To store the graph, I defined the data model as follows:

CREATE TABLE nodes (

	label VARCHAR(2),

	row_number INT NOT NULL,

	col_number INT NOT NULL,

	PRIMARY KEY (label)

);



CREATE TABLE edges(

	label VARCHAR(5) NOT NULL,

	from_node VARCHAR(2) NOT NULL,

	to_node VARCHAR(2) NOT NULL,

	PRIMARY KEY (label)

);



CREATE GRAPH WORKSPACE board

  EDGE TABLE edges SOURCE COLUMN from_node 

    TARGET COLUMN to_node KEY COLUMN label

  VERTEX TABLE nodes KEY COLUMN label;

I did not want to manually define the board, so I wrote a procedure for generating it for me based on an input size:

CREATE OR REPLACE PROCEDURE generate_board_for_knights(

    IN size INT, 

    OUT nodes TABLE(

        label VARCHAR(2),

        row_number INT,

        col_number INT

    ),

    OUT edges TABLE(

        label VARCHAR(5),

        from_node VARCHAR(2),

        to_node VARCHAR(2)

    )

) LANGUAGE SQLSCRIPT READS SQL DATA AS BEGIN

    nodes = SELECT col_label || row_label AS label, 

                row_number, col_number

        FROM (

           -- generate row numbers from 1 to 8

           SELECT GENERATED_PERIOD_START AS row_number, 

              TO_VARCHAR(GENERATED_PERIOD_START) AS row_label 

              FROM SERIES_GENERATE_INTEGER(1, 1, 9)

        ) AS row_table

        CROSS JOIN (

           -- generate column numbers from 1 to 8

           -- and column labels from A to H

           SELECT GENERATED_PERIOD_START AS col_number, 

              NCHAR(ASCII('A') + GENERATED_PERIOD_START - 1) AS col_label 

              FROM SERIES_GENERATE_INTEGER(1, 1, 9)

        ) AS col_table; 

           -- the edge label is simply the node labels

           -- joined by a hyphen (e.g. A1-C2)

    edges = SELECT A.label || '-' || B.label AS label, 

              A.label AS from_node, B.label AS to_node 

              FROM :nodes AS A CROSS JOIN :nodes AS B

            WHERE  (ABS(A.row_number - B.row_number) = 2 

              AND ABS(A.col_number - B.col_number) = 1)

              OR (ABS(A.row_number - B.row_number) = 1 

              AND ABS(A.col_number - B.col_number) = 2);

END;

After creating the procedure, I was able to simply call it to fill my tables:

CALL "GENERATE_BOARD_FOR_KNIGHTS"(8, nodes, edges) WITH OVERVIEW;

WebIDE has a built-in graph viewer which provides a pretty nice visualisation of the chess board:

Lastly, I defined a table type for representing paths:

CREATE TYPE tt_path AS TABLE(label VARCHAR(2));

Naive Approach

The first idea that came to my mind was to try to solve it via backtracking. I decided to not do it, because of the fact that it would be simply too slow.

Vanilla backtracking has an exponential asymptotic time complexity. In our graph each node has potentially 8 neighbours and we have to find a 64-node long path, so it would need to do (slightly less than) 8 ^ 64 steps to finish the whole algorithm.

Of course, if we would stop when we find the first solution it will be faster, but I still decided that there must be a better solution.

Greedy Algorithm

After thinking on it a while, I decided that it should be possible to find a solution using a greedy algorithm with an appropriate selection heuristic.

The first heuristic that I tried was to always try to go towards the bottom of the board and slowly fill in from there. After fighting a little with the GraphScript language, I wrote the following procedure:

CREATE OR REPLACE PROCEDURE "FIND_HAMILTONIAN_PATH" (

  IN iv_origin VARCHAR(2),

  OUT ot_path tt_path

) LANGUAGE GRAPH READS SQL DATA AS

BEGIN

  Graph lo_graph = Graph("BOARD");

  BigInt lv_size = Count(Vertices(:lo_graph));



  -- retrieve the start point from the graph based on the origin label

  Vertex lo_current = Vertex(:lo_graph, :iv_origin);

  Sequence<Vertex> lt_nodes = [:lo_current];

  ALTER lo_graph ADD TEMPORARY VERTEX ATTRIBUTE(Boolean visited = false);

  

  WHILE (COUNT(:lt_nodes) <= :lv_size) {

    lo_current."VISITED" = true;

    Boolean lf_found = false;

    Int lv_row = -1;

    Int lv_col = -1;

    

    FOREACH lo_neighbor IN NEIGHBORS(:lo_graph, { :lo_current }, 1, 1) {

      -- first try to go on lower rows

      -- if the row is the same, go towards the left corner

      IF (NOT :lo_neighbor."VISITED" 

          AND (:lv_row < :lo_neighbor."ROW_NUMBER" 

          OR (:lv_row == :lo_neighbor."ROW_NUMBER" 

          AND :lv_col < :lo_neighbor."COL_NUMBER"))) {

        lv_row = :lo_neighbor."ROW_NUMBER";

        lv_col = :lo_neighbor."COL_NUMBER";

        lf_found = true;

        lo_current = :lo_neighbor;

      }

    }

    

    -- stop the loop if we did not find a next node 

    -- this avoids infinite loops which do not do anything

    IF (NOT :lf_found) {

      BREAK;

    }

    

    -- append the next node to the path

    lt_nodes = :lt_nodes || [:lo_current];

  }

  

  ot_path = SELECT :lo_node.label FOREACH lo_node IN :lt_nodes;

END;

Unfortunately, this approach failed miserably, yielding mediocre results:

The problem looked to me like once the bottom half of the board was almost full, the knight got into a dead end. To prevent this, I tried applying a different heuristic: to always go as close as possible to the center of the board.

CREATE OR REPLACE PROCEDURE "FIND_HAMILTONIAN_PATH" (

  IN iv_origin VARCHAR(2), 

  OUT ot_path tt_path

) LANGUAGE GRAPH READS SQL DATA AS 

BEGIN

  Graph lo_graph = Graph("BOARD");



  -- get the size of the graph and use it throughout the algorithm

  BigInt lv_size = Count(Vertices(:lo_graph));



  -- retrieve the start point from the graph based on the origin label

  Vertex lo_current = Vertex(:lo_graph, :iv_origin);

  Sequence<Vertex> lt_nodes = [:lo_current];

  ALTER lo_graph ADD TEMPORARY VERTEX ATTRIBUTE(Boolean visited = false);

  

  WHILE (COUNT(:lt_nodes) <= :lv_size) {

    lo_current."VISITED" = true;

    Boolean lf_found = false;

    BigInt lv_score = 4L * :lv_size;

    

    FOREACH lo_neighbor IN NEIGHBORS(:lo_graph, { :lo_current }, 1, 1) {

      IF (NOT :lo_neighbor."VISITED") {

        -- compute the delta (4.5 would be the middle of the board, but we

        -- multiply everything by 2 to be able to use integers)

        BigInt lv_row_delta = 2L * BIGINT(:lo_neighbor."ROW_NUMBER") - 9L;



        -- do an ABS, because GraphScript does not support math functions

        IF (:lv_row_delta < 0L) {

          lv_row_delta = :lv_row_delta * -1L;

        }



        BigInt lv_col_delta = 2L * BIGINT(:lo_neighbor."COL_NUMBER") - 9L;

        IF (:lv_col_delta < 0L) {

          lv_col_delta = :lv_col_delta * -1L;

        }



        IF (:lv_score > :lv_col_delta + :lv_row_delta) {

          lv_score = :lv_col_delta + :lv_row_delta;

          lo_current = :lo_neighbor;

          lf_found = true;

        }

      }

    }

    

    IF (NOT :lf_found) {

      BREAK;

    }

    

    -- append the next node to the path

    lt_nodes = :lt_nodes || [:lo_current];

  }

  

  ot_path = SELECT :lo_node.label FOREACH lo_node IN :lt_nodes;

END;

I envisioned that the knight will spin around the center of the board. What I did not realise is that this approach would basically cut off the corners of the board leading the knight to be stuck in one of the corners:

Seeing that my heuristics are not working, I decided to read a little literature on the topic.

Warnsdorff Rule

Source: Squirrel, Douglas, and Paul Çull. "A Warnsdorff-Rule Algorithm for Knight's Tours." (1996).

After reading some papers, I found out that I was very close to a correct solution. The heuristic that I was searching for was to always select the next node to be the one with the smallest degree. That is, we select the node which has the smallest amount of not-yet-visited neighbours.

CREATE OR REPLACE PROCEDURE "FIND_HAMILTONIAN_PATH_VIA_WARNSDORFF" (

  IN iv_origin VARCHAR(2), 

  OUT ot_path tt_path

) LANGUAGE GRAPH READS SQL DATA AS 

BEGIN

  Graph lo_graph = Graph("BOARD");



  -- get the size of the graph and use it throughout the algorithm

  BigInt lv_size = Count(Vertices(:lo_graph));



  -- retrieve the start point from the graph based on the origin label

  Vertex lo_current = Vertex(:lo_graph, :iv_origin);



  Sequence<Vertex> lt_nodes = [:lo_current];

  ALTER lo_graph ADD TEMPORARY VERTEX ATTRIBUTE(Boolean visited = false);

  

  WHILE (COUNT(:lt_nodes) <= :lv_size) {

    lo_current.visited = true;

    Boolean lf_found = false;

    BigInt lv_min = :lv_size;

    

    FOREACH lo_first IN NEIGHBORS(:lo_graph, { :lo_current }, 1, 1) {

      IF (NOT :lo_first.visited) {

        BigInt lv_first_count = 0L;



        -- compute the degree of each neighbour

        FOREACH lo_second IN NEIGHBORS(:lo_graph, { :lo_first }, 1, 1) {

          IF (NOT :lo_second.visited AND :lo_second != :lo_current) {

            lv_first_count = :lv_first_count + 1L;

          }

        }

        

        -- go for the minimal degree

        IF (:lv_first_count < :lv_min) {

          lo_current = :lo_first;

          lv_min = :lv_first_count;

          lf_found = true;

        }

      }

    }

    

    IF (NOT :lf_found) {

      BREAK;

    }

    

    -- append the next node to the path

    lt_nodes = :lt_nodes || [:lo_current];

  }

  

  ot_path = SELECT :lo_node.label FOREACH lo_node IN :lt_nodes;

END;

Basically, this would yield the exact opposite result as the image above: first visit the margins of the board, then the center:

I finally got a solution! But what I don't like at this heuristic is the fact that, when we have to decide on the next node and there are two nodes with the exact same score, we always take the first one. In the literature they usually take one at random, but there is no RANDOM function in GraphScript.

Sometimes this heuristic may not be able to yield a result (although I never encountered this situation), so I wanted to improve it further.

Pohl's Tie Breaking

Source: Pohl, Ira. "A method for finding Hamilton paths and Knight's tours." (1967).

One possible way of breaking ties would be to repeatedly apply the selection mechanism on the candidate nodes. Because I cannot really write a recursive call using GraphScript, I decided to go for a single level tie-breaking:

First find all candidate nodes based on the Warnsdorff rule.

If there is only one candidate, select that one.

Otherwise, compute the minimal neighbour degree for each candidate and select the one with the smallest value. At this point, we could again reach a tie, so we just select the first one (this would be solved via recursion - except for the perfectly symmetrical arrangements).

CREATE OR REPLACE PROCEDURE "FIND_HAMILTONIAN_PATH_VIA_WARNSDORFF_AND_POHL" (

  IN iv_origin VARCHAR(2), 

  OUT ot_path tt_path

) LANGUAGE GRAPH READS SQL DATA AS 

BEGIN

  Graph lo_graph = Graph("BOARD");



  -- get the size of the graph and use it throughout the algorithm

  BigInt lv_size = Count(Vertices(:lo_graph));



  -- retrieve the start point from the graph based on the origin label

  Vertex lo_current = Vertex(:lo_graph, :iv_origin);



  Sequence<Vertex> lt_nodes = [:lo_current];

  ALTER lo_graph ADD TEMPORARY VERTEX ATTRIBUTE(Boolean visited = false);

  

  WHILE (COUNT(:lt_nodes) <= :lv_size) {

    lo_current.visited = true;

    BigInt lv_min = :lv_size;

    

    -- we will use this sequence to store candidate nodes

    Sequence<Vertex> lt_candidates = Sequence<Vertex>(:lo_graph);

    

    FOREACH lo_first IN NEIGHBORS(:lo_graph, { :lo_current }, 1, 1) {

      IF (NOT :lo_first.visited) {

        BigInt lv_first_count = 0L;



        -- compute the degree of each candidate

        FOREACH lo_second IN NEIGHBORS(:lo_graph, { :lo_first }, 1, 1) {

          IF (NOT :lo_second.visited) {

            lv_first_count = :lv_first_count + 1L;

          }

        }



        -- add the candidate to the list if needed

        IF (:lv_first_count < :lv_min) {

          lv_min = :lv_first_count;

          lt_candidates = [ :lo_first ];

        } ELSE {

          IF (:lv_first_count == :lv_min) {

            lt_candidates = :lt_candidates || [ :lo_first ];

          }

        }

      }

    }

    

    -- terminate the algorithm if no candidates were found

    IF (COUNT(:lt_candidates) == 0L) {

      BREAK;

    } ELSE {



      -- select the single candidate if only one was found

      IF (COUNT(:lt_candidates) == 1L) {

        lo_current = :lt_candidates[1L];



      -- otherwise, apply the tie-breaking heuristic

      } ELSE {

        lv_min = :lv_size;

        FOREACH lo_candidate IN :lt_candidates {

          BigInt lv_candidate_min = :lv_size;

          

          -- find the minimum neighbour degree for each candidate

          FOREACH lo_first IN NEIGHBORS(:lo_graph, {:lo_candidate}, 1, 1) {

            BigInt lv_first_count = 0L;

            IF (NOT :lo_first.visited) {

              FOREACH lo_second IN NEIGHBORS(:lo_graph, {:lo_first}, 1, 1) {

                IF (NOT :lo_second.visited AND :lo_second != :lo_candidate) {

                  lv_first_count = :lv_first_count + 1L;

                }

              }

            }

            IF (:lv_first_count < :lv_candidate_min) {

              lv_candidate_min = :lv_first_count;

            }

          }



          -- select the candidate with the smallest minimum neighbour degree

          IF (:lv_candidate_min <= :lv_min) {

            lo_current = :lo_candidate;

            lv_min = :lv_candidate_min;

          }

        }

      }

    }

    

    -- add the next node to the path

    lt_nodes = :lt_nodes || [:lo_current];

  }

  

  ot_path = SELECT :lo_node.label FOREACH lo_node IN :lt_nodes;

END;

Running it seems to generate the expected result:

I still don't like this solution, because even for asymmetrical arrangements, we still have to take a random choice (due to the lack of recursion).

Roth's Tie Breaking

Source: Ganzfried, Sam. "A Simple Algorithm for Knight’s Tours." (2004).

Another idea for the tie breaking is to select from the candidate nodes based on the distance from the center of the board (take the node with the maximal distance to the center):

CREATE OR REPLACE PROCEDURE "FIND_HAMILTONIAN_PATH_VIA_WARNSDORFF_AND_ROTH" (

  IN iv_origin VARCHAR(2), 

  OUT ot_path tt_path

) LANGUAGE GRAPH READS SQL DATA AS 

BEGIN

  Graph lo_graph = Graph("BOARD");



  -- get the size of the graph and use it throughout the algorithm

  BigInt lv_size = Count(Vertices(:lo_graph));



  -- retrieve the start point from the graph based on the origin label

  Vertex lo_current = Vertex(:lo_graph, :iv_origin);



  Sequence<Vertex> lt_nodes = [:lo_current];

  ALTER lo_graph ADD TEMPORARY VERTEX ATTRIBUTE(Boolean visited = false);

  

  WHILE (COUNT(:lt_nodes) <= :lv_size) {

    lo_current.visited = true;

    BigInt lv_min = :lv_size;



    -- we will use this sequence to store candidate nodes

    Sequence<Vertex> lt_candidates = Sequence<Vertex>(:lo_graph);

    

    FOREACH lo_first IN NEIGHBORS(:lo_graph, { :lo_current }, 1, 1) {

      IF (NOT :lo_first.visited) {

        BigInt lv_first_count = 0L;



        -- compute the degree of each candidate

        FOREACH lo_second IN NEIGHBORS(:lo_graph, { :lo_first }, 1, 1) {

          IF (NOT :lo_second.visited) {

            lv_first_count = :lv_first_count + 1L;

          }

        }



        -- add the candidate to the list if needed

        IF (:lv_first_count < :lv_min) {

          lv_min = :lv_first_count;

          lt_candidates = [ :lo_first ];

        } ELSE {

          IF (:lv_first_count == :lv_min) {

            lt_candidates = :lt_candidates || [ :lo_first ];

          }

        }

      }

    }

    

    -- terminate the algorithm if no candidates were found

    IF (COUNT(:lt_candidates) == 0L) {

      BREAK;

    } ELSE {



      -- select the single candidate if only one was found

      IF (COUNT(:lt_candidates) == 1L) {

        lo_current = :lt_candidates[1L];



      -- otherwise, apply the tie-breaking heuristic

      } ELSE {

        Double lv_max_distance = -1.0;

        FOREACH lo_candidate IN :lt_candidates {

          -- compute the distance between the vertex and the board center

          Double lv_distance = (DOUBLE(:lo_candidate."ROW_NUMBER") - 4.5) 

             * (DOUBLE(:lo_candidate."ROW_NUMBER") - 4.5) 

             + (DOUBLE(:lo_candidate."COL_NUMBER") - 4.5) 

             * (DOUBLE(:lo_candidate."COL_NUMBER") - 4.5);

          -- select the candidate with maximal distance

          IF (:lv_distance > :lv_max_distance ) {

            lo_current = :lo_candidate;

            lv_max_distance = :lv_distance;

          }

        }

      }

    }

    

    -- add the next node to the path

    lt_nodes = :lt_nodes || [:lo_current];

  }

  

  ot_path = SELECT :lo_node.label FOREACH lo_node IN :lt_nodes;

END;

Surprisingly, running this for the same starting point generated the same result as Pohl's tie-breaking (most of the time).

Bloopers

I don't want to give a false impression and make people think that everything went smoothly. I actually encountered a few bugs and limitations which made me waste a few hours of my time:

HANA Express is too large for my computer. I had to clean up 40GB of my laptop to be able to run it. I wanted to use HXE to have access to the latest service pack, but I really started thinking to use an older HANA to which I have access...

I am not very happy with the fact that GraphScript is a completely different language than SQLScript, especially because of the fact that most functions available in SQLScript are not available in GraphScript. For example, there is no ABS in GraphScript.

I did not find a way to pass a Graph object through the procedure interface (e.g. to create a recursive procedure).

The WebIDE is always showing a ton of errors as if all the code is completely broken. To be honest, I would prefer to not have any kind of syntax check and just have syntax highlighting.

Once I introduced the first IF clause, the WebIDE "Run" button would not anymore execute my "CREATE OR REPLACE PROCEDURE" statement. I had to resort to manually sending the code via an jQuery.ajax call from the development console.

jQuery.ajax({

   method: "POST", 

   url: "/sap/hana/cst/api/v2/sessions(...)/connections(...)/sqlExecute()", 

   headers: {"X-CSRF-Token": "..."}, 

   contentType: "application/json", 

   data: JSON.stringify({

      "statements": [{"statement": code, "type": "UPDATE"}],

      "maxResultSize": 1000, 

      "limitLOBColumnSize": 1024, 

      "includePosColumn": true,

      "bExeWithoutPrepare": false,

      "bStopOnError": true,

      "bRunAsBackgroundActivity": false

   })

});

Conclusion

Overall, I think that working with the Graph Engine is fun, but it is still not a fully mature technology. Actually, this might apply to the whole XSA and not only to the Graph capabilities. Coming from the Java ecosystem, where you can have an excellent developer experience, developing on HANA using the WebIDE is somewhat lacklustre.