continua (next page) Index of UNOFFICIAL Circumvesuviana Home Page Linux pages main index About twenty years ago a comics magazine presented a little game called "100 game" (at least here in Italy). In a 10×10 matrix the player had to fill the single cells with a number, incremented at every step, starting from the first cell (top-left) and moving around in the eight possile directions (horizontal or vertical by skipping two cells, or diagonal by skipping one cell), provided that the destination cell is not already filled.

A solution is found when the player can fill the entire matrix without getting to a "dead way". After some months, I got it (by paper and pen and a lot of patience), while wondering "If I've had a computational machine...". At that time, I wasn't aware of the P/NP (non-polynomial-completeness) question, neither I had decent knowledge about exponential-growth order algorythms. My first computer, in 1984, was a Sinclair ZX Spectrum, featuring a Z80A 8 bit processor at a stunning 3.5 MHz clock speed and 48k RAM.

Beh, sciroppatevela in inglese: m'shfàstaréo a tradurla pure in italiano...
Now I have a degree in Computer Science, some knowledge of P/NP, master a bunch of programming languages, hardware, etc; I have also a roaring P3/750 notebook with 192Mb RAM (well, I later switched to an Apple Powerbook G4/1.67 with 1Gb RAM), but I got back to that Very Young Boy Dream to get a greedy algorythm solution for it.

The first attempt was the listing 1 below, almost entirely resembling my first Spectrum Basic attempt (that never run because was never completed: I didn't have recursion, there!) in late 1984.

Well, my friend Giosi got a 100×100 solution using a nice trick and Java...!

The algorythm reads as "for every possible move, see if you can continue from there. If yes, then try from there the same way. If the program returns from there, it means that it wasn't a solution". So, it's an O(exp(7,n²)) (and -ouch!- the proposed comics game assumed n=10), because every node of the tree has seven children (the eight possible directions except the one where you came).

The tree is not complete but the average case is quite near to worst case. Finding a solution means finding one path from the root to a leaf (one of the "solution" leaves) at depth , no matter where you go - this means also that the best case is shown only if you don't apply the same strategy while descending the tree: the solution for a non-polynomial algorythm has to be searched in an adaptive strategy while selecting "which node I will descend now": using a fixed strategy you will descend a lot of useless leaves (not at depth ). A good case is when you find a path that quickly resolves in a "no way", but with little vales of n this gets really rare.

Well, a solution in reasonable time (some minutes) for n<=10 can be found even with a greedy approach.

//   cento-standard.c   Feb'2002

#include <stdio.h>

#ifndef SIZE
#define SIZE 10

int m[SIZE][SIZE], ctr=0, maxz=0;

#define bound(a,b)    ((a)>=0 && (a)<SIZE && (b)>=0 && (b)<SIZE)

void show()
  int a,b;
  for(a=0; a<SIZE; a++)
    for(b=0; b<SIZE; b++)
        printf("%-3d", m[a][b]);
        printf("   ");
  printf("\nmax N = %d  --  recursions: %d\n", maxz, ctr);

int vai(int x, int y, int z)
  if(!bound(x,y)) return 1;
  if(m[x][y]) return 1;
  if(maxz<z) maxz=z;



  if(vai(x+3, y, z) && vai(x+2, y+2, z) && vai(x, y+3, z) && vai(x-2, y+2, z)
  && vai(x-3, y, z) && vai(x-2, y+2, z) && vai(x, y-3, z) && vai(x+2, y-2, z))

  return 1; // ouch!

int main(int argc, char **argv)
  if(vai(0,0,1)) printf("impossible\n");
  return 1;

This kind of approach gives about 200 thousands recursion calls per second on my P3/750 machine, compiled with gcc 2.96 with standard optimizations ("-O2"). It finds quickly (a matter of seconds) the solution for n=5 and n=6 (for n<5 a solution does not exist). But, when you try with a SIZE of 7 or above, you will see countless recursions and countless hours of work...

How can be optimized it, without changing the irrational greedy approach with fixed strategy?

Well... oops... er... YES, use a bit of assembly programming for the first time in more than eight years of Linux programming! =:-)

Some hints:

Well, all these lead to a stunning speed of over 83 millions of recursions per seconds on my 750 MHz machine (yes, nine clock cycles per recursion call!!): the 10×10 matrix is completed in less than 150 seconds of CPU usage (while running X/Free, Linux 2.4.17, etc). I think one can do even better if knows the internals of Pentium-III cycle optimizations, branch prediction, etc, but the assembler listing below shows how assembly reengineering of a critical function can boost speeds of two orders of magnitude.

First, the C listing that prepares the matrix and does the output:

//      cento.c         -- Feb'2002
//      greedy approach to get a recursive solution for the Little "100" Game
//      compile with paranoid optimizations:
//          gcc -O6 -Wall -fomit-frame-pointer cento.c centro.o -o cento
//          strip -s -R .comment -R .note cento

// --- configuration ---

#define  SECS          3    // timeout for refreshing screen table

// --- library ---

#include <stdio.h>     // printf
#include <signal.h>    // signal
#include <unistd.h>    // alarm
#include <stdlib.h>    // exit
#include <string.h>    // strcpy

extern int do_calc(char *matrix_ptr, int starting_number, int maximum_value);

// --- local variables ---

char m[128][128];       // the full matrix (for up to 120x120)

char output[2+120*120][5];    // pre-formatted values

int n=10;               // matrix order

// --- the show applet ---

void showmatrix()
{                              //  non necesse optimizare hic
  int a,b, amax=n, bmax=n;
  printf("\033[;H\n");         //  reset cursor position

  if(amax>24) amax=24;         //  because of terminal window limits
  if(bmax>24) bmax=24;

  for(a=0; a<amax; a++)
    for(b=0; b<bmax; b++) printf("%s", output[1+m[a+4][b+4]]);

void show(int sig)             //  shows the current table to screen
  signal(SIGALRM, show);

// --- initialization/startup ---

int main(int argc, char **argv)
  int a, b;

  if(argc>1) sscanf(argv[1], "%d", &n);  // presume n<=120

  printf("\033[2J");  // clear terminal screen

  for(a=0; a<4; a++)
    for(b=0; b<128; b++)
      m[a][b]=-1;     // keep out some rows, to save a bound-checking

  for(a=n+4; a<128; a++)
    for(b=0; b<128; b++)
      m[a][b]=-1;     // save another bound-checking (rows below)

  for(a=4; a<n+4; a++)
    for(b=0; b<4; b++)
      m[a][b]=-1;     // columns to the left
    for(b=n+4; b<128; b++)
      m[a][b]=-1;     // columns to the right

  strcpy(output[0], "***");           // precompile strings
  strcpy(output[1], "__ ");
  for(a=1; a<=n*n; a++) sprintf(output[a+1], n>10 ? "%-3x" : "%-3d", a);

  show(SIGALRM);                      // simulates the first ALARM/show

  a=do_calc(&m[4][4], 1, n*n);        // go!

  alarm(0);                           // stop screen update...
  showmatrix();                       // ...but be sure the solution is shown!

  printf("\n[%s]\n", a ? "impossible" : "OK!!!");
  return a;

And now, the assembler listing of the recursive call.

;       centro.asm               
;       assembler implementation of the Little 100 Game recursive calculus
;       compile with:  nasm -felf centro.asm
;       then link the centro.o object file to the calling C program

; --- strategies (select one) ---


%define  CLOCKWISE          movearound  E, SE, S, SW, W, NW, N, NE
%define  CLOCK_UNWISE       movearound  S, SE, E, NE, N, NW, W, SW
%define  HOLDYOURBACK       movearound  NW, W, N, SW, NE, S, SE, E
%define  PINGPONG           movearound  W, E, N, S, SW, NE, SE, NW
%define  REGULAR            movearound  W, N, S, E, NW, SW, SE, NE
%define  NONREGULAR         movearound  NW, SW, SE, NE, W, N, S, E

; --- definitions ---

%define  N   (-384)
%define  S   (384)
%define  E   (3)
%define  W   (-3)
%define  NE  (-254)
%define  NW  (-258)
%define  SE  (258)
%define  SW  (254)

%macro go 2
        add ebx, dword (%2-%1)
        call calc

%macro movearound 8
        go  0, %1
        go %1, %2
        go %2, %3
        go %3, %4
        go %4, %5
        go %5, %6
        go %6, %7
        go %7, %8
        add ebx, dword (-%8)      ; this is to adjust original ebx value

; --- the function called only once (to start the recursion) ---

global do_calc  ; this is the function name

do_calc ; int do_calc(char *matrix_ptr, int starting_number, int max_number)
        enter 0,0
          push ebx
          push ecx
          push edx
          push esi
            mov esi, esp
            xor eax, eax
            mov ebx, [ebp+8]        ; matrix_ptr
            mov ecx, [ebp+12]       ; starting_number
            mov edx, [ebp+16]       ; max_number
            call calc
ret_here    setnz al                ; return value is zero only if the entire
            movzx eax, al           ;   matrix was filled
            mov esp, esi
          pop esi
          pop edx
          pop ecx
          pop ebx
        align 16       ; some paranoia alignment...

; --- the recursive function ---
; we get here with:
;    ah  -- zero
;    ebx -- points to the address of the next cell to be filled
;    ecx -- next number to place, always >= 1
;    edx -- the last value to place in the matrix

calc:   mov al, [ebx]           ; the recursion begins exactly here!
        or al, al
        jz able                 ; test cell and return with nz flag if we
        ret                     ;   cannot continue from that point
        align 16

able:   mov [ebx], cl           ; so, try to fill next cell
        inc ecx
        cmp ecx, edx            ; check if we really did it
        jbe fill
        xor eax, eax            ; go out if filled, leaving z flag by xoring
        jmp short ret_here
        align 16

fill:   STRATEGY                ; apply currently defined strategy

        mov [ebx], ah           ; if we get here then it was a closed way, so
        dec ecx                 ;   free the current cell and go back one move
        ret                     ;     and return with nz because always ecx!=1

; ---

That's all, folks! I already hear lots of Programming Purists crying for my bad-bad-BAD-BAD greedy approach, but I got what I was searching for. Maybe one day I'll rework on it to get a solution of up to 120×120 in reasonable time (the 128×128 matrix was born because of this), but my intent was only to fiddle with assembly in Linux.

YES, if you try to get a 9×9 solution (theoretically simpler), you'll have to wait three hours. So, the 10×10 was not the worst case, but I'm happy I found it in less than three minutes. Here is it for you (notice the horizontal/vertical moves preferred to diagonal moves):

14829 847309463316
7869947970 9586719687
36196435186334 176245
249 2875431105532 15
77689380 659285729788
372057402356 41246144
3 5027653261191 8414
7667100 81669982739889
3821583922 5942256043
451755527412 908313

Last note: my friend Giosuè was working on the same problem, but using Java instead of C and assembler.

Click here for index page.

send e-mail - continua (next page)