flat assembler
Message board for the users of flat assembler.
Index
> Examples and Tutorials > Sudoku Goto page 1, 2 Next 
Author 

bitRAKE
How hard would this be in assembler? Little more than double the lines of code, but I didn't use any fancy NUMPY to print matrix. Two additional algorithmic improvements are present in the assembler version: linearization of the matrix and advanced scanning on recursion.
https://www.youtube.com/watch?v=G_UYXzGuqvM Code: include 'format/format.inc' FORMAT PE64 CONSOLE 6.0 at $10000 on "NUL" SECTION '.flat' CODE READABLE WRITEABLE EXECUTABLE POSTPONE msvcrt_name db 'msvcrt',0 _exit db 0,0,'exit',0 __printf_p db 0,0,'_printf_p',0 align 8 DATA IMPORT dd 0,0,0,RVA msvcrt_name,RVA msvcrt_table dd 0,0,0,0,0 align 8 msvcrt_table: exit dq RVA _exit _printf_p dq RVA __printf_p dq 0 END DATA END POSTPONE ;############################################################################### Puzzle000 db \ 5,3,0,0,7,0,0,0,0,\ 6,0,0,1,9,5,0,0,0,\ 0,9,8,0,0,0,0,6,0,\ 8,0,0,0,6,0,0,0,3,\ 4,0,0,8,0,3,0,0,1,\ 7,0,0,0,2,0,0,0,6,\ 0,6,0,0,0,0,2,8,0,\ 0,0,0,4,1,9,0,0,5,\ 0,0,0,0,8,0,0,7,9 ; Each location has a linearized form [0,80]. Additionally, a location belongs ; to three groups: row, column, square. The values here convert the linearized ; index into the member groups. align 64 ; cache friendly TriQuantize: repeat 9, j:0 repeat 9, i:0 db i ; delta to current row start db j*9 ; delta to current column start ; delta to current square start db ((i mod 3) + 9*(j mod 3)) end repeat end repeat _format db "Looking for solutions for:",13,10 _floormat db "%c %c %c %c %c %c %c %c %c",13,10,0 Sudoku__PrettyPrint: ; nothing fancy, just replace zero with period push rsi mov rsi,rbp push rdi push rbp mov rbp,rsp sub rsp,10*8 and rsp,16 mov edi,9 .LLL: mov edx,1 xor eax,eax .LL: mov qword[rsp+rdx*8],'0' lodsb test al,al jnz .num add qword[rsp+rdx*8],'.'  '0' .num: add qword[rsp+rdx*8],rax add edx,1 cmp edx,10 jc .LL mov rdx,[rsp+8] mov r8,[rsp+16] mov r9,[rsp+24] call [_printf_p] lea rcx,[_floormat] sub edi,1 jnz .LLL mov rsp,rbp pop rbp pop rdi pop rsi retn ; RBP : global puzzle pointer ; RDI : pointer of current trial + 1 align 64 ; cache friendly Validate_XYN: ; RSI : linearized index of current trial neg rbp lea rsi,[rbp+rdi1] neg rbp ; check current row movsx rdx,byte[TriQuantize+rsi*2+rsi] repeat 9, i:0 cmp [(rdi1)+rdx+i],cl jz .false end repeat ; check current column movsx rdx,byte[TriQuantize+rsi*2+rsi+1] repeat 9, i:0 cmp [(rdi1)+rdx+9*i],cl jz .false end repeat ; check current square movsx rdx,byte[TriQuantize+rsi*2+rsi+2] repeat 3, j:0 repeat 3, i:0 cmp [(rdi1)+rdx+9*j+i],cl jz .false end repeat end repeat .false: retn ; Sudoku solver : TryStart: mov rdi,rbp ; from start .Try: push rdi mov al,0 neg rdi lea rcx,[81+rdi+rbp] neg rdi repnz scasb jnz .end ; no zeroes to fix mov ecx,9 .A: call Validate_XYN jz .B mov [rdi1],cl ; attempt call .Try ; recurse movzx ecx,byte [rdi1] ; restore counter mov byte[rdi1],0 ; clear attempt .B: loop .A pop rdi retn .end: pop rdi lea rcx,[_floormat1] call Sudoku__PrettyPrint retn ;############################################################################### ENTRY $ pop rax ; this doesn't return lea rbp,[Puzzle000] lea rcx,[_format] call Sudoku__PrettyPrint call TryStart call [exit] int3 

18 Feb 2020, 06:14 

bitRAKE
If one wanted to solve Sudoku fast, it translates easily to SAT problem. Then just use existing tools. I wanted to implement the algorithm from the video  which struggles with 16x16 games.


18 Feb 2020, 07:33 

Ali.Z
true its sudoku post, but what can be even tougher is 3 by 3 rubik's cube solver that can tell you what to move what side and how many times.
https://rubikscubesolver.com/ _________________ Asm For Wise Humans 

29 Mar 2020, 18:53 

bitRAKE
This page is in interesting read regarding Rubik's Cube:
https://tomas.rokicki.com/cubecontest/ Most of those solvers are under 20ms in 2004. God's Algorithm is the what they call an optimal move generator: https://www.speedsolving.com/wiki/index.php/God%27s_Algorithm There's a new contest up at: http://azspcs.com/ (Efficient solutions require several layers of discovery about the problem.) 

30 Mar 2020, 01:31 

pelaillo
This old entry of the Hugi Size Coding Competition for a sudoku solver is one of the most amazing optimizations that I have ever seen: the winner (G3) did it in 67 bytes!!
But after the contest, 3 of the participants joined efforts and the result is this beauty: the full sudoku (it reads the puzzle, solves it and then prints the result) all in 62 bytes Code: ;===============================================; ; **** Hugi Compo 25 **** ; ; Sudoku Solver  62 bytes ; ;; ; G3 / tgm80 (at) mail (dot) ru ; ; TFx / tfx (at) bitmaster (dot) it ; ; Digimind / digimind (at) aha (dot) ru ; ;===============================================; use16 org 100h mov ah, 3fh ;ah = 3f > input mov dx, bp ;bp = buffer InOutExit: int 21h ;input/output/exit xchg ax, cx ;cx = length inc bx ;stdout handle Solve: mov [bp+si], al ;digit is ok Empty: dec si ;find empty cell js InOutExit ;if not found output solution, then exit mov ax,4031h ;ah = 40 > output/inc ax ChkDgt: cmp byte [bp+si], '.' ;is empty? jne Empty ;if empty check it else continue mov di, cx Next: dec di ;find digit pusha js Solve ;if no more digits set and recurse cmp [bp+di], al ;else search jne SetOF xchg ax, di ;get digit position Split: aam 19 ;split row and column imul ax, 11 ;find 3x3 group xchg ax, si ;get empty cell position inc bx jpo Split ;split empty cell position xor ax, si ;compare cell positions test ax, 0e0c0h ;same 3x3 group? jle NOk SetOF: mul ah ;same column or same row? NOk: popa jo Next ;next position mov byte [bp+si], '.' ;empty cell cmp al, 39h ;no more digits? je NOk jmp ChkDgt1 This is the original contest: http://www.hugi.scene.org/compo/compoold.htm#compo25 Code: ;Sudoku Puzzle Solver for Hugi Compo 25 ;Coded by G3 (tgm80@mail.ru) ;Compile: fasm.exe entry.asm entry.com ;Usage: entry.com <filename.puz ;The input consists of nine lines. Each line contains only the ASCII ;characters 1 through 9, "." and space; and is terminated with a carriage ;return and line feed (in that order). The "." indicates an empty cell. ;One space character separates each printable character. ;For example, the above puzzle would have this input: ;4 3 . . . . . 8 .<cr><lf> ;9 7 8 . . . . 6 5<cr><lf> ;. . . 8 6 9 . . 3<cr><lf> ;. . 5 . . 4 6 . .<cr><lf> ;. . 1 9 5 8 3 . .<cr><lf> ;. . 3 2 . . 9 . .<cr><lf> ;5 . . 1 7 3 . . .<cr><lf> ;8 6 . . . . 5 3 1<cr><lf> ;. 1 . . . . . 4 2<cr><lf> org 100h ;assume bx=0, ch=0, si=100h mov ah,3Fh ;ah=3Fh  read mov dx,2EFFh ;dh='.', dl=1 mov bp,dx ;bp=dx  offset of puzzle array L10: mov cl,171 ;number of bytes to read/write int 21h ;read/write/exit inc bx ;bx=1  handle to stdout L20: mov [si+bp],al ;make trial entry with digit pusha ;save digit and index of cell L30: dec si ;next cell js L10 ;jump if puzzle solved cmp [si+bp],dh ;[si+2EFF]='.' ? jnz L30 ;loop while cell is not empty mov ax,4031h ;ah=40h  write, al='1' / 40h  inc ax L40: cmp al,3Ah jz L70 ;jump if al>'9' mov cl,171 ;size of puzzle array mov di,dx ;di=dx  offset of puzzle array L50: ;search for cell contains the digit in al with repnz scasb ;identical row or column or 3x3 region coordinates jnz L20 ;jump if digit in al is not already used pusha ;save digit and index of cell xchg ax,si ;al=si&00FF  index of first cell L60: aam 19 ;ah=al/19  row, al=al%19  column imul bp,ax,11 ;ah/3=(ah*11)>>5, al/6=(al*11)>>6 or bp,9F3Fh ;bp=1rr11111cc111111 where rr=row/3, cc=column/6 xor bx,bp ;compare 3x3 region coordinates of two cells xchg ax,di ;al=di&00FF  index of second cell js L60 ;loop 2 times xor ax,di ;compare row and column of two cells mul ah ;set al=0 if al=0 or ah=0 cmp al,1 ;set cf=1 if al=0 (rows or columns are equal) dec bx ;set zf=1 if bx=1 (3x3 region coordinates are equal) L70: popa ;restore digit and index of cell jnbe L50 ;jump if cf=0 and zf=0 mov [si+bp],dh ;undo trial entry jmp L401 ;next digit ;P.S. Best regards to all competitors and organizers and my girlfriend Nadya! 

01 Apr 2020, 00:22 

bitRAKE
Here is a wolframscript (wls) general implementation using binary constraints. Presently, it solves 25x25 puzzles in a couple of seconds (singlethreaded)  over 500k constraints.
Wolframscript will run on anything from a Rasberry PI to the cloud, and is free for personal use. If I create some larger puzzles I'll post them. There are many open problems in the sudoku puzzles. What makes the binary constraint method so appealing is it's flexibility. Nice survey of sudoku variations and their adaptation to SAT solvers: https://yurichev.com/writings/SAT_SMT_by_example.pdf
Last edited by bitRAKE on 25 Apr 2020, 21:53; edited 1 time in total 

23 Apr 2020, 03:47 

bitRAKE
I've used similar code to verify the 16x16 puzzle of minimum guesses published thus far. It is not known if 55 clues are the least needed.


24 Apr 2020, 20:38 

bitRAKE
If we watch an expert player, it seems clear to me that Knuth's Dancing Links algorithm closely mimics the grouping and (bulk) exclusion of possibilities. Next, I want to enumerate known highlevel techniques to discover which ones are not covered by Dancing Links; or if a higher dimensional variant of Dancing Links could be beneficial.
https://youtu.be/jU_W53M5aMQ 

26 Apr 2020, 04:12 

bitRAKE
If my Swedish (code in Java) through Google translate is correct, this paper gives some hope that a higherlevel technique could be advantageous, and also help to enumerate techniques used. The paper has a number of errors, but that doesn't mean it isn't useful. For example, the timing measurements don't have an equivalent basis: this can be seen in the constant time for DL algorithm  this would only happen if puzzles were exactly the same complexity, or DL algorithm was searching for all solutions. No other algorithm is searching for all solutions.
So, the author misses that DL algorithm is actually faster than all other methods, and is equivalent to many of the highlevel techniques. 

26 Apr 2020, 19:15 

revolution
So where is the crossover point where brute force becomes slower than more advanced methods?


26 Apr 2020, 22:52 

bitRAKE
I'm not done coding my DLX algorithm, but it should be faster in all cases.


27 Apr 2020, 00:17 

Ali.Z
bitRAKE wrote: If we watch an expert player, it seems clear to me that Knuth's Dancing Links algorithm closely mimics the grouping and (bulk) exclusion of possibilities. but that depends on the actual 17 numbers and their placements, last time i played 17 i took an hour or two so im no expert; but theoretically thinking i believe that the numbers and their placement does matter. also unique sudoku matrix tend to be more difficult i guess. _________________ Asm For Wise Humans 

27 Apr 2020, 18:13 

bitRAKE
It is unclear what you mean by number placement and unique. The puzzle has a great deal of symmetry, and repeated use of the following (base*) transformations can produce all equivalent variations. In that regard, there is no difference in placement. This also means there are no unique puzzles, per se.
Code: ;the permutations that leave sudoku unchanged are: ; (a) rotate the grid 90 degrees ; (b) swap the top two rows ; (c) swap the top two bands ; (d) swap any two cell values for all cells with those values Most likely, you are saying that the problem complexity does not directly correlate to the number of clues. Which is absolutely correct. It's possible to construct a graph (DAG) of discovery mechanics needed to solve a puzzle and that graph can be unique within the (symmetry divided) problem space. 

28 Apr 2020, 18:10 

bitRAKE
Herbert Kociemba, has already produced massive 400x400 grids by reverse engineering humanlike solving methods. The best existing software for general sudoku solving (Delphi source code), imho.
Qiu Yanzhe, the 2017 bronze champion has commented on a sudoku forum regarding the minimum clues, demonstrating an algorithm that works generally. For the NxN grid, it producing solutions with Floor(N²/4) clues. This more modern writeup suggests another avenue, using propositional logic to greatly reduce guessing. Adapting this method to the general problem seems like it would be advantageous. 

29 Apr 2020, 11:29 

guignol
revolution wrote: So where is the crossover point where brute force becomes slower than more advanced methods? 

29 Apr 2020, 16:27 

guignol
revolution wrote: I think there could be some kind of mega=sudoku. The normal 9x9 is too easy for a computer. Even with the most naive trial and backtrack bruteforcing algorithm it is less than a millisecond and it is done. 

29 Apr 2020, 16:29 

bitRAKE
A wonderful related problem ...
https://www.youtube.com/watch?v=qu04xLNrk94 

09 May 2020, 00:44 

Ali.Z
its so weird, im not mathematician i know +  * / for very small numbers.
but believe or not i solved these grids at least and at minimum 5 years ago up to 10x10. if i remember correctly then i think i solved 6x6 too, im not sure but i think so. so weird nonmath people can solve things that requires complex math, but maybe because i got used to do it for a very long time. even with increased complexity, by mean of each row and column must have their sum equal to each other so does the diagonal. so easy, so weird. _________________ Asm For Wise Humans 

09 May 2020, 16:55 

bitRAKE
I'm not a carpenter, but I built my desk  without plans or measurements. It's a very good desk, but if I lack the language to describe it you will have great difficulty building it.
If we can mentally understand a problemspace how can we communicate that? It can be useful to have a language that covers the domain with reduced ambiguity. Maths become useful when one wants to explain "how" because it has a very specific language  both of form and of process. Of course, the thing is never the description of it, but we may get closer with less effort. Or at the very least advance our communication with common understanding. x86, DNA, etc. these are languages common in our daily life. 

09 May 2020, 20:30 

Goto page 1, 2 Next < Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar.
Powered by rwasa.