flat assembler
Message board for the users of flat assembler.
Index
> Main > Bresenham's linedrawing algorithm in ASM [SOLVED] Goto page 1, 2 Next 
Author 

Madis731
BACKGROUND:
For a long time I've been trying to make some reasonably fast and simple (few registers) linedrawing algorithm in ASM. The first tries were based on the algo found on: http://en.wikipedia.org/wiki/Bresenham's_line_algorithm Look at the optimization part. Its no secret that MenuetOS64 uses this in its linedrawing part. The problems with this approach are that there are many XCHG commands and if you are going to XCHG some memory then you better not. Also the painful part is the end where in the tightest loop you have to DECIDE if it is X,Y or Y,X you are drawing. If you can't imagine what is done then this might help: the line is mirrored on another line specified by y=x (its a diagonal). I managed to make it work, but the loop is too complicated. If it wasn't for trianglefilling algorithm I would have left it there...but imagine sorting triangle's coordinates when you don't know their position (even if X is X and Y is Y) The second approach I made was from a code found from: http://www.gamedev.net/reference/articles/article1275.asp This one is great because it inits all the possible variables it needs and in the inner loop it only increments by them (they can be negative too...). My implementation of this algorithm (that I think is correct) draws everything correctly but the 7th octant. This is strange because everything should be symmetrical. In the 7th octant the line is drawn in exactly the opposite direction THE CODE: This is where I need your hands to help me Code: ;ORIGINAL  Cstyle syntax ;================== deltax = abs(x2  x1); // The difference between the x's deltay = abs(y2  y1); // The difference between the y's x = x1; // Start x off at the first pixel y = y1; // Start y off at the first pixel if (x2 >= x1) // The xvalues are increasing { xinc1 = 1; xinc2 = 1; } else // The xvalues are decreasing { xinc1 = 1; xinc2 = 1 } if (y2 >= y1) // The yvalues are increasing { yinc1 = 1; yinc2 = 1; } else // The yvalues are decreasing { yinc1 = 1; yinc2 = 1; } if (deltax >= deltay) // There is at least one xvalue for every yvalue { xinc1 = 0; // Don't change the x when numerator >= denominator yinc2 = 0; // Don't change the y for every iteration den = deltax; num = deltax / 2; numadd = deltay; numpixels = deltax; // There are more xvalues than yvalues } else // There is at least one yvalue for every xvalue { xinc2 = 0; // Don't change the x for every iteration yinc1 = 0; // Don't change the y when numerator >= denominator den = deltay; num = deltay / 2; numadd = deltax; numpixels = deltay; // There are more yvalues than xvalues } for (curpixel = 0; curpixel <= numpixels; curpixel++) { PutPixel(x, y); // Draw the current pixel num += numadd; // Increase the numerator by the top of the fraction if (num >= den) // Check if numerator >= denominator { num = den; // Calculate the new numerator value x += xinc1; // Change the x as appropriate y += yinc1; // Change the y as appropriate } x += xinc2; // Change the x as appropriate y += yinc2; // Change the y as appropriate } Code: ;MY UNDERSTANDING: ;=============== ; Let us make a line from Axy to Bxy so ; registers rax rbx rcx rdx are used mov rex,rcx ;rbx=x2 sub rex,rax ;rax=x1 sbb rhx,rhx xor rex,rhx sub rex,rhx ;rex=abs(rcxrax) mov rfx,rdx ;rdx=y2 sub rfx,rbx ;rcx=y1 sbb rhx,rhx xor rfx,rhx sub rfx,rhx ;rfx=abs(rdxrbx) or rsi,1 or rdi,1 cmp rcx,rax jc @f neg rsi neg rdi @@: or rix,1 or rjx,1 cmp rdx,rbx jc @f neg rix neg rjx @@: cmp rex,rfx jc @f xor rsi,rsi ; xinc1 = 0 xor rjx,rjx ; yinc2 = 0 mov rkx,rex ; den = deltax mov rlx,rex shr rlx,1 ; num = deltax/2 mov rbp,rfx ; numadd = deltay mov rdx,rex ; numpixels = deltax jmp .endloc @@: xor rdi,rdi ; xinc2 = 0 xor rix,rix ; yinc1 = 0 mov rkx,rfx ; den=deltay mov rlx,rfx shr rlx,1 ; num=deltay/2 mov rbp,rex ; numadd = deltax mov rdx,rfx ; numpixels = deltay .endloc: .linedraw: push rax rbx rcx rdx rex rfx mov rex,rbx add rex,1 ; INT60 eax=38 is a MenuetOS function for lines mov rdx,rax ; I chose to draw a thick "pixel" to see better mov rcx,rbx ; This can also be some PutPixel(rax,rbx) mov rbx,rax mov rax,38 mov rfx,255 int 60h pop rfx rex rdx rcx rbx rax add rlx,rbp ; num += numadd cmp rlx,rkx ; num >= den jc @f sub rlx,rkx ; num = den add rax,rsi add rbx,rdi @@: add rax,rix add rbx,rjx sub rdx,1 jnc .linedraw I've worked days on this and didn't figure out what's wrong  maybe a rested sharpeye can see where the problem is Thanks! Last edited by Madis731 on 03 Aug 2006, 12:04; edited 1 time in total 

02 Aug 2006, 12:27 

vid
use selfmodyfing code, and you have VERY small line algorithm, without all those IFs. even source will be shorter than C's


02 Aug 2006, 13:15 

zhak
2 shorick: it's Jourdain's book. The great book!


02 Aug 2006, 13:16 

Madis731
Vid  I would like to do all kinds of tricks on the code, but FIRST I need it do draw correct lines at every angle
Simply put: I need to know what to change in the code so that IF 1<slope<0 THEN NEG(x increments) andNEG(y increments). Even if its a bit more code  I can continue from there and compress it somehow in the algorithm. I think that something is missing from the Gamedev's article, but I just can't put my finger on it 

02 Aug 2006, 13:18 

vid
then first draw correct line at one angle, like 0 to 45 degrees...


02 Aug 2006, 13:23 

shoorick
use selfmodyfing code > what if code section will be readonly?


02 Aug 2006, 13:28 

Madis731
Selfmodyfying code is not so good on modern CPUs  Intel has warned about the penalties  I don't know how much it is exactly, but copypasting the code and jumping to modifications tends to be faster. Afterall AMD64 and EM64T CPUs ARE ALL NEW CPUs They can't read ahead and they just couldn't reorder the instructions if someone surprisingly modifys some of them :S
If it will be very much faster  I'll consider. Someone at intel once said this also: "If you are jumping, think about what are you doing wrong!" This is why I used no jump instructions in the ABS() parts of the code. Then its all predictable and takes the same amount of clocks. Jumps will never give you the same answer (clock count), neither would selfmodifying code. 

02 Aug 2006, 13:37 

vid
Quote: what if code section will be readonly? hmmm... if he is writing all code by himself (i suppose he is), then he can define it as writeable well, if you generate ideal algortihm with SMC, you get the penalty ONCE, but loops goes so many times that result is better. idea tested with my 2D blit vs. directdraw's blit 

02 Aug 2006, 13:48 

Madis731
Hmm, maybe your right  at least you will get it to shrink and yes I'm writing the code myself, but I though it was bad practise to do smc but well.... ... maybe I can find an even smaller and vaster trick without smc.
Thanks for the ideas, but most appreciated would be a hint to a line in the code to look at. I know that I just can't flip something like 1 <=> 1 because I don't want the result to be opposite  only under some condition. 

02 Aug 2006, 14:01 

Vasilev Vjacheslav
maybe this could help, it's my code for drawing line without approximation in 32bit dib buffer, coded in masm
Code: _drawhline proc uses ebx esi edi ppvBits:dword,x1:dword,x2:dword,y:dword,wdth:dword,col:dword mov ecx,[x2] mov eax,[x1] cmp ecx,eax jge @F mov edx,eax mov eax,ecx mov ecx,edx @@: mov edx,[y] imul edx,[wdth] mov esi,[ppvBits] add edx,eax cmp eax,ecx lea edi,[esi+edx*4] je @F sub ecx,eax mov eax,[col] rep stosd @@: ret _drawhline endp _drawvline proc ppvBits:dword,x:dword,y1:dword,y2:dword,wdth:dword,col:dword mov eax,[y2] mov esi,[y1] cmp esi,eax jle @F mov ecx,esi mov esi,eax mov eax,ecx cmp esi,eax @@: je @@ret mov ecx,[wdth] mov edi,esi imul edi,ecx add edi,[x] mov edx,ecx mov ecx,[ppvBits] shl edx,2 lea ecx,[ecx+edi*4] sub eax,esi @@: mov esi,[col] mov [ecx],esi add ecx,edx dec eax jne @B @@ret: ret _drawvline endp _drawline proc uses ebx esi edi ppvBits:dword,x1:dword,y1:dword,x2:dword,y2:dword,wdth:dword,col:dword local htmp:dword mov eax,[x1] mov ecx,[y1] mov esi,[x2] mov edi,[y2] mov edx,edi sub esi,eax sub edx,ecx mov [htmp],edx mov [y2],edx jne @@a test esi,esi jne @@b imul ecx,[wdth] mov edx,[ppvBits] add ecx,eax mov eax,[col] mov [edx+ecx*4],eax jmp @@ret @@b: invoke _drawhline,[ppvBits],eax,[x2],ecx,[wdth],[col] jmp @@ret @@a: test esi,esi jne @@c invoke _drawvline,[ppvBits],eax,ecx,edi,[wdth],[col] jmp @@ret @@c: mov eax,[y2] cdq mov ebx,eax xor ebx,edx sub ebx,edx mov eax,esi cdq xor eax,edx sub eax,edx cmp eax,ebx jle @@d cmp [y2],0 jle @F xor eax,eax inc eax jmp @@e @@: or eax,1 neg [y2] @@e: mov edx,[y2] sar edx,1 test esi,esi jl @@f mov edi,[x1] mov ebx,edi cmp ebx,[x2] mov [y1],edi je @@ret imul ecx,[wdth] imul eax,[wdth] mov [wdth],eax @@lp: add edx,[y2] mov ebx,[ppvBits] cmp edx,esi lea eax,[ecx+edi] mov edi,[col] mov [ebx+eax*4],edi jle @F add ecx,[wdth] sub edx,esi @@: mov edi,[y1] inc edi cmp edi,[x2] mov [y1],edi jne @@lp jmp @@ret @@f: mov ecx,[x2] neg [y2] neg esi cmp ecx,[x1] mov [y1],ecx je @@ret imul edi,[wdth] imul eax,[wdth] mov [wdth],eax @@lp1: mov eax,[y1] sub edx,[y2] mov ecx,[col] mov ebx,[ppvBits] add eax,edi cmp edx,esi mov [ebx+eax*4],ecx jle @F sub edi,[wdth] sub edx,esi @@: inc [y1] mov eax,[y1] cmp eax,[x1] jne @@lp1 jmp @@ret @@d: test esi,esi jle @@g mov [y1],1 jmp @@h @@g: or [y1],1 neg esi @@h: mov edx,esi sar edx,1 cmp [y2],0 jl @@j cmp ecx,edi mov eax,[x1] mov [x1],eax je @@ret mov eax,[htmp] imul ecx,[wdth] mov [x2],eax @@lp2: mov eax,[x1] mov edi,[col] mov ebx,[ppvBits] add eax,ecx add edx,esi cmp edx,[y2] mov [ebx+eax*4],edi jle @F mov eax,[y1] add [x1],eax sub edx,[y2] @@: add ecx,[wdth] dec [x2] jne @@lp2 jmp @@ret @@j: mov eax,[x2] neg [y2] neg esi cmp edi,ecx mov [x1],eax je @@ret mov eax,edi imul eax,[wdth] sub ecx,edi mov [x2],ecx @@lp3: mov ecx,[x1] mov edi,[col] mov ebx,[ppvBits] add ecx,eax sub edx,esi cmp edx,[y2] mov [ebx+ecx*4],edi jle @F mov ecx,[y1] sub [x1],ecx sub edx,[y2] @@: add eax,[wdth] dec [x2] jne @@lp3 @@ret: ret _drawline endp 

02 Aug 2006, 18:54 

f0dder
If you're going for quality and not just speed, have a look at http://en.wikipedia.org/wiki/Xiaolin_Wu%27s_line_algorithm .


02 Aug 2006, 19:37 

shoorick
sub rex,rax ;rax=x1
sbb rhx,rhx xor rex,rhx is it really abs?  "xor rex,1" means "not rex", should be "neg rex", or "not rex/inc rex" 

03 Aug 2006, 06:26 

shoorick
xor ecx,ecx
sub eax,edx sets cl sbb ebx,ebx xor eax,eax add eax,ecx  if it has sense 

03 Aug 2006, 06:33 

Madis731
Code: mov rex,rcx ;rbx=x2 sub rex,rax ;rax=x1 sbb rhx,rhx xor rex,rhx ; This isn't ABS() but NOT() as you mentioned sub rex,rhx ;...but this line will increase it by one if necessary This is tested and works exactly the same as: Code: mov rex,rcx ;rbx=x2 sub rex,rax ;rax=x1 jnc @f neg rex @@: @Fodder: Yeah, I already researched this. It will be fun to try that out in the future and I already have a plan to FSAA with integeronly method by taking different colours as input (if I'm at the edge of colour transition). 

03 Aug 2006, 07:51 

shoorick
got it. then next question  what these "ors" mean:
or rsi,1 or rdi,1 cmp rcx,rax (i do not have 64bit cpu, nor familiar with 64bit modes) == no, got it  it is conditional 1/1 assuming (what advantage against mov reg,1?) 

03 Aug 2006, 08:11 

Madis731
or rsi,1 is shorter and again :$ tested with other conditional methods and
works the same way. OR RSI,1 is shorter 5 vs. 7 bytes (MOV RSI,1)... ...hmm now when I think of it there IS NO advantage I'm still stuck with 32bit there it was 3 vs. 5 bytes. Gain has dropped from 40% to 29%. 

03 Aug 2006, 09:19 

shoorick
i can not switch to this task fully to got it in whole (busy  hehe)
but next: i think, your asm equivalent does not match csource exactly. it seems to me there is this difference: if (x2 > x1) // but in originall csource >= { xinc1 = 1; xinc2 = 1; } else // The xvalues are decreasing { xinc1 = 1; xinc2 = 1 } === not sure i'm right, but also not sure about your comments (where which x and y stored in real): 1. mov rex,rcx ;rbx=x2 sub rex,rax ;rax=x1 2. if (x2 >= x1) // The xvalues are increasing .. sub rfx,rbx ;rcx=y1 .. cmp rcx,rax ; is it x2 to x1 comparing? 

03 Aug 2006, 09:42 

Madis731
It goes like this:
if (x2 >= x1) will translate to "cmp x2,x1 \ jc else" pair because if x2 is greater or equal, then x2x1 won't result in a carry. If x2 is less, then it results in a carry and will continue with that "else" which is the increments remain the same. Otherwise they will be negated. I want to go through my theory  if the problem WOULD be in these conditions then the line would look wrong in the near 0 or near infinity steeps which is not the case. Because its doing something wrong in the 1 to 0 region its has got to be hiding somewhere in the "There are more xvalues than yvalues" or the other similar part (dy>dx)... @Vasilev: I started munching through your code, but I can't figure it out is it Bresenham or some modified version of it. There seems to be a lot of jumps and code, but I think its because of the memory access. I have some advantage to use 15 registers. EDIT: Theory2 Because the steeps are okay, but only the direction is off and in the drawing part it only increments nominator and denominator then I think it can be fixed when starting point is fixed. swap(x1,x2), swap(y1,y2) 

03 Aug 2006, 10:01 

shoorick
ok, agree. then had not you mixed y1 and x2:
mov rex,rcx ;rbx=x2 ... sub rfx,rbx ;rcx=y1 ? maybe, more tests will show you more problems? 

03 Aug 2006, 10:13 

Goto page 1, 2 Next < Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.