flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2, 3 Next |
Author |
|
LocoDelAssembly 10 Oct 2008, 01:48
bitRAKE, that one seems to be the concurrent queue from the article I posted above implemented in x86, cool
![]() Thanks again bitRAKE!! |
|||
![]() |
|
r22 10 Oct 2008, 20:29
It took two years, but this thread is turning out well.
(First Post = 23 Sep 2006, 00:20) When you need to consider concurrent Removes and Adds at the same time it certainly heightens the difficulty. Of course using a mutex is the generally anointed solution and it seems to work well. But because this is an ASM forum some of us try to search for the fastest/most optimized solution. But when you work towards optimization you always end up at the 'usage cases'. If you know how something will be used you can improve it's efficiency. So will a mutex be faster in some cases (or all cases) or will a lock free implementation be superior. |
|||
![]() |
|
revolution 10 Oct 2008, 21:36
I prefer to use semaphores when I implement queues.
|
|||
![]() |
|
LocoDelAssembly 10 Oct 2008, 22:10
Quote:
Why not mutexes or critical sections? |
|||
![]() |
|
revolution 10 Oct 2008, 22:19
Semaphores have a built in the counter so that after the wait returns the code uses a critical section to lock the queue and then insert the data into the queue. I found the mutex to require an extra step once we get the mutex we have to check to see of the queue has a free slot before inserting the data.
|
|||
![]() |
|
LocoDelAssembly 10 Oct 2008, 22:46
Ah, it wasn't a semaphores-only solution then. The extra step is sometimes prefered anyway revolution, because if the semaphore reaches zero, the next enqueuer will get blocked waiting an indefinite time, something not desirable as a general solution. If there is too much work enqueued perhaps it is time to report the problem to user rather than get hung till someone else dequeue something
![]() It is a matter of problem domain I suppose, but in average I think that it is preferable to get notified of a full queue situation instead of get blocked waiting for space. The handling of this situation could be either reporting the error in whatever form or reallocating the queue to a bigger size, for example. |
|||
![]() |
|
revolution 10 Oct 2008, 22:59
The wait functions have an optional timeout, so you can simply wait for 0ms and get an instant success/failure.
|
|||
![]() |
|
LocoDelAssembly 10 Oct 2008, 23:12
Is that the way you use it? Before using a semaphore that way isn't more appropriate to do the count-checking yourself? (where appropriate means "more efficient in both CPU time and system resources usage").
|
|||
![]() |
|
revolution 10 Oct 2008, 23:15
I doubt you can make it more efficient. After getting a mutex you have to check the queue status and subsequently release the mutex if the queue is full and either wait again or report a failure. A semaphore is just one call and, if you wait more than 0ms, the OS will automatically put the thread in an efficient wait state until there is an available slot.
|
|||
![]() |
|
LocoDelAssembly 10 Oct 2008, 23:31
Quote:
We are talking about the 0 ms of timeout case. It is not wrong the semaphore if it is suitable in your domain anyway (in case that saying that it is not a general purpose solution was considered as "it can't be applied to nothing" ![]() |
|||
![]() |
|
revolution 10 Oct 2008, 23:39
LocoDelAssembly wrote: It is not wrong the semaphore if it is suitable in your domain anyway (in case that saying that it is not a general purpose solution was considered as "it can't be applied to nothing" ![]() |
|||
![]() |
|
LocoDelAssembly 10 Oct 2008, 23:51
Just that I'm not saying it is wrong passing through a semaphore first, BUT that it is not always a good idea to use it if the thread could use its time doing something else than just waiting.
|
|||
![]() |
|
bitRAKE 11 Oct 2008, 00:51
Keeping threads busy is easier if the tasks can be completed in any order:
Code: GetTask: mov edx,[tasks] mov ecx,[task_count] ; task_count > 0 xor eax,eax .scan: xchg [edx+ecx*4-4],eax test eax,eax jne .okay dec ecx jne .scan ; fail with EAX/ECX zero, ZF=1 .okay: retn Last edited by bitRAKE on 12 Oct 2008, 03:28; edited 1 time in total |
|||
![]() |
|
r22 12 Oct 2008, 02:43
@bitRAKE
That's a novel concurrent stack POP snippet. But if task_count is originally 0 it'll fail. |
|||
![]() |
|
bitRAKE 12 Oct 2008, 07:15
r22 wrote: @bitRAKE Anyone tried to code the complementing PUSH? It's a little tricky. |
|||
![]() |
|
LocoDelAssembly 12 Oct 2008, 15:41
Here my (untested) attempt
Code: PutTask: ; >EDX = task ; <EAX = 0 on FAIL push ebx mov ecx, [task_count] ; Which means array size I suppose mov ebx, [tasks] lea ebx, [ebx+ecx*4] neg ecx .scan: xor eax, eax lock cmpxchg [ebx+ecx*4], edx je .exit inc ecx jnz .scan .exit: mov eax, ecx pop ebx ret |
|||
![]() |
|
r22 12 Oct 2008, 17:02
Concurrent Array Stack v0.5
If you PUSH 0 to this stack you'll be sorry when you try to POP it ![]() Code: .data _STACK_PTR dd 0 _STACK_MAX = 32 _STACK rd 32 .code STACK_PUSH: ;;esp+4 = item to add ;; eax = 0 FAIL ;; eax != 0 SUCCESS mov eax,[_STACK_PTR] mov edx,[esp+4] ;;new item ;;update the pointer .update: lea ecx,[eax + 4] ;;test overflow cmp ecx,_STACK_MAX ja .overflow lock cmpxchg [_STACK_PTR],ecx jnz .update ;;add the new element mov [_STACK + eax],edx inc eax ;;success ret 4 .overflow: xor eax,eax ;;overflow ret 4 STACK_POP: mov eax,[_STACK_PTR] .update: lea ecx,[eax-4] ;;test underflow cmp ecx,-4 je .underflow lock cmpxchg [_STACK_PTR],ecx jnz .update ;;get stack element xor ecx,ecx ;;get element wait for PUSH to complete .again: xchg [_STACK + eax - 4],ecx test ecx,ecx jz .again mov eax,ecx ret 0 .underflow: xor eax,eax ret 0 |
|||
![]() |
|
bitRAKE 14 Oct 2008, 05:47
Code: SetTask: mov edx,[tasks] mov ecx,[edx] ; array length > 0 .scan: xchg [edx+ecx*4],eax test eax,eax je .okay dec ecx jne .scan ; fail with EAX non-zero, some task unqueued! .okay: retn GetTask: mov edx,[tasks] mov ecx,[edx] ; array length > 0 .scan: xchg [edx+ecx*4],eax test eax,eax jne .okay dec ecx jne .scan ; fail with EAX zero ; success for EAX non-zero .okay: retn ![]() |
|||
![]() |
|
r22 15 Oct 2008, 01:25
Concurrency with just XCHG, I like bitRAKEs task list.
Because its O(n) on the Set and Get it makes me wonder at what point (what task list size) will the task list become slower than the concurrent stack 'v0.5'. Because of the latency on the LOCKed CMPXCHG opcode in the concurrent stack I think the task list would probably outperform it until around 16+ entries. Which means as long as you have more GETters than SETters you could probably get away with a small fast implementation like the task list. But again it comes full circle back to use cases. As an aside, WE should probably develop a library (in FASM) with concurrent ADTs (i.e. list, stack, queue, tree, hash map). I suspect concurrency and threading will only become more popular with Intel's new architecture 6x and more cores coming down the line. And although it's fun to reinvent the wheel, reinventing the MULTI-threaded wheel is an order of magnitude more difficult (as this thread and others have highlighted). |
|||
![]() |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.