flat assembler
Message board for the users of flat assembler.

Index > Windows > the best optimized threading method in win32/64

Author
Thread Post new topic Reply to topic
sleepsleep



Joined: 05 Oct 2006
Posts: 13176
Location: ˛                             ⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣Posts: 0010456
sleepsleep 07 Aug 2009, 22:49
oh yeah..
i am into coding mood lately,

wanna get your idea on what threading api/method that best and optimized in win32/64 system.

assume i gonna handle userbase like google / yahoo / bing. (sort of)
Post 07 Aug 2009, 22:49
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 07 Aug 2009, 23:04
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:47; edited 1 time in total
Post 07 Aug 2009, 23:04
View user's profile Send private message Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 07 Aug 2009, 23:28
The best approach for multi-threading depends heavily on the situation.

Two things hinder threads
- Resources
I processor with one core wouldn't get any work done if it had 100,000 threads running on it, because the context switches between the threads would eat all the processing time up.
- Concurrency
If you need serial access to an object, threads will have to block/wait for their turn. Having most of your threads waiting all the time is not very efficient.

The most widely used method for multi-threading is Thread Pooling. In thread pooling you would have a queue of jobs/requests and a list of threads. Depending on load and memory (resources) your list of threads could dynamically re-size itself. Usually the jobs/requests are self contained (the don't rely on serial access, writing to the same memory location).

Each thread in the list would DEqueue a job/request and process it. There would obviously be a thread dedicated to adding work to the queue or even dispatching the job/request directly to a currently free thread.
Post 07 Aug 2009, 23:28
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 08 Aug 2009, 15:07
Best optimized method: design a tree with "jobs", a job being a pointer to a function.

Every job on the same level can be done in parallel. Create as many threads as there are cores/processors on your PC (use some APIs to get that). When a job is finished, and if there is another job who wasn't started yet, use the current thread (that finished the current job) to start that new job.

Then there are jobs deeper in the tree which depend on the previous jobs. For instance, a diagram:

Code:
job0
    \
     job20
     job21
     job22
    /
job1
job2
    \
     job30
     job31
job3
    \
     job40    
This is to be read as follows.
job0, job1, job2 and job3 are all on the first level. They are independent of one another, so they can be executed at the same time. If you had 4 cores, start them all at the same time. If you had 2 cores, then start 2 of them, when they finish (each core itself), start the others. For instance if job1 takes 3 times as much time as job0, then the core executing job0 will start job2 and probably job3 (if they both take small time) before core 2 will even finish job1.

Naturally, you should split the jobs so each of them takes approximately the same time for best results. Unless their subjobs are not dependent on each other.

job20, job21 and job22 are dependent on job0 and job1, so they can't be executed unless job0 and job1 are finished.
job30 and job31 are dependent on job2 only. And job40 only on job3 (in this case, it's not even worth to make job40, since you can put it directly in job3 since it's alone Wink)

The jobs themselves do NOT change depending on the computer specs. What changes is just the number of threads. That's what makes this approach so universal. Smile

You can do this with an orderly tree of pointers to functions. Each "job" is just a pointer to the respective function in the data structure.

Good luck optimizing it Smile


EDIT: REAL WORLD EXAMPLE FOR A SIMPLE GAME (in software rendering, no GPU)

Code:
graphics
        \
         sorting------------------------------------------------------------*postprocessing
         splitscreen                                                       /
                    \                                                     /
                     quarter1\                                           /
                     quarter2 *-----------------------------------------*
                     quarter3/                                         /
                     ...                                              /
                     quarterx (number of threads you want to assign) /

gameplay
        \
         time prediction
         input
         physics and collisions
         other stuff

audio
     \
      sound sources
                   \
                    source1\
                    source2 *--*--*--*postprocessing
                    source3/  /  /  /
                    ...      /  /  /
                    sourcex /  /  /
      ambient-----------------/  /
      music---------------------/    
Post 08 Aug 2009, 15:07
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.