Message board for the users of flat assembler.
> Windows > the best optimized threading method in win32/64
i am into coding mood lately,
wanna get your idea on what threading api/method that best and optimized in win32/64 system.
assume i gonna handle userbase like google / yahoo / bing. (sort of)
|07 Aug 2009, 22:49||
Last edited by asmcoder on 14 Aug 2009, 14:47; edited 1 time in total
|07 Aug 2009, 23:04||
The best approach for multi-threading depends heavily on the situation.
Two things hinder threads
I processor with one core wouldn't get any work done if it had 100,000 threads running on it, because the context switches between the threads would eat all the processing time up.
If you need serial access to an object, threads will have to block/wait for their turn. Having most of your threads waiting all the time is not very efficient.
The most widely used method for multi-threading is Thread Pooling. In thread pooling you would have a queue of jobs/requests and a list of threads. Depending on load and memory (resources) your list of threads could dynamically re-size itself. Usually the jobs/requests are self contained (the don't rely on serial access, writing to the same memory location).
Each thread in the list would DEqueue a job/request and process it. There would obviously be a thread dedicated to adding work to the queue or even dispatching the job/request directly to a currently free thread.
|07 Aug 2009, 23:28||
Best optimized method: design a tree with "jobs", a job being a pointer to a function.
Every job on the same level can be done in parallel. Create as many threads as there are cores/processors on your PC (use some APIs to get that). When a job is finished, and if there is another job who wasn't started yet, use the current thread (that finished the current job) to start that new job.
Then there are jobs deeper in the tree which depend on the previous jobs. For instance, a diagram:
This is to be read as follows.
job0 \ job20 job21 job22 / job1 job2 \ job30 job31 job3 \ job40
job0, job1, job2 and job3 are all on the first level. They are independent of one another, so they can be executed at the same time. If you had 4 cores, start them all at the same time. If you had 2 cores, then start 2 of them, when they finish (each core itself), start the others. For instance if job1 takes 3 times as much time as job0, then the core executing job0 will start job2 and probably job3 (if they both take small time) before core 2 will even finish job1.
Naturally, you should split the jobs so each of them takes approximately the same time for best results. Unless their subjobs are not dependent on each other.
job20, job21 and job22 are dependent on job0 and job1, so they can't be executed unless job0 and job1 are finished.
job30 and job31 are dependent on job2 only. And job40 only on job3 (in this case, it's not even worth to make job40, since you can put it directly in job3 since it's alone )
The jobs themselves do NOT change depending on the computer specs. What changes is just the number of threads. That's what makes this approach so universal.
You can do this with an orderly tree of pointers to functions. Each "job" is just a pointer to the respective function in the data structure.
Good luck optimizing it
EDIT: REAL WORLD EXAMPLE FOR A SIMPLE GAME (in software rendering, no GPU)
graphics \ sorting------------------------------------------------------------*postprocessing splitscreen / \ / quarter1\ / quarter2 *-----------------------------------------* quarter3/ / ... / quarterx (number of threads you want to assign) / gameplay \ time prediction input physics and collisions other stuff audio \ sound sources \ source1\ source2 *--*--*--*postprocessing source3/ / / / ... / / / sourcex / / / ambient-----------------/ / music---------------------/
|08 Aug 2009, 15:07||
< Last Thread | Next Thread >
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.