flat assembler
Message board for the users of flat assembler.

Index > Windows > Rewrite GDI32 function

Author
Thread Post new topic Reply to topic
Kazyaka



Joined: 10 Oct 2011
Posts: 62
Location: Earth
Kazyaka 21 Nov 2011, 16:42
Hello,
I study using bitmaps for some time. I'm using GDI32.dll and I wonder is it possible to rewrite one function from this library (written in C) and few subfunctions to my program. It will speed up it a lot! I've searched for GDI32 source code and I've found this: http://source.winehq.org/WineAPI/gdi32.html
Function which I want to rewrite is SetDIBitsToDevice.

What do you think about my idea? Maybe someone knows faster method to display bitmap on screen from a bit array?
Post 21 Nov 2011, 16:42
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1618
Location: Toronto, Canada
AsmGuru62 21 Nov 2011, 17:09
I am thinking that GDI takes advantage of a hardware acceleration - how can you do faster than that?
Post 21 Nov 2011, 17:09
View user's profile Send private message Send e-mail Reply with quote
Kazyaka



Joined: 10 Oct 2011
Posts: 62
Location: Earth
Kazyaka 21 Nov 2011, 19:48
Now it works:
My app -> GDI32.DLL -> hardware acceleration
My idea is:
My app -> hardware acceleration

Is it possible? Can you explain me how libraries use hardware?

#Edit:

I've searched for some simple function and I've analyzed Sleep (Kernel32). Using this looks like:
My app -> Sleep (Kernel) -> SleepEx (Kernel) -> NtDelayExecution(Ntdll) -> ...
Way is long so I think we can speed up it.
Post 21 Nov 2011, 19:48
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1618
Location: Toronto, Canada
AsmGuru62 22 Nov 2011, 14:44
It is most likely possible, but I've never done it, so can't advise on that.
Personally I find the simple BitBlt() API to be fast enough.

In reality, how many CPU cycles you can really save when going to hardware without GDI32?
Post 22 Nov 2011, 14:44
View user's profile Send private message Send e-mail Reply with quote
Kazyaka



Joined: 10 Oct 2011
Posts: 62
Location: Earth
Kazyaka 22 Nov 2011, 14:54
@AsmGuru62

I must do speed test to know how many cycles I can save. I think I'll need help someone experienced.
I prefer SetDIBitsToDevice. It's the fastest method of displaying bitmap (using GDI32).
Post 22 Nov 2011, 14:54
View user's profile Send private message Reply with quote
pabloreda



Joined: 24 Jan 2007
Posts: 116
Location: Argentina
pabloreda 22 Nov 2011, 18:45
I uses SetDIBitsToDevice and is really fast, BUT if you found a better aproach, count with my for testing..
in WINE (linux) I have problem of speed, but I not sure if is this call..
if you like go depper, take a look to http://www.directfb.org/index.php
Post 22 Nov 2011, 18:45
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Nov 2011, 21:53
Kazyaka wrote:
I've searched for some simple function and I've analyzed Sleep (Kernel32). Using this looks like:
My app -> Sleep (Kernel) -> SleepEx (Kernel) -> NtDelayExecution(Ntdll) -> ...
Way is long so I think we can speed up it.
You're looking to shave off a few instructions (saving, at max, some microseconds) when doing a call to suspend your thread, with a resolution in milliseconds? I hope you can see the absurdity of that Smile

Instead of approaching optimization from a "Oh, this theoretically seems to have a lot of overhead, let me try optimizing right away!", you should measure where your performance bottlenecks are, and start optimizing that. If it turns out you spend any substantial time in SetDIBitsToDevice(), you probably aren't doing much at all in your own code.

_________________
Image - carpe noctem
Post 22 Nov 2011, 21:53
View user's profile Send private message Visit poster's website Reply with quote
Kazyaka



Joined: 10 Oct 2011
Posts: 62
Location: Earth
Kazyaka 23 Nov 2011, 15:09
Thanks for your comments. They gave me much to thinking.

@f0dder
About delay with micro- and milliseconds: it was only simple example. I won't use it.

If someone has something to add - feel free to post.
Post 23 Nov 2011, 15:09
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 23 Nov 2011, 16:08
Kazyaka wrote:
About delay with micro- and milliseconds: it was only simple example. I won't use it.
Smile

It's also relevant for other situations, though, where it might seem more reasonable to optimize code; for instance *A versus *W API calls. When calling a *A API, it converts your ascii to utf-16, calls the *W version, and converts results (if any) back to ascii. This seems pretty wasteful, but in real life situations you're unlikely to be able to benchmark any noticeable speed difference.

So if you have working ascii code, don't spend time rewriting to unicode for speed benefits - at least not without benchmarking your code and making sure that the A->W->A overhead is measurable and large enough to warrant the time spent. (But do consider doing it anyway for internationalization reasons, if applicable Smile).
Post 23 Nov 2011, 16:08
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.