flat assembler
Message board for the users of flat assembler.

Index > Windows > Windows PROC at 7FFE0300h

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Azu: in other words, you can't come up with any APIs where even modeswitching overhead is a speed problem? Smile

_________________
Image - carpe noctem
Post 06 Nov 2009, 17:50
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
Er.. all of them except the IO ones? Laughing

_________________
Post 06 Nov 2009, 17:56
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Azu wrote:
Er.. all of them except the IO ones? Laughing
Care to list just, say, a handful of the ones you'd say are the worst? I'm not expecting any benchmarking, even guesswork would be fine Smile

_________________
Image - carpe noctem
Post 06 Nov 2009, 18:15
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
GetTickCount(why why hell is this even a function)
memcpy(why why hell is this even a function)
Sleep
ReadProcessMemory

_________________
Post 06 Nov 2009, 18:18
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
GetTickCount() doesn't incur a modeswitch, Sleep() has milisecond granularity (whereas modeswitch overhead should be measured in, what, microsecends, even on the slowest CPUs), and memcpy doesn't do modeswitching either.

Why shouldn't GetTickCount() be a function? It's not replaceable by RDTSC, for a number of reasons... I guess you could have made a system tick counter with finer granularity and range and put it in globally mapped read-only memory at a fixed location, but considering how many applications would have use for that... it would have been overkill Smile

Also, why shouldn't memcpy be a function? While this isn't being done right now, it does allow for the system to decide on optimal routine at runtime depending on CPU features, alignment of buffers and size. For most code, it doesn't matter anyway, and if you have specific needs, you aren't going to be using a generic memcpy anyway.

ReadProcessMemory, ho humm... page remapping and TLB overhead et cetera, I doubt modeswitch overhead has a lot of runtime cost. Besides, when would you be running RPM in time-critical code? Smile
Post 06 Nov 2009, 18:23
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
f0dder wrote:
GetTickCount() doesn't incur a modeswitch
I'm not sure what you mean by modeswitch. I was talking about the overhead of reading a variable from some far away memory location, waiting for that read to complete, calling the result, and then returning. If you meant something else please explain.

f0dder wrote:
Sleep() has milisecond granularity
Epic fail right there. Yuck!

f0dder wrote:
memcpy doesn't do modeswitching either.
See above response to GetTickCount()

f0dder wrote:
WriteProcessMemory, ho humm... page remapping and TLB overhead et cetera, I doubt modeswitch overhead has a lot of runtime cost. Besides, when would you be running WPM in time-critical code? Smile
Why do we even have page tables and translation tables? Total waste of silicon. What's wrong with just making code address independent and ditching all that useless bloat???

_________________
Post 06 Nov 2009, 18:30
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17716
Location: In your JS exploiting you and your system
revolution
f0dder wrote:
Sleep() has milisecond granularity
On my XPSP2 box it is set for 10ms timing steps.
Post 06 Nov 2009, 18:37
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
revolution wrote:
f0dder wrote:
Sleep() has milisecond granularity
On my XPSP2 box it is set for 10ms timing steps.



omfg


Evil or Very Mad


Image

_________________
Post 06 Nov 2009, 18:38
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Azu wrote:
I'm not sure what you mean by modeswitch. I was talking about the overhead of reading a variable from some far away memory location, waiting for that read to complete, calling the result, and then returning. If you meant something else please explain.
Ring3->Ring0->Ring3 merry-go-round. The reason I focus on this is that the "call through 0x7FFE0300 pointer indirection" overhead is completely dwarfed by modeswitch overhead, and is thus pretty irrelevant Smile

Azu wrote:
f0dder wrote:
Sleep() has milisecond granularity
Epic fail right there. Yuck!
If Windows had been designed as a realtime OS, I would agree - but it wasn't.

Azu wrote:
f0dder wrote:
memcpy doesn't do modeswitching either.
See above response to GetTickCount()
And see my reply Smile - note that I haven't seen (m)any people use ntdll.memcpy(), HLLs tend to either link to their own (more or less) optimized versions, or inline the code when heuristics deem it profitable.

Azu wrote:
Why do we even have page tables and translation tables? Total waste of silicon. What's wrong with just making code address independent and ditching all that useless bloat???
Used to be primarily for disk paging, today it's primarily useful for process separation and per-page protection bits... this is useful enough, imho Smile
Post 06 Nov 2009, 18:47
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
f0dder wrote:
Azu wrote:
I'm not sure what you mean by modeswitch. I was talking about the overhead of reading a variable from some far away memory location, waiting for that read to complete, calling the result, and then returning. If you meant something else please explain.
Ring3->Ring0->Ring3 merry-go-round. The reason I focus on this is that the "call through 0x7FFE0300 pointer indirection" overhead is completely dwarfed by modeswitch overhead, and is thus pretty irrelevant Smile
Why should it cost so much to change rings? Does Windows somehow emulate them rather than using the CPU implementation?

f0dder wrote:
Azu wrote:
f0dder wrote:
Sleep() has milisecond granularity
Epic fail right there. Yuck!
If Windows had been designed as a realtime OS, I would agree - but it wasn't.
Why not? Computers are supposed to do things in real time.. their speed is their greatest strength!

f0dder wrote:
]quote="Azu"]
f0dder wrote:
memcpy doesn't do modeswitching either.
See above response to GetTickCount()
And see my reply Smile - note that I haven't seen (m)any people use ntdll.memcpy(), HLLs tend to either link to their own (more or less) optimized versions, or inline the code when heuristics deem it profitable.[/quote]What's it there for then? Useless functions bloating the namespace FTL...

f0dder wrote:
Azu wrote:
Why do we even have page tables and translation tables? Total waste of silicon. What's wrong with just making code address independent and ditching all that useless bloat???
Used to be primarily for disk paging, today it's primarily useful for process separation and per-page protection bits... this is useful enough, imho Smile
Can't they find a way to do that without all this bloated abstraction? If not, they should find a way to do without it, IMHO.. like maybe write secure code that isn't full of buffer overflows.

_________________
Post 06 Nov 2009, 18:53
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
@Azu: are you talking about virtual memory? Without it we wouldn't be able to multi-task.


And yeah Sleep sucks so bad with imprecision, I cannot imagine why they chose so though.
Post 06 Nov 2009, 19:22
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Azu wrote:
f0dder wrote:
If Windows had been designed as a realtime OS, I would agree - but it wasn't.
Why not? Computers are supposed to do things in real time.. their speed is their greatest strength!
Making a real-time OS has some trade-offs and complexities. Not needed for what, say, 99.9% of Windows installations are being used for, so why bother? If you need a realtime OS, go QNX or (if you're brave) linux with a realtime scheduler.

You can get better Sleep() precision by doing timeBeginPeriod()... but if you need that, you should probably be using the multimedia timers instead. Understand your platform, and use the APIs that fit your needs Smile

Azu wrote:
f0dder wrote:
And see my reply Smile - note that I haven't seen (m)any people use ntdll.memcpy(), HLLs tend to either link to their own (more or less) optimized versions, or inline the code when heuristics deem it profitable.
What's it there for then? Useless functions bloating the namespace FTL...
Convenience? Also, they might be used internally in NTDLL, and somebody though it was a nice gesture to export them to the world?

Ring switching is (relatively) slow because of all the access checks incurred... but you aren't going to be doing ring switching just to execute a few instructions in kernel-mode unless you're a really bad OS designer, or juts plain out of your mind.

Azu wrote:
Can't they find a way to do that without all this bloated abstraction? If not, they should find a way to do without it, IMHO.. like maybe write secure code that isn't full of buffer overflows.
People are imperfect... and especially while insisting on writing in low-level languages (like C++ and lower Smile) that allow you to make a lot of really bad decisions, you really should appreciate a lot of hardware assistance.
Post 06 Nov 2009, 19:31
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?

_________________
Previously known as The_Grey_Beast
Post 06 Nov 2009, 19:35
View user's profile Send private message Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
Borsuc wrote:
@Azu: are you talking about virtual memory? Without it we wouldn't be able to multi-task.
Why not? If you make your code address independent it should run without without that bloat.


Borsuc wrote:
And yeah Sleep sucks so bad with imprecision, I cannot imagine why they chose so though.
The syscall version is better, but they change around all the syscall ordinals with each release so it's a pain to use.. Sad
I plan to eventually make a JIT header for all my programs that sniffs those out at runtime and substitute them throughout my code, but I really shouldn't have to do that just to have decent performance.

f0dder wrote:
Azu wrote:
f0dder wrote:
If Windows had been designed as a realtime OS, I would agree - but it wasn't.
Why not? Computers are supposed to do things in real time.. their speed is their greatest strength!
Making a real-time OS has some trade-offs and complexities. Not needed for what, say, 99.9% of Windows installations are being used for, so why bother? If you need a realtime OS, go QNX or (if you're brave) linux with a realtime scheduler.
Huh? Everyone can appreciate faster/less-bloated code. That's what you mean by realtime, right? Because that's all I was talking about.

f0dder wrote:
You can get better Sleep() precision by doing timeBeginPeriod()... but if you need that, you should probably be using the multimedia timers instead. Understand your platform, and use the APIs that fit your needs Smile

Not according to MSDN. Their documentation for it says "timer resolution, in milliseconds"!!


f0dder wrote:
Azu wrote:
f0dder wrote:
And see my reply Smile - note that I haven't seen (m)any people use ntdll.memcpy(), HLLs tend to either link to their own (more or less) optimized versions, or inline the code when heuristics deem it profitable.
What's it there for then? Useless functions bloating the namespace FTL...
Convenience? Also, they might be used internally in NTDLL, and somebody though it was a nice gesture to export them to the world?
If the kernel of the OS itself is shit, so shall be everything running under it!

f0dder wrote:
Ring switching is (relatively) slow because of all the access checks incurred... but you aren't going to be doing ring switching just to execute a few instructions in kernel-mode unless you're a really bad OS designer, or juts plain out of your mind.
Or using Windows.

f0dder wrote:
Azu wrote:
Can't they find a way to do that without all this bloated abstraction? If not, they should find a way to do without it, IMHO.. like maybe write secure code that isn't full of buffer overflows.
People are imperfect... and especially while insisting on writing in low-level languages (like C++ and lower Smile) that allow you to make a lot of really bad decisions, you really should appreciate a lot of hardware assistance.
C++ is one of the most high level programming languages out there, so what are you talking about? Scripts? Rolling Eyes

Even x86 is pretty high level. We should be able to programmatically redefine which micro-operations an opcode performs.


Borsuc wrote:
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?

Because to 99.99999% of everyone it doesn't matter how much is sucks as long as new $5000 computers can run it at all, apparently.

_________________
Post 06 Nov 2009, 19:56
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
Are you joking? How can C++ be one of the most high-level languages? Rolling Eyes

I'm assuming you don't actually know truly HLL languages, as C is one of the most low-level available. C++ obviously is not as low-level, but it's built on the same base and CAN be (no one forces you to use the object-oriented abstractions).

_________________
Previously known as The_Grey_Beast
Post 07 Nov 2009, 01:22
View user's profile Send private message Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
Borsuc wrote:
Are you joking? How can C++ be one of the most high-level languages? Rolling Eyes
Because it's even higher than C.. it's full of object oriented stuff and is completely abstracted from the actual code..

Borsuc wrote:
I'm assuming you don't actually know truly HLL languages, as C is one of the most low-level available. C++ obviously is not as low-level, but it's built on the same base and CAN be (no one forces you to use the object-oriented abstractions).
You don't have to no.. but if you don't you aren't really using C++.

_________________
Post 07 Nov 2009, 01:26
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Borsuc wrote:
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?
Probably made a lot of sense when NT was originally implemented - and then didn't make a lot of sense to refine? Apart for hard realtime needs, I personally haven't found this a limiting factor.

Azu wrote:
Why not? If you make your code address independent it should run without without that bloat.
So you'd rather take up a full x86 register to point to loadaddr? Even if doing that, you don't get the nice protection features of x86 paging.

Azu wrote:

The syscall version is better, but they change around all the syscall ordinals with each release so it's a pain to use.. Sad I plan to eventually make a JIT header for all my programs that sniffs those out at runtime and substitute them throughout my code, but I really shouldn't have to do that just to have decent performance.
Stupid. You're going to rewrite your memory image (causing dirty pages and possibly page-to-disk) for some not-even-benchmarked "speed improvement"? I'm all for optimizing stuff, even if's some little penis improvement you won't really feel the effect of at runtime... but please don't penalize. And "decent performance", really... if you had a PutPixel function that did a 3->0->3 roundtrip, then PERHAPS that could have been a candidate for optimizing this way... but everybody would tell you to stop being retarded and not use PutPixel in an inner loop Smile

Azu wrote:
f0dder wrote:
Making a real-time OS has some trade-offs and complexities. Not needed for what, say, 99.9% of Windows installations are being used for, so why bother? If you need a realtime OS, go QNX or (if you're brave) linux with a realtime scheduler.
Huh? Everyone can appreciate faster/less-bloated code. That's what you mean by realtime, right? Because that's all I was talking about.
Take a look at OS design and OS classification. "realtime" has some pretty hard limits... that don't make much sense for general-available OSes.

Azu wrote:
f0dder wrote:
Convenience? Also, they might be used internally in NTDLL, and somebody though it was a nice gesture to export them to the world?
If the kernel of the OS itself is shit, so shall be everything running under it!
If you think NTDLL is part of the kernel, that kinda shows how much you know about Windows Smile

Azu wrote:
f0dder wrote:
People are imperfect... and especially while insisting on writing in low-level languages (like C++ and lower Smile) that allow you to make a lot of really bad decisions, you really should appreciate a lot of hardware assistance.
C++ is one of the most high level programming languages out there, so what are you talking about? Scripts? :rolleyes:
By today's standards, even C++ is a pretty low-level programming language. And don't get me wrong, I enjoy C++ and even lower level languages, but I do find that they're not the most suitable languages for a whole lot of development tasks these days. Especially not given the attitudes of a lot of developers...

Azu wrote:
Even x86 is pretty high level. We should be able to programmatically redefine which micro-operations an opcode performs.
Why? This would be very fun to toy around with, but think outside your own little perfect hobbyist world, and consider the real world with really really malicious people, nasty time-to-market requirement that imply "we can't write perfect code", imperfect humans that might produce errors when pushed to 60h/week working hours, et cetera.

Azu wrote:
borsuc wrote:
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?
Because to 99.99999% of everyone it doesn't matter how much is sucks as long as new $5000 computers can run it at all, apparently.
Again, how many times have you needed higher precision? The only example I can think of right now, that isn't more suitable for running on a realtime OS, is games... and even something as old as Quake2 ran just fine without higher-precision timing.

Now, it's all fine living in an ivory tower and claiming that everything should be perfect... but in the real world, less than perfect works, and works pretty well.
Post 07 Nov 2009, 01:26
View user's profile Send private message Visit poster's website Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
[quote="PIZZA!!!1111111111111111111111111111!!!11111111111111111111111111111111111111111111111111111111111"]
f0dder wrote:
Borsuc wrote:
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?
Probably made a lot of sense when NT was originally implemented - and then didn't make a lot of sense to refine? Apart for hard realtime needs, I personally haven't found this a limiting factor.
What makes you think it would have been hard to do it right to begin with? And what makes you think it would be hard to make it right now? The scheduler is technically capable already.. it's just you have to use syscalls to do it and it's a pain to detect all the different Windows versions at startup to choose the syscall..

Borsuc wrote:
Azu wrote:
Why not? If you make your code address independent it should run without without that bloat.
So you'd rather take up a full x86 register to point to loadaddr? Even if doing that, you don't get the nice protection features of x86 paging.
What's that? I looked it up and Google and it seems to only apply to the ELF file format.

Borsuc wrote:
Azu wrote:

The syscall version is better, but they change around all the syscall ordinals with each release so it's a pain to use.. Sad I plan to eventually make a JIT header for all my programs that sniffs those out at runtime and substitute them throughout my code, but I really shouldn't have to do that just to have decent performance.
Stupid. You're going to rewrite your memory image (causing dirty pages and possibly page-to-disk) for some not-even-benchmarked "speed improvement"? I'm all for optimizing stuff, even if's some little penis improvement you won't really feel the effect of at runtime... but please don't penalize. And "decent performance", really... if you had a PutPixel function that did a 3->0->3 roundtrip, then PERHAPS that could have been a candidate for optimizing this way... but everybody would tell you to stop being retarded and not use PutPixel in an inner loop Smile
Not sure what you're on about. I meant once at startup obviously, not during runtime.
Anyways the point is an OS should be made in a way that it's efficient to begin with rather then programmers having to do hacky stuff to make it work right.

Borsuc wrote:
Azu wrote:
f0dder wrote:
Making a real-time OS has some trade-offs and complexities. Not needed for what, say, 99.9% of Windows installations are being used for, so why bother? If you need a realtime OS, go QNX or (if you're brave) linux with a realtime scheduler.
Huh? Everyone can appreciate faster/less-bloated code. That's what you mean by realtime, right? Because that's all I was talking about.
Take a look at OS design and OS classification. "realtime" has some pretty hard limits... that don't make much sense for general-available OSes.
Okay. Stop going on about it then. You're the one who brought it up, not me.

Borsuc wrote:
Azu wrote:
f0dder wrote:
Convenience? Also, they might be used internally in NTDLL, and somebody though it was a nice gesture to export them to the world?
If the kernel of the OS itself is shit, so shall be everything running under it!
If you think NTDLL is part of the kernel, that kinda shows how much you know about Windows Smile
ntdll/ntkernel, whatever. Same difference. The point is all programs must import it.

Borsuc wrote:
Azu wrote:
f0dder wrote:
People are imperfect... and especially while insisting on writing in low-level languages (like C++ and lower Smile) that allow you to make a lot of really bad decisions, you really should appreciate a lot of hardware assistance.
C++ is one of the most high level programming languages out there, so what are you talking about? Scripts? :rolleyes:
By today's standards, even C++ is a pretty low-level programming language. And don't get me wrong, I enjoy C++ and even lower level languages, but I do find that they're not the most suitable languages for a whole lot of development tasks these days. Especially not given the attitudes of a lot of developers...
Compared to scripts, like javascript and lua, sure, it might seem low-level by comparison. But for a programming language it isn't.

Borsuc wrote:
Azu wrote:
Even x86 is pretty high level. We should be able to programmatically redefine which micro-operations an opcode performs.
Why? This would be very fun to toy around with, but think outside your own little perfect hobbyist world, and consider the real world with really really malicious people, nasty time-to-market requirement that imply "we can't write perfect code", imperfect humans that might produce errors when pushed to 60h/week working hours, et cetera.
I said "should be able to redefine" not "should be blank by default". So for everyone who doesn't want to make their code super fast and super small, there would be no difference.

Borsuc wrote:
Azu wrote:
borsuc wrote:
You say that Windows isn't a real-time system, but is there any reason why the hell it is so imprecise? Other than just arbitrarily so?
Because to 99.99999% of everyone it doesn't matter how much is sucks as long as new $5000 computers can run it at all, apparently.
Again, how many times have you needed higher precision? The only example I can think of right now, that isn't more suitable for running on a realtime OS, is games... and even something as old as Quake2 ran just fine without higher-precision timing.
Pretty much any time I've wanted to use Sleep rather than a blocking call.

Borsuc wrote:
Now, it's all fine living in an ivory tower and claiming that everything should be perfect... but in the real world, less than perfect works, and works pretty well.
Not perfect, just less shitty.

_________________
Post 07 Nov 2009, 01:36
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
Azu wrote:
Because it's even higher than C..
"even higher" than C, you talk as if C is high already.

Name one language lower than C (which is abstracted from the processor obviously). C-- doesn't count. Razz

_________________
Previously known as The_Grey_Beast
Post 07 Nov 2009, 01:48
View user's profile Send private message Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu
Borsuc wrote:
Azu wrote:
Because it's even higher than C..
"even higher" than C, you talk as if C is high already.
High as a hippy on crack.

Borsuc wrote:
Name one language lower than C (which is abstracted from the processor obviously). C-- doesn't count. Razz
FASM.

And below that, machine code (x86), which is still abstracted from the processor (processor is really RISC and runs uops).

_________________
Post 07 Nov 2009, 01:53
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.