flat assembler
Message board for the users of flat assembler.

Index > Main > Profiling

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
ASM-Man



Joined: 11 Jan 2013
Posts: 64
ASM-Man 10 Feb 2013, 02:13
What tool do you use to do profiling in assembly? I'm looking for a tool,don't put by hand time calls in start up and at end up subtract start - current time and show etc I want to a real profiling,show time how long time each construction was run.
It's so much-like gprof. I'm trying to see what is the faster of two routines,something like this etc

_________________
I'm not a native speaker of the english language. So, if you find any mistake what I have written, you are free to fix for me or tell me on. Smile
Post 10 Feb 2013, 02:13
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 10 Feb 2013, 02:55
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:35; edited 1 time in total
Post 10 Feb 2013, 02:55
View user's profile Send private message Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 12 Feb 2013, 04:08

_________________
Binary is nice, but Assembly is better!
Post 12 Feb 2013, 04:08
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 12 Feb 2013, 14:30
If you can find a way to generate .PDB debug information, there's the free Very Sleepy.
Post 12 Feb 2013, 14:30
View user's profile Send private message Visit poster's website Reply with quote
Bob++



Joined: 12 Feb 2013
Posts: 92
Bob++ 13 Feb 2013, 01:12
HaHaAnonymous wrote:
ASM-Man wrote:
What tool do you use to do profiling in assembly?

I use some ordinary tools I made (they work for me). In only do that occasionally when I have nothing to do (e.g., at midnight). Razz



Nice. But how does it works really? it mensure how many time all program use from CPU or instruction-per-instruction? what are the time functions(if any) do you use? the time's UNIX program is useful,sometimes.
Also,have you implemented GUI? Razz
Post 13 Feb 2013, 01:12
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 13 Feb 2013, 01:44
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:32; edited 1 time in total
Post 13 Feb 2013, 01:44
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 13 Feb 2013, 01:53
I had always assumed that profiling was to determine where in the code the CPU spends most of its time. Thus enabling the programmer to direct their optimising efforts in the most productive places.

Simply timing a loop or two is not profiling and is of no use if the loop only executes once or twice. And later when the programmer discovers that some other piece of code or hardware (other than the little piece of code they were working on) is the major bottleneck they commit ritual suicide after the depressing thought that all that effort was wasted on fruitless pursuits.
Post 13 Feb 2013, 01:53
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 13 Feb 2013, 02:08
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:32; edited 1 time in total
Post 13 Feb 2013, 02:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 13 Feb 2013, 02:16
HaHaAnonymous wrote:
Quote:
I had always assumed that profiling was to determine where in the code the CPU spends most of its time.

I believe this is not so hard to guess without any tools.
For small programs this is usually true, but sometimes surprises happen. For larger programs it can become very difficult to predict this type of thing. Which is why profilers were invented: to remove expectation biases and show the real situation.
Post 13 Feb 2013, 02:16
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 13 Feb 2013, 15:16
HaHaAnonymous wrote:
Quote:
I had always assumed that profiling was to determine where in the code the CPU spends most of its time.

I believe this is not so hard to guess without any tools.

Anyway, I may be wrong.

You are wrong indeed - for any "realistic" sized projects Smile

Here's a couple of blog entries that deal with profiling & optimizing a codebase, from a pretty interesting series that looks a bit at some Intel raytracing code:
Write combining is not your friend
A string processing rant
Cores don’t like to share
Fixing cache issues, the lazy way

(Before anybody makes a remark along the lines of "that's a stupid HLL coder on a HLL project", you might want to google who the guy is first Wink).

_________________
Image - carpe noctem
Post 13 Feb 2013, 15:16
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 13 Feb 2013, 15:42
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:31; edited 1 time in total
Post 13 Feb 2013, 15:42
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1671
Location: Toronto, Canada
AsmGuru62 13 Feb 2013, 16:07
@f0dder: great links!

I recently was coding a text editor for my IDE and I decided to "stress" it.
The question was: how big of a source file I can load without too much of
a visible delay. I tried increasingly large text files -- loading was OK, but
editing (inserting lines, typing text) was not very fast at all on a very large
text file.

So, the lesson here is always "stress" your code on a large data set.
That is where optimizing need the most.

Also, if you are coding something with a scrollbar - WM_xSCROLL messages have
a position limit of 32K, so watch out for that -- use the GetScrollInfo API.
Post 13 Feb 2013, 16:07
View user's profile Send private message Send e-mail Reply with quote
nmake



Joined: 13 Sep 2012
Posts: 192
nmake 13 Feb 2013, 18:00
Profiling is not always about finding a hot spot. Some (or many) c++ programmers will tell you that you need to find the hotspot. But little do they know that poisonous code can be a chain reaction, and the little tiny trigger that triggers the whole chain reaction may be located outside of critical code. That is why I favor an all-round good code.

Half of the work in optimizing is layout. Here is an example of a small kind of layout from the fasmw editor source code:
Code:
        cmp     [wmsg],WM_CREATE
        je      wmcreate
        cmp     [wmsg],WM_COPYDATA
        je      wmcopydata
        cmp     [wmsg],WM_GETMINMAXINFO
        je      wmgetminmaxinfo
        cmp     [wmsg],WM_SIZE
        je      wmsize
        cmp     [wmsg],WM_SETFOCUS
        je      wmsetfocus
        cmp     [wmsg],FM_NEW
        je      fmnew
        cmp     [wmsg],FM_OPEN
        je      fmopen
        cmp     [wmsg],FM_SAVE
        je      fmsave
        cmp     [wmsg],FM_COMPILE
        je      fmcompile
        cmp     [wmsg],FM_SELECT
        je      fmselect
        cmp     [wmsg],FM_ASSIGN
        je      fmassign
        cmp     [wmsg],FM_GETSELECTED
        je      fmgetselected
        cmp     [wmsg],FM_GETASSIGNED
        je      fmgetassigned
        cmp     [wmsg],FM_GETHANDLE
        je      fmgethandle
        cmp     [wmsg],WM_INITMENU
        je      wminitmenu
        cmp     [wmsg],WM_COMMAND
        je      wmcommand
        cmp     [wmsg],WM_NOTIFY
        je      wmnotify
        cmp     [wmsg],WM_DROPFILES
        je      wmdropfiles
        cmp     [wmsg],WM_CLOSE
        je      wmclose
        cmp     [wmsg],WM_DESTROY
        je      wmdestroy
    

This kind of layout iterates through the whole list before it checks the lowest parts. You should subdivide into sections, to narrow down amount of compare instructions, the approach is similar to quicksort. You split it into half each.

If you need to check number 1-100 you make subgroups, group 1 can be 1-50, group 2 can be 51-100.

Then you subdivide the next two groups, to narrow down amount of compares. This is a typical layout optimization.
Post 13 Feb 2013, 18:00
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 13 Feb 2013, 19:11
nmake wrote:
You should subdivide into sections, to narrow down amount of compare instructions, the approach is similar to quicksort. You split it into half each.

More like a binary sort - and it's a technique you'll find C++ compilers applying to switch statements.

But with the 20 or so window messages FASMW handles, does this matter, or are you committing the sin of premature optimization? Have you profiled the code to verify? If not, you'd be wasting time for no good reason, complicating clear and easy-to-read code.

(And the binary-search layout isn't always the most optimal one, anyway - like if you have a heavily skewed frequency distribution.)

Anyway, did you intend you intend your chosen code snippet as an example of "triggering a chain reaction"? I don't see how it would be an example of that, nor how it could be an example of something that makes profiling less useful.

_________________
Image - carpe noctem
Post 13 Feb 2013, 19:11
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 13 Feb 2013, 19:19
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:30; edited 1 time in total
Post 13 Feb 2013, 19:19
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1671
Location: Toronto, Canada
AsmGuru62 13 Feb 2013, 19:45
@nmake - you picked the worst place to optimize!
If you're optimizing the WndProc - at least pick the messages, which will be coming fast, like mouse movement or cursor settings.

Optimizing WM_CREATE or WM_DESTROY is not really necessary.

The most of optimization I would do here is take [wmsg] into a register and then compare with the register.

Then, again, some code can send WM_COMMANDs in a series of messages designed to do some macro-repeating, but then
the design of it is wrong. Another (custom) message is needed with
WPARAM indexing into a table of functions, which perform the actions of
macro-repeating.
Post 13 Feb 2013, 19:45
View user's profile Send private message Send e-mail Reply with quote
nmake



Joined: 13 Sep 2012
Posts: 192
nmake 13 Feb 2013, 20:11
Both me and tomasz are optimists, that is why he put WM_CREATE on top and WM_CLOSE and WM_DESTROY at the bottom. This small example signalises what you should do in bigger sets of code. It is the mentality that counts in the end, once you learn to think optimization from the beginning and it becomes a habit, it takes no effort to do and time is saved when it matters. It is not pre-mature, it is an investment. Those who clear away from premature optimization, they are the one who are mostly unfamiliar with optimization, and those are the one who most likely will abandon the complete project because the whole project is so badly planned from the beginning, that there is no room to improve it. Thinking like tomasz do here, from the very beginning, even in code that doesn't matter much, is the best approach.

I have to say, I really do hate pessimists, they don't understand much, they focus on things where they think matters, but in the end its too late anyway.

It is like the silly man who said, "Detail X" isn't necessary, I won't do it" Then he goes on to say "Detail Y isn't necessary", then "Detail z isn't necessary", "Detail Y isn't necessary", when it comes to the end, he lost because he had no practice at any detail at any level, and the details brought him down in the end. Basically he was a detail-less man, and so he couldn't win either way. When you sum up details, it adds up to a sum. There is the sum of bad, the sum of carelessness, and the sum of excellency.

I don't know how many times c++ coders have put to my face "Why do you spend so much time optimizing code that doesn't matter much"

Then I go on and say to him, "I am not optimizing it, it is routine, a habit I have, I spend no extra time on it, it is just a habit"

Why do you want to practice bad technique and then with effort implement excellency when you can practice excellency and save yourself the time to implement bad technique.
Post 13 Feb 2013, 20:11
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1671
Location: Toronto, Canada
AsmGuru62 13 Feb 2013, 20:24
Good points.
Post 13 Feb 2013, 20:24
View user's profile Send private message Send e-mail Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 13 Feb 2013, 22:07
I'm sorry, nmake, but you're not being very coherent.

First you pick a snippet from FASMW as an example of "bad layout", and now you're using it as an example of "the best approach"?

nmake wrote:
Then I go on and say to him, "I am not optimizing it, it is routine, a habit I have, I spend no extra time on it, it is just a habit"

Right, so it takes you absolutely no more time organizing your long list of compares as a binary search? You always flesh out your algorithms in optimal SSE code from the beginning, with no intermediary steps? You leave no part of the code "sloppy"? ...right.

Not doing premature optimization doesn't mean doing premature pessimization. It's about not wasting your time where it doesn't matter, but instead concentrating your effort where it (quantifiably) gives bang for the buck.

(Sure, there's poor programmer using the sentence as a mantra to excuse always writing piss-poor code. And then there's the rest of us.)
Post 13 Feb 2013, 22:07
View user's profile Send private message Visit poster's website Reply with quote
nmake



Joined: 13 Sep 2012
Posts: 192
nmake 13 Feb 2013, 22:36
f0dder wrote:

First you pick a snippet from FASMW as an example of "bad layout"

It is not bad layout, it is a template for unoptimized layout. It hasn't been worked on.
f0dder wrote:

, and now you're using it as an example of "the best approach"?

The layout isn't the best approach, it is his way of thinking that is the best approach, his programming mentality, his optimism.
f0dder wrote:

Right, so it takes you absolutely no more time organizing your long list of compares as a binary search? You always flesh out your algorithms in optimal SSE code from the beginning, with no intermediary steps? You leave no part of the code "sloppy"? ...right.

If you need SSE you are in a critical place of your code, optimization would be needed for both groups, so there is no imbalance of development-time.
f0dder wrote:

Not doing premature optimization doesn't mean doing premature pessimization. It's about not wasting your time where it doesn't matter, but instead concentrating your effort where it (quantifiably) gives bang for the buck.

A developer has to focus on the second best thing, it's his job to meet a deadline. Developers are not specialists in assembly, they are taught how to produce solutions using various sets of tools, but not so often experts in any particular field, although some might be experts in some fields beforehand, but most are not. So a developer has got to focus on the second best thing, he is not a technician, he is a developer, there are no alternatives. His job is to do things quickly, his profession is to know a little bit about everything, and his tools are designed to output the second best thing at a speed that can compete on the market. He have no choice, he can't choose perfection, there is no such thing for a developer. He is a moneymaking machine, all he has got to do is to out-compete the next second-best developer, basically its a competition of who can develop the best-worst program. If you can output better garbage than the garbage next door, you have become a successful developer. Perfection doesn't even come to mind here. Of course in some places there are extraordinary quality demands, but usually there isn't.
f0dder wrote:

(Sure, there's poor programmer using the sentence as a mantra to excuse always writing piss-poor code. And then there's the rest of us.)

If you take it too far with any hll developer, he will use the excuse too even if he haven't used the excuse before. It's about his well being, he have to protect his ways. Nobody is safe from that, they will begin to complain if you beg the question for long enough, he will start throwing apples and oranges at you. Very Happy
Post 13 Feb 2013, 22:36
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.