flat assembler
Message board for the users of flat assembler.
Index
> Main > Profiling Goto page 1, 2 Next |
Author |
|
HaHaAnonymous 10 Feb 2013, 02:55
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:35; edited 1 time in total |
|||
10 Feb 2013, 02:55 |
|
Asm++ 12 Feb 2013, 04:08
Well, for me, I also use my own profiling tools, but if you want some ready tools, here is some links:
AQtime Pro http://smartbear.com/products/qa-tools/application-performance-profiling AMD CodeAnalyst http://developer.amd.com/tools/heterogeneous-computing/amd-codeanalyst-performance-analyzer/ Intel VTune http://software.intel.com/en-us/intel-vtune-amplifier-xe _________________ Binary is nice, but Assembly is better! |
|||
12 Feb 2013, 04:08 |
|
f0dder 12 Feb 2013, 14:30
If you can find a way to generate .PDB debug information, there's the free Very Sleepy.
|
|||
12 Feb 2013, 14:30 |
|
Bob++ 13 Feb 2013, 01:12
HaHaAnonymous wrote:
Nice. But how does it works really? it mensure how many time all program use from CPU or instruction-per-instruction? what are the time functions(if any) do you use? the time's UNIX program is useful,sometimes. Also,have you implemented GUI? |
|||
13 Feb 2013, 01:12 |
|
HaHaAnonymous 13 Feb 2013, 01:44
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:32; edited 1 time in total |
|||
13 Feb 2013, 01:44 |
|
revolution 13 Feb 2013, 01:53
I had always assumed that profiling was to determine where in the code the CPU spends most of its time. Thus enabling the programmer to direct their optimising efforts in the most productive places.
Simply timing a loop or two is not profiling and is of no use if the loop only executes once or twice. And later when the programmer discovers that some other piece of code or hardware (other than the little piece of code they were working on) is the major bottleneck they commit ritual suicide after the depressing thought that all that effort was wasted on fruitless pursuits. |
|||
13 Feb 2013, 01:53 |
|
HaHaAnonymous 13 Feb 2013, 02:08
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:32; edited 1 time in total |
|||
13 Feb 2013, 02:08 |
|
revolution 13 Feb 2013, 02:16
HaHaAnonymous wrote:
|
|||
13 Feb 2013, 02:16 |
|
f0dder 13 Feb 2013, 15:16
HaHaAnonymous wrote:
You are wrong indeed - for any "realistic" sized projects Here's a couple of blog entries that deal with profiling & optimizing a codebase, from a pretty interesting series that looks a bit at some Intel raytracing code: Write combining is not your friend A string processing rant Cores don’t like to share Fixing cache issues, the lazy way (Before anybody makes a remark along the lines of "that's a stupid HLL coder on a HLL project", you might want to google who the guy is first ). _________________ - carpe noctem |
|||
13 Feb 2013, 15:16 |
|
HaHaAnonymous 13 Feb 2013, 15:42
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:31; edited 1 time in total |
|||
13 Feb 2013, 15:42 |
|
AsmGuru62 13 Feb 2013, 16:07
@f0dder: great links!
I recently was coding a text editor for my IDE and I decided to "stress" it. The question was: how big of a source file I can load without too much of a visible delay. I tried increasingly large text files -- loading was OK, but editing (inserting lines, typing text) was not very fast at all on a very large text file. So, the lesson here is always "stress" your code on a large data set. That is where optimizing need the most. Also, if you are coding something with a scrollbar - WM_xSCROLL messages have a position limit of 32K, so watch out for that -- use the GetScrollInfo API. |
|||
13 Feb 2013, 16:07 |
|
nmake 13 Feb 2013, 18:00
Profiling is not always about finding a hot spot. Some (or many) c++ programmers will tell you that you need to find the hotspot. But little do they know that poisonous code can be a chain reaction, and the little tiny trigger that triggers the whole chain reaction may be located outside of critical code. That is why I favor an all-round good code.
Half of the work in optimizing is layout. Here is an example of a small kind of layout from the fasmw editor source code: Code: cmp [wmsg],WM_CREATE je wmcreate cmp [wmsg],WM_COPYDATA je wmcopydata cmp [wmsg],WM_GETMINMAXINFO je wmgetminmaxinfo cmp [wmsg],WM_SIZE je wmsize cmp [wmsg],WM_SETFOCUS je wmsetfocus cmp [wmsg],FM_NEW je fmnew cmp [wmsg],FM_OPEN je fmopen cmp [wmsg],FM_SAVE je fmsave cmp [wmsg],FM_COMPILE je fmcompile cmp [wmsg],FM_SELECT je fmselect cmp [wmsg],FM_ASSIGN je fmassign cmp [wmsg],FM_GETSELECTED je fmgetselected cmp [wmsg],FM_GETASSIGNED je fmgetassigned cmp [wmsg],FM_GETHANDLE je fmgethandle cmp [wmsg],WM_INITMENU je wminitmenu cmp [wmsg],WM_COMMAND je wmcommand cmp [wmsg],WM_NOTIFY je wmnotify cmp [wmsg],WM_DROPFILES je wmdropfiles cmp [wmsg],WM_CLOSE je wmclose cmp [wmsg],WM_DESTROY je wmdestroy This kind of layout iterates through the whole list before it checks the lowest parts. You should subdivide into sections, to narrow down amount of compare instructions, the approach is similar to quicksort. You split it into half each. If you need to check number 1-100 you make subgroups, group 1 can be 1-50, group 2 can be 51-100. Then you subdivide the next two groups, to narrow down amount of compares. This is a typical layout optimization. |
|||
13 Feb 2013, 18:00 |
|
f0dder 13 Feb 2013, 19:11
nmake wrote: You should subdivide into sections, to narrow down amount of compare instructions, the approach is similar to quicksort. You split it into half each. More like a binary sort - and it's a technique you'll find C++ compilers applying to switch statements. But with the 20 or so window messages FASMW handles, does this matter, or are you committing the sin of premature optimization? Have you profiled the code to verify? If not, you'd be wasting time for no good reason, complicating clear and easy-to-read code. (And the binary-search layout isn't always the most optimal one, anyway - like if you have a heavily skewed frequency distribution.) Anyway, did you intend you intend your chosen code snippet as an example of "triggering a chain reaction"? I don't see how it would be an example of that, nor how it could be an example of something that makes profiling less useful. _________________ - carpe noctem |
|||
13 Feb 2013, 19:11 |
|
HaHaAnonymous 13 Feb 2013, 19:19
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:30; edited 1 time in total |
|||
13 Feb 2013, 19:19 |
|
AsmGuru62 13 Feb 2013, 19:45
@nmake - you picked the worst place to optimize!
If you're optimizing the WndProc - at least pick the messages, which will be coming fast, like mouse movement or cursor settings. Optimizing WM_CREATE or WM_DESTROY is not really necessary. The most of optimization I would do here is take [wmsg] into a register and then compare with the register. Then, again, some code can send WM_COMMANDs in a series of messages designed to do some macro-repeating, but then the design of it is wrong. Another (custom) message is needed with WPARAM indexing into a table of functions, which perform the actions of macro-repeating. |
|||
13 Feb 2013, 19:45 |
|
nmake 13 Feb 2013, 20:11
Both me and tomasz are optimists, that is why he put WM_CREATE on top and WM_CLOSE and WM_DESTROY at the bottom. This small example signalises what you should do in bigger sets of code. It is the mentality that counts in the end, once you learn to think optimization from the beginning and it becomes a habit, it takes no effort to do and time is saved when it matters. It is not pre-mature, it is an investment. Those who clear away from premature optimization, they are the one who are mostly unfamiliar with optimization, and those are the one who most likely will abandon the complete project because the whole project is so badly planned from the beginning, that there is no room to improve it. Thinking like tomasz do here, from the very beginning, even in code that doesn't matter much, is the best approach.
I have to say, I really do hate pessimists, they don't understand much, they focus on things where they think matters, but in the end its too late anyway. It is like the silly man who said, "Detail X" isn't necessary, I won't do it" Then he goes on to say "Detail Y isn't necessary", then "Detail z isn't necessary", "Detail Y isn't necessary", when it comes to the end, he lost because he had no practice at any detail at any level, and the details brought him down in the end. Basically he was a detail-less man, and so he couldn't win either way. When you sum up details, it adds up to a sum. There is the sum of bad, the sum of carelessness, and the sum of excellency. I don't know how many times c++ coders have put to my face "Why do you spend so much time optimizing code that doesn't matter much" Then I go on and say to him, "I am not optimizing it, it is routine, a habit I have, I spend no extra time on it, it is just a habit" Why do you want to practice bad technique and then with effort implement excellency when you can practice excellency and save yourself the time to implement bad technique. |
|||
13 Feb 2013, 20:11 |
|
AsmGuru62 13 Feb 2013, 20:24
Good points.
|
|||
13 Feb 2013, 20:24 |
|
f0dder 13 Feb 2013, 22:07
I'm sorry, nmake, but you're not being very coherent.
First you pick a snippet from FASMW as an example of "bad layout", and now you're using it as an example of "the best approach"? nmake wrote: Then I go on and say to him, "I am not optimizing it, it is routine, a habit I have, I spend no extra time on it, it is just a habit" Right, so it takes you absolutely no more time organizing your long list of compares as a binary search? You always flesh out your algorithms in optimal SSE code from the beginning, with no intermediary steps? You leave no part of the code "sloppy"? ...right. Not doing premature optimization doesn't mean doing premature pessimization. It's about not wasting your time where it doesn't matter, but instead concentrating your effort where it (quantifiably) gives bang for the buck. (Sure, there's poor programmer using the sentence as a mantra to excuse always writing piss-poor code. And then there's the rest of us.) |
|||
13 Feb 2013, 22:07 |
|
nmake 13 Feb 2013, 22:36
f0dder wrote:
It is not bad layout, it is a template for unoptimized layout. It hasn't been worked on. f0dder wrote:
The layout isn't the best approach, it is his way of thinking that is the best approach, his programming mentality, his optimism. f0dder wrote:
If you need SSE you are in a critical place of your code, optimization would be needed for both groups, so there is no imbalance of development-time. f0dder wrote:
A developer has to focus on the second best thing, it's his job to meet a deadline. Developers are not specialists in assembly, they are taught how to produce solutions using various sets of tools, but not so often experts in any particular field, although some might be experts in some fields beforehand, but most are not. So a developer has got to focus on the second best thing, he is not a technician, he is a developer, there are no alternatives. His job is to do things quickly, his profession is to know a little bit about everything, and his tools are designed to output the second best thing at a speed that can compete on the market. He have no choice, he can't choose perfection, there is no such thing for a developer. He is a moneymaking machine, all he has got to do is to out-compete the next second-best developer, basically its a competition of who can develop the best-worst program. If you can output better garbage than the garbage next door, you have become a successful developer. Perfection doesn't even come to mind here. Of course in some places there are extraordinary quality demands, but usually there isn't. f0dder wrote:
If you take it too far with any hll developer, he will use the excuse too even if he haven't used the excuse before. It's about his well being, he have to protect his ways. Nobody is safe from that, they will begin to complain if you beg the question for long enough, he will start throwing apples and oranges at you. |
|||
13 Feb 2013, 22:36 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.