flat assembler
Message board for the users of flat assembler.
  
|  Index
      > Main > No code profiler for FASM people. We are so poor! Goto page 1, 2 Next | 
| Author | 
 | 
| revolution 05 Dec 2014, 12:53 I've written a few macros to inject code into procedures at compile time. The code it mostly just simple RDTSC instructions and appropriate add/sub/store arithmetic for counters and accumulators. I'm sure many others here have done the same. It is no big thing but, as you say, they are hard to use and set up properly.
 Instrumenting code changes its behaviour so the user must be aware of what they are doing else the results can be meaningless. It is also application specific as to how the measurements should be made. Each application will have its own requirements and quirks that need programmer attention. Executive summary: Profiling is not just simple plug-and-play thing. You've got to know what you are doing. | |||
|  05 Dec 2014, 12:53 | 
 | 
| system error 05 Dec 2014, 12:57 Or probably someone could just teach me the easy way to use AMD's CodeXL to measure my code performance, in assembly not in C. I don't understand most of the terms used by CodeXL. Its an overkill. I just need some simple time lapse test from point A to point B of my code. Nothing fancy. I could use RDTSC at both ends but I think its more complicated than that. | |||
|  05 Dec 2014, 12:57 | 
 | 
| system error 05 Dec 2014, 13:00 revolution wrote: I've written a few macros to inject code into procedures at compile time. The code it mostly just simple RDTSC instructions and appropriate add/sub/store arithmetic for counters and accumulators. I'm sure many others here have done the same. It is no big thing but, as you say, they are hard to use and set up properly. | |||
|  05 Dec 2014, 13:00 | 
 | 
| system error 05 Dec 2014, 13:05 It would be nice and fun to have this, even a basic one should do so that we can bitch-slapping each other over whose codes are the fastest or the shortest. | |||
|  05 Dec 2014, 13:05 | 
 | 
| revolution 05 Dec 2014, 13:14 system error wrote: Where can I get that [profiler]? system error wrote: Do you know how to use codeXL? system error wrote: I could use RDTSC at both ends but I think its more complicated than that. system error wrote: It would be nice and fun to have this, even a basic one should do so that we can bitch-slapping each other over whose codes are the fastest or the shortest. | |||
|  05 Dec 2014, 13:14 | 
 | 
| Tomasz Grysztar 05 Dec 2014, 13:21 For probably the least intrusive measurements the statistical profiling is a nice concept to play with. Though I have only ever used it in DOS environment. | |||
|  05 Dec 2014, 13:21 | 
 | 
| system error 05 Dec 2014, 13:25 revolution wrote: Speed of execution is not a constant. Different systems do things in different ways, some things are faster and other things are slower, so comparisons are mostly pointless for people with disparate systems. But could be useful as per code comparison, like seeing the effect of unrolling my loops things like that. Without it, I am completely clueless on code performance. Nothing. Nada. I just write code for the sake of completion. | |||
|  05 Dec 2014, 13:25 | 
 | 
| system error 05 Dec 2014, 13:28 Tomasz Grysztar wrote: For probably the least intrusive measurements the statistical profiling is a nice concept to play with. Though I have only ever used it in DOS environment. | |||
|  05 Dec 2014, 13:28 | 
 | 
| revolution 05 Dec 2014, 13:34 system error wrote: But could be useful as per code comparison, like seeing the effect of unrolling my loops things like that. Without it, I am completely clueless on code performance. Nothing. Nada. I just write code for the sake of completion. I actually think that a custom written profiler for each app is really the only sensible and most reliable method. Some form of universal profiler is not gong to make the grade if one is really serious about performance. But if you are just dabbling for "bitch-slapping" reasons then I guess any profiler will do, even if it is wrong. | |||
|  05 Dec 2014, 13:34 | 
 | 
| gens 05 Dec 2014, 14:07 sampling profiling
 codexl does it for linux there is the perf framework https://perf.wiki.kernel.org/index.php/Main_Page program has to have debug symbols ofc, you can diy it http://en.wikipedia.org/wiki/Hardware_performance_counter guess that is what Tomasz did rdtsc is fine but you should account for it changing the alignment so rdtsc stuff - align 16 - loop measured intel cpu's would probably slow down on crossing the page border, so aligning to page length and ofc make the loop measured run for long enough to increase the precision of the measurement | |||
|  05 Dec 2014, 14:07 | 
 | 
| revolution 05 Dec 2014, 14:18 RDTSC has more problem than just that. Thread migration and interruption begin the most damning events that will make the readings meaningless. | |||
|  05 Dec 2014, 14:18 | 
 | 
| system error 05 Dec 2014, 14:23 gens wrote: sampling profiling Code: perf stat -B ./low64 0.00324449999999999 Performance counter stats for './low64': 0.151585 task-clock (msec) # 0.231 CPUs utilized 2 context-switches # 0.013 M/sec 0 cpu-migrations # 0.000 K/sec 3 page-faults # 0.020 M/sec 196,858 cycles # 1.299 GHz 79,876 stalled-cycles-frontend # 40.58% frontend cycles idle <not counted> stalled-cycles-backend <not counted> instructions <not counted> branches <not counted> branch-misses 0.000655774 seconds time elapsed It's a statistical program I made sometime ago. How does the result look like to you? Bad or bad? | |||
|  05 Dec 2014, 14:23 | 
 | 
| revolution 05 Dec 2014, 14:26 It looks very bad. You are not running it long enough to gather enough long term data.
 Usually profilers are run on long running tasks that need to run faster to save money or something. For a program that runs within a single tick of 16ms you won't see anything meaningful. | |||
|  05 Dec 2014, 14:26 | 
 | 
| system error 05 Dec 2014, 14:29 revolution wrote: 
 revo, we can't just skip code performance measures just because it is not easy to implement. It is not healthy for FASM programmers. | |||
|  05 Dec 2014, 14:29 | 
 | 
| system error 05 Dec 2014, 14:33 revolution wrote: It looks very bad. HAHAHA! I know its bad, but I don't know how bad. I can't tell which one is which from the generated information. But its a large program though. Compiles to 930 bytes in size. Math-intensive. | |||
|  05 Dec 2014, 14:33 | 
 | 
| revolution 05 Dec 2014, 14:38 system error wrote: revo, we can't just skip code performance measures just because it is not easy to implement. It is not healthy for FASM programmers. | |||
|  05 Dec 2014, 14:38 | 
 | 
| system error 05 Dec 2014, 14:59 revolution wrote: 
 I know what you mean revo. But hard route needs tools. At least a basic tool. That's why I am asking because I haven't seen one. I've seen many in MASM circle and they are making good use of it. In the end what you'll see is FASM write-to-completion coders vs MASM performance-aware coders. It is not healthy and should not make it into a habit. | |||
|  05 Dec 2014, 14:59 | 
 | 
| gens 05 Dec 2014, 15:23 run your loop 100 000 times then divide the run length by 100 000
 sampling profilers work by reading the instruction pointer every once in a while, amongst other things so the loop HAS to run for a while if you want usable results tools like perf, codexl and such can show you where the program spends it's time, the whole program and only if it runs for a longer time for a loop you would be better to measure it yourself you have to know about things like alignment, instruction and data caches and so on or you can get misleading results oh ye, and put your cpu to performance mode forgot to say, also symbols under linux | |||
|  05 Dec 2014, 15:23 | 
 | 
| AsmGuru62 05 Dec 2014, 15:53 I am writing the code generation utility for FASM (and its been long time).
 It will have the profiling ability beside the OOP code generation. But there will be no CPU clocks - just the counters for every function: 1. The # of times called 2. The total amount of time spent inside (using QueryPerformanceCounter API) I am aiming this for large projects, so you can make the code with probes and then get the statistics. Then switch it off with a checkbox and build the released version. | |||
|  05 Dec 2014, 15:53 | 
 | 
| Goto page 1, 2  Next < Last Thread | Next Thread > | 
| Forum Rules: 
 | 
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.