flat assembler
Message board for the users of flat assembler.

Index > Main > UEFI questions

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 26 Aug 2016, 09:22
Thanks! I'm reading Agner Fog's manuals on optimizing x86 assembly and Intel microarchitecture particularities (link found in recent topics here on the board), so it is just today that I got to know what you are talking about. Embarassed
Post 26 Aug 2016, 09:22
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 26 Aug 2016, 18:13
MrFox wrote:
Afterthought: maybe that's done this way to prevent local overheating EAX cpu cirquitry?

Like revolution said, overheating is not a problem. But one thing you might be interested in and keep in mind is the possibility of pipeline stalls caused by register accesses. Although modern CPUs do register renaming and even try to execute instructions out of order when detect that those are not relative to each other, avoiding access to the same register by subsequent instructions (very simplified explanation!) may let you gain something in terms of performance.

And, although some people say that preliminary optimization is bad, having a habit to write code with these simple optimization techniques in mind doesn’t make things worse.
Post 26 Aug 2016, 18:13
View user's profile Send private message Visit poster's website Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 26 Aug 2016, 18:50
Thanks Dimon!
DimonSoft wrote:
avoiding access to the same register by subsequent instructions (very simplified explanation!) may let you gain something in terms of performance.
Can you give a more comprehensive explanation or drop a link where I can read more about it?
(Either in English or in Russian)
Post 26 Aug 2016, 18:50
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Aug 2016, 19:17
DimonSoft wrote:
... modern CPUs do register renaming ... avoiding access to the same register by subsequent instructions (very simplified explanation!) may let you gain something in terms of performance.
Erm, are you talking about dependency chain length? Because manually reordering instructions to trying and interleave dependency chains it tricky and can be confusing. In most cases letting the OOO engine do its job will be just fine. And of course testing in the final app will tell you if it makes any difference at runtime.

As usual it all depends upon what you are doing. Different situations will yield different results.
Post 26 Aug 2016, 19:17
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 27 Aug 2016, 12:30
revolution wrote:
DimonSoft wrote:
... modern CPUs do register renaming ... avoiding access to the same register by subsequent instructions (very simplified explanation!) may let you gain something in terms of performance.
Erm, are you talking about dependency chain length? Because manually reordering instructions to trying and interleave dependency chains it tricky and can be confusing. In most cases letting the OOO engine do its job will be just fine. And of course testing in the final app will tell you if it makes any difference at runtime.

As usual it all depends upon what you are doing. Different situations will yield different results.

What I actually meant is that in spite of the presence of register renaming and similar features writing code that uses a single register ten instructions in a row is not really a good thing. In most cases there’s a pretty straight-forward way to split calculations (or, more common case, moving data around) so that there’re gaps of a few instructions between any two instructions that access the same register. I could think about a better example but, I think, this completely abstract one will be enough to explain my point:

Code:
; Case 1
mov     ecx, [Value1]
mov     edx, [Value2]
inc     ecx
dec     edx
add     ecx, [Value3]
imul    edx, [Value4]
add     edx, ecx
mov     [Result1], ecx
mov     [Result2], edx

; Case 2
mov     ecx, [Value1]
inc     ecx
add     ecx, [Value3]
mov     edx, [Value2]
dec     edx
imul    edx, [Value4]
add     edx, ecx
mov     [Result1], ecx
mov     [Result2], edx    

What I wanted to say by my previous comment is that the first piece of code might cause fewer stalls while, in most cases, still being well readable. There’re quite a lot of cases when you do very similar things to some pieces of data, and sticking to using a single register, like in MrFox’s example with EAX, is not a good idea anyway.

But, once again, it’s not really important until the code becomes a performance bottleneck.
Post 27 Aug 2016, 12:30
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4072
Location: vpcmpistri
bitRAKE 27 Aug 2016, 17:00
MrFox wrote:
Thanks Dimon!
DimonSoft wrote:
avoiding access to the same register by subsequent instructions (very simplified explanation!) may let you gain something in terms of performance.
Can you give a more comprehensive explanation or drop a link where I can read more about it?
(Either in English or in Russian)
I couldn't find the Russian language version, but a good read through the optimization manual would be profitable.

http://www.intel.ru/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf

(More specifically, the introduction to Appendix C.)

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 27 Aug 2016, 17:00
View user's profile Send private message Visit poster's website Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 27 Aug 2016, 17:43
Thanks bitRAKE!
As always, Intel doesn't bother to translate stuff into languages. Actually I'm fine with English so I'll definitely give it a go.

Thanks Dimon. I'll keep that in mind.
Post 27 Aug 2016, 17:43
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 29 Aug 2016, 16:02
MrFox wrote:

Question #2
I'm wracking my brains around the http://wiki.osdev.org/Uefi.inc and can't figure out why on earth the guy needed all those bells and whistles with redefining data types and structures


This is because (if you read UEFI Spec carefully) data in structures should be naturally aligned. dwords should start at 4 byte boundaries, qwords - at 8-byte boundaries, etc. So new aligned data types were introduced. Y

Example of naturally aligning data in a struct:

Code:
struct EFI_SYSTEM_TABLE
  Hdr                   EFI_TABLE_HEADER
  FirmwareVendor        rq 1
  FirmwareRevision      rd 1
                        rd 1  ; alignment for the next qword
  ConsoleInHandle       rq 1
  ConIn                 rq 1
  ConsoleOutHandle      rq 1
  ConOut                rq 1
  StandardErrorHandle   rq 1
  StdErr                rq 1
  RuntimeServices       rq 1
  BootServices          rq 1
  NumberOfTableEntries  rq 1
  ConfigurationTable    rq 1
ends    
Post 29 Aug 2016, 16:02
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 01 Sep 2016, 08:15
Thanks zhak

I'm reading the spec as carefully as I can but it's going to be a long-long read so I may be missing some parts of it.

Another question that I have is whether the addresses of SystemTable and its subtables are fixed during the execution of my program or can some of them change suddenly?

The thing is, I'm going to save frequently used pointers to procedures such as SetCursorPosition, OutputString, etc, directly to local memory variables or even in registers so I don't have to refetch them each time I need them from SystemTable entries.

This approach works and saves me a lot of code size and CPU time but it's safe to use only if we assume that the addresses of SystemTable, ConOut, ConIn and other things stay the same during the whole program run. Are they?
Post 01 Sep 2016, 08:15
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 01 Sep 2016, 15:09
Yes, they're fixed once you obtain them, and you can safely store shortcuts in regs/memory
Post 01 Sep 2016, 15:09
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 02 Sep 2016, 17:54
Okies, Thanks!
Yet another question: Is there any way I can use the original tables in my code or do I always have to store them locally to use directly?

I'd like to use things like:
Code:
; Code

;->> Copying Actual tables SystemTable and SimpleTextOutput to local memory (gBS and tXT respectively)

; Something-something

lea eax,[string_to_print]
push eax
push [gBS.ConOut]
call [tXT.OutputString]
add esp, 8

; Data
gBS  EFI_SYSTEM_TABLE
tXT SIMPLE_TEXT_OUTPUT
    

I can do so by declaring e.g. a local memory variable 'gBS' of type/struct 'EFI_SYSTEM_TABLE' and then copying the actual System Table contents to my local copy. Is there any way I can directly reference the original tables residing somewhere in memory for 'gBS' without copying?
Post 02 Sep 2016, 17:54
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 03 Sep 2016, 11:42
I don't understand why do you want to copy EFI_SYSTEM_TABLE or any other interface to local mem. It doesn't have any pros, actually. You already have a pointer to the structure. If you want to create system call wrappers (e.g. to push all arguments on stack, as you described), you can reuse the pointer to the struct in your wrapper proc. But I myself don't like wrapper procs. Yet they bring some convenience and make writing code easier, it also takes additional unnecessary cycles to execute the code at runtime.

Using a pointer to a pointer, like
Code:
  mov rcx, [r15 + EFI_SYSTEM_TABLE.ConOut]
  call [rcx + EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.OutputString]
    

is not such a great overhead worthy copying structures here and there, IMO
Post 03 Sep 2016, 11:42
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 04 Sep 2016, 07:58
Thanks, that's exactly what I wanted to know!
I'm writing 32-bit code (coz my EFI is 32bit) so I'm not allowed to use rcx, r15, etc.
As for wrapper procs, all UEFI procs use CDECL convention so they expect the parameters to be 'pushed' in certain order before calling and force me to clean the stack afterwards, otherwise they'll hang the PC. Internally (for my own subroutines), I use FASTCALL, i.e. simply put values to be passed to a subroutine to registers and call the proc. This is indeed much simpler.
But it works only with a limited number of params and it occupies registers which I constantly have to keep in mind in order not to erase/rewrite a register I will need later.
Post 04 Sep 2016, 07:58
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 04 Sep 2016, 09:37
Oh wow. nice. Where did you get 32-bit EFI? what hardware installation do you have? What firmware version? I've never had a chance to see one
Post 04 Sep 2016, 09:37
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 04 Sep 2016, 15:10
Well, I guess 32bit EFI-s are a dime a dozen on cheap laptops/tablets. At least my Irbis TW36 seems to be one of them. It runs on a 4-core 64bit Atom CPU but still, all software is 32bit, namely EFI and preinstalled Windows-10. My laptop-transformer has 1G RAM + 32G HDD.

Well, it's just a $100 nice little toy that's just good enough for what I want from it.

It says it has UEFI v2.40 (INSYDE Corp.). InBuilt EFI Shell v2.1 is one of default boot options (although I usually choose 'Boot From File' to test my efi procs).
Post 04 Sep 2016, 15:10
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 20 Sep 2016, 06:23
Hi again!
I'm now trying to decide whether to write my own UEFI Console UI engine or use the existing one. But I'm still unable to get a grasp of it from the Spec as it lacks of code examples or detailed explanation.
Where can I find working examples (written in C or ASM) utilizing in-built "Forms" feature of UEFI?
Post 20 Sep 2016, 06:23
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 23 Sep 2016, 13:11
I myself yet haven't worked with HII, so can't give any examples. But you could try starting with http://uefi.blogspot.com/2009/09/uefi-hii-part-1.html
Post 23 Sep 2016, 13:11
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 23 Sep 2016, 19:34
Thanks, I'll try to give it a go.
Post 23 Sep 2016, 19:34
View user's profile Send private message Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 25 Sep 2016, 22:04
Here I found a HII Training by Intel - https://www.youtube.com/watch?v=11PIctg6pz8
Getting myself prepared to dive into HII as well Smile Just going down the spec trying everything, but it is sooo huge sometimes it looks like it'll take forever to go through all of it
Post 25 Sep 2016, 22:04
View user's profile Send private message Reply with quote
MrFox



Joined: 17 Aug 2016
Posts: 52
Location: Russia
MrFox 28 Sep 2016, 07:52
Yup, same here. That's why I decided not to wrack my brains around this topic and write my own simple 'Forms' engine for my simple project.

But still, the topic is of great interest and I would like to ask you to kindly share your experience if you run into something HII-useful and quick to learn and implement.

Some people say "How to eat an elephant? -- Piece by piece". If you find a self-sufficient 'piece' of code or knowlege, please let me know.
Post 28 Sep 2016, 07:52
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.