flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
bitRAKE 14 Jul 2013, 11:31
Code: use32 mov eax,$80000005 @@: dec eax push eax cpuid xchg edx,[esp] push ecx ebx eax xchg eax,edx cmp al,2 jnz @B |
|||
![]() |
|
TightCoderEx 14 Jul 2013, 16:40
Original 16 bit version.
Code: 00 66B804000080 mov eax,0x80000004 06 6650 push eax 08 0FA2 cpuid 0A 6766871424 xchg edx,[esp] 0F 6651 push ecx 11 6653 push ebx 13 6650 push eax 15 80FA02 cmp dl,0x2 18 7407 jz 0x21 1A 6689D0 mov eax,edx 1D FEC8 dec al 1F EBE5 jmp short 0x6 21 = 33 Bytes Improved 16 bit version. In my case I'll need to use this version because boot loader has just passed control to here and still in REAL mode. Code: 00 66B805000080 mov eax,0x80000005 06 6648 dec eax 08 6650 push eax 0A 0FA2 cpuid 0C 6766871424 xchg edx,[esp] 11 6651 push ecx 13 6653 push ebx 15 6650 push eax 17 6692 xchg eax,edx 19 3C02 cmp al,0x2 1B 75E9 jnz 0x6 1D = 30 Bytes 32 bit version as per bitRAKE's example. Code: 00 B805000080 mov eax,0x80000005 05 48 dec eax 06 50 push eax 07 0FA2 cpuid 09 871424 xchg edx,[esp] 0C 51 push ecx 0D 53 push ebx 0E 50 push eax 0F 92 xchg eax,edx 10 3C02 cmp al,0x2 12 75F1 jnz 0x5 14 = 20 Bytes and 64 bit version Code: 00 B805000080 mov eax,0x80000005 05 FFC8 dec eax 07 50 push rax 08 0FA2 cpuid 0A 67871424 xchg edx,[esp] 0E 51 push rcx 0F 53 push rbx 10 50 push rax 11 92 xchg eax,edx 12 3C02 cmp al,0x2 14 75EF jnz 0x5 16 = 22 Bytres |
|||
![]() |
|
bitRAKE 14 Jul 2013, 23:07
The default address size for use64 is 64-bit, so a byte is saved with "xchg edx,[rsp]". Also, the string is no longer contiguous in memory. Maybe something like,
Code: use64 mov eax,$80000005 @@: dec eax push rax cpuid xchg ecx,[rsp] mov [rsp+4],edx push rax mov [rsp+4],ebx xchg eax,ecx cmp al,2 jnz @B Code: use16 mov ax,(4+1)*2 + 1 @@: dec ax dec ax ; sub al,2 push ax cwde ror eax,1 cpuid pop bp pushad xchg ax,bp cmp al,2 jnz @B ![]() Oddly, in 16-bit mode, "dec ax" can be used without effecting the upper word, and is one byte shorter. Also, the addressing mode is 16-bit -- "xchg edx,[sp]" saves a byte. |
|||
![]() |
|
TightCoderEx 15 Jul 2013, 02:22
Yes, I did realize my error for 64 bit, even before I read your message.
I tried one simple mod, that added 4 to RSP after each push, but that failed miserably. Don't know if this is the most efficient way, but it works. Code: mov eax, 0x80000005 @@: dec eax push rax cpuid shl rdx, 32 add rdx, rcx xchg [rsp], rdx shl rbx, 32 add rbx, rax push rbx xchg rdx, rax cmp al, 2 jmp @B Result: Quote:
|
|||
![]() |
|
TightCoderEx 15 Jul 2013, 03:39
64 Bit version of yours @ 27 bytes worked great.
bitRAKE wrote: 21 bytes, and shrinking... Interesting concept and I particularly like the preamble. Demonstrates how an acute knowledge of architecture can lead to creative ideas. Unfortunately, it didn't work. Code: mov ax,(4+1)*2 + 1 @@: dec ax dec ax ; sub al,2 push ax cwde ror eax, 1 cpuid pop bp pushad add sp, 16 ; Need to omit extraneous registers xchg ax, bp cmp al,5 ; AX saved before ROR. jnz @B Even this version still doesn't produce desired result due to the way pushad saves registers. l(R)e(TM CorInte CPU ) i5 @ GHz2.67 750 when it is supposed to look like this Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz This code yields the desired result, but at a saving of 2 more bytes over the best previous 16 bit version of mine being 30 bytes. Code: mov ax,(4+1)*2 + 1 @@: dec ax dec ax ; sub al,2 push ax cwde ror eax, 1 cpuid pop di push edx push ecx push ebx push eax push di pop ax cmp al,5 ; AX saved before ROR. jnz @B NOTE: The difference in strings 16 vs 64 is that Bochs is used for 16 bit testing and FDBG is used for 64, hence emulated versus real. Last edited by TightCoderEx on 15 Jul 2013, 03:48; edited 1 time in total |
|||
![]() |
|
BAiC 15 Jul 2013, 03:40
you don't need to use the stack for the loop variable. simply use rsi, rdi, or even rbp as the loop variable (or any of the number registers):
Code: mov esi, 0x80000005 @@: dec si mov eax, esi cpuid shl rdx, 32 shl rbx, 32 add rdx, rcx add rbx, rax push rdx push rbx cmp si, 2 jnz @B _________________ byte me. |
|||
![]() |
|
bitRAKE 15 Jul 2013, 03:56
TightCoderEx wrote: Unfortunately, it didn't work. XCHG AX,DI is only one byte. |
|||
![]() |
|
TightCoderEx 15 Jul 2013, 05:14
bitRAKE wrote: I was quite explicit about the display routine needed to account for the lack of organization in the data. I guess I should have been explicit, but the failure part would have come from the processor throwing an exception when 7FFFFFF would have been passed to CPUID. This is what would have happened with the three extra iteration by comparing AL with 2 rather than 5. I never tested the exception theory, as a valuable tool as Bochs is, it's still an emulator and may not mimic real hardware, or at least the hardware I have. |
|||
![]() |
|
bitRAKE 15 Jul 2013, 05:43
That's what I get for not testing the code. DOSBox is the only thing I have on this machine, atm. Thank you for explaining the error further.
|
|||
![]() |
|
baldr 15 Jul 2013, 23:11
Since there's no clearly defined specification for the snippet, I've made it using esi:
Code: _cpuid: mov esi, 0x80000003 ; 5 .prev: lea eax, [esi+1] ; 3 cpuid ; 2 push edx ; 1 push ecx ; 1 push ebx ; 1 push eax ; 1 dec esi ; 1 jpo .prev ; 2 Code: _squeeze: mov esi, esp; 2 mov edi, esp; 2 xor ecx, ecx; 2; reset flag: don't copy .next: lodsb ; 1 test al, al ; 2 jz .done ; 2; almost done if NUL cmp al, ' ' ; 2 jecxz .check ; 2; don't copy? .copy: stosb ; 1; copy jne .next ; 2; repeat if not ' ' not ecx ; 2; ' ' copied, reset flag .check: je .next ; 2; either via 'jecxz', then ZF indicates al==' '; continue to skip ' ' if so ; or ecx was ==-1 (now ==0), then ZF==1; proceed to skip ' ' not ecx ; 2; set flag (copy); we can get here only if ecx was ==0 && al!=' ' jmp .copy ; 2; proceed to copy non-' '; note that ZF==0, 'jne .next' uses it .done: cmp esp, edi; 2; CF==edi>esp; this accounts for brand string of all ' 's sbb edi, ecx; 2 ; ecx==0: ; CF==1: edi>esp => skipping after copying => trailing space, decrement ; CF==0: edi==esp => string contains only ' 's => no trailing space, no decrement ; ecx==-1: edi>esp since we were copying => CF==1 => no decrement stosb ; 1; store NUL ![]() |
|||
![]() |
|
TightCoderEx 16 Jul 2013, 02:32
baldr wrote: Since there's no clearly defined specification for the snippet Quite true, but I think it's safe to assume at minimum most of us try to get as much computing done as possible with the least amount of instructions. This does result in saving space and hopefully time too, but it seems that's not always guaranteed. Till now it was indeterminant where exactly I wanted to put this, but all things considerd and thanks to everyone's contributions the specifications are as follows; A 16 bit routine that returns a pointer in ES:DI to Processors brand string , excluding leading spaces and length in CX minus terminator. Code: 00 66BE03000080 mov esi,0x80000003 06 67668D4601 lea eax,[esi+0x1] 0B 0FA2 cpuid 0D 6652 push edx 0F 6651 push ecx 11 6653 push ebx 13 6650 push eax 15 664E dec esi 17 7BED jpo 0x6 19 16 push ss 1A 07 pop es 1B 89E7 mov di,sp 1D 83C9FF or cx,-1 20 B020 mov al,' ' 22 F3AE repe scasb 24 4F dec di 25 83C131 add cx, 31H NOTE: As this is going to be used immediately after my boot loader is finished, all registers are volatile, except those being used in the process. Protected mode is the most efficient space wise, but as I'm going to be going directly into long mode from real it's really not an option in this case. In conclusion, more functionality packed into 8 bytes less than my original posting. |
|||
![]() |
|
baldr 16 Jul 2013, 04:43
TightCoderEx,
If you're not against self-modifying code, 2 bytes can be shaved off as follows: Code: use16 _cpuid: mov eax, 0x80000003 label .al byte at $-4 inc ax cpuid push edx ecx ebx eax dec [.al] jnz _cpuid By the way, on my netbook cpuid shows this spacing: Code: 47 65 6E 75 69 6E 65 20-49 6E 74 65 6C 28 52 29 Genuine Intel(R) 20 43 50 55 20 20 20 20-20 20 20 20 20 20 20 55 CPU U 32 33 30 30 20 20 40 20-31 2E 32 30 47 48 7A 00 2300 @ 1.20GHz. P.S. Long/IA-32e mode is only reachable via protected mode, isn't it? So why don't you take opportunity to execute this code somewhere between RM and LM, with shorter encoding? Another 5 bytes. |
|||
![]() |
|
TightCoderEx 16 Jul 2013, 06:46
On my Asus P8 Z-77V LK, there are 7 leading spaces and as this is just a one time thing, I'm just going to design my screen for 48 bytes.
Self modifying code is something I used to use a lot on Z80 CPM machines. I changed your code slightly to my style and to fortify in my mind what "label" does. I notice most programmers initialize segments to known values and in 16 bit a lot of times DS = CS. In my Boot Loader DS points to 5 segments below CS thus only saving one byte. Code: label Value word at $ + 2 @@: mov eax, 0x80000003 inc ax cpuid push edx push ecx push ebx push eax dec [cs:Value] jnz @B push ss pop es mov di, sp or cx, -1 mov al, ' ' repz scasb dec di add cx, 49 I used "word" instead of "byte" just to see if there was any change in code. |
|||
![]() |
|
TightCoderEx 16 Jul 2013, 15:34
baldr wrote: Long/IA-32e mode is only reachable via protected mode, isn't it? Long Mode Directly |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.