flat assembler
Message board for the users of flat assembler.

Index > Main > Advice needed: data structures

Author
Thread Post new topic Reply to topic
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek
Hey!

I'm writing a small tool, which parses a file of a fixed structure (see below), and finally, it will be shown in a explorer-like treeview.

The file structure is following: Header part (which is irrelevant) and Data part.
Data part consists of Key string and value description:
Code:
[/]
   Type=empty
[/parent1]
   Type=empty
[/parent1/child1]
  Type=string
  Value="This is value"
  Length=26
[/parent1/child1/definition]
  Type=empty
[/parent1/child1/definition/log]
  Persistence=full
  Type=integer
  Value=5
[/parent1/child2]
...
[/parentN/child2/.../any/random/stuff]
...
[/parentN/childN/.../another/random/stuff]
...
    


Maximum number of entries in key string is known, let's say 16.

The way it needed to be shown to user:
Code:
/ (root node)
|
+-Parent1
| |
| +Child1
| | |
| | +Definition
| |  |
| |  +Log
| |   |
| |   +Random stuff
| +Child2
|
+-ParentN
    


And when the user clicks on, let's say, "Log", he sees in the right plane that it's of type string, and the according value, but that's not of concern.

I believe the best structure to use will be "Tree", but I never wrote something like that on assembler.

Which tree to use? Binary or M-ary? How to store the parent-child relations? How to allocate memory for members? What is the best practice for such tasks?

I have already implemented parsing function for key string:
Code:
; parseKey - parses siebel ns key string (i.e. [/enterprises/waffle/stomper])
; Parameters:
;       lpBuffer - out: pointer to receiving buffer
;       lpStr    - in : pointer to input key string
;       bStrLen  - in : input key string length (BYTE)
;       keyNum   - in : index of key to extract (starting with 1)
; Returns:
;       lpBuffer - contains the extracted key
;       EAX:     - key length - if everything ok
;                -  0         - if something went wrong or key index out of bounds
;                - -1         - if key is root key ([/])
proc parseKey uses esi edi, lpBuffer:DWORD,lpStr:DWORD,bStrLen:BYTE,keyNum:BYTE
        xor     eax,eax
     ; Checking for zero pointers. We don't need those
  cmp     [lpBuffer],eax
      je      .finish
     cmp     [lpStr],eax
 je      .finish
     cmp     [keyNum],al
 je      .finish
     cmp     [bStrLen],al
        je      .finish

 ; Checking for [/] entry
    mov     esi,[lpStr] ;[/]
    mov     eax,[esi]
   and     eax,00FFFFFFh
       cmp     eax,5D2F5Bh
 je      .root

   ; Extracting the key
        movzx   ecx,[bStrLen]
       movzx   edx,[keyNum]
        mov     edi,[lpStr]
 mov     al,'/'                ;This is used as delimiter in key string
      @@:
   repne   scasb
       or      ecx,ecx
     jz      .error          ;We hit an end of string before we could reach the desired key number
       dec     edx
 jnz     @b              ;the beatings will continue till we reach the desired key

       mov     esi,edi         ;when we reach the desired key
      mov     edi,[lpBuffer]  ;prepare to copy it to destination buffer

       xor     ecx,ecx
  .copy_loop:
        lodsb
       cmp     al,'/'
    je      .end_copy
   cmp     al,']'
        je      .end_copy
   cmp     al,0ah
      je      .error
      or      al,al
       je      .error
      stosb
       inc     ecx
 jmp     .copy_loop
  .end_copy:
      mov     byte[edi],0
 mov     eax,ecx
  .finish:
   ret

  .error:
    xor     eax,eax
     jmp     .finish
  .root:
     mov     eax,-1
      jmp     .finish
endp
    


Searching in stream for next Key String:
Code:
; findNextKeyString: searches for next key string in NS file
; Parameters:
;       nsPtr      - in: current position pointer in NS buffer
;       dwBufferSz - in: number of bytes till the end of buffer
; Returns:
;       EAX     - pointer to next key
proc findNextKeyString uses esi, nsPtr:DWORD,dwBufferSz:DWORD
    mov     esi,[nsPtr]
 mov     ecx,[dwBufferSz]
    xor     eax,eax
      @@:
    mov     ah,al
       mov     al,byte[esi]
        inc     esi
 dec     ecx
 jz      .error
      cmp     al,5Bh
      jne     @b
  cmp     ah,0ah
      jne     @b
  lea     eax,[esi-1]
 jmp     .finish
  .error:
    xor     eax,eax
  .finish:
   ret
endp
    


Counting number of entries:
Code:
proc getNumEntries uses edx esi ebx, nsPtr:DWORD,file_size:DWORD,dwHdrSize:DWORD
virtual at ebx
  .fsize LARGE_INTEGER
end virtual
    mov     ebx,[file_size]
     mov     ecx,[.fsize.LowPart]            ;just the low part for now
  sub     ecx,[dwHdrSize]                 ;minus header size
  xor     edx,edx
     mov     esi,[nsPtr]
 mov     al,0ah
  .next:
      mov     ah,al                           ;save previous byte (mov is faster than shl/shr)
    mov     al,byte[esi]                    ;this is faster than "lodsb"
      inc     esi
 cmp     al,5bh
      jz      .fffound
      @@:       
    dec     ecx
 jz      .end_loop
   jmp     .next
  .fffound:
    cmp     ah,0ah
      jnz     @b
  inc     edx
 jmp     @b
  
  .end_loop:
        mov     eax,edx
     ret
endp
    


and hash:
Code:
proc peggeHash512 uses esi ecx, lpStr:DWORD,strLen:DWORD
;http://www.jose.it-berater.org/smfforum/index.php?topic=2388.0
      xor     eax,eax
     mov     esi,[lpStr]
 mov     ecx,[strLen]
        mov     ah,cl
       mov     al,byte[esi]
        cmp     cl,1
        jz      .xhash
      shl     eax,3
       xor     al,byte[esi+1]
      shl     eax,3
  .nexh:
       rol     eax,1
       xor     al,byte[esi]
        inc     esi
 dec     ecx
 jg      .nexh
  .xhash:
      ret
endp
    


But I'm not sure how to put the stuff together.

Please advice!
Post 03 May 2012, 06:08
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
There are different approaches to the trees.
At first, you can use field ".ptrChildren" in your data record, that to contain pointer to array with the children records.

The second approach is that the tree can be represented by simple linear array. You only need some field to contain the level information. In your case the elements of the array will have:

Parent1: .level = 1
Child1: .level = 2
Definition: .level = 3
Log: .level = 4
Random stuff: .level = 5
Child2: .level = 2
ParentN: .level = 1
Post 03 May 2012, 06:52
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
you could do:

+Node
|
|__[x] data_size (omit if fixed size)
|
|__[x] data
|
|--+ Child (location = Parent's data.size)
Post 03 May 2012, 07:54
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
Using hash: (random locations, MR Big-O will complain)

Node
--header--
.hash
.data_size (again, omit if fixed size)
.parent (by hash)
.child
--end header--
.data

when reading you'd read the header first then read .data_size bytes to advance.
Post 03 May 2012, 07:58
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek
Thanks, JohnFound!

About the second approach, how do children connect to parents?

Also, is there a best method to find a record in a tree without having to scan through the whole tree, if using the first approach, where parent record contains a pointer to the array of child records?

UPD:
Thanks, typedef!

I'll try to implement it.

Maybe it's a good idea to keep a parallel lookup structure containing (hash,address) values?
Post 03 May 2012, 08:30
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
you can also use the indexed "Natural Array mapping". Binary Tree Approach

Node
{
index
sizeof_data
data
}
the first record would have to be unused with index 0

Parent location = index/2
left child = 2index + 1
right child = 2index

but then again it'll depend on the way you'll save the data to the file since that will be your sorting/mapping time.

Also windows controls provide LPARAM pointer. With this pointer you can use to store your items. Actually you need not to sort them you just traverse the TreeView and filling the structure then save to file. Quite easy
Post 03 May 2012, 09:01
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek
I managed to build the tree using SysTreeView32 and the following structure:
Code:
struct SIEBNSENTRY              ;size= 32
        itemHash        dd ?    ;4 peggeHash512 hash
        itemId          dd ?    ;4 itemId in treeContol
        parAddr         dd ?    ;4
        keyName         dd ?    ;4 the address of the key name in keyData storage (keyData dd ? - should be dynamically alocated)
        keyLength       dd ?    ;4 length of the key name in keyData storage
        kPersistence    db ?    ;1
        kType           db ?    ;1
        kValue          dd ?    ;4 if "string" - pointer to value, else - value itself
        kLen            dw ?    ;2 Value length * 2
        reserved        dd ?    ;4
ends
    


Where:
itemHash - is the hash of the string, i.e. 'enterprise';
itemId - itemID returned by AddItemin the SysTreeView32
parAddr - parent item index in the array
...
program-specific fields
...
reserved - 32 byte padding

Then I count the number of items to be added, i.e. 65000, then allocate 65000*32+4 bytes of memory.
Additional 4 bytes are placed at the start of the array, and stores the number of items currently stored in this array.

The search starts comparing at the last item (having maximum index), then goes to the parent record,each time comparing hashes,
until it hits the first item in the chain (the root of the tree). If item not found in array, it's being added to the end of array,
and parAddr is updated with the index of the parent record.
Post 14 May 2012, 20:12
View user's profile Send private message Reply with quote
shutdownall



Joined: 02 Apr 2010
Posts: 518
Location: Munich
shutdownall
This is a bit similar to XML structures.
There are many tools for handling XML.
Even every browser can display.

So it would maybe spare time of development to use XML I think. Wink
Post 15 May 2012, 11:51
View user's profile Send private message Send e-mail Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek
shutdownall wrote:
This is a bit similar to XML structures.
There are many tools for handling XML.
Even every browser can display.

So it would maybe spare time of development to use XML I think. Wink

Unfortunately, these files are not something I invented, but I do have to deal with them - they are part on Siebel CRM system - they contain description of every component in the system, + log levels, etc.
When migrating the configuration from one server to another - it's really painful to go through all keys and fix them manually (changing server names, enterprise name, number of multiservers etc.), so that would be a tool with find/replace features, automatic key length maintaining, and checksum fixer.
Post 15 May 2012, 12:59
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
If time is a choke, just do it in JAVA or C++ and get over it with. Very Happy
Post 15 May 2012, 19:35
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek
typedef wrote:
If time is a choke, just do it in JAVA or C++ and get over it with. Very Happy

Very Happy

Fortunately, it is just an ad-hoc task to make mine and my colleagues life easier
Post 15 May 2012, 20:49
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.