flat assembler
Message board for the users of flat assembler.
Index
> Main > Usefulness of assembly in modern projects Goto page Previous 1, 2 |
Author |
|
DimonSoft 31 Aug 2019, 06:45
st wrote: I expect we would distinguish proving correctness from testing. The former means a mathematical proof and is hard (NP-Hard?) for imperative programming languages like assembly (could be solved with a brute-force algorithm sometimes). That is why functional languages exist. I did mean exactly proving the correctness, not testing. Structured programming has been invented quite long ago and if you feel its three basic structures (sequence, binary decision and generic loop), you can easily evaluate possible value sets for all the variables used in this particular piece of code and make sure that for these values the documented behaviour of language constructs or processor instructions leads to the desired result. Something like abstract interpretation that happens in one’s mind. And, of course, if such evaluation becomes hard it absolutely means the code should be refactored to decrease its complexity. P.S. You definitely can’t use such proving for arbitrary assembly code. But spaghetti machine code is quite rare these days. Modern processors tend to execute structured code faster, so outside of obfuscation (performed over a provable piece of code with methods that are proved to produce correct results) and extreme size optimizations (the same note here), non-structured machine code doesn’t seem to be widely used. Well, definitely, you should be able to recognize the basic structures to prove correctness but for a particular compiler it’s quite easy. |
|||
31 Aug 2019, 06:45 |
|
st 01 Sep 2019, 06:29
DimonSoft wrote: you can easily evaluate possible value sets for all the variables used in this particular piece of code |
|||
01 Sep 2019, 06:29 |
|
DimonSoft 01 Sep 2019, 09:32
st wrote:
“Abstract interpretation” is your search request. Modern compilers have been doing it for ages to ensure as many bugs as possible are found at compile time. A good programmer does the same but is not limited to a particular data structure to represent the information in their mind. Are you trying to tell me that people write code and then just hope it will work? |
|||
01 Sep 2019, 09:32 |
|
st 02 Sep 2019, 06:25
DimonSoft wrote: *** is your search request. DimonSoft wrote: people write code and then just hope it will work This is true, except of rare (particular and elementary) cases[*]. Quote: In computability theory, Rice's theorem states that all non-trivial, semantic properties of programs are undecidable. […] A property is non-trivial if it is neither true for every computable function, nor false for every computable function. [*] Just an example related to assembly: Quote: EverCrypt is a formally verified modern cryptographic provider that provides […] |
|||
02 Sep 2019, 06:25 |
|
Tomasz Grysztar 02 Sep 2019, 09:12
st wrote:
And abstracting out what your registers and relevant memory locations contain in order to prove that at the end of a given function their states are as defined by design/specification is done pretty routinely by assembly programmers. Of course, this is often not formalized enough and mistakes happen. st wrote: This is true, except of rare (particular and elementary) cases[*]. Quote: In computability theory, Rice's theorem states that all non-trivial, semantic properties of programs are undecidable. […] A property is non-trivial if it is neither true for every computable function, nor false for every computable function. |
|||
02 Sep 2019, 09:12 |
|
st 03 Sep 2019, 07:36
Tomasz Grysztar wrote: it definitely can be formalized and used to obtain "mathematical proofs". Tomasz Grysztar wrote: Of course, this is often not formalized enough and mistakes happen. |
|||
03 Sep 2019, 07:36 |
|
Tomasz Grysztar 03 Sep 2019, 08:20
st wrote:
On a simplest example, an instruction like AND EAX,1 leaves only two possibilities for the value of EAX afterwards. So when analyzing the code that later depends on it, you have to trace only two cases instead of 2^32. For a simple real-life case, one could analyze a routine like my uint128_to_float80. The only branches it has are forward ones, so there are no loops and to assess all possible outcomes we just need to determine what the values of registers might be. At every step we can tell what is the possible range of values for any of the registers used (including the bit-span), so we can abstract this out to check just the borderline cases and one generic case for the values in between (with inequalities to express ranges). An experiences assembly programmer does this while coding, skipping most of the formalism, but this is quite similar to the way mathematicians work, often hand-waving parts of the proof that are familiar and do not seem to hide any traps. Of course, both mathematicians and programmers may sometimes make mistakes. But my point is that much of what assembly programmers routinely do is to prove correctness of the code they write, in a manner quite similar to the work of a mathematician (and I say this from a perspective of someone that worked in both roles). [edit] Oh, I actually have an example of an existing assembly routine backed with a formalized mathematical proof - you could take a look at my div10 function. BTW, fasmg uses it now. [/edit] st wrote:
|
|||
03 Sep 2019, 08:20 |
|
DimonSoft 03 Sep 2019, 09:22
st wrote: So, where it *is*? Existing static-analizers for C do partial analysis with false positives and/or require annotation, introducing new semantics (this way we can end up with rewriting everything in Rust ). It’s funny when people start discussing spherical things in vacuum absolutely loosing the whole point of discussion. Proofs are not valueable by themselves, I can proof an DFJDKG (some object in an imaginary parallel universe) is always twice as large as FVJFGK (another object in an imaginary parallel universe), but nobody cares until particular characteristics of the objects are used some way. You make “false positives” sound like it a bad thing. It’s not if you’re looking for a bug and your method tells it’s there. In fact, if you’re looking for bugs and get a false positive from some method it often means you have a code smell that doesn’t cause problems right now but may come into play at some moment in the future. Adding newly introduced annotations into the mix, you will have to do that if you don’t want false positives. Programming languages don’t even pretend to express the solution idea in its entirety. If some analyzer has to work with a program written with only programming language constructs, it has enough information to have, say, 5 false positives out of 100. You just need more information to distinguish false positive case from real bug. You might have this information in your mind but you still need to formalize this idea to make it available for the tool. That’s where annotations come into play. |
|||
03 Sep 2019, 09:22 |
|
revolution 03 Sep 2019, 11:33
It is also possible to verify the the input space if it isn't too large with a brute force test function.
For example: https://board.flatassembler.net/topic.php?p=208281#208281 While the mathematics approach gives us the proof of method, the brute force force approach gives us the proof of implementation. So I think both are useful. |
|||
03 Sep 2019, 11:33 |
|
st 04 Sep 2019, 11:19
Tomasz Grysztar wrote: I actually have an example of an existing assembly routine backed with a formalized mathematical proof - you could take a look at my div10 function. From this point of view, C gives us no advantage (except it is cross-platform). In the same time there are languages designed with Rice's theorem in mind. They prohibit hard to prove cases (non-pure functions). As a side effect they provide no easy way to write low-level code. So, we could consider two variants: 1. assembly language which allows to code effective and correct small routines. 2. some very high level languages that allow to implement complex and correct software. Both have strong and weak sides and complement each other. Actually any macro-assembler is some simple combination of above. fasm g did step forward (and even potentially allows to write cross-platform code). Now we have two points and can imagine vector of Evolution. DimonSoft wrote: It’s funny when people start discussing spherical things in vacuum absolutely loosing the whole point of discussion. bitRAKE wrote: In fact, research continues on "super optimizers" which produce machine code no programmer would have thought of. And here is another point I've repeated a few times : DimonSoft wrote: compiler [...] doesn’t (and cannot) perform complete analysis revolution wrote: It is also possible to verify the the input space if it isn't too large with a brute force test function. Yes, just in simple cases. |
|||
04 Sep 2019, 11:19 |
|
Tomasz Grysztar 04 Sep 2019, 12:47
st wrote: However we may not generalise these particular cases and induce the rule "all hypothetical routines are provable to be correct". No matter are they written in asm or in C. I also think that assembly gives a unique perspective into this problem, since the concept of "non-pure functions" is not even well-applicable in case of machine code. We can have function that operates only on registers and then it has just input in registers and output in registers, with no so-called "side effects". If you take a look at my coding habits (as in fasmg source), you may notice that I define inputs and outputs (including "trashed" registers) in such manner for the functions I design and sometimes a function has additional inputs and/or outputs in memory locations, not only due to the scarcity of registers, but to provide additional options, etc. But this also demonstrates how using register in such role is really no different from using a global variable, as long as it is documented. In fact, every function could be viewed as having entire memory and CPU state (including registers) as its input and output, and then we just abstract out the parts that never get touched by it or are irrelevant. Of course, properly documenting every location that function may read or modify is sometimes a hard task by itself (especially if code also calls other functions, every one introducing its own sets of "touched" locations). But having all interfaces properly defined like that makes it possible to see a way of proving correctness of such function (even if only manually) and the distinction between "pure" and "non-pure" function does not even make much sense in this context. Please forgive me my digression here, I just saw a good opportunity to point out some interesting aspects of assembly programming as I see it. |
|||
04 Sep 2019, 12:47 |
|
Mike Gonta 04 Sep 2019, 17:21
Ideom is a concept oriented language. The basic root concepts (routines) can only be described in Ideom as axioms.
All other Ideom routines are higher level concepts based on these axioms. For example: Code: To add a number to another number:
Add the number to the other number. These axioms are implemented in Ideom as x86 statements: Code: To add a number to another number: x86 $8B7C2404. \ mov edi, [esp+4] ; the number x86 $8B4C2408. \ mov ecx, [esp+8] ; the other number x86 $8B07. \ mov eax, [edi] x86 $0101. \ add [ecx], eax The Ideom also assembles x86 machine code using Ideom code such as: Code: To assemble (mov [ebp+) an offset (], edx) to a hex string: Append $8955 to the hex string. If the offset is short, append the offset as a byte to the hex string. Otherwise add 64 to the hex string's last's target; append the offset to the hex string. \ x86 $895501. \ mov [ebp+1], edx \ x86 $8995FE000000. \ mov [ebp+0xFE], edx |
|||
04 Sep 2019, 17:21 |
|
st 06 Sep 2019, 09:25
Tomasz Grysztar wrote: We can have function that operates only on registers and then it has just input in registers and output in registers, with no so-called "side effects". [...] C does not provide an easy way for such approach (may be with help of some 'context' structure, that is passed as an argument to all functions). |
|||
06 Sep 2019, 09:25 |
|
Tomasz Grysztar 06 Sep 2019, 09:42
st wrote: I think there is a difference: registers are safer. Register is accessible only with corresponding machine instruction. Global variable could be changed via an arbitrary pointer. |
|||
06 Sep 2019, 09:42 |
|
macgub 06 Sep 2019, 10:34
Tomasz Grysztar wrote:
Nice joke at the begginig of the new school year, |
|||
06 Sep 2019, 10:34 |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.