flat assembler
Message board for the users of flat assembler.

Index > High Level Languages > Why C++ programs are so big?

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 09 Feb 2018, 19:12
Compare printf with std::cout, the latter is so much bigger. Never could get into C++ just due to that. Something is wrong in the headers.

EASTL may or may not improve the situation a bit, don't understand how to use it yet.

I wonder if it's possible to write libraries specifically with the goal of reducing size. The oblivious way is adding a bunch of #ifdef, and linker often can ignore the obliviously unused functions. Code that heavily uses switch statements can't be properly "compressed" by the linker, I wonder if oop has it better (oop has a potential of increasing compile time though).

Another concern is, about SDL-like crossplatform libraries. Using the native api should result in a smaller programs, because compiler can't make this crossplatform level properly transparent, some logic will stay no matter what. For example, an extra matrix multiplication at each rendering, because opengl and direct3d use different rendering something (left handed and right handed coordinates), how compiler can possibly optimize this away?

Nothing new, i guess. Who cares. I'm just a bit paranoid about watermarks in my programs, which gcc for example indeed puts in my programs. I'm lucky I even noticed them, but I can't remove them unless I recompile gcc myself, or execute an extra command after compilation.
Post 09 Feb 2018, 19:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20290
Location: In your JS exploiting you and your system
revolution 09 Feb 2018, 19:19
I guess anything that supports cross-platform outputs is going to be large. Everything has to be converted to the lowest common denominator and then executed natively. It's the price of convenience.


Last edited by revolution on 10 Feb 2018, 17:10; edited 1 time in total
Post 09 Feb 2018, 19:19
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 09 Feb 2018, 21:31
Look at Project Oberon (2013), he fit an entire OS (OOP with garbage collector, GUI, compiler) into less than 1 MB on a Xilinx Spartan FPGA.

Or look at Free Pascal (smart linker), output is relatively small (esp. compared to others).

Seems MSVC claims to remove unused functions, which is good. GCC nowadays supposedly can too (although COFF support for that is "experimental", but ELF works fine since years).

Watermarks? Eh? There are several GCC builds for Windows, so I'm not sure which one(s) you find problematic.

Honestly, if you're that worried about bloat, use an interpreter! Or is the script itself larger than your compiled programs??
Post 09 Feb 2018, 21:31
View user's profile Send private message Visit poster's website Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 08:02
>Honestly, if you're that worried about bloat, use an interpreter!
but the size of interpreter itself?

>Project Oberon
good tricks i can steal from there? I'm a fan of KolibriOS, and one of my (will never be done) projects is making a language specifically with the requirement of small size. Just so these poor souls don't torture themselves and me with assembly.

>garbage collector
no

>Free Pascal
how good is it? Probably doesn't support SEH?

>Watermarks? Eh? There are several GCC builds for Windows, so I'm not sure which one(s) you find problematic.
the msys2 one. I'm sure they all are same.
Post 10 Feb 2018, 08:02
View user's profile Send private message Reply with quote
yeohhs



Joined: 19 Jan 2004
Posts: 195
Location: N 5.43564° E 100.3091°
yeohhs 10 Feb 2018, 10:15
I don't like HLL too. Besides size, there are also issues with libraries/packages/modules version or binary incompatibility. I've played with more than ten high level languages and trying to get third-party libraries to work together seamlessly is a real problem. I'm glad I don't need to code for a living. Smile

With fasm or other assemblers, it's simple. Just call the OS API or call the functions in the third-party DLL file. Very Happy
Post 10 Feb 2018, 10:15
View user's profile Send private message Visit poster's website Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 10:34
Yes. If only asm was more readable, or c compilers not this stupid. Actually, compilers themselves are usually ok, but the runtime and headers are a mess.

What c lacks to replace assembly completely? Probably access to the overflow register, + the ability to use cutsom calling conventions (setting which function argument uses which register/flag), + ability to return multiple variables at once. Anything else?

I'm on the path of the dinosaurus. If I care about stuff that nobody cares about, I will be left behind. 10% of work bring 90% of profit, but somebody has to do that 90% as well.
Post 10 Feb 2018, 10:34
View user's profile Send private message Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 12:34
Even printf increases programs more than it should. Compiler isn't smart enough to guess with what parameters this function will be called, and so it includes it in it's entirity. Even if 90% of it is never used.
Post 10 Feb 2018, 12:34
View user's profile Send private message Reply with quote
TmX



Joined: 02 Mar 2006
Posts: 841
Location: Jakarta, Indonesia
TmX 10 Feb 2018, 14:54
vivik wrote:
Even printf increases programs more than it should


Maybe you'd like to write your own printf(), then.
Or maybe on another case, you find malloc() is too bloated.
etc etc.

Eventually, you write your own C runtime library Smile
Post 10 Feb 2018, 14:54
View user's profile Send private message Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 17:06
The one that comes with msys2 is absolutely terrible anyway, so yeah, i'm going in that direction.

clib is quite terrible in its own right, because for it, portability > speed. Like for any unix/linux shit. It helped it get viral then there were 60 different architectures competing, but now everyone just uses intel architecture, not much reason to be super portable anymore. x D
Post 10 Feb 2018, 17:06
View user's profile Send private message Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 17:08
Eh, what do i care. Fuck. Fuck everything. Do whatever, it's none of my business.
Post 10 Feb 2018, 17:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20290
Location: In your JS exploiting you and your system
revolution 10 Feb 2018, 17:15
The libraries try to do everything for everyone. It is what people want from an HLL, right? Else you write all your own stuff, so might as well use assembly. Razz
vivik wrote:
... but now everyone just uses intel architecture ...
And ARM. And GPUs. And PIC.
Post 10 Feb 2018, 17:15
View user's profile Send private message Visit poster's website Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 10 Feb 2018, 21:11
Sometimes it's simplier to delete the code you don't need, than writing everything from ground up. Sometimes it's easier to write your own code, than figuring out existing one. Eventually you'll have to go in all directions, and then choose the one that works best.

To the theme of writing your own stuff.

Always envied how people could understand programs by just disassembling them, and I couldn't even understand what is in my own programs. Why do I need runtime, it works without it.
Post 10 Feb 2018, 21:11
View user's profile Send private message Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 11 Feb 2018, 11:35
The fact that I can't compile most of the programs by myself is a bigger problem though.

Look for example https://forum.palemoon.org/viewtopic.php?f=19&t=13556

"Don't try to build on an average laptop. If you are even considering doing this, stop. You need a development-class computer to build Pale Moon. A 64-bit operating system is required!"

This is quite problematic. I can't edit programs I'm using in everyday life. Even if I have the source, I can't do anything with it. I don't understand why compiling programs is so difficult, if you distribute your program with the debug information, and only change one or two functions, you should be able to get away without the full recompilation. I'm pretty sure that those requirements are C++ fault, C is braindead-simple to link together.

If I can just edit the source code, there is no need for plugins or scripting languages, and no limitations either.

I wish I just had money for a bit newer computer, so many problems would go away. And new problems will appear.

About reducing size, one of makers of .kkrieger said they made a primitive parser of C++ code, to find the pieces of code that are never executed. https://fgiesen.wordpress.com/2012/04/08/metaprogramming-for-madmen/ They didn't really used this tool afterwards, parsing C++ is a difficult task. If somebody is to make a language, it's a cool idea to keep in mind.
Post 11 Feb 2018, 11:35
View user's profile Send private message Reply with quote
TmX



Joined: 02 Mar 2006
Posts: 841
Location: Jakarta, Indonesia
TmX 11 Feb 2018, 14:16
vivik wrote:
I'm pretty sure that those requirements are C++ fault, C is braindead-simple to link together.


Template metaprogramming, deeply nested include files, etc etc
Complex C++ projects indeed can take long time to compile, and require lot of RAM.

Smile
Post 11 Feb 2018, 14:16
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 11 Feb 2018, 17:15
vivik wrote:
>Honestly, if you're that worried about bloat, use an interpreter!
but the size of interpreter itself?


Depends on the language (and host OS/implementation/compiler).

But if your interpreter is bigger than all your programs separately compiled, then just use a compiler.

Most people recommend Python, but I'd suggest REXX (or maybe AWK).

Quote:

>Project Oberon
good tricks i can steal from there? I'm a fan of KolibriOS, and one of my (will never be done) projects is making a language specifically with the requirement of small size. Just so these poor souls don't torture themselves and me with assembly.


Since he gives away sources to the compiler and OS (compiled by itself), then yes, presumably there's "something" you can learn from there.

Quote:

>garbage collector
no


Oberon is fairly minimal, and while I'm not sure which variant of garbage collection it uses (in various implementations), it's neither bloated nor slow.

Quote:

>Free Pascal
how good is it? Probably doesn't support SEH?


Very good (esp. Turbo and Delphi dialects). Compiles itself, has very good IDE (Lazarus is GUI although classic textmode IDE also exists).

A quick search says that 3.0.x does indeed support SEH (which I'm not familiar with).

Quote:

>Watermarks? Eh? There are several GCC builds for Windows, so I'm not sure which one(s) you find problematic.
the msys2 one. I'm sure they all are same.


So what? Why is that a problem?
Post 11 Feb 2018, 17:15
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 11 Feb 2018, 17:20
vivik wrote:
Yes. If only asm was more readable, or c compilers not this stupid. Actually, compilers themselves are usually ok, but the runtime and headers are a mess.


So use a language without headers, e.g. Delphi (FPC) or Oberon or ....

Quote:

What c lacks to replace assembly completely? Probably access to the overflow register, + the ability to use cutsom calling conventions (setting which function argument uses which register/flag), + ability to return multiple variables at once. Anything else?


ANSI C can return structured items, e.g. structs. Not quite multiple return (like Go ??) but close enough. Even classic Pascal can quasi-return multiple values in a VAR parameter. Others can handle structured returns too (FPC's default "fpc" dialect, ISO Modula-2, Extended Pascal).

Quote:

Even printf increases programs more than it should. Compiler isn't smart enough to guess with what parameters this function will be called, and so it includes it in it's entirity. Even if 90% of it is never used.


Probably because most modern OSes dynamically link in the C lib. So they don't duplicate it much.

You can always write your own (or pare down an existing implementation, e.g. from newlib). Try also looking at TinyStdio.
Post 11 Feb 2018, 17:20
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 11 Feb 2018, 17:24
vivik wrote:
The one that comes with msys2 is absolutely terrible anyway, so yeah, i'm going in that direction.

clib is quite terrible in its own right, because for it, portability > speed. Like for any unix/linux shit. It helped it get viral then there were 60 different architectures competing, but now everyone just uses intel architecture, not much reason to be super portable anymore. x D


printf has increased a lot (in size and functionality) since classic C89. It's almost unavoidable.

Not much reason to be super portable? Some people still support IA-32 (or even IA-16!). Don't forget that x64 differs some details across OSes (LP64 vs. LLP64), too.

And yes, there are still lots of (various) ARM devices, upcoming alternatives like RISC-V, and a bunch of others (PPC64) still used elsewhere. But I'll agree that many other arches have died out.
Post 11 Feb 2018, 17:24
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 11 Feb 2018, 17:45
vivik wrote:
The fact that I can't compile most of the programs by myself is a bigger problem though.

Look for example https://forum.palemoon.org/viewtopic.php?f=19&t=13556

"Don't try to build on an average laptop. If you are even considering doing this, stop. You need a development-class computer to build Pale Moon. A 64-bit operating system is required!"

This is quite problematic. I can't edit programs I'm using in everyday life. Even if I have the source, I can't do anything with it.


It's not something most people, even developers, want to fiddle with in their spare time. It's not crucial enough (or at least not enough to constantly rebuild ... but see Gentoo!).

If you absolutely insist on rebuilding a web browser, I'd suggest a simpler alternative like Dillo or Links2.

Quote:

I don't understand why compiling programs is so difficult,


OSes, APIs, cpus, standards, dialects, libraries, licenses, experience, time, ....

It's only as simple as you make it. And it takes effort to simplify things. A famous quote (by Einstein, often re-quoted by Wirth for Oberon) is "make things as simple as possible ... but not simpler!" ("Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy" -- Alan Perlis.) Most people don't want to stick to simple, ol' Brainf***. Laughing

Developers only have so much time to do things, so sometimes things are suboptimal. Of course even they can't know literally everything, so it will always be somewhat incomplete, inefficient, buggy, etc. Sure, I agree it should be simpler, smaller, more portable, etc., but that doesn't come for free. It takes a lot of effort, hard work, lots of testing, to get things where they aren't totally horrible. As is, we're probably doing fairly good managing complexity. (It could be much worse, that's all I'm saying. Many people have done improvements to many things.)

Quote:

if you distribute your program with the debug information, and only change one or two functions, you should be able to get away without the full recompilation. I'm pretty sure that those requirements are C++ fault, C is braindead-simple to link together.


What do you want, Xerox Smalltalk? You could change many things at runtime, but overall it was too slow.

Anyways, in modern C or C++ projects, "full" recompilation is only done once. After that, the makefile will only build what's needed. Then again, linking also takes a lot of time (and memory, disk space, etc).

Quote:

If I can just edit the source code, there is no need for plugins or scripting languages, and no limitations either.

I wish I just had money for a bit newer computer, so many problems would go away. And new problems will appear.


Even with modern hardware, interpreters are too slow (e.g. Python). Modern will speed up but only partially. The rest just needs better algorithms, more profiling, refactoring, better libs, etc.

But languages (and OSes) like Oberon are very fast. Sure, GCC is slow, but that's not an easy egg to crack (or they would've already done it). Hence why Clang is popular, it tried to be faster (though these days dunno if that's still true). Also, see TinyC, faster builds for less optimizations.

Quote:

If somebody is to make a language, it's a cool idea to keep in mind.


C++ originally had to run in less than 1 MB. But it has grown and added lots since then. Even C and Pascal were designed to be single-pass. But C made the mistake of #include and preprocessor whereas things like Modula-2 had true modules and didn't need kludgy makefiles at all. Even C++ is (mostly) trying to invent its own module system to speed up builds, but it's not quite finalized yet (last I heard).

I haven't really tried lately, and it's comparing apples to oranges, but Free Pascal is probably faster to rebuild itself than GCC (which is now implemented in C++). Of course, it doesn't reparse headers either (although units aren't quite as good as proper modules but close enough).

Sometimes there is no easy answer. But often the reason is known but nobody has found the time to do it. It's always a tradeoff, many considerations, so nothing is as simple as it sounds.

EDIT: Somebody once tried to make a simpler LALR grammar for C++ (called SPECS), but it didn't achieve much popularity, dunno why. I've heard people naively say that "compilers are already fast enough", but that's ridiculous. The lazy way out is to throw more Ghz (or cores) into the mix. I think we still have a lot to learn from Wirth (see his _Plea for Lean Software_ circa 1995!).
Post 11 Feb 2018, 17:45
View user's profile Send private message Visit poster's website Reply with quote
FlierMate



Joined: 21 Jan 2021
Posts: 219
FlierMate 11 Apr 2021, 11:23
Even Pascal is huge also!

For example like displaying "Hello World" text string on screen, I get 49,815 bytes EXE.

When I disassemble the code section, it is just: (or look at screenshot below)
Code:
Disassembly:
0:  55                      push   ebp
1:  89 e5                   mov    ebp,esp
3:  c6 05 20 82 40 00 01    mov    BYTE PTR ds:0x408220,0x1
a:  68 d0 b2 40 00          push   0x40b2d0
f:  6a f6                   push   0xfffffff6
11: e8 da fa ff ff          call   0xfffffaf0
16: 50                      push   eax
17: e8 e4 fa ff ff          call   0xfffffb00
1c: b9 00 c0 40 00          mov    ecx,0x40c000
21: ba 04 c0 40 00          mov    edx,0x40c004
26: b8 50 80 40 00          mov    eax,0x408050
2b: e8 30 ff ff ff          call   0xffffff60
30: e8 3b ff ff ff          call   0xffffff70
35: b8 10 80 40 00          mov    eax,0x408010
3a: e8 91 5c 00 00          call   0x5cd0
3f: 89 ec                   mov    esp,ebp
41: 5d                      pop    ebp
42: c3                      ret    


The rest of the EXE packed with TLS, IAT...
Maybe someone can explain it to me what each of the section does: (even .text code section is huge!)

Code:
Section Header
+-+-+-+-+-+-+-
####  Name      VSize       RVA         RSize       Offset      Flags
----  ----      -----       ---         -----       ------      -----
   1  .text     0x00006B50  0x00001000  0x00006C00  0x00000400  0x60000020
   2  .data     0x00000514  0x00008000  0x00000600  0x00007000  0xC0000040
   3  .rdata    0x000000F0  0x00009000  0x00000200  0x00007600  0x40000040
   4  .bss      0x00001B34  0x0000A000  0x00000000  0x00000000  0xC0000080
   5  .CRT      0x0000000C  0x0000C000  0x00000200  0x00007800  0xC0000040
   6  .idata    0x0000077D  0x0000D000  0x00000800  0x00007A00  0xC0000040
   7  .stab     0x00000090  0x0000E000  0x00000200  0x00008200  0x42000040
   8  .stabstr  0x00000052  0x0000F000  0x00000200  0x00008400  0x42000040    


I recall someone on here saying: "Free Pascal policy is EXE size does not matter".

I would like to compress these EXEs compiled by Free Pascal but it does look like more complicated than I thought.


Description: Disassembly of "Hello World" EXE compiled using FPC 3.2.0 (for i386-Win32)
Filesize: 66.66 KB
Viewed: 16111 Time(s)

firstapp.PNG


Post 11 Apr 2021, 11:23
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 11 Apr 2021, 12:43
Just compile with -nostdlib and import functions manually just like in asm.
Post 11 Apr 2021, 12:43
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.