Size of "Hello world"

A

Alf P. Steinbach

The typical size of a machine code executable "Hello world", generated by a high
level language compiler, has increased steadily over the years.

Is this increase exponential, linear, or what?

Assuming that C++ is still around, when will the typical size for a C++ "Hello
world" executable have exceeded 1 GiB?




- Alf (wondering)
 
W

werasm

The typical size of a machine code executable "Hello world", generated by a high
level language compiler, has increased steadily over the years.

Is this the case when one omits symbols during compilation too?

When not omitting symbols, templates tend to cause long symbol names
that increase the size. I've found that for application that I've
written the symbols may make up 80 -> 100 MB, where as the actual meat
is perhaps only 3 -> 5 MB. Hello World without symbols are pretty
small (using gcc).

Kind Regards,

Werner
 
M

Marek Borowski

Is this the case when one omits symbols during compilation too?

When not omitting symbols, templates tend to cause long symbol names
that increase the size. I've found that for application that I've
written the symbols may make up 80 -> 100 MB, where as the actual meat
is perhaps only 3 -> 5 MB. Hello World without symbols are pretty
small (using gcc).
3 MB for hello world is small ????? You are joking aren't you ?
Using pure kernel calls it can be done < 3KB. It will be so big due to
headers of executable format.


Regards

Marek
 
W

werasm

3 MB for hello world is small ????? You are joking aren't you ?
Using pure kernel calls it can be done < 3KB. It will be so big due to
headers of executable format.

You aren't reading what I've said - I've said that applications
that have a size of 100MB reduce to 3MB when omitting symbols.
Obviously this is not referring to "Hello World" - obvious to
me at least.

Kind Regards,

Werner
 
M

MiB

The typical size of a machine code executable "Hello world", generated by a high
level language compiler, has increased steadily over the years.

Where did you get this?
Using Visual C++ 2010, creating native x86 code in default release
settings (i.e. no fancy optimization tricks), a C++ program printing
"hello world" to a console window is 8k.
I expect similar results of any concurrent C++ compiler.

MiB.
 
A

Alf P. Steinbach

* MiB:
Where did you get this?

Experience and general knowledge.

E.g. the size shown below wouldn't even fit in a ZX80, much less the KIM-1.

Although the ZX80 possibly didn't have a C++ compiler.

Using Visual C++ 2010, creating native x86 code in default release
settings (i.e. no fancy optimization tricks), a C++ program printing
"hello world" to a console window is 8k.

No doubt it can be reduced to that. In fact I have no trouble reducing it to 4
KiB in Windows (and less if I ain't afraid of letting the loader fix up things,
but there's a 4 KiB pagesize), and in *nix it can be just a few hundred bytes
IIRC. Which doesn't say anything, really -- I was talking about typical size.


<example>
C:\test> cedit x.cpp

C:\test> msvc --version
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.6030 for 80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\test> msvc x.cpp
x.cpp

C:\test> dir | find "x.exe"
21.04.2010 16:43 73 728 x.exe

C:\test> gnuc --version
g++ (GCC) 3.4.5 (mingw-vista special r3)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


C:\test> gnuc x.cpp -o x

C:\test> dir | find "x.exe"
21.04.2010 16:44 488 517 x.exe

C:\test> gnuc x.cpp -s -o x

C:\test> dir | find "x.exe"
21.04.2010 16:44 276 480 x.exe

C:\test> type x.cpp
#include <iostream>
int main(){ std::cout << "Hello, world!" << std::endl; }

C:\test> "c:\Program Files\Microsoft Visual Studio 9.0\Common7\Tools\vsvars32.bat"
Setting environment for using Microsoft Visual Studio 2008 x86 tools.

C:\test> msvc --version
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\test> msvc x.cpp
x.cpp

C:\test> dir | find "x.exe"
21.04.2010 16:46 98 816 x.exe

C:\test> _
</example>


I expect similar results of any concurrent C++ compiler.

Oh?

C above. ;-)


Cheers,

- Alf
 
P

Puppet_Sock

The typical size of a machine code executable
[snip]

Sure does not seem to be language related. Is it? After all,
the example you spoke of, Hello World!, isn't changing. So
this sure seems to belong on other forums.
Socks
 
J

Jonathan Lee

The typical size of a machine code executable "Hello world", generated by a high
level language compiler, has increased steadily over the years.

Just some figures for Linux 64-bit machine, gcc 4.4.3

C a.out 4496 bytes
C++ a.out 4824 bytes (using printf)
C++ a.out 5536 bytes (using cout)
C# hello.exe 3584 bytes (size of mono not included)
haskell hello 436496 bytes (seriously)
fasm hello 229 bytes

Remembering that C#, Java, and such require an interpreter
I'd say I'm with you on this.
Is this increase exponential, linear, or what?

I would guess exponential-ish, following Moore's Law. IMO
high level languages grow as average computer resources
grow. Why can Haskell make a 400kb hello world program
when you only need 230 bytes to do it? 'Cause it can.
Assuming that C++ is still around, when will the typical size for a C++ "Hello
world" executable have exceeded 1 GiB?

As for C++ specifically, it doesn't seem to be following
that trend. I mean, at 5kb today, how low could it have
been in the past? Since, C++ tends to follow the "don't
pay for what you don't get" idea, I would guess any
increase in size is from the larger instruction set on
my 64-bit machine; linking to a few extra libraries
than the C version; and changes in the standard.

These kinds of things I expect to grow linearly (without
any evidence, of course). Consequently, I think C++ will
follow.

--Jonathan
 
M

Michael Oswald

Just some figures for Linux 64-bit machine, gcc 4.4.3

C a.out 4496 bytes
C++ a.out 4824 bytes (using printf)
C++ a.out 5536 bytes (using cout)
C# hello.exe 3584 bytes (size of mono not included)
haskell hello 436496 bytes (seriously)
fasm hello 229 bytes

Why can Haskell make a 400kb hello world program
when you only need 230 bytes to do it? 'Cause it can.

Well, it's more likely the Haskell runtime, which is a bit more
complicated than the runtime from C++.

Still, if you include the Java VM and Mono its comparatively small but
nothing against C/C++.


lg,
Michael
 
J

Jonathan Lee

Well, it's more likely the Haskell runtime, which is a bit more
complicated than the runtime from C++.

Sure, but my point is that as computer resources increase,
we will see languages fill in the space. With what, exactly,
is just incidental.

--Jonathan
 
C

cpp4ever

The typical size of a machine code executable "Hello world", generated
by a high level language compiler, has increased steadily over the years.

Is this increase exponential, linear, or what?

Assuming that C++ is still around, when will the typical size for a C++
"Hello world" executable have exceeded 1 GiB?




- Alf (wondering)

At that time you can add a function called Goodbye memory

JB
 
B

Balog Pal

Alf P. Steinbach said:
No doubt it can be reduced to that.

It is not "can be reduded" but it *is* 8k. I really miss your point of
this thread.
In fact I have no trouble reducing it to 4 KiB in Windows (and less if I
ain't afraid of letting the loader fix up things, but there's a 4 KiB
pagesize), and in *nix it can be just a few hundred bytes IIRC. Which
doesn't say anything, really -- I was talking about typical size.

Typical supposed to mean tweaked specially to show increase?
<example>
C:\test> cedit x.cpp

C:\test> msvc --version
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.6030 for
80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\test> msvc x.cpp
x.cpp

C:\test> dir | find "x.exe"
21.04.2010 16:43 73 728 x.exe
C:\test> type x.cpp
#include <iostream>
int main(){ std::cout << "Hello, world!" << std::endl; }

Dunno what you do. I launched Visual Studio 2008(SP1), create new project
for WIN32 console, then copy in your code ( by myself I'd certainly used
puts, streams is pure overhead...):

#include "stdafx.h"

#include <iostream>

int _tmain(int argc, _TCHAR* argv[])

{ std::cout << "Hello, world!" << std::endl; }

switch taregt to Release, and the resulting exe has size 9216. Not anywhere
near 73k. Sure it is still quite fat, dumpbin reveals that it has a
manifest, a bunch of locking in streambuf, security cookies, etc.


Your command line trials seem to miss any request for optimize options. Of
course you can get arbitrary amount of stuff in the executable, but it
proves nothing besides the tools were never supposed to be used that way.

In the old times the runtime library had extreme amount of object files in
the libraries, as the linker could include only full objects. That practice
was dropped as function level linking got implemented. That is certainly a
thing you do want to use for a 'release'.

By default it is not on, I guess to keep either tradition or to save on
compile time -- after all during development we build release way less.
 
I

Ian Collins

It is not "can be reduded" but it *is* 8k. I really miss your point of
this thread.

It is a bit on the whimsical side!

One thing that's often overlooked is in a hosted environment, the size
of an executable doesn't include the size of the library functions used
(unless static is linking is used). So in C or C++ terms, "the size of
hello world" is pretty meaningless.

To prove the point, compare the size of

int main(int argc, char* argv[]) {
return 0;
}

and

#include <stdio.h>

int main(int argc, char* argv[]) {
puts( "Hello, world!\n" );
return 0;
}

The difference is a function call. The executable size is mainly the
invisible start and finish code.
Dunno what you do. I launched Visual Studio 2008(SP1), create new
project for WIN32 console, then copy in your code ( by myself I'd
certainly used puts, streams is pure overhead...):

#include "stdafx.h"

#include <iostream>

int _tmain(int argc, _TCHAR* argv[])

{ std::cout << "Hello, world!" << std::endl; }

IS that C++??
switch taregt to Release, and the resulting exe has size 9216. Not
anywhere near 73k. Sure it is still quite fat, dumpbin reveals that it
has a manifest, a bunch of locking in streambuf, security cookies, etc.

To go further (if your platform supports it), remove symbols:

ls -l a.out : 9624

strip a.out

ls -l a.out : 6784

Utterly pointless I know!
 
B

Balog Pal

Ian Collins said:
#include "stdafx.h"

#include <iostream>

int _tmain(int argc, _TCHAR* argv[])

{ std::cout << "Hello, world!" << std::endl; }

IS that C++??

Why not? the MS toolchain "supposedly" helps to use char as either 8-bit
thing (ascii or some codepage) or 16-bit UCS-16. TCHAR and a plenty of
t-containing things will be either char ot wchar_t depending on project
settings (forcing #define UNICODE or smething like that).

Or did you mean that one is supposed to also #include <ostream> to have
operator << for sure and not just by luck? ;-)
 
J

Jerry Coffin

The typical size of a machine code executable "Hello world",
generated by a high level language compiler, has increased steadily
over the years.

I don't think it's steady -- it happens in steps. Virtually every new
OS has a new executable format that increases overhead compared to
its predecessors. Just for example, a "Hello world" for MS-DOS as a
..COM file (written in assembly language) could be around 20 bytes --
but that had essentially no overhead; if you added up the size of the
string and the size of the code, you got the size of the file -- to
the byte.

If you did the same in a .exe file, you got overhead. Still using
assembly language, the file came to something like 200 bytes or so.

In 16-bit Windows, the bare minimum file size grew again -- to 512
bytes, and a hello world program ended up something like 5120 bytes.

In 32-bit Windows that went up to 720 bytes.

In 64-bit Windows, it's risen again -- for a 32-bit program, it's now
2560 bytes, and for a 64-bit program it's 3072 bytes.

That's a bit misleading though -- even in the 64-bit executable, the
actual machine code inside that executable is a mere 33 bytes. That's
larger than (for example) the 16-bit Windows version, but most of the
change is simply because addresses are larger -- 64-bits apiece
instead of 16. At least IMO, the expanded addressing capability makes
that minuscule increase in size *entirely* worthwhile.

The rest of the difference is entirely in the overhead of the
executable file format itself, not in the machine code in that
executable. The .com file had zero overhead, but carried quite a few
limitations with it. In a multitasking system, it would be
essentially impossible to share images loaded from such files between
processes, so the savings in disk space would come at the expense of
consuming substantially more memory in operation. That doesn't strike
me as a good tradeoff.
 
A

Alf P. Steinbach

* Balog Pal:
Alf P. Steinbach said:
No doubt it can be reduced to that.

It is not "can be reduded" but it *is* 8k. I really miss your point of
this thread.
In fact I have no trouble reducing it to 4 KiB in Windows (and less if
I ain't afraid of letting the loader fix up things, but there's a 4
KiB pagesize), and in *nix it can be just a few hundred bytes IIRC.
Which doesn't say anything, really -- I was talking about typical size.

Typical supposed to mean tweaked specially to show increase?
<example>
C:\test> cedit x.cpp

C:\test> msvc --version
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.6030 for
80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\test> msvc x.cpp
x.cpp

C:\test> dir | find "x.exe"
21.04.2010 16:43 73 728 x.exe
C:\test> type x.cpp
#include <iostream>
int main(){ std::cout << "Hello, world!" << std::endl; }

Dunno what you do. I launched Visual Studio 2008(SP1), create new
project for WIN32 console, then copy in your code ( by myself I'd
certainly used puts, streams is pure overhead...):

#include "stdafx.h"

#include <iostream>

int _tmain(int argc, _TCHAR* argv[])

{ std::cout << "Hello, world!" << std::endl; }

switch taregt to Release, and the resulting exe has size 9216. Not
anywhere near 73k. Sure it is still quite fat, dumpbin reveals that it
has a manifest, a bunch of locking in streambuf, security cookies, etc.

Have you tried this 9K program on a computer without Visual Studio?

How much did you have to copy to make it work?

Note: I don't know the answer to the that question. With the examples I
presented the programs could be copied freely, with the stated sizes. I suspect
that your Visual Studio default-settings program may be larger than you think...


Cheers & hth.,

- Alf
 
J

Jerry Coffin

[ ... ]
In 16-bit Windows, the bare minimum file size grew again -- to 512
bytes, and a hello world program ended up something like 5120 bytes.

Oops - that should be "520 bytes". My apologies.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top