Android—Why Dalvik?

  • Thread starter Lawrence D'Oliveiro
  • Start date
B

BGB

License issues is the biggest one. There is also the fact that it
defeats the purpose of the plugin architecture: it's not like Firefox
bundles Flash, the other You Need This plugin.

Actually, as an aside, Firefox used to have a deep integration via OJI,
and until Firefox 3 or so, it was theoretically possible to write your
extension in Java (JavaXPCOM has since been broken, I think). Nowadays,
Mozilla-Java integration is mostly limited to the standard NPAPI (i.e.,
plugin) architecture capabilities.


Judging from wget, the size of Firefox right now is 12 MB. While
download size is not a huge priority, a 150% regression is definitely
not going to be accepted. And keep in mind that not everyone has access
to high-speed internet; e.g., Africa.

and, even in the US, some of us in non-urban areas (IOW, where the
county doesn't bother to pave any of the roads) have poor-quality DSL,
where an added 18MB could add maybe an extra hour or more to the
download time...
 
L

Lawrence D'Oliveiro

On 05/31/2011 11:05 PM, Lawrence D'Oliveiro wrote:

Funny. I thought C/C++ was supposed to be portable.

C certainly is. “Write once, run everywhere†is more true of C than it is of
Java; a portable compiler like GCC means C is the most portable language in
the world.
With Java, it doesn't matter which compiler I use to link the binary, they
all the do same thing. Even if I don't program my code in Java. ;-)

Java has extreme ABI portability--any compiler, any OS, any arch.

At the cost of putting the burden on the recipient of your code to figure
out how to run a .jar file on their system.
 
M

Michal Kleczek

Lawrence said:
The Android Builders Summit
<http://free-electrons.com/blog/abs-2011-videos/> had some interesting
presentations, in particular Karim Yaghmour’s delving into the internals
of Android, and Aleksander Gargenta’s “A Walk Through The Android Stackâ€.

From 48:00 onwards, Gargenta explains why Android uses the Dalvik VM
instead of the Java VM.

* Why not Java SE? Too bloated, not suitable for low-power applications.
* Why not Java ME? Too expensive, everything runs in one VM => lousy
security. And you don’t get the necessary hardware access.

There is also something called Java SE Embedded. Together with LWUIT it
could be a great platform for mobile devices. The only real problem
comparing to Android is not technical but simply licencing.
Dalvik is purpose-built from the ground up; its .dex code is, even
uncompressed, slightly smaller than a compressed .jar file. This
simplifies class loading—a .apk file can be opened and mmap’d, and the
code is ready for execution. (This is why zipalign is so important when
building an Android app.)

Dalvik is also register-based, not stack based, for higher performance.

See:
http://blogs.oracle.com/javaseembed...performance_stack_up_against_java_se_embedded
 
L

Lawrence D'Oliveiro

The only real problem comparing [Java SE Embedded] to Android is not
technical but simply licencing.

Like the only real problem with pigs flying is simply gravity.
 
B

BGB

To run a .jar file on my system I double click on the file. I can run
a C program using the GCC compiler as well but it is a lot more
trouble than a double click.

On my system a .jar file is immediately runnable while a C source file
isn't. C is not "write once run everywhere" it is "write once,
compile and run everywhere." Java removes the compile step from the
user's end.

To me the .jar is more portable.

yes, but to be fair though, it is a little less convenient in some cases
than a native binary would be, such as (AFAIK) on Linux it would be
necessary to use a shell script to wrap the call to 'java' to make it
behave more like a native program (from the shell, GNome does file
associations so will probably wrap this case).

say, "myprogram.sh":
java -jar myprogram.jar ...


I just did a little experiment here, and it seems on modern Windows
(Win7) one can essentially launch files (jars included) directly from
the command-line (presumably arguments would be passed to the JVM and be
parsed correctly, but I didn't test this...). I suspect this is because
the CMD shell now checks for associations and launches the program.

AFAICT, it one goes and adds JAR to the PATHEXT environment variable,
then it is no longer necessary to include typing the jar file extension
when launching JARs.


one could do similar with C files, but alas then this would hinder their
more useful behavior:
double-click to launch ones' favorite text editor...

well, and as well, C programs are not typically self-contained in a
single source file, so one would more likely need a Makefile-like
launcher script of some sort...

then this made me wonder about something... apparently it seems EXE
files do have file associations in the registry... odd...


then again, I do remember an instance of someone going and messing with
file associations to break Windows in a relatively amusing way (pretty
much nothing could be done on the computer, because nearly all actions
resulted in Notepad windows filled with binary garbage).


it does bring up an idle mystery though as to how much of a central role
the OS's binary format really needs in an OS, or if it could be largely
reduced to "just another file format" as far as the kernel is concerned
(all files launched by associations, including program binaries, just
with a little bit of a hack for the "main program loader" or similar,
with the OS possibly allowing secondary loaders with a behavior
analogous to that of the main loader).


or such...
 
B

BGB

The only real problem comparing [Java SE Embedded] to Android is not
technical but simply licencing.

Like the only real problem with pigs flying is simply gravity.

note that Java SE Embedded was outperforming Dalvik on the linked-to
benchmark...

granted, yes, it would also be nice to see a comparison of things like
memory footprint and similar, which could be a relevant factor in terms
of comparisons, but was not provided in the linked benchmark.
 
J

Joshua Cranmer

yes, but to be fair though, it is a little less convenient in some cases
than a native binary would be, such as (AFAIK) on Linux it would be
necessary to use a shell script to wrap the call to 'java' to make it
behave more like a native program (from the shell, GNome does file
associations so will probably wrap this case).

say, "myprogram.sh":
java -jar myprogram.jar ...

If that really pisses you off to type in `java -jar', then just chmod +x
the jar file and then do ./myprogram.jar. My Linux distribution at least
comes with a utility that works out based on the binary file type which
program it should call to execute files.
 
B

BGB

If that really pisses you off to type in `java -jar', then just chmod +x
the jar file and then do ./myprogram.jar. My Linux distribution at least
comes with a utility that works out based on the binary file type which
program it should call to execute files.

fair enough...

but, the point is more that there may be cases where launching the jar
is slightly less convenient than, say, using a native binary.


but, now my recent explorations are suggesting that raw OS binaries are
not necessarily necessary...

this is convenient to know, partly for my own VM projects as well, since
I more just need compiled image files and a fairly unique file
extension, rather than necessarily needing to produce native OS binaries
(such as ELF or PE/COFF images).


or such...
 
L

Lawrence D'Oliveiro

Nope. Batch files and shell scripts work just fine, and it's easy enough
to include one with your app. Or you can create a native executable
stub.

Sounds like you’re reinventing the work done by GNU Autoconf, only now it’s
happening on every execution, instead of once at build time.
 
L

Lawrence D'Oliveiro

You're right, but if I create an app for someone and give them a file
they can execute at the command line, or double-click in the GUI, who
the hell cares?

Don’t they also have to download and install Java first?
 
L

Lawrence D'Oliveiro

installs a JVM by default, and at least one major PC manufacturer
includes Java on their desktop computers (that'd be the 2nd largest -
Dell - I am not sure about the others).

Notable omission from that list...
 
M

Michael Wojcik

BGB said:
yeah, and it doesn't exactly tend to work well for non-Linux operating
systems (such as Windows...).

Used properly, autoconf works just fine on Windows - or, at any rate,
as well as it works anywhere. (Like Joshua I am not particularly
impressed with autoconf, though it's not quite as thoroughly
brain-damaged as some of its fellow GNU build tools, such as libtool.)
Wireshark uses it, for example.

Once again, the real problem is that systems like autoconf only help
with C code that is written to be portable with the help of
conditional compilation. The vast majority of C code is poorly written
(spend some time on comp.lang.c if you don't understand how or why)
and a good portion of that is unportable assumptions.

Some of the classic non-portable assumptions in C code are becoming
rarer. The prevalence of two's-complement machines over
one's-complement and sign-magnitude (the other two "pure binary
representations" allowed for C integer types) has largely eliminated
one source of bit-twiddling errors, for example; and the popularity of
I32LP64 architectures has made more programmers aware of the problems
of casting between pointer and integer types.

But we still see a lot of code with character set assumptions, or
assuming CHAR_BIT==8, or assuming huge auto-class objects are fine.
Those are safe assumptions on many platforms, but they limit
portability. So do endianness assumptions, etc.

And we still see a lot of buffer overflows, integer overflows, unsafe
or erroneous memory allocation. Failures to check for and handle error
returns from library and system calls. TOCTOU races and other forms of
unsafe file handling. Interpositioning vulnerabilities (a huge issue
with Windows right now; not specific to C, but mitigated by runtime
systems that use more-sophisticated dynamic loaders). And so on.

Autoconf does *not a damn thing* to address any of this.
 
M

Michael Wojcik

Lawrence said:
Using MSVC brings its own share of problems. I remember on the Python group,
if you wanted to build a C/C++ extension for Python, you had to compile it
with the exact same version of MSVC as was used for that version of Python,
otherwise it wouldn’t work.

There's no "C/C++" language. C and C++ are very different languages.[1]

Requiring the same version of MSVC, for a binary compiled from C code,
indicates improper use of the C runtime by either Python or the
extension. Mixing C runtimes is fine as long as you follow the
guidelines Microsoft publishes. In particular, resources allocated by
one module shouldn't be freed by another; and some resources (notably
FILE* objects) shouldn't be shared between modules. None of this is
difficult to achieve.

There may be other issues with Microsoft C++, particularly if the
versions are very far apart, but on the whole, with properly-designed
APIs, there's no reason for this to be a problem.

There are certainly many infelicities with MSVC. But most of the
problems people have with it are due to sloppy design and coding, and
a failure to read and follow the documentation.



[1] Yes, I'm well aware that Stroustrop thinks otherwise. I've had
that argument (and he participated) on Usenet. I find his argument
fundamentally flawed.
 
M

Michael Wojcik

BGB said:
yeah... sadly, the C ABI is a little bit prone to variations

There is no "C ABI". If you believe otherwise, please cite the
appropriate language from ISO/IEC 9899. I'll accept any version.
 
M

Michael Wojcik

Ah, argument by repeated assertion. This is still bullshit, insofar as
it's anything at all. ("the most portable language in the world" is a
vapid claim, since "portable" is not well-defined and there's no
metric for "most".)
Well, technically, you still have to have a gcc compiler available for
your platform, but gcc *is* available on every popular platform *I* can
think of. More importantly, you have to rebuild from source each time
you deploy your app to a new platform. You don't, with Java.

Recompiling arbitrary C source on a new platform only produces a
working binary if the C source doesn't make any invalid assumptions
about the implementation. It's rare to find non-trivial C source that
doesn't make assumptions about the implementation: CHAR_BIT,
endianness, character set, etc.
 
M

Michael Wojcik

BGB said:
of course, C# is currently up there as well, so it is mostly a battle
between C, C++, Java, and C# for the title of "most widely used
language...".

That's a meaningless title unless you define it. Used by the most
programmers? Used by the most "applications" (however that would be
defined)? Most SLOC or function points or some other dubious code
metric? Ever? In the last year, month, week?

TIOBE's rankings are suspect, as is their methodology, but at least
they have a method - they're not just pulling a list out of their
collective ass.

FYI, the most recent short-term TIOBE rankings are Java, C, C++, C#,
PHP, Objective-C, Python, "(Visual) Basic" (a dubious entry), Perl,
and Ruby, in that order.[1] That's for May 2011. (RPG has risen to
#20, by the way, from #25 last year. Time for everyone to refresh
those RPG skills!)

Their long-term data shows Java and C securely holding the top two
spots for the past decade. C++ briefly beat C for the #2 spot a couple
of times, but it didn't last.

But as I noted, the TIOBE rankings are suspect. They're based on
things like advertised positions and classes, so they mostly measure
demand or perceived demand in various markets.

And simplistic interpretations of their data are likely to be
misleading. For example, they rank COBOL at #37, well below, say, Logo
(#24). (Time to brush up on those Logo skills!) But there are a few
billion lines of COBOL application source code still under
maintenance. They're rarely touched (indeed, businesses are
tremendously wary of touching them), because they encode business
rules. But they still exist and the programs compiled from them are
still used. Does that mean COBOL is under-ranked? Only if you
interpret the TIOBE rankings to mean something other than what they mean.

Similarly, we see TIOBE ranks Alice (a language in the ML family) at
#35, and PL/I at #42. Alice is free, and comes with a free IDE. The
major PL/I implementations - IBM's and ours - are expensive. But we
still sell a goodly number of PL/I licenses, and all evidence suggests
IBM does too. We don't see PL/I customers rushing to switch to Alice.
Or even, say, C, which is more like PL/I and is the #2 language.


[1] http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
 
J

Joshua Cranmer

TIOBE's rankings are suspect, as is their methodology, but at least
they have a method - they're not just pulling a list out of their
collective ass.

At the very least, TIOBE explains their method and admits that it may
not be the most useful gauge of programming language popularity. They
especially admit that rankings below around #25 or so (and probably it
ought to be higher) are pretty much complete bullshit.
FYI, the most recent short-term TIOBE rankings are Java, C, C++, C#,
PHP, Objective-C, Python, "(Visual) Basic" (a dubious entry), Perl,
and Ruby, in that order.[1] That's for May 2011. (RPG has risen to
#20, by the way, from #25 last year. Time for everyone to refresh
those RPG skills!)

That those are the top 10 languages in some order is probably
reasonable, if you include the use of Basic in Office macros and other
light programming, and you accept that what is being measured is the
interest in people with those language skills. Java, C, C++, and C# I
don't think anyone would disagree with; Objective-C is basically
programming on Mac, PHP and Ruby are the most significant
web-programming languages (you might also include ASP, but that seems to
be falling out of favor, even by Microsoft). Python and Perl are of
course the premier scripting languages, and Visual Basic is the crown of
crappy macro stuff and programming for idiots.

I had a discussion a few months ago about what a ranking of the
languages as measured by most lines of code (normalized to account for
expressiveness) in use would be. At the very least, Java, C, Fortran,
and COBOL would be near the top of the list; I don't know much more to
give a fuller list...
 
B

BGB

There is no "C ABI". If you believe otherwise, please cite the
appropriate language from ISO/IEC 9899. I'll accept any version.

more like "there is no universal standardized C ABI...", and nothing is
stated in the C standards to this effect, but, there ARE C-ABIs...

http://en.wikipedia.org/wiki/X86_calling_conventions

note also:
http://agner.org./optimize/calling_conventions.pdf


much like there is a C++ ABI (or more correctly, many C++ ABIs...).


but, the "standardized" (in a largely de-facto sense) ABI for C on
32-bit x86 is commonly known as cdecl, but it has other names.

in its common form:
arguments are passed right to left on the stack;
structs are passed by passing an argument to the return location on the
stack;
generally, on non-ELF targets, all names have a prepended '_';
....


now, the issue is that not all compilers exactly agree on all the
details, and these subtle differences can at times break code linked
together from different compilers...

these issues can also pop up sometimes when mixing a DLL compiled in one
compiler with a main program compiled in another (such as using
MinGW-compiled DLLs with a MSVC-compiled app, ...).


granted, most of the time all this works without too much issue though,
and careful handling can largely avoid many of these issues...

example:
use caution when passing/returning structs or SIMD types (preferably,
don't directly pass SIMD types, such as "__m128", only structs
containing floats or similar);
don't use long-double in API calls;
....
 
B

BGB

Used properly, autoconf works just fine on Windows - or, at any rate,
as well as it works anywhere. (Like Joshua I am not particularly
impressed with autoconf, though it's not quite as thoroughly
brain-damaged as some of its fellow GNU build tools, such as libtool.)
Wireshark uses it, for example.

Once again, the real problem is that systems like autoconf only help
with C code that is written to be portable with the help of
conditional compilation. The vast majority of C code is poorly written
(spend some time on comp.lang.c if you don't understand how or why)
and a good portion of that is unportable assumptions.

Some of the classic non-portable assumptions in C code are becoming
rarer. The prevalence of two's-complement machines over
one's-complement and sign-magnitude (the other two "pure binary
representations" allowed for C integer types) has largely eliminated
one source of bit-twiddling errors, for example; and the popularity of
I32LP64 architectures has made more programmers aware of the problems
of casting between pointer and integer types.

But we still see a lot of code with character set assumptions, or
assuming CHAR_BIT==8, or assuming huge auto-class objects are fine.
Those are safe assumptions on many platforms, but they limit
portability. So do endianness assumptions, etc.

CHAR_BIT==8 is AFAIK more acceptable, since nearly all major/common
hardware at this point (and likely in the near future) has this property.

endianess matters if one thinks the code may have a chance of migrating
between different sorts of targets, such as between x86 and PPC. usually
I handle endianness in my own code though.

And we still see a lot of buffer overflows, integer overflows, unsafe
or erroneous memory allocation. Failures to check for and handle error
returns from library and system calls. TOCTOU races and other forms of
unsafe file handling. Interpositioning vulnerabilities (a huge issue
with Windows right now; not specific to C, but mitigated by runtime
systems that use more-sophisticated dynamic loaders). And so on.

Autoconf does *not a damn thing* to address any of this.

yes, ok.


oddly, Mozilla uses Autoconf+MSVC in their Windows build setup (and
apparently also a 'special' version of Autoconf).

IMO it looks like a bit of a kludge.

more impressive though was that it actually worked, and when one follows
their directions (downloading/using MozillaBuild, ...), they can rebuild
FireFox/... from source on Windows.


I am currently using MSVC, but mostly this was because in 2009 I had
been building for Win64, and GCC's Win64 support was not impressive
(mostly in the sense that it was occasionally producing broken code
apparently mixing together parts of the Win64 and AMD64 ABI's...).

I am left wondering if GCC's Win64 support has improved, or if it really
matters. I could probably switch back, but it would involve having to
mess around some with the makefiles, which is inconvinient.


or such...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top