Discussion in 'C Programming' started by Rick C. Hodgin, Jan 29, 2014.

  1. (snip)
    Yes. Well, some might have different definition, but there are many
    languages that can only be interpreted. TeX, for example, lets you
    change the category codes of characters, which changes, among others,
    which are letters, and so allowed in multiple character names.

    Languages that allow dynamic typing of variables, such as matlab and
    octave, are mostly meant to be interpreted. (Though many now use the
    just-in-time compilation to speed things up.)

    It is pretty much always possible to interpret a compiled language,
    but not always the other way around.

    -- glen
    glen herrmannsfeldt, Feb 9, 2014
  2. Or you *might* want a type for which the compiler emulates
    2's-complement in software -- but you're not going to get that unless
    you pay for it somehow. C doesn't require compiler implementers
    to do that work.

    It might be useful to have a standard typedef that's guaranteed to refer
    to a 32-bit signed integer type without specifying which of the three
    permitted representations it uses, but I think non-two's-complement
    systems are rare enough that it wasn't thought to be worth adding it to
    the standard.
    Keith Thompson, Feb 9, 2014
  3. The comment reads:

    "Note: For testing this in Visual Studio, see the #define _TEST_ME
    line. It is not normally un-commented so the sha1.cpp can be
    directly included in other projects.
    However, for testing it must be uncommented to expose the main()
    function and related at the end of this source code file."


    "[_TEST_ME] should only be defined when testing from within the
    sha1.sln project."

    It indicates three things:

    (1) That for stand-alone testing, _TEST_ME needs to be defined.
    (2) That it is included in other libsf projects as an include file.
    (3) That it is designed for Visual Studio (sha1.sln).

    Since it uses non-standard naming for the fixed-size types, porting may
    be an issue.
    I don't know. It's not a great concern for me to know. There are
    undoubtedly several. Choose several of those and there's your list. :)

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Feb 9, 2014
  4. David, you don't need to quote the entire parent article. Just delete
    any quoted text that you're not replying to. You can mark deletions
    with something like "[...]" or "[snip]" if you like. Apart from saving
    bandwidth, snipping avoids making readers scroll down to the bottom of
    your article to see if you've added anything else.
    Keith Thompson, Feb 9, 2014
  5. Fair enough -- except of course that adding new keywords like u8 et al
    would break existing code.
    Just use a typedef; that's what it's for. (Windows does have a
    regrettable tendency to pollute the user's namespace with its own type
    C as it's currently defined carefully allows for conforming
    implementations on systems that don't have the most common integer sizes
    (8, 16, 32, 64). You can either continue that (which means that u8 et
    all have to be optional), or you can change it (which means that no
    future system that doesn't conform to your new requirements can have a
    conforming C implementation).
    Changing source code to conform to a new standard shouldn't be terribly
    difficult -- but it only solves a fairly small part of the problem. How
    sure can you be that that conversion tool is 100% reliable?

    If you force a change to millions of lines of source code, it all has to
    be retested, rebuilt, and rereleased, and that process will not go as
    smoothly as you expect it to. If a new C standard broke existing code
    like that, most large projects would probably just continue to use
    compilers that only support the older standards. It would probably
    permanently fragment the language.
    char, signed char, and unsigned char are all distinct types. Keeping
    that rule while making u8 and c8 aliases for unsigned char (or for char)
    and for signed char (or for char) would IMHO be too confusing. And
    there's already a mechanism for creating new type names for existing
    Keith Thompson, Feb 9, 2014
  6. You mean Oak?
    Well, interpreters have often used various tricks to speed things up.

    Languages meant to be user commands, such as unix shells and
    Windows CMD are likely 100% interpreted.

    I remember the HP TSB2000 BASIC systems would convert numeric constants
    to internal form, such that when you list the program it comes out
    different than you typed it in.

    But you should always be able to interpret a compiled language,
    though that might mean an initial pass to find variable names and
    statement labels.

    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  7. (snip, someone wrote)
    And soon you get to the question about microprogrammed processors
    "interpreting" the machine code. In most cases, though, the
    underlying hardware is specifically designed to make it easier
    to implement in microcode, and definitely isn't true for
    horizontal microcode, which might execute only one microinstruction
    per host instruction.
    No idea about Oak. It was originally designed for set-top boxes,
    and may have been somewhat different than now.
    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  8. Eh? It's not commented at all, it's "commented" by #ifdef #endif.
    Yes, I do know this stuff and I can read. I am letting you know that it
    does not compile as a stand-alone program due to errors unrelated to the
    You took a perfectly normal translation unit and turned it into a file
    whose purpose is to included with #include? That seems... odd.

    Note that no one would normally deduce "as an include file" from "can be
    directly included in other projects".
    Does that make it not C anymore? I.e. is there some stuff related to VS
    that makes it compile there when it won't just by giving the file to a

    | I've never had another language where fundamental data types are of
    | variable size. From assembly through Java, they are a known size.
    | Only in the land of C, the home of the faster integer for those
    | crucial "for (i=0; i<10; i++)" loops, do we find them varying in
    | size.

    Nothing I can do can make "only in the land of C" be correct, nor make
    "from assembly through Java, they are a known size" look any less
    Ben Bacarisse, Feb 10, 2014
  9. Rick C. Hodgin

    David Brown Guest

    Out of curiosity, what is your day job? We know you are not qualified
    to program in C, and you don't know or use C++, so what do you do with
    VS2008 at work? I gather VS supports other languages (like C#, F#,
    etc.), so maybe you work with one of them? In which case, why don't you
    use that for RDC?
    David Brown, Feb 10, 2014
  10. Rick C. Hodgin

    David Brown Guest

    Just to be clear on a point here (especially since most of my posts have
    been somewhat negative), I think it is a very good thing for developers
    to have assembly language experience - precisely because it helps them
    understand what is going on "under the hood". This is particularly true
    for lower level languages such as C. I don't mean that it is a good
    idea to do /much/ programming in assembly, especially with overly
    complex architectures like x86, but understanding how the generated code
    works lets you get a better feel for some types of C programming. For
    embedded systems with small cpus, it is particularly important, so that
    you have a reasonable idea of the size and speed you can expect from a
    given section of C code.

    However, don't get carried away - there are lots of things that are done
    in assembly programming that don't match well in higher level code (less
    so if your assembly programming is structured, modular, and
    maintainable). And there are some things that can be done in assembly
    that are almost impossible in most high level languages (such as
    co-routines, multiple entry points, etc.). And of course, on a modern
    cpu, well-written C compiled with a good compiler will outclass
    hand-written assembly on most tasks - so there is seldom reason to write
    assembly for real code.

    And one should certainly /never/ try to learn assembly for the Itanium,
    unless someone is paying you lots of money to write a C compiler for it.
    It's a dead-end architecture, so all the effort will be wasted, and
    you'll quickly drive yourself insane as you try to track all these
    registers and manual instruction scheduling while being unable to resist
    the temptation to squeeze another instruction into the same cycle count.
    David Brown, Feb 10, 2014
  11. Rick C. Hodgin

    David Brown Guest

    "Interpreted" means reading the source code and handling it directly.
    "Compiled" means a tool reads the source code and generates an
    executable that is later run.

    I don't think the meaning of "interpreted" has changed - it is simply
    that languages are no longer easily divided into "interpreted" and
    "compiled". In particular, many languages are "bytecode compiled" and
    run on virtual machines, and sometimes these are "Just In Time" compiled
    to machine code. And the boundaries between "compiled", "bytecoded" and
    "interpreted" have become more blurred.

    This is not actually a new thing - it is just that bytecoding has become
    a lot more relevant for modern languages. There have been bytecoded
    languages for decades (such as the "P-Code" Pascal system I used briefly
    some thirty years ago).
    David Brown, Feb 10, 2014
  12. Rick C. Hodgin

    David Brown Guest

    On 10/02/14 00:48, Keith Thompson wrote:
    Yes, I know - sorry. My excuse is the post was late at night and I was
    David Brown, Feb 10, 2014
  13. Rick C. Hodgin

    David Brown Guest

    You use "int32_t" when you want /exactly/ 32-bit bits - not "at least
    32-bits". For that, you can use int_least32_t". When you are
    communicating outside the current program (files on a disk, network
    packets, access to hardware, etc.), then exact sizes are often important.

    I agree that such systems are too rare to make it worth having to
    specify integer representations in standard typedefs (and who wants to
    write "int_twoscomp32_t" ? Certainly not those that already complain
    about "int32_t" !). It might be nice with some standardised pre-defined
    macros, however, so that one could write:

    #ifndef __TWOS_COMPLEMENT
    #error This code assumes two's complement signed integers

    One feature I would love to see in the standards - which would require
    more work from the compiler and not just a typedef - is to have defined
    integer types that specify explicitly big-endian or little-endian
    layout. Non-two's-complement systems are rare enough to be relegated to
    history, but there are lots of big-endian and little-endian systems, and
    lots of data formats with each type of layout. I have used a compiler
    with this as an extension feature, and it was very useful.
    Ranged integer types would be nice - some other languages (like Pascal
    and Ada) have them. They let you be explicit about the ranges in your
    code, and give the compiler improved opportunity for compile-time checking.

    Of course, it gets complicated trying to specify behaviour of how these
    types should interact with other integral types.
    Yes. There is probably a C++ template library that covers all this :)
    David Brown, Feb 10, 2014
  14. Except that it often doesn't mean exactly that.

    It probably does in the case of command langauges, as often the
    program execution logic is just spliced into the command processor,
    and most often speed isn't all that important.

    But note that the original statement was "interpreted language"
    not just "interpreter".

    For one, it is often desirable that the interpreter do a syntax
    check first, as it is surprising to users to have a syntax error
    detected much later. If the language allows for GOTO and labels,
    the first pass may also recognize the position of labels for faster
    reference. Maybe also put the symbols into a symbol table, and
    allocate space for variables. All features that an "interpreter"
    shouldn't have, but aren't much extra work.

    As I mentioned previously, some BASIC interpreters convert constants
    to internal form on input, and convert back when generating a listing.
    (Funny, because a different value might come back.) It is also
    simple to convert keywords to a single character (byte) token,
    especially if keywords are reserved. The result is sometimes
    called an incremental compiler, but the result isn't so different
    from the usual interpreter.

    Just to add more confusion, consider the in-core compiler. Just
    like usual compilers, it generates actual machine instructions,
    but doesn't write them to a file. That avoids much overhead of I/O,
    and in addition simplifies the fixup problem on forward branches.
    You have to have enough memory for both the program and compiler
    at the same time, but that is often the case for smaller programs.
    The OS/360 WATFOR and WATFIV are favorite examples for this case.
    For another case, consider exactly the same code as might be a
    bytecode (except that the size might be other than bytes) and
    instead for each one place a subroutine call instruction to the
    routine that processes that code. (Or maybe indirectly to that
    processing routine.) On many machines, the result looks exactly
    the same, except for a machine specific operation code instead of
    zeros before each operation. Yet now the result is exectuable code,
    as usually generated by compilers.
    I am sure it goes back much farther than that.

    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  15. If you run more than one compiler on the same machine, there is
    always the possibility of overlap in environment variable usage.

    It is way too common to use LIB and INCLUDE as environment
    variables for the corresponding directories. Some compilers
    have other variables that they will check for ahead of those,
    to allow them to coexist.

    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  16. Rick C. Hodgin

    BartC Guest

    That's what I keep hearing, however...

    The following are timings, in seconds, for an interpreter written in C,
    running a set of simple benchmarks.

    GCC A B C D

    -O3 79 130 152 176
    -O0 87 284 304 297

    A, B, C, D represent different bytecode dispatch methods; 'C' and 'D' use
    standard C, while 'B' uses a GCC extension.

    'A' however uses a dispatch loop in x86 assembler (handling the simpler

    The difference is not huge: barely twice as fast as the 'C' method, and when
    executing real programs the difference narrows considerably. But it still
    seems worth having as an option. (And with other compilers which are not as
    aggressive at optimising as GCC, it might be more worthwhile.)

    In general however you are probably right; interpreters are a specialised
    application, as the assembler code is only written once, not for each
    program it will run, and there are issues with maintenance, portability and
    reliability. (And even with interpreters, there are cleverer ways of getting
    them up to speed than this brute-force method.)
    BartC, Feb 10, 2014
  17. (snip)
    I agree, but ...
    In addition, it sometimes results in a tendency to write for specific
    machine code when there is no need to do that. That is, the old saying
    "Premature optimization is the root of all evil".
    Yes. And also not to try to write C code to "help" the compiler
    along when it isn't needed. (But not forget how when it is.)
    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  18. (snip)
    Some languages allow you to specify approximately the needed range.
    PL/I allows one to specify the number of decimal digits or binary
    bits needed, such that the compiler can supply at least that many.

    Fortran now has SELECTED_INT_KIND() and SELECTED_REAL_KIND that allow
    one to specify the needed number of decimal digits.

    Still, I remember compiling Metafont on a Pascal compiler that didn't
    generate a single byte for a 0..255 integer. The result was that the
    output files had a null byte before each actual byte. The result was
    that I wrote one of the simplest C programs doing both input and output:

    while(getchar() != EOF) putchar(getchar());
    Not to mention floating point types.

    -- glen
    glen herrmannsfeldt, Feb 10, 2014
  19. Rick C. Hodgin

    BartC Guest

    By that definition then Python is not interpreted, since it is translated
    into bytecode. (Usually on-the-fly but but it can also be pre-compiled.)
    There seem to be dozens of ways of executing Java bytecode. But if that
    involves repeatedly re-interpreting the same codes, then most people would
    say that it is being interpreted (and you will notice the difference because
    it might be a magnitude slower).

    But this makes Java more 'fuzzier', unless you pin down exactly what happens
    to the bytecode.
    So? In theory you can create a machine to run intermediate code. And Java
    bytecode seems particularly simple, since it is statically typed (I haven't
    looked into it in detail, but I don't understand why a load-time conversion
    to native code isn't just done anyway, the result of which can be cached.)
    BartC, Feb 10, 2014
  20. Rick C. Hodgin

    James Kuyper Guest

    Remember the complaints he made about certain aspects of C that turned
    out to be due to his using a C++ compiler to compile programs written in
    files with names ending in *.cpp? His understanding of the difference
    between C and C++ is even poorer than his understanding of C - but he
    does use C++, if only by accident.
    James Kuyper, Feb 10, 2014
