need XML schema to store infomation in a language neutral format

Discussion in 'XML' started by AViS, Aug 16, 2006.

  1. AViS

    AViS Guest

    Hi,
    I am building a language translator, that must convert input from
    source languages to a language neutral format in XML. This XML must be
    read by the target language translator and produce the output in the
    target language. I am thinking of using a hashed map to handle
    translations but am have trouble in deciding on the schema in which the
    XML must be stored


    The application must work as follows...
    {c translator} <---> | X M L | <---> {vb translator}
    int i; stored in Dim i as Integer
    printf("%d",i); neutral format Print i

    Proposed XML format:
    <translate>
    <action index=1>i</action>
    <action index=2>i</action>
    </translate>

    the index attribute of the XML tag action will refer to a hash table
    that will aid in translations thus
    __________________________________________________________________
    | index | c | vb |
    |==================================================================|
    | 1 | int $token | Dim $token as Integer |
    | 2 | printf("%d",$token) | print $token |
    ===================================================================


    Is the XML format and translation method I propose sufficient. Please
    consider that the conversion is 100% possible (meaning my translator
    excludes C's asm, pointers etc.)
    AViS, Aug 16, 2006
    #1
    1. Advertising

  2. AViS

    Stefan Ram Guest

    "AViS" <> writes:
    >I am building a language translator, that must convert input from
    >source languages to a language neutral format in XML.


    There is no language neutral format. Or - in other words:
    A "language neutral format" is just another language.

    >I am thinking of using a hashed map to handle translations


    Mentioning a low-level implementation detail as a hashing when
    talking about a very high-level task seems inappropriate.

    >Is the XML format and translation method I propose sufficient.


    Even XML is a low-level implementation detail when in fact you
    should be talking about annotated trees or similar structures.

    It will be possible for you to translate a small restricted
    and controlled subset of both languages. More might be beyond
    the capabilities of most individuals, though very gifted
    programmers or organizations might be able to translate a
    large part of both languages: But I expect this to be a huge
    effort.
    Stefan Ram, Aug 16, 2006
    #2
    1. Advertising

  3. AViS

    AViS Guest

    > There is no language neutral format. Or - in other words:
    > A "language neutral format" is just another language.

    "Language Neutral" was meant to be in the sense that I did not want the
    source language to be recognized by looking at the XML, in other words
    if 'printf' in c is translated to 'X' in the XML then an equivalent
    'cout' in c++ must also be translated to the same 'X'

    > Mentioning a low-level implementation detail as a hashing when
    > talking about a very high-level task seems inappropriate.

    Sorry about the hash map, I just thought it would make more sense to
    explain the index attribute of the xml <action> tag along with the hash
    map


    > Even XML is a low-level implementation detail when in fact you
    > should be talking about annotated trees or similar structures.

    Can you explicate more about "annotated trees or similar structures". I
    am not able to find much info on the net. Even if you can suggest some
    sites, that'll go a long way

    > It will be possible for you to translate a small restricted
    > and controlled subset of both languages...

    It is enough if it is translates only a subset, in other words...
    though I need to find a way to store in the XML, the functions held by
    a c++ class, when translating from xml to c the functions will be
    removed and the class be converted to a typedef struct.
    AViS, Aug 16, 2006
    #3
  4. AViS

    Stefan Ram Guest

    "AViS" <> writes:
    >"Language Neutral" was meant to be in the sense that I did not
    >want the source language to be recognized by looking at the XML,


    This is easy: If any code, such as

    printf( "%d", i );

    is given, and I tell you that it was translated from a language
    X, there is no way for you, to find out what X is. So /every/
    representation will fulfil this requirement.

    >in other words if 'printf' in c is translated to 'X' in the
    >XML then an equivalent 'cout' in c++ must also be translated
    >to the same 'X'


    In general, equivalence between two programs is undecidable.

    See »Equivalence Problem« in

    http://www.cs.rochester.edu/u/nelson/courses/csc_173/computability/undecidable.html

    However, for a restricted domain you might indeed suceed
    to find such a representation. One possibility would be
    to translate the C++ into C as early C++ compilers did.

    >>Even XML is a low-level implementation detail when in fact you
    >>should be talking about annotated trees or similar structures.

    >Can you explicate more about "annotated trees or similar structures". I
    >am not able to find much info on the net. Even if you can suggest some
    >sites, that'll go a long way


    Starting points might be

    http://en.wikipedia.org/wiki/Abstract_syntax_tree
    http://www.cse.iitk.ac.in/users/raj/cs335/WebNotes/lec17.html

    An annotated tree is a tree with annotations, which might be
    represented as attributes in XML. While "annotated tree" means
    the information structure itself, an XML documented is one way
    to represent such an information structure using a text
    document.

    Maybe, to ask in this XML newsgroup, you should try to isolate
    that part of your question that is directly related to the
    language XML from the rest that deals with your algorithm, but
    has nothing to do with XML.
    Stefan Ram, Aug 16, 2006
    #4
  5. It sounds like you're talking about an XML representation of an
    Intermediate Language general enough to cover multiple source languages.
    Your first step, therefore, is to find or design that IL; from there,
    writing an XML rendering of it is straightforward.

    I'd recommend reading any of the standard reference works on compiler
    design as a starting point for picking your IL. Note that its required
    characteristics are going to depend heavily on exactly what operations
    you're going to want to perform against that representation.


    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
    Joseph Kesselman, Aug 16, 2006
    #5
  6. AViS

    AViS Guest

    Thanks Stefan and Joseph.
    The IL in XML was intended to be my Proof of Concept for a much bigger
    initiative
    I shall try to keep this thread updated with the latest.


    Thanks again.
    AViS, Aug 17, 2006
    #6
  7. Stefan Ram wrote:
    > "AViS" <> writes:
    > >"Language Neutral" was meant to be in the sense that I did not
    > >want the source language to be recognized by looking at the XML,

    >
    > This is easy: If any code, such as
    >
    > printf( "%d", i );
    >
    > is given, and I tell you that it was translated from a language
    > X, there is no way for you, to find out what X is. So /every/
    > representation will fulfil this requirement.
    >
    > >in other words if 'printf' in c is translated to 'X' in the
    > >XML then an equivalent 'cout' in c++ must also be translated
    > >to the same 'X'


    The idea of using an intermediate language might not be the best way to
    go about it but so what? Let the guy explore, he might find
    interresting things and he surely will learn alot.

    > In general, equivalence between two programs is undecidable.


    This whole thing seems fishy to me. If 2 languages are Turing complete,
    then they can both represent everything that is representable by a
    Turing machine which is everything that is computable. This means that
    any program representation in the first language DOES have an
    equivalent representation in the second language.

    Knowing weather 2 given programs written in 2 different languages are
    indeed functionally equivalent if both languages are Turing complete is
    far from being a trivial problem but it is possible.

    >
    > See »Equivalence Problem« in
    >
    > http://www.cs.rochester.edu/u/nelson/courses/csc_173/computability/undecidable.html


    Baloney. If the input and output subsets of each program are known for
    both programs then they can be compared to evaluate if they are
    functionally equivalent. The Equivalence problems speaks of Equivalence
    in general terms (whatever that means (nothing in context if you ask
    me)). The difficulty resides in our inability to track very complex
    problems. They are not impossible to solve, they are simply too complex
    to aprehend when taken as a whole upfront.

    Of course the proof makes sure not to mention any specific languages.
    The proof applies to a program that would compute equivalence for ANY 2
    programs. No such program can exist in the first place. The guy isn't
    trying to translate anything to everything else, he's writing a
    translater that goes from one language to another. Quite challenging,
    but not impossible.

    [snip]

    Regards
    Jean-Francois Michaud
    =?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=, Aug 17, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hannes Schmiderer

    Localization: neutral language out of main assembly

    Hannes Schmiderer, Aug 19, 2003, in forum: ASP .Net
    Replies:
    5
    Views:
    1,521
    Hannes Schmiderer
    Aug 21, 2003
  2. wgan
    Replies:
    7
    Views:
    574
    Roedy Green
    Jul 8, 2004
  3. Markus
    Replies:
    1
    Views:
    1,491
    Markus
    Nov 23, 2005
  4. Stanimir Stamenkov
    Replies:
    3
    Views:
    1,164
    Stanimir Stamenkov
    Apr 25, 2005
  5. Replies:
    3
    Views:
    2,783
Loading...

Share This Page