LaTeX-Like Parsing in C

Discussion in 'C Programming' started by nedelm@po-box.mcgill.ca, Jul 26, 2007.

  1. Guest

    My problem's with parsing. I have this (arbitrary, from a file)
    string, lets
    say:

    "Directory: /file{File:/filename(/size) }"

    I would like it to behave similar to LaTeX. I parse it, and then I
    write it
    out for diferent variables, like:

    "Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "

    But I keep getting into a mess of complication. I'm using C (of
    course.) How
    do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
    to a
    data-structure that I could write out? Algorithms?

    -Neil
     
    , Jul 26, 2007
    #1
    1. Advertising

  2. said:

    > My problem's with parsing. I have this (arbitrary, from a file)
    > string, lets
    > say:
    >
    > "Directory: /file{File:/filename(/size) }"
    >
    > I would like it to behave similar to LaTeX. I parse it, and then I
    > write it
    > out for diferent variables, like:
    >
    > "Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "
    >
    > But I keep getting into a mess of complication. I'm using C (of
    > course.) How
    > do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
    > to a
    > data-structure that I could write out? Algorithms?


    Start with a lexing stage, where you simply break the input into lexical
    tokens, doing your best to identify them as you go but not worrying too
    much about odd cases. Store your lexical tokens in some kind of dynamic
    data structure such as a linked list. Yes, strpbrk will work for this,
    or even strtok if your input is writeable.

    That will massively reduce the complexity of the parsing stage, since
    you won't have to worry about tokenisation (because each token is
    simply the next node on the linked list), and so you can focus purely
    on the grammar that you are trying to implement.

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
     
    Richard Heathfield, Jul 26, 2007
    #2
    1. Advertising

  3. Chris Dollin Guest

    Richard Heathfield wrote:

    > said:
    >
    >> My problem's with parsing. I have this (arbitrary, from a file)
    >> string, lets
    >> say:
    >>
    >> "Directory: /file{File:/filename(/size) }"
    >>
    >> I would like it to behave similar to LaTeX. I parse it, and then I
    >> write it
    >> out for diferent variables, like:
    >>
    >> "Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "
    >>
    >> But I keep getting into a mess of complication. I'm using C (of
    >> course.) How
    >> do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
    >> to a
    >> data-structure that I could write out? Algorithms?

    >
    > Start with a lexing stage, where you simply break the input into lexical
    > tokens, doing your best to identify them as you go but not worrying too
    > much about odd cases. Store your lexical tokens in some kind of dynamic
    > data structure such as a linked list. Yes, strpbrk will work for this,
    > or even strtok if your input is writeable.


    And if your tokenisation rules are sufficiently bizarre [1], you can
    resort to tools such as [f]lex, which [typically|can] generate C
    code/tables for you.

    > That will massively reduce the complexity of the parsing stage, since
    > you won't have to worry about tokenisation (because each token is
    > simply the next node on the linked list), and so you can focus purely
    > on the grammar that you are trying to implement.


    And again, if you end up with a sufficiently complex grammar [1again],
    there are tools that will help. But if you're in control of the grammar,
    such complexity may be a grammar smell ...

    (Also helpful: existing books. And writing unit tests.)

    [1] What counts as "sufficiently" is variable.

    --
    Far-Fetched Hedgehog
    "It took a very long time, much longer than the most generous estimates."
    - James White, /Sector General/
     
    Chris Dollin, Jul 27, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ivo welch
    Replies:
    4
    Views:
    823
    ivo welch
    Dec 26, 2003
  2. Marcus Beyer

    from XHTML to PDF in Java: LaTeX 3?

    Marcus Beyer, Nov 19, 2003, in forum: Java
    Replies:
    1
    Views:
    4,072
    Marcus Beyer
    Nov 20, 2003
  3. Michael Friendly

    translating LaTeX to XML

    Michael Friendly, Apr 2, 2004, in forum: XML
    Replies:
    1
    Views:
    618
    Patrick TJ McPhee
    Apr 4, 2004
  4. Shmuel (Seymour J.) Metz

    XML equivalent to LaTex res class

    Shmuel (Seymour J.) Metz, Jun 30, 2004, in forum: XML
    Replies:
    1
    Views:
    710
    Martin Honnen
    Jun 30, 2004
  5. Patrick Kowalzick
    Replies:
    5
    Views:
    511
    Patrick Kowalzick
    Mar 14, 2006
Loading...

Share This Page