parsing text

Discussion in 'C Programming' started by yang__lee@ausi.com, Apr 11, 2007.

  1. Guest

    Hi,

    I hope you may help me.

    Please check the attached text file.
    Actually its a report file with some headers information
    and them report is in tabular format. I want to parse each row
    and get the values.

    I think strtok won't work here.

    There are no tabs in between the column values they are spaces.
    Column values them selves contain spaces but they are single spaces.
    Minimum two spaces are there between column values.

    can you suggest some algorithm in C or any other method to get each
    column value.

    In excel I try to create a delimited file but its unsuccessful.

    Thanks,

    lee


    following content is in a text file. Copy in a text file
    Keep wordwrap off
    -------------------------------------------------------------------------------




    DATE: 04/07/2007

    TMK

    MTC_PROC REPORT

    Area: KTO
    WRKPCK: KTO


    PROCESS PNAME FEANAME CEB MCO DPNUM NM_ITM
    ERROR
    TXT_GUI ERROID
    _______ ___________ _____________ ___ ___ ______ __________
    ________________________________________________
    ____________________________________________________________ ______

    RECKKON 206 DEVICE 007 997 532533 532533
    Invalid source. BREAK needed. Count: 1501-1505
    Source: -98 0201

    RECKKON U206 DEVICE 007 997 532533 532533
    Invalid source. BREAK needed. Count:
    1421,1726-1730 Source: -98 0201

    RECKKON F77 CROSS CNNCT 009 997 520624 520624
    Feeder feature. Count: AM992,651-900
    Source: 520619 7310

    RECKKON F1727 X-MIC 009 997 521184 521184
    Sourced Out. Count: SA5211,1-1800
    Source: 0 0206

    RECKKON F115 CROSS CNNCT 009 997 522306 522306
    provide feed. Count:
    1400,1001-1100 Source: 522333 7310
    , Apr 11, 2007
    #1
    1. Advertising

  2. Guest

    On 11 Apr, 11:44, wrote:
    > Hi,
    >
    > I hope you may help me.
    >
    > Please check the attached text file.
    > Actually its a report file with some headers information
    > and them report is in tabular format. I want to parse each row
    > and get the values.
    >
    > I think strtok won't work here.
    >
    > There are no tabs in between the column values they are spaces.
    > Column values them selves contain spaces but they are single spaces.
    > Minimum two spaces are there between column values.
    >
    > can you suggest some algorithm in C or any other method to get each
    > column value.


    You know what data is in what character positions, so what's wrong
    with extracting the data on that basis?

    strncpy() would do for starters. More complex code could wrap this to
    strip trailing spaces, convert to numeric format, etc...
    , Apr 11, 2007
    #2
    1. Advertising

  3. wrote:
    > Hi,
    >
    > I hope you may help me.
    >
    > Please check the attached text file.


    No, thank you. Do not post attachments to text newsgroups.
    Martin Ambuhl, Apr 11, 2007
    #3
  4. In article <>,
    Martin Ambuhl <> wrote:

    >> Please check the attached text file.


    >No, thank you. Do not post attachments to text newsgroups.


    In fact he didn't post an attachment. He just appended the text to
    his message.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
    Richard Tobin, Apr 11, 2007
    #4
  5. Richard Tobin wrote:
    > In article <>,
    > Martin Ambuhl <> wrote:
    >
    >>> Please check the attached text file.

    >
    >> No, thank you. Do not post attachments to text newsgroups.

    >
    > In fact he didn't post an attachment. He just appended the text to
    > his message.


    As silly as it may be, I tend to believe posters unless this has already
    been shown to be an error. At the point that (an
    attribution you unaccountably removed) wrote that there was an attached
    file, there was no reason to doubt his word. Silly me.

    In the same way, if I encountered your post without known earlier
    context, I might think that I had written
    >>> Please check the attached text file.

    I would wonder when I did it, and why there were extra '>'s, but you
    claimed that I wrote, so I must have.
    Martin Ambuhl, Apr 11, 2007
    #5
  6. user923005 Guest

    You have two formats, one for the header and one for the body.

    Write a function to parse the header and another function to parse the
    body.

    Do not write one function to do both jobs. It is also easy to
    recognize the transition between header and body so write a
    controlling function that sees the header and calls the header parser
    and then sees the body and calls the body parser.

    Your question is really more appropriate for news:comp.programming
    since you do not have a C question but a programming one.

    Personally, I would parse the header and then throw the body into a
    text file and use an ODBC text file connection. But that's neither
    here nor there and it's not even topical on news:comp.lang.c

    bcnu
    user923005, Apr 11, 2007
    #6
  7. In article <>,
    Martin Ambuhl <> wrote:

    >>>> Please check the attached text file.


    >>> No, thank you. Do not post attachments to text newsgroups.


    >> In fact he didn't post an attachment. He just appended the text to
    >> his message.


    >As silly as it may be, I tend to believe posters unless this has already
    >been shown to be an error. At the point that (an
    >attribution you unaccountably removed) wrote that there was an attached
    >file, there was no reason to doubt his word. Silly me.


    "Attached" doesn't only mean "MIME attachment". The phrase "the
    attached X" has been around for a long time.

    >In the same way, if I encountered your post without known earlier
    >context, I might think that I had written
    > >>> Please check the attached text file.

    >I would wonder when I did it, and why there were extra '>'s, but you
    >claimed that I wrote, so I must have.


    When you become familiar with Usenet quoting conventions, you will not
    have this problem.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
    Richard Tobin, Apr 11, 2007
    #7
  8. Richard Tobin wrote:
    > In article <>,
    > Martin Ambuhl <> wrote:


    >> In the same way, if I encountered your post without known earlier
    >> context, I might think that I had written
    >>>>> Please check the attached text file.

    >> I would wonder when I did it, and why there were extra '>'s, but you
    >> claimed that I wrote, so I must have.

    >
    > When you become familiar with Usenet quoting conventions, you will not
    > have this problem.


    When you become familiar with Usenet quoting conventions, you will no
    longer snip attributions aways from text your are quoting. Your
    practice is at least cavalier and sloppy; it is possibly dishonest as well.
    Martin Ambuhl, Apr 11, 2007
    #8
  9. Ian Collins Guest

    Richard Tobin wrote:
    >
    > When you become familiar with Usenet quoting conventions, you will not
    > have this problem.
    >

    Your conventions, or everyone else's?

    --
    Ian Collins.
    Ian Collins, Apr 11, 2007
    #9
  10. Default User Guest

    user923005 wrote:

    > You have two formats, one for the header and one for the body.


    I do?

    > Write a function to parse the header and another function to parse the
    > body.


    Why would I want to do that?

    > Do not write one function to do both jobs. It is also easy to
    > recognize the transition between header and body so write a
    > controlling function that sees the header and calls the header parser
    > and then sees the body and calls the body parser.


    If you say so!

    > Your question is really more appropriate for news:comp.programming
    > since you do not have a C question but a programming one.


    I didn't know I had a question.

    > Personally, I would parse the header and then throw the body into a
    > text file and use an ODBC text file connection. But that's neither
    > here nor there and it's not even topical on news:comp.lang.c


    You're the one who brought it up.


    Unless, perhaps, this was meant as a reply to some OTHER person. As
    there's no quotes are attributions, that's rank speculation on my part.




    Brian
    Default User, Apr 11, 2007
    #10
  11. In article <>,
    Ian Collins <> wrote:

    >> When you become familiar with Usenet quoting conventions, you will not
    >> have this problem.


    >Your conventions, or everyone else's?


    It should be obvious that there are various conventions for quoting
    used on Usenet. Regular readers become familiar with the mostly
    commonly used ones.

    I have no plans to continue this discussion, since I doubt any of us
    will change our views.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
    Richard Tobin, Apr 12, 2007
    #11
  12. Richard Tobin said:

    > In article <>,
    > Ian Collins <> wrote:
    >
    >>> When you become familiar with Usenet quoting conventions, you will
    >>> not have this problem.

    >
    >>Your conventions, or everyone else's?

    >
    > It should be obvious that there are various conventions for quoting
    > used on Usenet. Regular readers become familiar with the mostly
    > commonly used ones.


    Come on, Richard - you did snip an attrib, now, didn't you?

    Surely you and Martin are both above this sort of bickering.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Apr 12, 2007
    #12
  13. Default User said:

    <snip>
    >
    > Unless, perhaps, this was meant as a reply to some OTHER person. As
    > there's no quotes are attributions, that's rank speculation on my
    > part.


    It wasn't intended for you, as we can deduce quickly and easily from the
    available information.

    Firstly, it was written in English, so it's not intended for Pedro
    Garcia, a Spaniard living in Barcelona who has no English. Secondly, it
    was posted in comp.lang.c, so it is not intended for Her Majesty Queen
    Elizabeth II, who - much to her regret - doesn't have time to read
    comp.lang.c. Thirdly, it was recommending a modular approach to
    programming, so it was not intended for me, because the author knows I
    already favour such an approach so it would have been pointless to try
    to persuade me.

    So if it's not intended for Pedro, Elizabeth or myself, it must have
    been intended for someone else. And if it was intended for someone
    else, it can't have been intended for you.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Apr 12, 2007
    #13
  14. Richard Bos Guest

    (Richard Tobin) wrote:

    > In article <>,
    > Ian Collins <> wrote:


    [ No, he didn't; the above attribution is a lie. ]

    > >> When you become familiar with Usenet quoting conventions, you will not
    > >> have this problem.

    >
    > >Your conventions, or everyone else's?

    >
    > It should be obvious that there are various conventions for quoting
    > used on Usenet. Regular readers become familiar with the mostly
    > commonly used ones.


    However, _the_ convention is not to snip attribution lines when material
    from that poster is still in the post. Honest posters abide by this
    convention.

    Richard
    Richard Bos, Apr 12, 2007
    #14
  15. In article <>,
    Richard Heathfield <> wrote:

    >> It should be obvious that there are various conventions for quoting
    >> used on Usenet. Regular readers become familiar with the mostly
    >> commonly used ones.


    >Come on, Richard - you did snip an attrib, now, didn't you?


    Of course. I usually only include the attribution of the article I'm
    replying to. And that is a common Usenet convention that regular
    users should be used to. It's obvious from the angle brackets when
    the quoted text contains older quoted text, and I consider it a waste
    of space - and a reduction in readability - to include attributions
    all the way back. If someone wants to trace the history of a
    discussion, they can look at the parent article in their newsreader or
    on Google. If that doesn't work, or is too much trouble, well it's
    only Usenet isn't it?

    I've been following this convention for 20 years, and no-one has
    complained about it except a few posters in comp.lang.c in the lasat
    year.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
    Richard Tobin, Apr 12, 2007
    #15
  16. Richard Tobin said:

    > In article <>,
    > Richard Heathfield <> wrote:
    >
    >>> It should be obvious that there are various conventions for quoting
    >>> used on Usenet. Regular readers become familiar with the mostly
    >>> commonly used ones.

    >
    >>Come on, Richard - you did snip an attrib, now, didn't you?

    >
    > Of course. I usually only include the attribution of the article I'm
    > replying to. And that is a common Usenet convention that regular
    > users should be used to.


    First I've heard of it (as a convention, anyway). I must confess that
    I've often been tempted to do likewise, but always refrained (as far as
    I can recall) because it is not my desire to flout what I perceive to
    be useful and meaningful conventions.

    > I've been following this convention for 20 years, and no-one has
    > complained about it except a few posters in comp.lang.c in the lasat
    > year.


    It's not something I generally complain about myself, on the whole, but
    it did seem to me on this occasion that Martin had a point. I also
    think the whole thing is a

    | *** ** |
    | ***** |------+
    | ****** |----+ |
    | \ \ \ \ | | |
    | \ \ \ \ | | |
    | \ \ \ |----+ |
    | \ \ \ |------+
    +---------+

    Don't you? :)

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Apr 12, 2007
    #16
  17. Default User Guest

    Richard Heathfield wrote:

    > Default User said:
    >
    > <snip>
    > >
    > > Unless, perhaps, this was meant as a reply to some OTHER person. As
    > > there's no quotes are attributions, that's rank speculation on my
    > > part.

    >
    > It wasn't intended for you, as we can deduce quickly and easily from
    > the available information.


    Ah, is that so? I saw little or no information that indicated it wasn't
    directed towards me. If not me, the person reading it, then who? There
    was no information along that line.

    Seriously, I could have given Mr. Corbit the standard post that we give
    rank newbies that violate netiquette by failing to quote, but I know
    and you know that he knows better.





    Brian
    Default User, Apr 12, 2007
    #17
  18. user923005 Guest

    On Apr 11, 11:16 pm, Richard Heathfield <> wrote:
    > Default User said:
    >
    > <snip>
    >
    >
    >
    > > Unless, perhaps, this was meant as a reply to some OTHER person. As
    > > there's no quotes are attributions, that's rank speculation on my
    > > part.

    >
    > It wasn't intended for you, as we can deduce quickly and easily from the
    > available information.
    >
    > Firstly, it was written in English, so it's not intended for Pedro
    > Garcia, a Spaniard living in Barcelona who has no English. Secondly, it
    > was posted in comp.lang.c, so it is not intended for Her Majesty Queen
    > Elizabeth II, who - much to her regret - doesn't have time to read
    > comp.lang.c. Thirdly, it was recommending a modular approach to
    > programming, so it was not intended for me, because the author knows I
    > already favour such an approach so it would have been pointless to try
    > to persuade me.
    >
    > So if it's not intended for Pedro, Elizabeth or myself, it must have
    > been intended for someone else. And if it was intended for someone
    > else, it can't have been intended for you.


    My format could have included some context. Some day, in the
    hypothetical future, we may lose all threading context to the
    news:comp.lang.c archives. Some future net denizen may download my
    post and (lacking sufficient context) find it rather puzzling (though
    no more puzzling than the initial response). At any rate, I consider
    it a reminder to include enough context so that the post may be read
    on its own. Sometimes, I am only thinking of solving the problem of
    the O.P. and perhaps that is short-sighted.
    user923005, Apr 12, 2007
    #18
  19. Default User Guest

    user923005 wrote:

    > On Apr 11, 11:16 pm, Richard Heathfield <> wrote:
    > > Default User said:
    > >
    > > <snip>
    > >
    > >
    > >
    > > > Unless, perhaps, this was meant as a reply to some OTHER person.
    > > > As there's no quotes are attributions, that's rank speculation on
    > > > my part.

    > >
    > > It wasn't intended for you, as we can deduce quickly and easily
    > > from the available information.


    > My format could have included some context. Some day, in the
    > hypothetical future, we may lose all threading context to the
    > news:comp.lang.c archives. Some future net denizen may download my
    > post and (lacking sufficient context) find it rather puzzling (though
    > no more puzzling than the initial response). At any rate, I consider
    > it a reminder to include enough context so that the post may be read
    > on its own.


    While my post had a somewhat mocking tone, that was mainly because I
    felt you deserved more than "quote context dummy!!!"

    In fact, I really didn't know who it was addressed towards. I run my
    newsreader set to display only unread messages, and the original was no
    longer in my view, having been read in some previous session. Now, I
    could have clicked up and switched the view to show all messages, and
    found the original, but chose not to.

    Some newsreaders have the ability to traverse the message tree even if
    they aren't currently displayed. As far as I can tell, mine does not do
    that.

    > Sometimes, I am only thinking of solving the problem of
    > the O.P. and perhaps that is short-sighted.


    That can happen.




    Brian
    Default User, Apr 12, 2007
    #19
  20. On 12 Apr 2007 09:51:51 GMT, in comp.lang.c ,
    (Richard Tobin) wrote:

    >In article <>,
    >Richard Heathfield <> wrote:
    >
    >>> It should be obvious that there are various conventions for quoting
    >>> used on Usenet. Regular readers become familiar with the mostly
    >>> commonly used ones.

    >
    >>Come on, Richard - you did snip an attrib, now, didn't you?

    >
    >Of course. I usually only include the attribution of the article I'm
    >replying to. And that is a common Usenet convention that regular
    >users should be used to.


    I'm a regular user of this and many other groups, and I've never heard
    of this so-called common convention before.

    >and I consider it a waste
    >of space - and a reduction in readability - to include attributions
    >all the way back.


    Fair enough, if the context and attributions aren't germane.

    >If someone wants to trace the history of a
    >discussion, they can look at the parent article in their newsreader or
    >on Google.


    I strongly disagree with this. You have no way to know the retention
    of your readers' newsreaders or servers, and parent articles may be
    long gone. And expecting me to fire up another app and run another
    search is absurd.
    --
    Mark McIntyre

    "Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are,
    by definition, not smart enough to debug it."
    --Brian Kernighan
    Mark McIntyre, Apr 13, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    864
    GIMME
    Feb 11, 2004
  2. Naren
    Replies:
    0
    Views:
    570
    Naren
    May 11, 2004
  3. Christopher Diggins
    Replies:
    0
    Views:
    597
    Christopher Diggins
    Jul 9, 2007
  4. Kai Schlamp
    Replies:
    1
    Views:
    409
    Arne Vajhøj
    Mar 27, 2008
  5. Domenico Discepola

    Assistance parsing text file using Text::CSV_XS

    Domenico Discepola, Sep 1, 2004, in forum: Perl Misc
    Replies:
    6
    Views:
    442
    Domenico Discepola
    Sep 2, 2004
Loading...

Share This Page