ASCII file parser - to read between brackets ()

olson_ord · Feb 15, 2006

Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.

Thanks a lot guys,
O.O.

Victor Bazarov · Feb 15, 2006

Hi,
My ascii file is not exactly a comma separated file. [...]

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.

Sounds fine. I don't see any C++ relation, however. Please don't just
say that you're "writing it in C++". The algorithm you've described can
just as easily be written in almost any other language. Did you mean to
post it to 'comp.programming'?

V

TB · Feb 15, 2006

(e-mail address removed) sade:

Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

Tokenize the input before parsing.

Ivan Vecerina · Feb 15, 2006

: Hi,
: My ascii file is not exactly a comma separated file. The following is
: a small but complete example of such a file. (This is the ISCAS circuit
: file format that I need to read in.)
:
: ----------- Example c17.bench ----------------
: INPUT(1)
: INPUT(2)
: INPUT(3)
: INPUT(6)
: INPUT(7)
:
: OUTPUT(22)
: OUTPUT(23)
:
: 10 = NAND(1, 3)
: 11 = NAND(3, 6)
: 16 = NAND(2, 11)
: 19 = NAND(11, 7)
: 22 = NAND(10, 16)
: 23 = NAND(16, 19)
: ----------------End of Example ----------------
:
: I would like to note that the numbers can also be replaced by some
: other symbols - so they are actually to be treated as strings and not
: as numbers.
: Eg. "1" can also be "N_1" or "Node_1" - or any other string
: representation.
:
: Now I would like to know what should be my approach to reading in this
: file, i.e. the algorithm.
:
: Off the top of my head I think I would just have to read in each line
: as a string. Then I would search the string for various keywords. On
: finding a keyword I would then find the location of the two brackets ()
: - and then parse the values between them.
:
: I am wondering if this approach is the right way to go.

There are several ways in which this can be accomplished.
But because I don't know the complete 'grammar' of the file,
I am not sure which would be the most appropriate
(e.g. I assume there is not only NAND, but XOR etc.
Can a more complex expression be used? Unary NOT ? )

In any case, rather than parsing each line manually, you
could use one of the existing lexers or parser generators,
such as flex(with or without bison, a bit old-fashioned
but works - http://www.gnu.org/software/flex/),
or boost::spirit (http://www.boost.org/libs/spirit/index.html).

If the files are simple enough, a regular-expressions package
might be an alternative for extracting needed identifiers from
each line (e.g. http://www.boost.org/libs/regex/doc/index.html)

These are among a number of other options...
hth-Ivan

olson_ord · Feb 15, 2006

Dear Victor,
Thanks for responding. I forgot to mention in my post that I am
dealing with C++. I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++. Also I did not know of what Tokenize means
which I learnt only after TB suggested it.
That's what I was looking for.
Thanks,
O.O.

olson_ord · Feb 15, 2006

Thanks TB. I think this was what I am looking for. I think my file is
simple enough so I don't need to use RegEx's. I have found some
code in the example at http://www.codeproject.com/cpp/stringtok.asp
that I think would be useful to me.
Regards,
O.O.

Victor Bazarov · Feb 15, 2006

Thanks for responding. I forgot to mention in my post that I am
dealing with C++.

That's what I was afraid of...

> I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++.

They are not part of the language yet. As soon as you see the TR1
implemented, you could try using <regex> and whatever it is going to
contain. Until then, alas, no language mechanism to help you except some
very simple ones, like 'string', 'fstream', and others of which you are
probably already aware.

> Also I did not know of what Tokenize means
which I learnt only after TB suggested it.

"Tokenize" usually means "identify and split the input stream into tokens"
and it can mean _whatever_you_make_it_to_mean_ because it depends entirely
on your definition of "a token".

V

olson_ord · Feb 15, 2006

Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.

pillbug · Feb 16, 2006

Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.

most implementations of scanf will handle this for you no problem.

if you don't like sscanf you can use regex.

if you don't like regex you can use lex (probably don't need a parser,
just the scanner should suffice).

if you don't like lex/yacc you can write your own scanner, the grammar
you have there isn't too complex.

if you don't want to write your own scanner you can..

actually that's sort of the problem people have with c++. your options
when dealing with any particular problem are quite literally, endless.

perl kind of herds you into trying to approach everything with regex's
and hashes, while vb/.net will get you to buy some prebuilt item. i'm
guessing thats why you came to c++ group, to find out what approach c++
lends itself most easily to. unfortunately, c++ lends itself to just
about every solution

olson_ord · Feb 16, 2006

Thanks pillbug. I did not have this insight before.
O.O.

Minimum Total Difficulty	0	Nov 15, 2023
Read xml column inside csv file with Python	0	Jul 23, 2022
save dictionary to a file without brackets.	50	Aug 9, 2012
Taskcproblem calendar	4	Aug 31, 2023
Java with JSON file	3	Nov 5, 2023
The distinction between a java applet and an application	1	Jan 4, 2023
Php combine identical lines in text file	4	Oct 11, 2023
Struggling to read from a file using a for loop.	0	Oct 8, 2019

ASCII file parser - to read between brackets ()

olson_ord

Victor Bazarov

TB

Ivan Vecerina

olson_ord

olson_ord

Victor Bazarov

olson_ord

pillbug

olson_ord

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads