efficient way to process 100kb of data in C

S

sam_cit

Hi Everyone,

I'm currently working in embedded environment and hence resources
like memory are constrained. And i'm in a need to pass a entire text
file to a parser (implemented in lex/yacc) to parse it. The maximum
size of the text file is 100kb and i can't allocate a buffer to hold
the complete 100 kb of data in memory for the parser to read from.

I realise that another solution is to have the data stored in a temp
file and pass the file to the parser. However, in my case, i want to
avoid the file approach as it is performance costly.

I know that this is more of programming question than related to C,
i decided to post here as i'm implementing the logic in C. Can anyone
help suggesting other solutions?

Thanks in advance!!!
 
L

Laurent Deniau

Hi Everyone,

I'm currently working in embedded environment and hence resources
like memory are constrained. And i'm in a need to pass a entire text
file to a parser (implemented in lex/yacc) to parse it. The maximum
size of the text file is 100kb and i can't allocate a buffer to hold
the complete 100 kb of data in memory for the parser to read from.

why do you want to store the entire file in memory? lex reads the input
files in buffered mode which means that only a small part of the file is
in memory. if your yacc grammar has recursive rules, ensure that they
left recursive to minimize memory usage.
I realise that another solution is to have the data stored in a temp
file and pass the file to the parser. However, in my case, i want to
avoid the file approach as it is performance costly.

I know that this is more of programming question than related to C,
i decided to post here as i'm implementing the logic in C. Can anyone
help suggesting other solutions?

Using a parser combinators like in functional languages. You can read

http://www.math.chalmers.se/~koen/ParserComboC/parser-combo-c.html

for an introduction to this topic in C. It can be less efficient in
speed (depending on your grammar) but more efficient in memory usage.

a+, ld.
 
R

Richard Tobin

I'm currently working in embedded environment and hence resources
like memory are constrained. And i'm in a need to pass a entire text
file to a parser (implemented in lex/yacc) to parse it. The maximum
size of the text file is 100kb and i can't allocate a buffer to hold
the complete 100 kb of data in memory for the parser to read from.
I realise that another solution is to have the data stored in a temp
file and pass the file to the parser. However, in my case, i want to
avoid the file approach as it is performance costly.

If the data is already in a file, why would you need to put it into
a temporary file? Why can't you parse it from the original file?

-- Richard
 
C

CBFalconer

I'm currently working in embedded environment and hence resources
like memory are constrained. And i'm in a need to pass a entire text
file to a parser (implemented in lex/yacc) to parse it. The maximum
size of the text file is 100kb and i can't allocate a buffer to hold
the complete 100 kb of data in memory for the parser to read from.

I realise that another solution is to have the data stored in a temp
file and pass the file to the parser. However, in my case, i want to
avoid the file approach as it is performance costly.

I know that this is more of programming question than related to C,
i decided to post here as i'm implementing the logic in C. Can anyone
help suggesting other solutions?

Why any fuss? Just pass the file name, or even the opened file, as
needed. I see no reason the parser can't open and read the file.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
S

sam_cit

If the data is already in a file, why would you need to put it into
a temporary file? Why can't you parse it from the original file?

Sorry everybody for not giving a complete picture.
Actually the parser library provided is a third party code and their
current implementation reads the input from a buffer and not from a
file. (they have redefined the input function to read from a buffer
instead of a file).

which is why i want to avoid the temp file approach.
Having understood this, can you suggest any other idea or approach.

Thanks in advance!!!
 
C

Chris Dollin

Sorry everybody for not giving a complete picture.
Actually the parser library provided is a third party code and their
current implementation reads the input from a buffer and not from a
file. (they have redefined the input function to read from a buffer
instead of a file).

which is why i want to avoid the temp file approach.
Having understood this, can you suggest any other idea or approach.

It seems to me you're stuffed: the parse /demands/ a buffer with the
complete source, and /you don't have/ enough memory for that buffer.

It doesn't matter what the performance using a file would be, since
you can't use a file.

The obvious course of action is to go to the third party and have
them fix their parser so it doesn't demand the whole file in one
lump.

Not that any of this so far has anything to do with C, of course.
 
S

sam_cit

The obvious course of action is to go to the third party and have
them fix their parser so it doesn't demand the whole file in one
lump.

Are you indicating that there is no other possible approach to solve
the problem?
 
M

mark_bluemel

Sorry everybody for not giving a complete picture.
Actually the parser library provided is a third party code and their
current implementation reads the input from a buffer and not from a
file. (they have redefined the input function to read from a buffer
instead of a file).

What do you mean by a buffer? That you have to pass the address of the
string of data into the parser?

Then you have a problem. You have to provide the parser with 100K of
data in memory, due to the parser design, but you don't have room for
that 100K.

Short of implementing your own virtual memory system, or modifying the
parser interface, I don't see how you expect to resolve this issue.
 
S

sam_cit

What do you mean by a buffer? That you have to pass the address of the
string of data into the parser?

I meant a chunk of memory holding 100 k of data for the parser to
start decoding...
Then you have a problem. You have to provide the parser with 100K of
data in memory, due to the parser design, but you don't have room for
that 100K.

Yes exactly, thats the problem here...
Short of implementing your own virtual memory system, or modifying the
parser interface, I don't see how you expect to resolve this issue.

Can you give a link, i think you are suggesting a kind of mechanism
at the application level instead of the operating system.
I would appreciate if you can post a link, say a tutorial or a similar
kind of implementation...
 
M

mark_bluemel

Can you give a link, i think you are suggesting a kind of mechanism
at the application level instead of the operating system.

I'm afraid my somewhat ironic tone has prevented you taking my point -
you'd
have to change the operating system to provide a virtual memory
solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,436
Messages
2,571,696
Members
48,796
Latest member
Greg L.
Top