read data from a file algorithm?

J

james.burrows

I'm looking for a quick way to read lines of data from a text file into
memory. Each line will contain e.g. a book entry: -

id,name,price,category,author,description

I could have over 20000 of these lines, and I'd like a function that
reads in the list of books grabbing only the id and name fields. Then
I'd like another function that when passed an id will grab all the
details from that entry.

So my dilemma is how to read in this information as efficiently as
possible. So far the best I can think of is a random access file with
each field (and therefore each record) taking up X bytes. I can then
use a file pointer and get to a specific line very quickly. The two
reasons I don't want to do this is (a) I have to give limits on field
lengths, and (b) I can't really edit the entries in the file by hand
(which would be desirable).

Anyone have any ideas on another way I can accomplish an efficient data
retrieval like this from a file?

Many thanks,
James
 
V

Victor Bazarov

I'm looking for a quick way to read lines of data from a text file
into memory. Each line will contain e.g. a book entry: -

id,name,price,category,author,description

I could have over 20000 of these lines, and I'd like a function that
reads in the list of books grabbing only the id and name fields. Then
I'd like another function that when passed an id will grab all the
details from that entry.

So my dilemma is how to read in this information as efficiently as
possible. So far the best I can think of is a random access file with
each field (and therefore each record) taking up X bytes. I can then
use a file pointer and get to a specific line very quickly. The two
reasons I don't want to do this is (a) I have to give limits on field
lengths, and (b) I can't really edit the entries in the file by hand
(which would be desirable).

Anyone have any ideas on another way I can accomplish an efficient
data retrieval like this from a file?

Use 'std::getline', collect all lines into a 'std::vector<std::string>',
then process them in memory as needed. Parsing should be done after you
read all of it, not during. Finish reading ASAP, close the file and
then muck with your data at your leisure.

Parsing each line can be done with the help of 'find' member of 'string'
class.

V
 
A

Alf P. Steinbach

* (e-mail address removed):
I'm looking for a quick way to read lines of data from a text file into
memory. Each line will contain e.g. a book entry: -

id,name,price,category,author,description

I could have over 20000 of these lines, and I'd like a function that
reads in the list of books grabbing only the id and name fields. Then
I'd like another function that when passed an id will grab all the
details from that entry.

So my dilemma is how to read in this information as efficiently as
possible. So far the best I can think of is a random access file with
each field (and therefore each record) taking up X bytes. I can then
use a file pointer and get to a specific line very quickly. The two
reasons I don't want to do this is (a) I have to give limits on field
lengths, and (b) I can't really edit the entries in the file by hand
(which would be desirable).

Anyone have any ideas on another way I can accomplish an efficient data
retrieval like this from a file?

If I understand it correctly, your problem is that you don't want 20 MiB
of memory used just for this, you want to minimize memory usage and,
within that constraint, also access time, by using some preprocessing.

Then I suggest preprocessing the file, generating a better suited binary
file -- which is probably easiest to do using a simple database (e.g.
MySQL).

I've set follow ups to [comp.programming], since AFAICS this doesn't
have anything to do with C++.
 
J

Jerry Coffin

@j73g2000cwa.googlegroups.com>, (e-mail address removed)
says...
I'm looking for a quick way to read lines of data from a text file into
memory. Each line will contain e.g. a book entry: -

id,name,price,category,author,description

I could have over 20000 of these lines, and I'd like a function that
reads in the list of books grabbing only the id and name fields. Then
I'd like another function that when passed an id will grab all the
details from that entry.

Create an index of the file. When your program starts up,
have it step through the file one line at a time, reading
in the ID and name of each entry. Create two maps, one
with each of those as the key. The data you'll associate
with each is the file offset where that record can be
found.

When you need to find something by ID, you look it up in
the index, seek the file to the specified offset, and
read in the record (line).

If the file was extremely large, you'd probably want to
maintain the index between uses, but indexing 20000
entries should be fast enough that it's unlikely to
matter much (unless the entries are _quite_ large). Re-
building the index each time also means that (as you
mentioned was desirable) you can hand-edit the file
without causing a problem (you just can't do it while the
program is running).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,244
Latest member
cryptotaxsoftware12

Latest Threads

Top