help needed using ifstream::seekg with windows text file

W

wtnt

Hello.
I've searched all over and haven't seen another thread with this
problem. Please bear with me as I try to explain. thanks. :)

I have some programs that need to be cross-platform compatible (unix
and windowsXP). The first program parses a text file and records where
snippets are in terms of where it begins (char offset from begin of
the file) and length (number of chars).

One can almost use "byte" and "char" interchangeably here, given that
sizeof(char) is 1, however it doesn't quite work that way.

The second program tries to get things from this text file with the
information collected from program1. program2 tries to use seekg and
read. something like:

char* snippet = new char[length + 1];
ifstream read(file); // OR ifstream read(file, ios::binary);
read.seekg(offset, ios::beg);
read.read(snippet,length);
read.close();:
snippet[length] = '\0';

The problem is that when the code is as above in text mode, while read
actually reads in the number of characters, seekg seeks to the number
of bytes. So, if I seek to 0, I read in exactly what I need. If I need
to seek to any char length > 0, it doesn't seek to my next offset
correctly, and it appears that the number of characters it's missing
(short) is correlated with the number of newlines there are previous
to that point in the file. (read still reads in the correct number of
characters from that point)

I know that windows and unix treat newlines differently. But I can't
quite understand the behavior on windows to get it to do what I want.
It behaves as if a newline is 1 char with sizeof 2 bytes. (although if
I go through it character by character and print out sizeof(*c) I
never get anything that is 2, it's always 1.)

If I set it to binary mode, both seekg and read work in terms of
bytes. Successive seeks read in snippets one right after the other as
they appear in the text file with no overlap (actually, it drops 1
character in between for which I have no explanation). And each
snippet has fewer visible number of characters than its length.

This problem does not occur on unix and does not occur on windows if I
transfer a text file from unix in "binary" mode so that the \n's don't
get replaced by windows. In these instances, program2 behaves as
expected regardless of whether it's in binary or text mode.

Is there any way to get a file pointer to seek to a place in a text
file according to the number of characters, like the way read behaves
in text mode? This would be the simplest solution.

I guess one could say get program1 to store the snippet information in
terms of bytes and not number of characters. I'm not sure how to do
that. What it does is count characters, and on windows the newline is
still counting as 1 character. I guess I could add an extra byte every
time there is a windows-style newline, but I'm not even sure my
assessment that a windows newline = 1 char of 2 bytes is actually
true.

What would be a more elegant solution that would work on both
platforms?

Thank you for your help.
 
J

Jonathan Turkanis

wtnt said:
Hello.
I've searched all over and haven't seen another thread with this
problem. Please bear with me as I try to explain. thanks. :)

I have some programs that need to be cross-platform compatible (unix
and windowsXP). The first program parses a text file and records where
snippets are in terms of where it begins (char offset from begin of
the file) and length (number of chars).

One can almost use "byte" and "char" interchangeably here, given that
sizeof(char) is 1, however it doesn't quite work that way.

You simply can't store offsets into files portably in text mode. In
binary mode you can, if you remember that there may be an arbitrary
number of null characters added at the end of the file. See the
Dinkumware online documentation for an explanation.
(http://www.dinkumware.com/refxcpp.html. Go to the C++ table of
contents, and look under "Files and Streams"). Alternatively, see
P.J.Plauger's book on the C standard library.

Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,835
Latest member
KetoRushACVBuy

Latest Threads

Top