Reading from a text file

M

Michael McGarry

Hi,

I am reading strings from a text file using fscanf("%s", currToken);

This returns strings delimited by whitespace. How can I tell if the
string was followed by a carriage return in the file?

Michael
 
W

Walter Roberson

I am reading strings from a text file using fscanf("%s", currToken);
This returns strings delimited by whitespace. How can I tell if the
string was followed by a carriage return in the file?

The whitespace character (including newline) that terminates the
string will be "put back" into the input stream. Therefor to
determine whether currToken was followed by a carriage return,
read the next character, test to see if it was carriage return,
and if not then ungetc() the character to put it back in the
input stream.

By the way, do you really mean "carriage return", ascii character 13?
Or do you mean "the operating system's end of line sequence, whether that
be carriage return or linefeed or some combination of the two or
something else completely" -- which C compactly encodes as \n ?
 
P

pete

Michael said:
Hi,

I am reading strings from a text file using fscanf("%s", currToken);

This returns strings delimited by whitespace. How can I tell if the
string was followed by a carriage return in the file?

You can read lines at a time, from a text file,
and find the white space in the resulting string,
if you're really looking to tokenise on white space.
 
M

Michael McGarry

Thanks for your reply, I just want to check for the EOL sequence
whatever that may be on some arbitrary OS. I want this code to be
portable.

Thanks for any additional guidance,

Michael
 
R

Richard Bos

Michael McGarry said:
I am reading strings from a text file using fscanf("%s", currToken);

This returns strings delimited by whitespace. How can I tell if the
string was followed by a carriage return in the file?

Others have given you an answer to your direct question. But let me make
another suggestion: if you need to know this because you want to read an
entire line, rather than separate words, you should also look at
fgets(). It will read a complete line for you without stopping for other
whitespace than newlines. It will also let (in fact, make) you specify a
buffer size, which means buffer overrun errors are less likely. So does
fscanf(), but it doesn't require it, and as you use it above it does not
know how large your buffer is and will merrily write over the end of it.

Richard
 
K

Keith Thompson

Others have given you an answer to your direct question. But let me make
another suggestion: if you need to know this because you want to read an
entire line, rather than separate words, you should also look at
fgets(). It will read a complete line for you without stopping for other
whitespace than newlines. It will also let (in fact, make) you specify a
buffer size, which means buffer overrun errors are less likely. So does
fscanf(), but it doesn't require it, and as you use it above it does not
know how large your buffer is and will merrily write over the end of it.

You also need to decide what to do if the input line is longer than
your buffer. In that case, fgets will read as much of the line as it
can, the buffer won't be terminated by a newline character, and the
remainder of the line will be left on the input stream to be read
later.
 
M

Malcolm

Michael McGarry said:
I am reading strings from a text file using fscanf("%s", currToken);

This returns strings delimited by whitespace. How can I tell if the
string was followed by a carriage return in the file?
Don't use fscanf().
The library functions are fine as long as you want the narrow purpose for
which they were designed. As soon as you want to do something a bit unusual,
such as check for carriage returns, your best bet is to build a custom
function on top of fgetc().
(You might need to open your file in binary mode, some OSes suppress the
'\r' character).
 
K

Keith Thompson

Malcolm said:
Don't use fscanf().
The library functions are fine as long as you want the narrow purpose for
which they were designed. As soon as you want to do something a bit unusual,
such as check for carriage returns, your best bet is to build a custom
function on top of fgetc().
(You might need to open your file in binary mode, some OSes suppress the
'\r' character).

I suspect when the OP asked about a "carriage return", he really meant
an end-of-line, not a literal '\r' character. Opening a text file in
binary mode is seldon a good idea; the possible variations are bigger
than just the Unix vs. Windows "LF", vs. "CR/LF" line terminators.
 
M

Michael McGarry

Hi,

Thanks for all the advice. I just wound up using fgetc() and as soon as
it returned whitespace that was not a space, I assume end of line. In
my case the file contains space seperated fields, so a whitespace that
is not a space must be EOL. Is there explicitly an EOL character that
is OS independent?

I would like to make my design more robust to actually check for an
EOL.

Michael
 
G

Gordon Burditt

Thanks for all the advice. I just wound up using fgetc() and as soon as
it returned whitespace that was not a space, I assume end of line. In
my case the file contains space seperated fields, so a whitespace that
is not a space must be EOL.

Warning: this ignores the existence of tabs, as well as other
characters considered to be white space. This may not be a problem
for *your* files, but you can't assume that for everyone.
Is there explicitly an EOL character that
is OS independent?

I would like to make my design more robust to actually check for an
EOL.

In C, lines from a text file, opened in text mode, as read by
functions like fgetc(), fgets(), fread(), and *scanf() end in '\n'.
It is irrelevant how the line ending is represented *on disk*.
C translates the line ending to '\n' on reading, and from '\n'
to whatever is the appropriate line ending on writing.

All bets are off for files being read in binary mode.

Gordon L. Burditt
 
M

Michael McGarry

Thanks Gordon,

I will explicitly read with fscanf(file,"%s"); until fgetc() returns
'\n'

Michael
 
J

Joe Wright

Michael said:
Hi,

Thanks for all the advice. I just wound up using fgetc() and as soon as
it returned whitespace that was not a space, I assume end of line. In
my case the file contains space seperated fields, so a whitespace that
is not a space must be EOL. Is there explicitly an EOL character that
is OS independent?

I would like to make my design more robust to actually check for an
EOL.

Michael
Choice of fgetc() and/or fgets() makes a lot of sense to me. Also,
reading from a text file is a very exact exercise. You must know exactly
how this file was constructed in order to read it successfully. There
are few Standards in this regard. You have to know your file.

Generally, text files are comprised of lines of characters, each line
ending with the '\n' character. There may be any number of lines.
Whether the last line in the file is ended with '\n' is an
implementation detail. Usually it is.

Again, we have to know our file to make sense of it. You mention above
separated fields in text files. This suggests to me a database table of
rows and columns presented to us as text. Simply separating fields with
spaces is not normally useful.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,152
Latest member
LorettaGur
Top