slurping in binary data

J

jameskuyper

Barry Schwarz wrote:
....
I'm confused. There are only two possible return values from fgets:
NULL or the first argument (line in this case). How is

Incorrect. There's many possible return values. There are only two
legal return values from fgets(), but laws get broken. fgets() can be
incorrectly implemented, or undefined behavior in some other part of
the code might have leaked over into my code which calls fgets(). Even
if the first of those two possibilities is pretty remote, the second
is not. Given the negligible performance difference between !=NULL
and ==line, I favor the choice that copes with those possibilities in
a more robust manner.

It might seem that there's nothing I can do if the standard library is
defective or a program had undefined behavior, but that conclusion is
based upon seeing everything is absolutes. In reality, implementations
that are only slightly defective are commonplace; I suspect that they
outnumber perfectly correct implementations, possibly by an infinite
factor. Undefined behavior is, in principle, completely unconstrained;
but in reality, it's commonplace for code with undefined behavior to
behave in a fashion that is, at least initially, only slightly
different from the way it was intended to behave. Defensive
programming techniques can catch such problems earlier than they would
otherwise would be, reducing the amount of damage they can do.
 
G

George

No, I haven't the faintest idea what he's saying, either.

It was a dutch matter. When I'm given to be cryptic, I don't make much
sense to most people. It's not unlike one of my expressions for necessary
success, where I tell people, "well, let's return zero and get out of
here."

I've never had someone say, "are you alluding to C?", but they do register
that I want to leave pronto.

You can say that babbling is OT, but I think to balance it by categorically
addressing C in other posts.
--
George

Use power to help people. For we are given power not to advance our own
purposes nor to make a great show in the world, nor a name. There is but
one just use of power and it is to serve people.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
 
G

George

So that people could pass an argument of type va_list.

Thanks, Barry. I'll try to get my head around this soon. I'm gonna live
to understand va_list some other day.
And it did exactly what you asked, even if it is not what you wanted.
It processed all the characters up to the next white space or a max of
99, whichever came first. What you may have wanted was a format of
the form "%*s %99s" to skip over the first string.

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);


while(fgets(line, MAXFMTLEN, fp) == line){
sscanf(line, pattern , lbin);

printf("%s\n", lbin);
}

if(ferror(fp))
perror(PATH);

fclose(fp);
return 0;
}

// gcc -o x.exe chad5.c

The data set is again:
1 0001000000000000001
2 0001000000000000001
3 10000011001000000000000001
4 10000011001000000000000001
5 10000011001000000000000001
6 10000011001000000000000001
7 10000011001000000000000001
8 10000011001000000000000001
9 10000011001000000000000001
10 10000011001000000000000001
11 100000001111100
12 100000001111100
13 100000001111100
14 1000001110110111100000000000001
15 1000001110110111100000000000001
16 1000001110110111100000000000001
17 1000001110110111100000000000001
18 1000001110110111100000000000001
19 1000001110110111100000000000001
20 0001000000000000001


This compiles but gives me a runtime.

One other question here:

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

if(ferror(fp))
perror(PATH);

If there is an error in opening the file, will execution go through both of
these?

Next thanksgiving, try to get yourself invited to a place where they cut
the turkey up before they cook it. Because a turkey is a black body, the
breast will be 200 degrees before the middle is 160°, causing the proteins
to constrict and eject the water, for which you would have been glad, when
the bite sticks in your esophagus like a hockey puck. Jenny's was finger
lickin good.
--
George

Use power to help people. For we are given power not to advance our own
purposes nor to make a great show in the world, nor a name. There is but
one just use of power and it is to serve people.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
 
G

George

Well, your first thought, if you weren't sure, should have been to check
the documentation before asking whether I was sure.

The variadic ones came first, the non-variadic ones were labeled with
'v' to indicate that they take a va_list argument instead of being
variadic. If they'd been designed at the same time, it would probably
have been done the other way around.


I order to figure out why the va_list versions exist, just think about
what's required to do this:


Try to imagine how you could possibly write such a wrapper that used
fprintf() rather than vfprintf().


James,

I'm gonna have to think about this as I do some reading.

I'm not sure I understand why you consider that misbehavior. You told it
to scan and print only the line numbers, and that's exactly what it does.

Right, I posted the attempt to get the opposite to Barry.
C:\MinGW\source>
C:\MinGW\source>g95 kuyper1.f03 -o x.exe


I'm still scratching for clues why fortran and care different here.

integer :: this
character (len=10) :: anything
this = 42
anything = "the other"

print *, this, "that, ", anything
endprogram

! g95 kuyper1.f03 -o x.exe

C:\MinGW\source>x
42 that, the other

C:\MinGW\source>

Without wanting to sound flamy, I like the fortran version just fine.

You've provided C code and fortran code that perform very different
tasks. Why are you comparing them? The version of fortran you're using
is quite different from the last version that I ever used, but I believe
that equivalent C code would look something like this:

#include <stdio.h>

int main(void)
{
int this = 42;
char anything[] = "the other";
printf("%d that, %s\n", this, anything);
return 0;
}

It just looks all very much the same to me.

--
George

Now, there are some who would like to rewrite history - revisionist
historians is what I like to call them.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
 
B

Barry Schwarz

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);

This code has been severely mangled. As it stands now, you are
missing two arguments and invoke undefined behavior. But this is NOT
what you want.

Look at your previous code for constructing the format string in
pattern. (Or make life easy on yourself and initialize pattern with a
string literal.)
while(fgets(line, MAXFMTLEN, fp) == line){
sscanf(line, pattern , lbin);

printf("%s\n", lbin);
}

if(ferror(fp))
perror(PATH);

fclose(fp);
return 0;
}

// gcc -o x.exe chad5.c

The data set is again:
1 0001000000000000001
2 0001000000000000001
3 10000011001000000000000001
4 10000011001000000000000001
5 10000011001000000000000001
6 10000011001000000000000001
7 10000011001000000000000001
8 10000011001000000000000001
9 10000011001000000000000001
10 10000011001000000000000001
11 100000001111100
12 100000001111100
13 100000001111100
14 1000001110110111100000000000001
15 1000001110110111100000000000001
16 1000001110110111100000000000001
17 1000001110110111100000000000001
18 1000001110110111100000000000001
19 1000001110110111100000000000001
20 0001000000000000001


This compiles but gives me a runtime.

error? In general, step through the code with a debugger or add
debugging printf statements so you can determine exactly where the
error occurs.
One other question here:

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

if(ferror(fp))
perror(PATH);

If there is an error in opening the file, will execution go through both of
these?

Look at the code that executes if fopen fails. Something very special
happens on the second statement (ignoring the portability issue).
 
G

George

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);

This code has been severely mangled. As it stands now, you are
missing two arguments and invoke undefined behavior. But this is NOT
what you want.

Look at your previous code for constructing the format string in
pattern. (Or make life easy on yourself and initialize pattern with a
string literal.)

I want to ignore the first number in each line.
error? In general, step through the code with a debugger or add
debugging printf statements so you can determine exactly where the
error occurs.

Can't do it. Gdb is my only tool right now for debugging, and it is a
hardship.

Look at the code that executes if fopen fails. Something very special
happens on the second statement (ignoring the portability issue).

I see nothing remarkable in returning one. Under what circumstances does
ferror do something?

--
George

Saddam Hussein is a homicidal dictator who is addicted to weapons of mass
destruction.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
 
G

George

It's difficult to tell without seeing the context. Fortunately, I
have a copy of the article.

James Kuyper had posted a list of the *scanf function names:
vfscanf
vscanf
vsscanf
swscanf
fwscanf
vfwscanf
vswscanf
vwscanf
wscanf

You asked:
q1) Which of the above are variadic?

Nick Keighley replied:
all the ones that don't start with v

And you replied:
Huh? Are you sure?

Your question implied some doubt about the correctness of Nick's
answer, or at least surprise. I didn't, and still don't, understand
your reaction.

And this all would have been a lot easier if you'd responded to the
actual article.

I'm getting bogged down here and need to move on. I'd be more assured that
the list were complete if it included sscanf. Maybe if I figure out how to
use sscanf effectively, we can revisit this.
--
George

To those of you who received honours, awards and distinctions, I say well
done. And to the C students, I say you, too, can be president of the United
States.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
 
B

Barry Schwarz

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);

This code has been severely mangled. As it stands now, you are
missing two arguments and invoke undefined behavior. But this is NOT
what you want.

Look at your previous code for constructing the format string in
pattern. (Or make life easy on yourself and initialize pattern with a
string literal.)

I want to ignore the first number in each line.

Yes, and you will use the string in pattern to cause sscanf to do just
that. First decide what the string should look like and then decide
how you want to populate pattern with that string. There is no need
to use sprintf but if you want to so that you can adjust NUMBER then
figure out how to do that. Your previous code was very close. This
code is not.
Can't do it. Gdb is my only tool right now for debugging, and it is a
hardship.

You certainly can add debugging printf statements at strategic points
in your code. Things like
printf("fp has been opened successfully for file %s\n", PATH);
I see nothing remarkable in returning one. Under what circumstances does
ferror do something?

Look at the code again. Will the ferror ever get a chance to execute
if fopen fails.

And as long as you brought it up, the only portable values to return
from main are 0, EXIT_SUCCESS, and EXIT_FAILURE.
 
J

James Kuyper

George said:
#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);
This code has been severely mangled. As it stands now, you are
missing two arguments and invoke undefined behavior. But this is NOT
what you want.

Look at your previous code for constructing the format string in
pattern. (Or make life easy on yourself and initialize pattern with a
string literal.)

I want to ignore the first number in each line.

Then go back to something approaching what you had in previous versions
of this code:

sprintf(pattern, "%%*s %%%ds", NUMBER-1);

....
Can't do it. Gdb is my only tool right now for debugging, and it is a
hardship.

Then insert debugging printf() calls.

....
I see nothing remarkable in returning one. Under what circumstances does
ferror do something?

The program exit()s immediately if fopen() returns a null pointer.
Therefore, the stream is guaranteed to not be in an error state by the
time that ferror() is called.. As a result, ferror() can never return a
non-zero value, and therefore perror() never gets executed. I doubt that
this is what you intended.
 
J

James Kuyper

George said:
I'm getting bogged down here and need to move on. I'd be more assured that
the list were complete if it included sscanf.

The list is not complete, nor was it intended to be. It was a response
to the following question:
I hadn't even disambiguated these. Am I correct that

scanf
sscanf
fscanf

are the only ones that look like another?

The list I created was a list of the *scanf functions that were not on
your list. The combination of your list and my list is complete.
 
B

Barry Schwarz

George said:
#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lbin[NUMBER];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern, "%*s %40s", NUMBER-1);
This code has been severely mangled. As it stands now, you are
missing two arguments and invoke undefined behavior. But this is NOT
what you want.

Look at your previous code for constructing the format string in
pattern. (Or make life easy on yourself and initialize pattern with a
string literal.)

I want to ignore the first number in each line.

Then go back to something approaching what you had in previous versions
of this code:

sprintf(pattern, "%%*s %%%ds", NUMBER-1);

...
Can't do it. Gdb is my only tool right now for debugging, and it is a
hardship.

Then insert debugging printf() calls.

...
I see nothing remarkable in returning one. Under what circumstances does
ferror do something?

The program exit()s immediately if fopen() returns a null pointer.
Therefore, the stream is guaranteed to not be in an error state by the
time that ferror() is called.. As a result, ferror() can never return a
non-zero value, and therefore perror() never gets executed. I doubt that
this is what you intended.

Do to George's excessive trimming, some intervening code was omitted.
His call to ferror followed a call to fgets. The point I was trying
to get him to realize was that if fopen failed his program called exit
and consequently the call to ferror would never be executed.
 
C

Chad

So that people could pass an argument of type va_list.

Thanks, Barry.  I'll try to get my head around this soon.  I'm gonna live
to understand va_list some other day.
And it did exactly what you asked, even if it is not what you wanted.
It processed all the characters up to the next white space or a max of
99, whichever came first.  What you may have wanted was a format of
the form "%*s %99s" to skip over the first string.

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
  FILE *fp;
  char pattern[MAXFMTLEN];
  char lbin[NUMBER];
  char line[MAXFMTLEN];

  if ((fp = fopen(PATH, "r")) == NULL ) {
    fprintf(stderr, "can't open file\n");
    exit(1);
  }

  sprintf(pattern, "%*s %40s", NUMBER-1);

 while(fgets(line, MAXFMTLEN, fp) == line){
    sscanf(line, pattern , lbin);

    printf("%s\n", lbin);
  }

  if(ferror(fp))
    perror(PATH);

  fclose(fp);
  return 0;

}

// gcc -o x.exe chad5.c

The data set is again:
1  0001000000000000001
2  0001000000000000001
3  10000011001000000000000001
4  10000011001000000000000001
5  10000011001000000000000001
6  10000011001000000000000001
7  10000011001000000000000001
8  10000011001000000000000001
9  10000011001000000000000001
10  10000011001000000000000001
11  100000001111100
12  100000001111100
13  100000001111100
14  1000001110110111100000000000001
15  1000001110110111100000000000001
16  1000001110110111100000000000001
17  1000001110110111100000000000001
18  1000001110110111100000000000001
19  1000001110110111100000000000001
20  0001000000000000001

This compiles but gives me a runtime.

One other question here:

  if ((fp = fopen(PATH, "r")) == NULL ) {
    fprintf(stderr, "can't open file\n");
    exit(1);
  }

  if(ferror(fp))
    perror(PATH);

If there is an error in opening the file, will execution go through both of
these?

Next thanksgiving, try to get yourself invited to a place where they cut
the turkey up before they cook it.  Because a turkey is a black body, the
breast will be 200 degrees before the middle is 160°, causing the proteins
to constrict and eject the water, for which you would have been glad, when
the bite sticks in your esophagus like a hockey puck.  Jenny's was finger
lickin good.
--

Wow, I still can't believe a variant of what I wrote in about 30
seconds at 4am in the morning a while back is still being brought up.
Good grief.


Chad
 
N

Nate Eldredge

pete said:
There's no ssize_t typedef in C.

ptrdiff_t should work. IIRC glibc's getline returns the number of
characters read, or -1 on error, and ptrdiff_t would be able to contain
these values. Of course, anyone writing their own version is free to
choose another convention to return this information.
 
K

Keith Thompson

George said:
It was a dutch matter. When I'm given to be cryptic, I don't make much
sense to most people. It's not unlike one of my expressions for necessary
success, where I tell people, "well, let's return zero and get out of
here."

I've never had someone say, "are you alluding to C?", but they do register
that I want to leave pronto.

You can say that babbling is OT, but I think to balance it by categorically
addressing C in other posts.

Have you considered just not babbling at all? Surely that would be
better than a "balance" between sense and nonsense.
 
N

Nick Keighley

there isn't

I'd have thought any Reasonable Compiler would generate identical
code.
He did that on my recommendation.

it was a daft recomendation. Why write obscure code when you can
write clear code?
There is a functional difference: it
treats an invalid return value from fgets() the same as an error, while
the original version treats an invalid return value from fgets() the
same as a successful call.

this is nonsense

The chance that fgets() will malfunction is
negligible, but so is the timing difference between !=NULL and ==line. I
prefer to trade off robust behavior when used with a defective version
of the standard library over the minor difference in performance. YMMV.

if your standard library is broken then all bets are off.
Maybe your compiler is broken as well?
Maybe your processor launces a pre-emptive nuclear strike
whenever fegts() read "42\n".

We're into that whole Reflections of Trust stuff now

<snip>
 
N

Nick Keighley

So that people could pass an argument of type va_list.

Thanks, Barry.  I'll try to get my head around this soon.  I'm gonna live
to understand va_list some other day.
And it did exactly what you asked, even if it is not what you wanted.
It processed all the characters up to the next white space or a max of
99, whichever came first.  What you may have wanted was a format of
the form "%*s %99s" to skip over the first string.

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt"
#define NUMBER 100
#define MAXFMTLEN 2000

int main(void)
{
  FILE *fp;
  char pattern[MAXFMTLEN];
  char lbin[NUMBER];
  char line[MAXFMTLEN];

  if ((fp = fopen(PATH, "r")) == NULL ) {
    fprintf(stderr, "can't open file\n");
    exit(1);
  }

  sprintf(pattern, "%*s %40s", NUMBER-1);

 while(fgets(line, MAXFMTLEN, fp) == line){
    sscanf(line, pattern , lbin);

    printf("%s\n", lbin);
  }

  if(ferror(fp))
    perror(PATH);

  fclose(fp);
  return 0;

}

This compiles but gives me a runtime.

what runtime?

One other question here:

  if ((fp = fopen(PATH, "r")) == NULL ) {
    fprintf(stderr, "can't open file\n");
    exit(1);
  }

  if(ferror(fp))
    perror(PATH);

If there is an error in opening the file, will execution go through both of
these?

both of what? If fopen() has an error it returns NULL (I'm assuming
you're not using James Kuyper's implementation of the standard
library).
Hence it prints out the error message and calls exit(). exit()
terminates
the program (unless it's Mr Kuyper's library).

Next thanksgiving,

I'm assuming this is what I'd call "Christmas"

try to get yourself invited to a place where they cut
the turkey up before they cook it.

or better, somewhere they don't have turkey
Because a turkey is a black body,

not if the chef is sane

the
breast will be 200 degrees before the middle is 160°, causing the proteins
to constrict and eject the water,

--
George

Use power to help people. For we are given power not to advance our own
purposes nor to make a great show in the world, nor a name. There is but
one just use of power and it is to serve people.
George W. Bush

the more you quote this guy the more impressed I am with him.
He makes a *lot* of sense.
 
J

James Kuyper

Nick said:
there isn't


I'd have thought any Reasonable Compiler would generate identical
code.

For "!=NULL" and "==line"? I suppose that's a legal optimization for a
standard library function which is guaranteed to return either one or
the other; but I wouldn't expect that optimization.
it was a daft recomendation. Why write obscure code when you can
write clear code?

I consider "== Expected_value" and "!= Error_value" to be equally clear.
Of course, if you're unfamiliar with the expected value, that might be
less obvious. You should be familiar with it.
this is nonsense

You may not believe that the possibility that fgets() will return an
invalid value is worth worrying about, but in that case you should have
said "That's not worth worrying about.". If fgets() did return an
invalid value, then what I said about the consequences is the literal
truth, not nonsense. You're not claiming, I hope, that the chance that
fgets() would return an invalid value is exactly zero? This is the real
world I'm talking about, not some fantasy in which it's easy to write a
large piece of software that is free of defects.
if your standard library is broken then all bets are off.

That's the all-or-nothing "logic" that I've denigrated elsewhere in this
thread. Most real-world software has one or more defects. An fgets() bug
that gets triggered sufficiently rarely that it has not yet been noticed
before does not mean that the entire rest of the library is defective.
It might be, but that's not how I would place my bets.
Maybe your compiler is broken as well?

It's certainly a possibility.
Maybe your processor launces a pre-emptive nuclear strike
whenever fegts() read "42\n".

What in the world could you possibly do to protect against that
possibility? Could you give me any example of an alternative to using
fgets() that would have a possibility of launching a nuclear strike
which is guaranteed to be less than the possibility when calling
fgets()? That possibility is so low that guarantying a lower one is
pretty difficult. Are the costs of this alternative low enough to
justify using it, given the extreme unlikelihood of that outcome?

It's trivial to protect against the possibility of fgets() returning an
invalid value. There's no significant difference between the costs of
doing so and the costs of not doing so. Why not do so?
 
K

Keith Thompson

James Kuyper said:
Nick said:
Barry Schwarz wrote: [...]
// while ((fgets(line, MAXFMTLEN, fp)) != NULL)
while(fgets(line, MAXFMTLEN, fp) == line){
Do you think there is any functional difference between the two
version of your while?
[...]
I consider "== Expected_value" and "!= Error_value" to be equally
clear. Of course, if you're unfamiliar with the expected value, that
might be less obvious. You should be familiar with it.
[...]

It's trivial to protect against the possibility of fgets() returning
an invalid value. There's no significant difference between the costs
of doing so and the costs of not doing so. Why not do so?

Because code that checks whether fgets() returns NULL is clearer than
code that checks whether fgets() returns the value of its first
argument. Why is it clearer? Because the fact that fgets() returns
NULL on error or end-of-file is better known, and more often used,
than the fact that it returns its first argument on success.

I don't think I've ever actually seen code that calls fgets() and
makes use of its non-null returned value (other than, in this case,
comparing it against the value of the first argument). I suppose it
could be used in a chained call, something like:
strcat(s, fgets(line, MAX_LEN, fp));
but that ignores error checking. So it's perfectly sensible to think
of fgets() as returning two possible results, null or non-null.

The fact that fgets() returns its first argument on success is
relatively obscure, and this is an obstacle (admittedly a small one)
to understanding the code. There is the advantage of detecting some
cases of an incorrectly implemented fgets() call, but I hardly think
that it's worth it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top