beginner fscanf question

R

Roman Zeilinger

Hi

I have a beginner question concerning fscanf.
First I had a text file which just contained some
hex numbers:

0C100012
0C100012
....

It was easy to read this values into an integer variable:


FILE *fp = fopen("test.dat","r");
int mem_word = 0;
while (fscanf(fp,"%x", &mem_word) != EOF)
....

Now, I have the following format where my hex number
is splitted by spaces. So I wonder if there is
an easy way to parse such a line and not doing
a lot of space removing etc.

0C 10 00 12 // 00000000
0C 10 00 12 // 00000000
....

Thanks for helpful comments.
 
M

Mark Bluemel

Roman said:
Now, I have the following format where my hex number
is splitted by spaces. So I wonder if there is
an easy way to parse such a line and not doing
a lot of space removing etc.

I don't think so. But it's trivial to remove the spaces and then use
sscanf().
0C 10 00 12 // 00000000
0C 10 00 12 // 00000000

Or you could write your own simple parser
 
J

Jens Thoms Toerring

Roman Zeilinger said:
I have a beginner question concerning fscanf.
First I had a text file which just contained some
hex numbers:

It was easy to read this values into an integer variable:
FILE *fp = fopen("test.dat","r");
int mem_word = 0;
while (fscanf(fp,"%x", &mem_word) != EOF)
...
Now, I have the following format where my hex number
is splitted by spaces. So I wonder if there is
an easy way to parse such a line and not doing
a lot of space removing etc.
0C 10 00 12 // 00000000
0C 10 00 12 // 00000000

If there's always " // 00000000" at the end of the line and
never anything else (except trailing white space) it's rela-
tively simple:

#include <stdio.h>
#include <stdlib.h>

int main( void ) {
FILE *fp = fopen( "sh.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;

if ( fp == NULL )
return EXIT_FAILURE;

while ( fscanf( fp, "%x%x%x%x // 00000000",
d, d + 1, d + 2, d + 3 ) == 4 ) {
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}

return 0;
}

but if that's not the case then you will need to do a bit more of
work in that while loop:

while ( fscanf( fp, "%x%x%x%x", d, d + 1, d + 2, d + 3 ) == 4 ) {
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
while ( ( c = fgetc( fp ) ) != '\n' && c != EOF )
/* empty */ ;
}

You also will need to add a definition of 'c' (as an int, not a
char!).

You could use something simpler if you can be sure that there will
be never more than 12 characters following your quadruplet of hex
numbers (again except trailing white space characters):

#include <stdio.h>
#include <stdlib.h>

int main( void ) {
FILE *fp = fopen( "sh.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;
char buffer[ 13 ];

if ( fp == NULL )
return EXIT_FAILURE;

while ( fscanf( fp, "%x%x%x%x%12[^\n]",
d, d + 1, d + 2, d + 3, buffer ) == 5 ) {
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}

return 0;
}
Regards, Jens
 
R

Roman Zeilinger

You could use something simpler if you can be sure that there will
be never more than 12 characters following your quadruplet of hex
numbers (again except trailing white space characters):

#include <stdio.h>
#include <stdlib.h>

int main( void ) {
FILE *fp = fopen( "sh.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;
char buffer[ 13 ];

if ( fp == NULL )
return EXIT_FAILURE;

while ( fscanf( fp, "%x%x%x%x%12[^\n]",
d, d + 1, d + 2, d + 3, buffer ) == 5 ) {
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}

return 0;
}

Great thanks Jens. It works fine for one line so I have to figure out
how I can parse the whole txt file :)

Cheers!
 
R

Roman Zeilinger

I modified the code from jens slightly, but
it looks like the program sticks with the
first line it reads from "icache.dat". Why
does the program not move to the next line of
the text file after the fscanf and remains with the
first one?

int main( void ) {
FILE *fp = fopen( "icache.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;

if ( fp == NULL )
return EXIT_FAILURE;

while ( fscanf( fp, "%x%x%x%x%", d, d + 1, d + 2, d + 3 ) != EOF )
{
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}
return 0;
}

Cheers!
 
M

Mark Bluemel

Roman said:
I modified the code from jens slightly, but
it looks like the program sticks with the
first line it reads from "icache.dat". Why
does the program not move to the next line of
the text file after the fscanf and remains with the
first one?

Time for the C FAQ, I suspect. http://c-faq.com

There's loads of discussion of the pitfalls and snares of scanf..
 
J

Jens Thoms Toerring

Roman Zeilinger said:
I modified the code from jens slightly, but
it looks like the program sticks with the
first line it reads from "icache.dat". Why
does the program not move to the next line of
the text file after the fscanf and remains with the
first one?
int main( void ) {
FILE *fp = fopen( "icache.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;
if ( fp == NULL )
return EXIT_FAILURE;
while ( fscanf( fp, "%x%x%x%x%", d, d + 1, d + 2, d + 3 ) != EOF )
{
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}
return 0;
}

If there are some unusable data in that line as in your origi-
nal example had you don't skip them. fscanf() will try to read
them also as 4 hex numbers, fail for obvious reasons, return 0
and not move forward in the file. And you only test the return
value for EOF and repeat if you get anything else. That makes
fscanf() look again at the stuff it already deemed as not being
4 hex numbers the last time round, giving you the same result
again etc., so you end up in an endless loop. That's why I had
proposed to either read in everything of the remainder of the
line with a loop over getc() until you find the newline charac-
terer or by using a long enough dummy buffer that gets filled
with the unusable data in the call of fscanf().

Regards, Jens
 
B

Ben Bacarisse

Roman Zeilinger said:
I modified the code from jens slightly, but
it looks like the program sticks with the
first line it reads from "icache.dat". Why
does the program not move to the next line of
the text file after the fscanf and remains with the
first one?
int main( void ) {
FILE *fp = fopen( "icache.dat", "r" );
unsigned int d[ 4 ];
unsigned int x;
if ( fp == NULL )
return EXIT_FAILURE;
while ( fscanf( fp, "%x%x%x%x%", d, d + 1, d + 2, d + 3 ) != EOF )
{
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
}
return 0;
}

If there are some unusable data in that line as in your origi-
nal example had you don't skip them. fscanf() will try to read
them also as 4 hex numbers, fail for obvious reasons, return 0
and not move forward in the file.

For this reason it is almost always better to test the return value
from fscanf for success (== 4 in this case) rather than for one
particular failure (from the possible EOF, 0, 1, 2 or 3).
 
P

pete

Roman said:
Hi

I have a beginner question concerning fscanf.
First I had a text file which just contained some
hex numbers:

0C100012
0C100012
...

It was easy to read this values into an integer variable:

FILE *fp = fopen("test.dat","r");
int mem_word = 0;
while (fscanf(fp,"%x", &mem_word) != EOF)
...

Now, I have the following format where my hex number
is splitted by spaces. So I wonder if there is
an easy way to parse such a line and not doing
a lot of space removing etc.

0C 10 00 12 // 00000000
0C 10 00 12 // 00000000
...

Thanks for helpful comments.

I like to read lines and proccess strings.

/* BEGIN beginner.c */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <assert.h>

#define FILE_NAME ("test.dat")
#define LINE ("0C 10 00 12")
#define LINELENGTH 11
#define str(x) # x
#define xstr(x) str(x)

void hex_squeeze(char *s1);

int main(void)
{
unsigned long mem_word;
char array[sizeof LINE];
int rc;
char *endptr;
char *fn = FILE_NAME;
FILE *fp = fopen(fn, "r");

assert(LINELENGTH == sizeof LINE - 1);
fp = fopen(fn, "r");
if (fp == NULL) {
printf("fopen problem with %s\n", fn);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fp, "%" xstr(LINELENGTH) "[^\n]%*[^\n]", array);
if (!feof(fp)) {
getc(fp);
}
if (rc == 0) {
array[0] = '\0';
}
if (rc == 1) {
hex_squeeze(array);
mem_word = strtoul(array, &endptr, 16);
printf("%lx\n", mem_word);
}
} while (rc != EOF);
fclose(fp);
return 0;
}

void hex_squeeze(char *s)
{
char *p = s;

do {
if (isxdigit((unsigned char)*s)) {
*p++ = *s;
}
} while (*s++ != '\0');
*p = '\0';
}

/* END beginner.c */
 
C

Chris Torek

[a bunch of correct stuff with one subtle flaw]
unsigned int d[ 4 ];
unsigned int x; [snippage]
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );

To make this portable, you need to give "x" the type "unsigned
long", and convert at least one of the various operands in the
shift and add sequence to "unsigned long". Of course then the
printf() format argument must be "%lx".

(An implementation with 16-bit "unsigned int"s will turn:

x = (((((d[0] << 8) + d[1]) << 8) + d[2]) << 8) + d[3];

into the equivalent of just:

x = (d[2] << 8) + d[3];

since the extra bits will "fall off the end". A lot of
implementations today have 32-bit "int", where the problem
will not occur; and some of 64-bit "long", where "unsigned
long" is perhaps overkill, but the oberkill is portable.)
 
J

Jens Thoms Toerring

Chris Torek said:
unsigned int d[ 4 ];
unsigned int x; [snippage]
x = ( ( ( ( ( d[ 0 ] << 8 ) + d[ 1 ] ) << 8 )
+ d[ 2 ] ) << 8 ) + d[ 3 ];
printf( "x = %x\n", x );
To make this portable, you need to give "x" the type "unsigned
long", and convert at least one of the various operands in the
shift and add sequence to "unsigned long". Of course then the
printf() format argument must be "%lx".
(An implementation with 16-bit "unsigned int"s will turn:
x = (((((d[0] << 8) + d[1]) << 8) + d[2]) << 8) + d[3];
into the equivalent of just:
x = (d[2] << 8) + d[3];
since the extra bits will "fall off the end". A lot of
implementations today have 32-bit "int", where the problem
will not occur; and some of 64-bit "long", where "unsigned
long" is perhaps overkill, but the oberkill is portable.)

Thank you, I should have realized that myself;-)

Regards, Jens
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top