E
Eigenvector
I've been surfing the FAQ and Google for about a week and haven't quite
figured out this one.
I have a file that changes on a periodic basis and every once and a while
^Zs will appear in the file for reasons I don't want to get into. I need to
get rid of those ^Z's and need to do it via a C code as it is the only tool
available to me that can handle the file size.
So I cooked up some code, tried it out on one platform - and it works great,
it doesn't work so great on another and I am trying to understand why. I
did my best to code standard but perhaps that is where I'm failing.
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *infile, *outfile;
int c; /*picked that up from the FAQ */
if ( (infile = fopen(argv[1], "rb") == NULL) /*picked the binary part up
from this google group*/
{
printf("Cannot open file\n");
exit(1);
}
if ( (outfile = fopen("Clean_file", "w+")) == NULL)
{
printf("Cannot open output file\n");
exit(1);
}
while ((c=fgetc(infile)) != EOF )
{
if(c == 0x1a) /* This is where I'm having a problem */
/* if(c == '\0x1a') This fails with compiler error - more than one
character defined for type char */
{
c='_'; /*replace bad control char with something innocuous */
}
fputs(c,outfile);
}
fclose(infile);
fclose(outfile);
}
Yeah it's a pretty primitive code, but I'm more interested in getting the
basics working before I go in and optimize the way it handles the input
file. This compiles on xlC and HP's ANSI C compilers.
In the first if statement dealing with the ^Z, the program doesn't detect
the control characters in the file, in the second statement the compiler
complains about syntax. If I set c as typecast char, it finds the control
characters, replaces them, but then blows away the EOF character and nukes
the file.
I have the suspicion that its the way I'm defining the c==\0x1a that is
leading my astray here. I can't find any good consistent documentation on
exactly how to represent hex or octal in c code or string/character
operations.
figured out this one.
I have a file that changes on a periodic basis and every once and a while
^Zs will appear in the file for reasons I don't want to get into. I need to
get rid of those ^Z's and need to do it via a C code as it is the only tool
available to me that can handle the file size.
So I cooked up some code, tried it out on one platform - and it works great,
it doesn't work so great on another and I am trying to understand why. I
did my best to code standard but perhaps that is where I'm failing.
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *infile, *outfile;
int c; /*picked that up from the FAQ */
if ( (infile = fopen(argv[1], "rb") == NULL) /*picked the binary part up
from this google group*/
{
printf("Cannot open file\n");
exit(1);
}
if ( (outfile = fopen("Clean_file", "w+")) == NULL)
{
printf("Cannot open output file\n");
exit(1);
}
while ((c=fgetc(infile)) != EOF )
{
if(c == 0x1a) /* This is where I'm having a problem */
/* if(c == '\0x1a') This fails with compiler error - more than one
character defined for type char */
{
c='_'; /*replace bad control char with something innocuous */
}
fputs(c,outfile);
}
fclose(infile);
fclose(outfile);
}
Yeah it's a pretty primitive code, but I'm more interested in getting the
basics working before I go in and optimize the way it handles the input
file. This compiles on xlC and HP's ANSI C compilers.
In the first if statement dealing with the ^Z, the program doesn't detect
the control characters in the file, in the second statement the compiler
complains about syntax. If I set c as typecast char, it finds the control
characters, replaces them, but then blows away the EOF character and nukes
the file.
I have the suspicion that its the way I'm defining the c==\0x1a that is
leading my astray here. I can't find any good consistent documentation on
exactly how to represent hex or octal in c code or string/character
operations.