Replacing characters in file

A

Anony-mouse

Mumia W. said:
"Dr.Ruud" said:
[...]
But if your .DAT is a "binary" file, read perlopentut.

Call them whatever you want, but I have said all along that the
"characters" are 01 and 5F in hex ... 01 is not an ASCII character, but
is a control character for "start of record" (or something like that
from memory).

I have also said that it is a .DAT file containing various control
characters, so it can't be a plain text file, which is why I never said
"text file" anywhere.

If it was a simple text file, then SED would have worked.

So I take it that you want us to help you write a program that can make
modifications to a binary file, and you want that program to fit on a
USB key.

Why do you want to do this?

I'm really beginning to think it's not worth wasting my time with Perl
at all. There must be an easy (and small) way to achieve this. :eek:\
_
_/ \___
Anony-mouse says o_/O _/ \
"Eek-eek-eek!" \__/_|_/_|\____/
 
D

Dr.Ruud

Anony-mouse schreef:
Dr.Ruud:

Call them whatever you want, but I have said all along that the
"characters" are 01 and 5F in hex ... 01 is not an ASCII character,
but is a control character for "start of record" (or something like
that from memory).

01 is an ASCII character.

I think it is best that you rephrase your question to something like:
I want to replace, in a binary file on a DOS-type system, the
two-byte-sequence "0x01 0x5F" by "0x01 0x5C".

You should also tell something about the size of the file. If the file
fits easily in memory, it is much easier to do the replacement. If the
file is way big so it needs to be processed in chunks, then the 0x01
could be the last byte of a chunk, which complicates things a little.

I have also said that it is a .DAT file containing various control
characters, so it can't be a plain text file, which is why I never
said "text file" anywhere.

A .DAT file (whatever that means) containing various control characters,
can be a perfectly normal plain text file.
See also http://filext.com/detaillist.php?extdetail=.DAT
If it was a simple text file, then SED would have worked.

Some seds can streamedit binary files as well.
 
J

Jürgen Exner

Anony-mouse said:
It's not a text file. As I said it's a .DAT file

Whatever that is supposed to mean.
A file name ending of .DAT doesn't imply anything as to the content of the
file.
that contains control
characters, including EOF ones. :eek:\

Well, if it is not a text file but a binary file, then it isn't very
meaningful to talk about characters.
It would have been much less confusing if you had said "I have a two-byte
sequence"01 5F" (denoted in hex) and like to replace it with "01 5C" .

I still don't understand why a simply
s/\x01\x5F/\x01\x5C/g
shouldn't work.

jue
 
A

Anony-mouse

"Jürgen Exner" said:
Whatever that is supposed to mean.
A file name ending of .DAT doesn't imply anything as to the content of the
file.

Although not necessarily true, it does imply that it's not a simple
..TXT file.


Well, if it is not a text file but a binary file, then it isn't very
meaningful to talk about characters.
It would have been much less confusing if you had said "I have a two-byte
sequence"01 5F" (denoted in hex) and like to replace it with "01 5C".

Symantics (or pendaticism, whichever you prefer). It still means the
same thing and some people seem to have understood what I was saying,
while others simply want to get picky about terminology.


I still don't understand why a simply
s/\x01\x5F/\x01\x5C/g
shouldn't work.

It doesn't work in SED because SED aborts when it hits the first EOF
character and so doesn't complete the file.

In the version of Perl I've been using it simply does nothing - I get a
completely empty result file. I'll have to wait until I finish
downloading Active Perl and see what happens there, but that may be too
big in filesize for my needs.


_
_/ \___
Anony-mouse says o_/O _/ \
"Eek-eek-eek!" \__/_|_/_|\____/
 
A

Anony-mouse

"Dr.Ruud" said:
Anony-mouse schreef:

01 is an ASCII character.

Yes, an ASCII control character. One of many different one sin the
file, so it's not a simple ASCII / text file.


I think it is best that you rephrase your question to something like:
I want to replace, in a binary file on a DOS-type system, the
two-byte-sequence "0x01 0x5F" by "0x01 0x5C".

The terminology is irrelevant since some people have obvoiusly
understood.


You should also tell something about the size of the file. If the file
fits easily in memory, it is much easier to do the replacement. If the
file is way big so it needs to be processed in chunks, then the 0x01
could be the last byte of a chunk, which complicates things a little.

The size of the .DAT file is probably irrelevant too since it's under
10K.


A .DAT file (whatever that means) containing various control characters,
can be a perfectly normal plain text file.
See also http://filext.com/detaillist.php?extdetail=.DAT

If I'd meant a plain ASCII / text file than I would have said .TXT file.


Some seds can streamedit binary files as well.

Then maybe I need a different version, but every version I've looked at
(which probably isn't ALL of them) says that under DOS finding a EOF
character will cause it to stop processing the file.
_
_/ \___
Anony-mouse says o_/O _/ \
"Eek-eek-eek!" \__/_|_/_|\____/
 
U

Uri Guttman

A> Yes, an ASCII control character. One of many different one sin the
A> file, so it's not a simple ASCII / text file.

you seem to have this odd conception of what makes a binary or text
file. it is NEVER the suffix itself (however much redmond thinks so). so
stop calling your file whatever you want and start describing its actual
contents.

A> The size of the .DAT file is probably irrelevant too since it's under
A> 10K.

so slurp it in, do the s///g and spew it out. wow!

A> If I'd meant a plain ASCII / text file than I would have said .TXT file.

no. there is no such thing as a TXT file. there is a .TXT suffix. maybe
it is a normal text file if no one lied about the suffix but that
happens all the time. as i said, stop calling a file by its suffix. it
is wrong, not portable, makes no sense, etc.


A> Then maybe I need a different version, but every version I've looked at
A> (which probably isn't ALL of them) says that under DOS finding a EOF
A> character will cause it to stop processing the file.

again, there is NO eof character in a general sense. stop saying
that. you are obviously stuck on redmondware which has inherited the
ancient practice of ending 'text' files with ^Z. but you keep claiming
your file is not a text file so it shouldn't matter if there is a ^Z in
there. so your falacious thinking is boxing you in. is it text and ^Z
matters or is it binary and ^Z doesn't matter?

since you seem to think it is a binary file, then sed is not for you as
it is a LINE oriented stream editor. neither is perl in a line by line
mode. and on redmond, maybe even perl will stop reading a file in 'text'
mode (see perldoc -f binmode) when it hits ^Z. in that case you need to
slurp the file in binary mode and do your s/// over the whole file with
NO regard to lines or ^Z. you can do this with File::Slurp with the
binary option enabled. do not do this with perl's -p/-n options as they
will run in text mode. i doubt even using -0777 will work as nothing is
telling perl or winblows that your file is binary so the ^Z will hit
you.

now do the right thing already, slurp it in in binary more, clean up the
mess, and write it out. and end this interminable thread!

uri
 
J

Jürgen Exner

Anony-mouse said:
Although not necessarily true, it does imply that it's not a simple
.TXT file.

Well, the file ending has no bearing on the content of a file. So, yes and
so what?
Symantics (or pendaticism, whichever you prefer).

Not really. Computers are notorious at being extremely pedantic. You will
have much success when dealing with computers if you adapt a precise
language, too.
It still means the
same thing

Well, it doesn't. A character and a byte are two very different things. And
a text file and data/binary file are two very different things, too.
and some people seem to have understood what I was saying,

Some people are better in guessing than others.
while others simply want to get picky about terminology.

As I mentioned, terminology is important when dealing with something as
pedantic as computers.
It doesn't work in SED because SED aborts when it hits the first EOF
character and so doesn't complete the file.

Not surprising. AFAIK SED is designed to work for text files and files that
contain text by definition don't have any content after an EOF. Do you now
realize how important the distinction between text and binary files is
sometimes?
In the version of Perl I've been using it simply does nothing - I get
a completely empty result file.

And which version would that be?
perl -v

jue

I'll have to wait until I finish
 
A

Anony-mouse

Many thanks to Klaus and a couple of others who actually tried to help
me with this problem by posting sensible replies, but I now give up
wasting time with Perl and this newsgroup.

It's simply not worth the hassle dealing with the pedantic fools in
this newsgroup that simply want to prove how clever they are by
nitpicking teminology. Thanks for nothing. :eek:(


_
_/ \___
Anony-mouse says o_/O _/ \
"Eek-eek-eek!" \__/_|_/_|\____/
 
U

Uri Guttman

A> Many thanks to Klaus and a couple of others who actually tried to help
A> me with this problem by posting sensible replies, but I now give up
A> wasting time with Perl and this newsgroup.

A> It's simply not worth the hassle dealing with the pedantic fools in
A> this newsgroup that simply want to prove how clever they are by
A> nitpicking teminology. Thanks for nothing. :eek:(

hmmm, methinks you are the fool for not understanding that computer
terminology matters (like it does in all technical worlds). they call it
jargon for a reason. i think perl is too good for you so stick with
cobol or fortran which are more your speed.

and thanx for dropping by and ignoring my recent posts which tell you
exactly (and pedantically) how to solve your coding problem. but i can't
solve your head problem. neither can you it seems.

or hire a programmer the next time because you aren't one.

uri
 
K

Klaus

Many thanks to Klaus and a couple of others who actually tried to help
me with this problem by posting sensible replies,

All replies to your problem so far were sensible.
but I now give up wasting time with Perl...

Don't hurt yourself.
...and this newsgroup.

Well, that's your opinion, but I would suggest at least that you read
this newsgroup from time to time (I suggest twice a week ?). One never
knows, maybe a satisfactory reply to your problem will be posted
later.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,242
Latest member
KendrickKo

Latest Threads

Top