Deleting Text From a File

W

woodyee

Hi! I had to manually remove decimals and tildes from a text file. How
could I have done this via programming? I googled this but didn't
really find anything, just stuff on removing spaces. Basically, I'd
like to be able to search a file and remove certain items (ex - all
decimals and tildes). Thanks!
 
A

Alex Young

woodyee said:
Hi! I had to manually remove decimals and tildes from a text file. How
could I have done this via programming? I googled this but didn't
really find anything, just stuff on removing spaces. Basically, I'd
like to be able to search a file and remove certain items (ex - all
decimals and tildes). Thanks!
Sounds like a job for regular expressions. Have you got a short before
and after example?
 
W

woodyee

Sounds like a job for regular expressions. Have you got a short before
and after example?

Sure. See below. Basically, the entire document was like this. To me,
Reg Exp's were the key but they're a weakness of mine so I was
lost. :)


Original
 
A

Alex Young

woodyee said:
Sure. See below. Basically, the entire document was like this. To me,
Reg Exp's were the key but they're a weakness of mine so I was
lost. :)


Original
---------
89.00~~~~
6330.00~15~CAT1~0005~1.00

Modified

irb(main):001:0> file = "89.00~~~~
irb(main):002:0" 6330.00~15~CAT1~0005~1.00
irb(main):003:0"
irb(main):004:0" "
=> "89.00~~~~\n6330.00~15~CAT1~0005~1.00\n\n"
irb(main):005:0> file.gsub(/\.\d+/, '').tr('~', ' ')
=> "89 \n6330 15 CAT1 0005 1\n\n"
irb(main):006:0> puts _
89
6330 15 CAT1 0005 1

I think that should do it :) I don't think it's feasible to do it all
in a single pass, which is why I've gone for separate gsub and tr phases.
 
M

MonkeeSage

Hi! I had to manually remove decimals and tildes from a text file. How
could I have done this via programming? I googled this but didn't
really find anything, just stuff on removing spaces. Basically, I'd
like to be able to search a file and remove certain items (ex - all
decimals and tildes). Thanks!

How about...

"he~~ll.o".tr('~.', '') # => "hello"

String#tr translates a group of characters into another group, such
that for every character in a given position in group one, it is
replaced by the character in the same position in group two, or if
group two is shorter, it uses the last character in group two. In this
case, group two is empty, so everything in group one is replaced with
empty strings (i.e., deleted).

HTH,
Jordan
 
Y

yermej

Hi! I had to manually remove decimals and tildes from a text file. How
could I have done this via programming? I googled this but didn't
really find anything, just stuff on removing spaces. Basically, I'd
like to be able to search a file and remove certain items (ex - all
decimals and tildes). Thanks!

This will remove all decimals and tildes and create a backup of the
original file with the extension .bak (from the command line):

ruby -i.bak -n -e 'print $_.gsub(/[\.~]/, "")' input_file.txt

Jeremy
 
W

woodyee

Hi! I had to manually remove decimals and tildes from a text file. How
could I have done this via programming? I googled this but didn't
really find anything, just stuff on removing spaces. Basically, I'd
like to be able to search a file and remove certain items (ex - all
decimals and tildes). Thanks!

This will remove all decimals and tildes and create a backup of the
original file with the extension .bak (from the command line):

ruby -i.bak -n -e 'print $_.gsub(/[\.~]/, "")' input_file.txt

Jeremy


WOW! It worked and it was so cool!! Thanks! One thing I did - I
modified it to delete the .00's:

ruby -i.bak -n -e 'print $_.gsub(/[\.00~]/, " ")' input_file.txt

This worked but it deleted the zero in front of the decimal. How can I
avoid this? Thanks! :)
 
Y

yermej

This will remove all decimals and tildes and create a backup of the
original file with the extension .bak (from the command line):
ruby -i.bak -n -e 'print $_.gsub(/[\.~]/, "")' input_file.txt

WOW! It worked and it was so cool!! Thanks! One thing I did - I
modified it to delete the .00's:

ruby -i.bak -n -e 'print $_.gsub(/[\.00~]/, " ")' input_file.txt

This worked but it deleted the zero in front of the decimal. How can I
avoid this? Thanks! :)

Typically, when you characters in [], it means to match any character
in that group. So, [\.00~] is the same as [\.0~] is the same as [0\.~]
etc. so you removed all occurrences of 0, ., and ~ no matter what
surrounded them.

If you want to delete just .00 and ~, try:

/\.00|~/

as the first argument to gsub. That means match .00 or ~ and nothing
else.

This is probably a decent site if you want to learn more about regular
expressions:
http://www.regular-expressions.info/tutorial.html

Jeremy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top