Simple regex question.

P

Peter Bailey

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter
 
T

Tim Hunter

Peter said:
Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
m = fname.match /(.*?)_\d+\.tiff/
if m
puts "Match: '#{m[1]}'"
else
puts "No match: #{fname}"
end
end

__END__
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff
 
B

Brian Candler

Peter said:
Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")

The argument "#{$1}" is expanded once, before gsub even executes. You
probably want the block form:

file = file.sub(/^(.*)_\d+\.tiff/) { $1 }
 
P

Peter Bailey

Tim said:
Peter said:
Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
m = fname.match /(.*?)_\d+\.tiff/
if m
puts "Match: '#{m[1]}'"
else
puts "No match: #{fname}"
end
end

__END__
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff

Well, you gave me a good idea, using match. Here's what I did, and, it
worked. Thank you very much, Tim.

Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.match(/^(.*)_[0-9]+\.tiff/)
#file = file.to_i
puts $1
#end
gives me:
ehs-g7917741_01.tiff
ehs-g7917741

Program exited with code 0
 
D

David A. Black

Hi --

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn't it give me my root filename?

Here's another good use of the string[//] technique:
file = "ehs-g7917741_01.tiff" => "ehs-g7917741_01.tiff"
file[/[^_]+/] # match non-underscore characters
=> "ehs-g7917741"


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
R

Robert Klemme

2009/6/26 David A. Black said:
=3D> "ehs-g7917741_01.tiff"
file[/[^_]+/] =A0 =A0 =A0# match non-underscore characters

=3D> "ehs-g7917741"

Combining all the good suggestions this is probably what I'd do:

files =3D Dir.glob("L:/infocontiffs/ehs-g7917741/*.tiff")
files.each do |f|
base =3D File.basename f
root =3D base[/^([^_]+)_\d+\.tiff$/, 1]

if base
# rename or whatever
else
$stderr.puts "Dunno what to do with #{f}"
end
end

The reason I left in the matching of underscores and digits is to be
sure that the complete name matches the pattern that we required in
order to detect other files that might accidentally have been placed
in that directory.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top