T
Tom Cloyd
I'm baffled by this strange outcome - I cannot reduce multiple spaces
from a text file. This isn't just a regex problem, somehow. I'm failing
to grasp something essential, but don't know what it is. All help
appreciated, as usual!
Here is a demo of my problem, in which I try two different ways, and
both fail:
=== code ===
# h2t.rb
def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [
/<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but
doesn't
data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s }
conv.each do |i|
data.gsub!( i[0], i[1] )
end
data.squeeze(' ') # <= putting this here was sheer desperations, but
even THIS fails
open( "h2t-out.txt", "w" ) { |f| f.write( data ) }
end
%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib|
require lib }
main
=== input file ===
<h1>Library catalog listing </h1>x
<h3>Library catalog listing </h3>x
<h2>Library catalog listing </h2>x
p(subtitle). A complete listing of all material in the Library
=== output file ===
h1. Library catalog listing x
h3. Library catalog listing x
h2. Library catalog listing x
p(subtitle). A complete listing of all material in the Library
==============
The "x"s in the input file are to show that while the end tags are being
removed the space before them is NOT.
t.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< (e-mail address removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
from a text file. This isn't just a regex problem, somehow. I'm failing
to grasp something essential, but don't know what it is. All help
appreciated, as usual!
Here is a demo of my problem, in which I try two different ways, and
both fail:
=== code ===
# h2t.rb
def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [
/<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but
doesn't
data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s }
conv.each do |i|
data.gsub!( i[0], i[1] )
end
data.squeeze(' ') # <= putting this here was sheer desperations, but
even THIS fails
open( "h2t-out.txt", "w" ) { |f| f.write( data ) }
end
%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib|
require lib }
main
=== input file ===
<h1>Library catalog listing </h1>x
<h3>Library catalog listing </h3>x
<h2>Library catalog listing </h2>x
p(subtitle). A complete listing of all material in the Library
=== output file ===
h1. Library catalog listing x
h3. Library catalog listing x
h2. Library catalog listing x
p(subtitle). A complete listing of all material in the Library
==============
The "x"s in the input file are to show that while the end tags are being
removed the space before them is NOT.
t.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< (e-mail address removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~