problem with ".scan"

Peter Bailey · Sep 27, 2006

RUBY's complaining about the following 3 lines of code. I've got it in a
new program, but, I copied it directly from an older, working program.
Can someone help me understand what's the problem with the "scan" line,
or, apparently, the "each" line?

Thanks,
Peter

10 Dir.glob("*.ps").each do |psfile|
11 file_contents = File.read(psfile)
12 file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do

Error message:

E:/PageCounts/test1.rb:12:in `scan': string modified (RuntimeError)
from E:/PageCounts/test1.rb:12
from E:/PageCounts/test1.rb:10:in `each'
from E:/PageCounts/test1.rb:10

ara.t.howard · Sep 27, 2006

RUBY's complaining about the following 3 lines of code. I've got it in a
new program, but, I copied it directly from an older, working program.
Can someone help me understand what's the problem with the "scan" line,
or, apparently, the "each" line?

Thanks,
Peter

10 Dir.glob("*.ps").each do |psfile|
11 file_contents = File.read(psfile)
12 file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do

the modification is probably here. can't you show us everything up through
the matching end?

-a

Peter Bailey · Sep 27, 2006

unknown said:
11 file_contents = File.read(psfile)
12 file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do

Click to expand...

the modification is probably here. can't you show us everything up
through
the matching end?

-a

Sorry. It's a bit much. That's why I was holding back. Here's the whole
script.

require 'kirbybase'
Dir.chdir("E:/pagecounts")
#First, create the database table.
db = KirbyBase.new
# If table exists, delete it.
db.drop_table

pageinfo) if db.table_exists?

pageinfo)
pageinfo_tbl = db.create_table

pageinfo,
:filename, {

ataType=>:String,
:Index=>1},
:lconstant, :String,
:compcode, :String,

rimecode, :Integer,
:costcenter, :String,
:acctgroup, :Integer,
:blank, :String,
:description, :String,

agecount, :Float,
:sjccode, :String,
:fullname, {

ataType=>:String,
:Index=>2}
)
# Import the csv file.
pageinfo_tbl.import_csv('McArdle_indexes.csv')

=begin
Parse each postscript print file in the polled directory. Create
variables for:
the number of pages in each file; the number of blank pages in each
file; and,
what exact pages are blank.
=end
Dir.glob("*.ps").each do |psfile|
file_contents = File.read(psfile)
file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do
totalpages = $1
if (totalpages.to_i % 2) !=0 then
newtotalpages = totalpages.to_i + 1
file_contents << "\%\%Blank page for Asura.\n\%\%Page:
#{newtotalpages.to_i}\nshowpage\n"
File.open(psfile, "w") { |f| f.print file_contents }
FileUtils.touch(psfile)
end

=begin
Find blank pages in the postscript file. Look for the regular expression
that
sees a page callout followed by postscript data that does not include
data in parentheses. Any type on a postscript page is enclosed in
parentheses,
so, that's why this is a legitimate search. Blank pages have no
parenthesized
data.
=end
blanks = []
file_contents.scan(/\%\%Page: [()0-9{1,5}]
([0-9]{1,5})\n[^$.*$]\%\%Page/)
do |match|
blanks.push($1)
end
file_contents.scan(/\%\%Blank page for Asura.\n/) do |match|
blanks.push(totalpages.to_i + 1)
end

=begin
Open a "pageinfo" file. Put page information about the file into it.
Notice that the variable for the total number of pages differs depending
on whether a "newtotalpages" variable exists. And, that variable only
exists if the original page count was odd and a blank had to be added.
=end
filename = File.basename("#{psfile}", '.ps')
pageinfofile = File.basename("#{psfile}", '.ps') + ".pageinfo"
File.open("E:/pagecounts/#{pageinfofile}", "a") do |fileinfo|
if newtotalpages then
fileinfo << #{filename}\n << "Total number of pages in this PDF:
#{newtotalpages}\n" <<
"Number of blank pages in this PDF: #{blanks.size}\n" <<
"Specific pages that are blank in this PDF: " <<
"#{blanks.join(', ')}\n"
else
fileinfo << #{filename}\n <<
"Total number of pages in this PDF: #{totalpages}\n" <<
"Number of blank pages in this PDF: #{blanks.size}\n" <<
"Specific pages that are blank in this PDF: " <<
"#{blanks.join(', ')}\n"
end
end
end
end

=begin
Back to the database table. . . .
Query against the table and match the filename in the directory with
whichever entry
in the "filename" column of the table matches. Then, if there's a match,
populate
the "pagecount" field in that row of the table with the variable for the
page count, as
found above. That variable name is "newtotalpages."
=end

Dir.glob("*.ps").each do |dirfile|
result = pageinfo_tbl.select

filename) { |r| dirfile =~
Regexp.new(r.filename) }
pageinfo_tbl.update { |r| r.name ==
{filename}.set

pagecount=>#{newtotalpages}) } unless result.nil?
end

ara.t.howard · Sep 27, 2006

unknown said:
unknown said:

11 file_contents = File.read(psfile)
12 file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do

Click to expand...

the modification is probably here. can't you show us everything up
through
the matching end?

-a

Click to expand...

Sorry. It's a bit much. That's why I was holding back. Here's the whole
script.

Dir.glob("*.ps").each do |psfile|
file_contents = File.read(psfile)
file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do
totalpages = $1
if (totalpages.to_i % 2) !=0 then
newtotalpages = totalpages.to_i + 1
file_contents << "\%\%Blank page for Asura.\n\%\%Page:

^^
^^
^^
^^
the modification is question

#{newtotalpages.to_i}\nshowpage\n"
File.open(psfile, "w") { |f| f.print file_contents }
FileUtils.touch(psfile)
end

so, ruby is correct, you are modifying a string while in an in-progress scan
block. easy-cheasy.

kind regards.

-a

Peter Bailey · Sep 27, 2006

unknown said:
Sorry. It's a bit much. That's why I was holding back. Here's the whole
script.

Click to expand...

Dir.glob("*.ps").each do |psfile|
file_contents = File.read(psfile)
file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do
totalpages = $1
if (totalpages.to_i % 2) !=0 then
newtotalpages = totalpages.to_i + 1
file_contents << "\%\%Blank page for Asura.\n\%\%Page:

Click to expand...

^^
^^
^^
^^
the modification is question

#{newtotalpages.to_i}\nshowpage\n"
File.open(psfile, "w") { |f| f.print file_contents }
FileUtils.touch(psfile)
end

Click to expand...

so, ruby is correct, you are modifying a string while in an in-progress
scan
block. easy-cheasy.

kind regards.

-a

Thanks. I ended the scan block before doing any file writing. That
seemed to do the trick. It still confuses me, though, because, this code
was borrowed from an existing script that I've been using for 6 months,
and, that part of it is just as you see it above.

ara.t.howard · Sep 27, 2006

unknown said:
unknown said:

-a

Sorry. It's a bit much. That's why I was holding back. Here's the whole
script.

Click to expand...

Dir.glob("*.ps").each do |psfile|
file_contents = File.read(psfile)
file_contents.scan(/\%\%Pages: (\d{1,5})[ ]+\n/) do
totalpages = $1
if (totalpages.to_i % 2) !=0 then
newtotalpages = totalpages.to_i + 1
file_contents << "\%\%Blank page for Asura.\n\%\%Page:

Click to expand...

^^
^^
^^
^^
the modification is question

#{newtotalpages.to_i}\nshowpage\n"
File.open(psfile, "w") { |f| f.print file_contents }
FileUtils.touch(psfile)
end

Click to expand...

so, ruby is correct, you are modifying a string while in an in-progress
scan
block. easy-cheasy.

kind regards.

-a

Click to expand...

Thanks. I ended the scan block before doing any file writing. That
seemed to do the trick. It still confuses me, though, because, this code
was borrowed from an existing script that I've been using for 6 months,
and, that part of it is just as you see it above.

probably because totalpages is always 1 - it's never even - in your new script
the number of pages is always 2 (or 0) i'm guessing, and so the bug is
triggered. if i we're you i'd update the other script - it's a bug in
waiting.

regards.

-a

Peter Bailey · Sep 27, 2006

unknown said:
probably because totalpages is always 1 - it's never even - in your new
script
the number of pages is always 2 (or 0) i'm guessing, and so the bug is
triggered. if i we're you i'd update the other script - it's a bug in
waiting.

regards.

-a

Well, I know that they're not always odd or even. They've been a mix of
both. But, I understand what you're saying. I will change my original
script. Basically, and, please tell me if I understand this correctly:
if I'm going to do a scan of a file, open the file, scan it, and then
close it. Right?

ara.t.howard · Sep 27, 2006

Well, I know that they're not always odd or even. They've been a mix of
both. But, I understand what you're saying. I will change my original
script. Basically, and, please tell me if I understand this correctly:
if I'm going to do a scan of a file, open the file, scan it, and then
close it. Right?

yup. just remember to avoid this

string = 'foobar'

string.scan(%r/foo/) do |word|
string << 'foo' # can't modify while scanning
end

regards.

-a

Peter Bailey · Sep 27, 2006

unknown said:
yup. just remember to avoid this

string = 'foobar'

string.scan(%r/foo/) do |word|
string << 'foo' # can't modify while scanning
end

regards.

-a

Thanks a lot, -a! I've cleaned up my code. But, if you notice way above,
I've got a File.read in the line before the file scan. If I do an "end"
for the file scan, my "read" is still open, right? Meaning, I can still
do stuff to the open file.

Robert Klemme · Sep 27, 2006

Peter Bailey said:
Thanks a lot, -a! I've cleaned up my code. But, if you notice way
above, I've got a File.read in the line before the file scan. If I do
an "end" for the file scan, my "read" is still open, right? Meaning,
I can still do stuff to the open file.

If you're referring to your original code, then no. You use File.read(name)
which returns the whole file in a single string. No open connection is
returned.

Btw, for efficiency reasons if your files are large you might consider using

File.foreach(file_name) do |line|
....
end

Or use File.readlines instead of File.read - that way you get an array with
lines and not the whole file in one piece.

Kind regards

robert

Peter Bailey · Sep 27, 2006

Robert said:
If you're referring to your original code, then no. You use
File.read(name)
which returns the whole file in a single string. No open connection is
returned.

Btw, for efficiency reasons if your files are large you might consider
using

File.foreach(file_name) do |line|
....
end

Or use File.readlines instead of File.read - that way you get an array
with
lines and not the whole file in one piece.

Kind regards

robert

Thanks, Robert. I'll look into that line-by-line technique. The reason I
probably haven't used it is that I often need to search for or
accommodate data that spans over multiple lines.

Robert Klemme · Sep 28, 2006

Peter Bailey said:
Thanks, Robert. I'll look into that line-by-line technique. The
reason I probably haven't used it is that I often need to search for
or accommodate data that spans over multiple lines.

Yeah, in that case File.read is clearly superior (if the file fits into
memory that is). For me line by line is the default because it scales better
and I switch only to slurp in at once if I need line spanning. But then
again my typical problem might be different from yours so your different
default might actually be the better solution for you.

Kind regards

robert

Why am I getting this string error?	2	Nov 7, 2006
can't seem to write to a file properly	2	Nov 9, 2006
Logic Problem with BigInteger Method	2	Aug 26, 2023
"unknown type of string?"	2	Sep 26, 2006
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
Odd Ruby/Rubygems/gem path problem	6	Apr 24, 2009
Help with code	0	Jun 12, 2022
Problem with displaying command line outputs	2	Apr 15, 2011

problem with ".scan"

Peter Bailey

ara.t.howard

Peter Bailey

ara.t.howard

Peter Bailey

ara.t.howard

Peter Bailey

ara.t.howard

Peter Bailey

Robert Klemme

Peter Bailey

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads