How do I get an integer from an array?

P

Peter Bailey

Hi,
I need to process individual pages of PDFs. To do so, I need to get the
page count of the PDF, then, do some image magic with each page of that
PDF. So, first thing I do is use a utility that gives me that page
count. I get the page count, but,it's an array. And, it doesn't let me
treat that "array" as a number, so, I can't do what I want. Here's a
snippet of my script and what I get with it. Thanks.

Dir.chdir("N:/infoconpdf")
file = "ehs-X7917735.pdf"
pages = `pdfinfo #{file}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
puts pages
1.upto(pages) do |n|
puts n
end

I get this:
78
================ ArgumentError =====================
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:12:in `>'
1.upto(pages) do |n|
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:12:in `upto'
1.upto(pages) do |n|
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:12:in `<main>'
1.upto(pages) do |n|
Exception: comparison of Fixnum with Array failed
Program exited with code 0
 
A

Alex

[Note: parts of this message were removed to make it a legal post.]


I think his `pages' var is a single-element array, not an array of length
78, so this may give the desired result:

1.upto(pages[0].to_i) do |n|
puts n
end


Alex
 
D

David A. Black

Hi --

Hi,
I need to process individual pages of PDFs. To do so, I need to get the
page count of the PDF, then, do some image magic with each page of that
PDF. So, first thing I do is use a utility that gives me that page
count. I get the page count, but,it's an array. And, it doesn't let me
treat that "array" as a number, so, I can't do what I want. Here's a
snippet of my script and what I get with it. Thanks.

Dir.chdir("N:/infoconpdf")
file = "ehs-X7917735.pdf"
pages = `pdfinfo #{file}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
puts pages
1.upto(pages) do |n|
puts n
end

I get this:
78
================ ArgumentError =====================

When you do a scan where the regex has parentheses, you get an array
for each scan through the string, with the captures as elements of
that array. So you end up with an array of arrays:
=> [[" I "], [" a "]]

So you got back [["78"]], I believe. You have to dig the number out.

Another way to do it is:

pages[/Pages:\D+(\d+),1/]

(plus .to_i to convert it to an integer).


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
P

Peter Bailey

Roger said:
Peter said:
Yes, if it do pages.to_s, I get [["78"]].

Looks like you want pages[0][0].to_i or what not then.
=r

Yup. That did it. Thank you very much, Roger. Now, I have to admit, I've
never, ever seen that double array counter notation. [0][0]. That's
totally weird to me. But, it works!

Dir.chdir("N:/infoconpdf")
file = "ehs-X7917735.pdf"
pages = `pdfinfo #{file}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
pages = pages[0][0].to_i
1.upto(pages) do |n|
puts n
end

I got:
1
2
3
4
5
6
7...
78
 
A

Alex

[Note: parts of this message were removed to make it a legal post.]

Roger said:
Peter said:
Yes, if it do pages.to_s, I get [["78"]].

Looks like you want pages[0][0].to_i or what not then.
=r

Yup. That did it. Thank you very much, Roger. Now, I have to admit, I've
never, ever seen that double array counter notation. [0][0]. That's
totally weird to me. But, it works!

Dir.chdir("N:/infoconpdf")
file = "ehs-X7917735.pdf"
pages = `pdfinfo #{file}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
pages = pages[0][0].to_i
1.upto(pages) do |n|
puts n
end

I got:
1
2
3
4
5
6
7...
78

The double notation works just like chaining any other methods together, IE

foo = bar.method.method

The index method of the array class is a method just like any other, so you
could just as well write it like:

foo.[](0).[](0)

which calls the `[]' method on the result of the previous call of the `[]'
method, which is an array.

foo[0] is just a shortcut ruby gives you for calling foo.[](0)



Alex
 
B

Brian Candler

Alex said:
The index method of the array class is a method just like any other, so
you
could just as well write it like:

foo.[](0).[](0)

which calls the `[]' method on the result of the previous call of the
`[]'
method, which is an array.

foo[0] is just a shortcut ruby gives you for calling foo.[](0)

Or in this particular case, you can do

foo.first.first
 
P

Peter Bailey

Brian said:
Alex said:
The index method of the array class is a method just like any other, so
you
could just as well write it like:

foo.[](0).[](0)

which calls the `[]' method on the result of the previous call of the
`[]'
method, which is an array.

foo[0] is just a shortcut ruby gives you for calling foo.[](0)

Or in this particular case, you can do

foo.first.first

Well, interestingly, I've succeeded in some of my scripts. But, in this
one, it fails. It displays a few hundred filenames with page counts,
but, in this directory, there are literally thousands of PDF files. So,
it does a bunch, then dies.

Dir.glob("*.pdf").each do |pdffile|
pages = `pdfinfo #{pdffile}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
pages = pages[0][0].to_i
puts "#{pdffile} #{pages}"
end

I get:
================ NoMethodError =====================
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:8:in `block in
<main>'
pages = pages[0][0].to_i
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `each'
Dir.glob("*.pdf").each do |pdffile|
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `<main>'
Dir.glob("*.pdf").each do |pdffile|

=============================================
Exception: undefined method `[]' for nil:NilClass
 
D

David A. Black

Hi --

Brian said:
Alex said:
The index method of the array class is a method just like any other, so
you
could just as well write it like:

foo.[](0).[](0)

which calls the `[]' method on the result of the previous call of the
`[]'
method, which is an array.

foo[0] is just a shortcut ruby gives you for calling foo.[](0)

Or in this particular case, you can do

foo.first.first

Well, interestingly, I've succeeded in some of my scripts. But, in this
one, it fails. It displays a few hundred filenames with page counts,
but, in this directory, there are literally thousands of PDF files. So,
it does a bunch, then dies.

Dir.glob("*.pdf").each do |pdffile|
pages = `pdfinfo #{pdffile}`
pages = pages.scan(/^Pages:[ ]{2,99}([0-9]+)/)
pages = pages[0][0].to_i
puts "#{pdffile} #{pages}"
end

I get:
================ NoMethodError =====================
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:8:in `block in
<main>'
pages = pages[0][0].to_i
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `each'
Dir.glob("*.pdf").each do |pdffile|
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `<main>'
Dir.glob("*.pdf").each do |pdffile|

=============================================
Exception: undefined method `[]' for nil:NilClass

That means that somewhere along the line, the scan operation isn't
finding what you expect it to. Is it possible that you have a document
with more than 99 occurrences of [] in a row?

I'd still recommend trying the technique I suggested in my earlier
answer. Getting a nested array of one element and unnesting it seems
like the long way around.


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
P

Peter Bailey

David said:
Hi --

method, which is an array.
it does a bunch, then dies.
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:8:in `block in
<main>'
pages = pages[0][0].to_i
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `each'
Dir.glob("*.pdf").each do |pdffile|
C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `<main>'
Dir.glob("*.pdf").each do |pdffile|

=============================================
Exception: undefined method `[]' for nil:NilClass

That means that somewhere along the line, the scan operation isn't
finding what you expect it to. Is it possible that you have a document
with more than 99 occurrences of [] in a row?

I'd still recommend trying the technique I suggested in my earlier
answer. Getting a nested array of one element and unnesting it seems
like the long way around.


David

But, it does hundreds of files just fine. Then, it dies. So, you're
saying that in one file in particular it can find what's in the scan?
I'm sorry, but, I don't understand the technique you described earlier,
David. You say to do this:
pages[/Pages:\D+(\d+),1/]
pages = pages.to_i
I get "0" as output with this.

The output of pdfinfo is simple. Here's an example:
Author: pb4072
Creator: Microsoft« Office Word 2007
Producer: Microsoft« Office Word 2007
CreationDate: 09/27/07 13:36:28
ModDate: 02/19/09 14:13:47
Tagged: no
Pages: 1
Encrypted: no
Page size: 612 x 792 pts (letter)
File size: 55418 bytes
Optimized: yes
PDF version: 1.6
As you can see, there are no [] characters in here.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top