Mechanize and charset issues

J

John Schmitz

I'm not sure what is causing this error as I can successfully login, I
just can't submit this form without the script bombing out.

formatstring = "testing submission"

agent = WWW::Mechanize.new
page = agent.get 'hidden'
form = page.forms.first
if !(form.action.eql?('submit.php'))
p "logging in....."
form['username'] = 'hidden'
form['password'] = 'hidden'

page = agent.submit form
page = agent.click(page.link_with:)text => 'Add'))
end

page = agent.click(page.link_with:)text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`from_native_charset'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:151:in
`from_native_charset'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:143:in
`proc_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`map'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`proc_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:165:in
`build_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`each'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`build_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:213:in
`request_data'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:392:in
`post_form'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:335:in
`submit'
 
J

John Schmitz

It is caused by the following html â€&oelig in one of the
hidden form entries that is being submitted. I'm not sure how to avoid
this from bombing and still submit the form though?
 
T

The Higgs bozo

John said:
page = agent.click(page.link_with:)text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
from

Running "ruby -KU ..." will probably fix it (at least it has worked for
me whenever I had errors from \nnn inside strings).
 
J

John Schmitz

The said:
John said:
page = agent.click(page.link_with:)text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
from

Running "ruby -KU ..." will probably fix it (at least it has worked for
me whenever I had errors from \nnn inside strings).

Thank you for the response but it doesn't seem to solve the issue. I
think it's related to charsets and iconv, but I have no idea where to go
from there. I get a near duplicate error message with ruby -KU:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "â¬Åa condition"... (Iconv::IllegalSequence)
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`from_native_charset'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:151:in
`from_native_charset'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:143:in
`proc_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`map'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`proc_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:165:in
`build_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`each'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`build_query'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:213:in
`request_data'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:392:in
`post_form'
from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:335:in
`submit'
 
J

John Schmitz

If anyone comes across this problem, this is how I fixed it. Found a
method online and made some minor changes and additions. I just pass the
problem strings through this and it gives me back strings that don't
have issues.

def fix_quotes(c)
c.gsub!(/\342\200(?:\234|\235)/,'"')
c.gsub!(/\342\200(?:\230|\231)/,"'")
c.gsub!(/\342\200\223/,"-")
c.gsub!(/\342\200\246/,"...")
c.gsub!(/\303\242\342\202\254\342\204\242/,"'")
c.gsub!(/\303\242\342\202\254\302\235/,'"')
c.gsub!(/\303\242\342\202\254\305\223/,'"')
c.gsub!(/\303\242\342\202\254"/,'-')
c.gsub!(/\342\202\254\313\234/,'"')
end
 
J

Jarmo Pertman

Have you tried to set encoding for page something like this:
page.encoding = 'UTF-8'?

Jarmo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top