Encoding problem ? in Mechanize /NET::Http.get

A

Adam Cza

Hi there.
I'm trying to get a site to mess with, using mechanize.
Site charset is iso-8859-2 (central european something), Im running
ubuntu 8.10 and ruby 1.8. (but it doesn't seem to be the problem).
During the process I get that rather unpleasant error
ruby pierdolety.rb
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/util.rb:29:in
`iconv': "\352</a></b></font>"... (Iconv::IllegalSequence)
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/util.rb:29:in
`to_native_charset'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain/response_header_handler.rb:29:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain.rb:30:in
`pass'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain/handler.rb:6:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain/response_body_parser.rb:35:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain.rb:30:in
`pass'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain/handler.rb:6:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain/pre_connect_hook.rb:14:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/chain.rb:25:in
`handle'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize.rb:494:in
`fetch_page'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize.rb:229:in
`get'
from pierdolety.rb:9
Exit code: 1


from this code

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new{ |agent|
agent.user_agent_alias = 'Mac Safari'}

agent.get('http://www.land-serwis.pl/hurt/zaloguj.php')




I tried googling it, but It gave Me no effect at all.

any ideas ?
 
7

7stud --

Adam said:
Hi there.
I'm trying to get a site to mess with, using mechanize.
Site charset is iso-8859-2 ...
During the process I get that rather unpleasant error

/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.1/lib/www/mechanize/util.rb:29:in
`iconv': "\352</a></b></font>"... (Iconv::IllegalSequence)

from this code

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new{ |agent|
agent.user_agent_alias = 'Mac Safari'}

agent.get('http://www.land-serwis.pl/hurt/zaloguj.php')




I tried googling it, but It gave Me no effect at all.

any ideas ?

Looking at the source code from that web site, mechanize is having
problems with this word:

SiÄ™

The \352 in the error message is an octal number, which is 234 in
decimal, which is the decimal code for 'Ä™' in iso-8859-2. Mechanize is
choking on that character. The error message mentions iconv, which is a
standard library module used to convert between different encodings. It
looks to me like get() is trying to use iconv to convert 'Ä™' to ascii,
which is causing the error.

I looked around the Mechanize docs, and I couldn't find any mention of
"encoding", so I have no solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top