Intermittant problem with Hpricot

A

Allison Newman

I've run into a wierd problem with the Hpricot gem. I don't know if
this is a problem with gems in general on my system, or if it's specific
to Hpricot, so I'm posting here.

Basically I have code as follows:
require 'rubygems'
require 'net/http'
require 'uri'
require_gem 'hpricot'

class Page
def some_function(html_text)
doc = Hpricot(html_text)
doc.search("/html/body//img").each do |img|
logger.debug("== Found an image: #{img} ==")
src = img[/src="/]
logger.debug("src = #{src}")
end
end
end


When the function is called, sometimes I get a message saying that
function Hpricot does not exist for the object of type Page. this
continues for the entire programming session.

But sometimes, when I sit down to work on the project, the code works!
It continues working until I restart the Rails server (this is a Rails
project running on MacOSX).

Does anyone have any idea what might be causing this problem, and how to
correct it?

Alli
 
A

Allison Newman

Of interest, if I replace the
doc = HPricot(html_text)
by
doc = Hpricot.parse(html_text)

everything works just fine, all of the time. But I still don't
understand why Rails doesn't find the module level function all of the
time. Anyone have any ideas?

Alli
 
D

David Vallner

--------------enig2EF5DBE831F7F191B8208007
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Allison said:
I've run into a wierd problem with the Hpricot gem. I don't know if
this is a problem with gems in general on my system, or if it's specifi= c
to Hpricot, so I'm posting here.
=20
Basically I have code as follows:
require 'rubygems'
require 'net/http'
require 'uri'
require_gem 'hpricot'
=20
class Page
def some_function(html_text)
doc =3D Hpricot(html_text)
doc.search("/html/body//img").each do |img|
logger.debug("=3D=3D Found an image: #{img} =3D=3D")
src =3D img[/src=3D"/]
logger.debug("src =3D #{src}")
end
end
end
=20
=20
When the function is called, sometimes I get a message saying that
function Hpricot does not exist for the object of type Page. this
continues for the entire programming session.
=20
But sometimes, when I sit down to work on the project, the code works!
It continues working until I restart the Rails server (this is a Rails
project running on MacOSX).
=20
Does anyone have any idea what might be causing this problem, and how t= o
correct it?
=20
Alli
=20

This might or might not be related, but I think there was some change to
the semantics of require_gem lately (appeared in a post or thread a week
or two ago?), or HPricot might have been changed not to load any files
when the gem is loaded.

Either way, (I think) you should just use #require, not #require_gem and
directly require files inside the Hpricot gem - Rubygems is intended to
be as transparent to use as possible, and at least personally I prefer
code agnostic to how a certain library might be packaged on the given
system.

Try using require 'hpricot' instead or require_gem 'hpricot' and see if
that helps?

Also, Rails has a hard dependency on rubygems, so you don't need to
require those in Rails code.

David Vallner


--------------enig2EF5DBE831F7F191B8208007
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFcxa3y6MhrS8astoRAhCzAJ4hqzVwtBHsyVCnvYgBeRWSaBojGQCdEsWZ
vXWgc6SeKMPhSkrZWBbhJh0=
=oekr
-----END PGP SIGNATURE-----

--------------enig2EF5DBE831F7F191B8208007--
 
L

lrlebron

You can change the code one of two ways to get it to work

This way which is useful if you want, for example, to use a particular
version of the library

require 'rubygems'
require 'net/http'
require 'uri'
require_gem 'hpricot', '=0.4'
require 'hpricot'

or this way to use the latest version

require 'net/http'
require 'uri'
require 'hpricot'

Luis
 
A

Allison Newman

David said:
This might or might not be related, but I think there was some change to
the semantics of require_gem lately (appeared in a post or thread a week
or two ago?), or HPricot might have been changed not to load any files
when the gem is loaded.

Either way, (I think) you should just use #require, not #require_gem and
directly require files inside the Hpricot gem - Rubygems is intended to
be as transparent to use as possible, and at least personally I prefer
code agnostic to how a certain library might be packaged on the given
system.

Try using require 'hpricot' instead or require_gem 'hpricot' and see if
that helps?

Also, Rails has a hard dependency on rubygems, so you don't need to
require those in Rails code.

David Vallner

Yup, that seems to fix the problem, thanks. It's still wierd that the
problem was intermittent though!!!!

I think I may leave the code in the Hpricot.parse form, just to be sure,
until I have some time to more fully investigate the problem.
 
A

Allison Newman

You can change the code one of two ways to get it to work

This way which is useful if you want, for example, to use a particular
version of the library

require 'rubygems'
require 'net/http'
require 'uri'
require_gem 'hpricot', '=0.4'
require 'hpricot'

or this way to use the latest version

require 'net/http'
require 'uri'
require 'hpricot'

Luis

Ahhh, ok, so I have to require 'hpricot' regardless of whether I have
already done a require_gem 'hpricot'. Seems a little odd that! I'll
have to add it to my list of "not principle of least surprise" examples
:) Thankyou
 
D

David Vallner

--------------enig274639DD28968D55A89A3291
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Allison said:
=20
Ahhh, ok, so I have to require 'hpricot' regardless of whether I have=20
already done a require_gem 'hpricot'. Seems a little odd that! I'll=20
have to add it to my list of "not principle of least surprise" examples= =20
:) Thankyou
=20

Not really, the two methods aren't supposed to do the same at all.

require_gem means load a gem with the name "hpricot", and then load the
library (usually .rb) files the gem's author presumes you'll want.
However, this is optional, and the gem's author can decide not to load
any files in the gem by default, and then require_gem only serves to
determine the correct version. On the contrary, it is a little weird to
me that the decision of what code from a gem you want to use is up to
its authors and not its users, and I'm not aware of there being a
"prefer this version of a gem" method that's strictly orthogonal to
"load this library". But that's digressing.

Rubygems modifies the semantics of require so that it also searches in
installed gems for a file you're trying to find. The require 'hpricot'
doesn't try to load a gem named hpricot, it tries to load file
'hpricot.rb' whether it's part of gems installed on the machine, or in
the "standard" library locations. The fact the gem names are usually
identical to names of files inside them you want to load makes this
difference hard to spot.

This manifests itself with the FOX bindings, for example: the gem name
is 'fxruby', but the file to load is 'fox16.so', and the gem doesn't
automatically load it. To wit, an irb session demonstrating the behaviour=
:

irb(main):001:0> $VERBOSE =3D nil
=3D> nil
irb(main):002:0> require 'rubygems'
=3D> true
irb(main):003:0> require 'fxruby'
LoadError: no such file to load -- fxruby
from
C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`gem_original_require'
from
C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `require'=

from (irb):3
from C:/Ruby/lib/ruby/1.8/time.rb:65
irb(main):004:0> require_gem 'fxruby'
=3D> true
irb(main):005:0> Fox
NameError: uninitialized constant Fox
from (irb):5
from C:/Ruby/lib/ruby/1.8/time.rb:65
irb(main):006:0> require 'fox16'
=3D> true
irb(main):007:0> Fox
=3D> Fox

David Vallner


--------------enig274639DD28968D55A89A3291
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFcyBPy6MhrS8astoRAkzhAJ4qpU9nwX6YyFm5O5dhPbJ4u6l8JQCfaETy
2FMODr+NvFbLEjshKHhHf0M=
=Lt8I
-----END PGP SIGNATURE-----

--------------enig274639DD28968D55A89A3291--
 
G

Giles Bowkett

Not really, the two methods aren't supposed to do the same at all.

"Principle of least surprise" just means that things are consistent,
so that once you're familiar with Ruby API X, you don't have to
completely rethink your assumptions to work with Ruby API Y. The
language *is* supposed to conform to your intuition, but only *after*
you've *developed* that intuition -- which you do by learning the
language. It's much more about internal consistency and sensible
design than creating an effortless learning curve.
 
W

_why

Not really, the two methods aren't supposed to do the same at all.

require_gem means load a gem with the name "hpricot", and then load the
library (usually .rb) files the gem's author presumes you'll want.

You are talking about "autorequire", David. Which is long gone.[1]

In other words, require_gem should no longer be used to load any Ruby code.
It should only be used to tell RubyGems which version you'll be needing.
You know?

_why

[1] http://redhanded.hobix.com/inspect/autorequireIsBasicallyGoneEveryone.html
 
D

David Vallner

--------------enig19009667E77CB39BEBD6B2B3
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

_why said:
You are talking about "autorequire", David. Which is long gone.[1]
=20
[1] http://redhanded.hobix.com/inspect/autorequireIsBasicallyGoneEveryo= ne.html
=20

Ah, that's the change to rubygems I meant. Woohoo, I wasn't
hallucinating :) Thanks!

David Vallner


--------------enig19009667E77CB39BEBD6B2B3
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFdJI3y6MhrS8astoRAniNAJ490nnpUjW73JAmW/JHoWxLfQpWbQCeMyIV
K2N/ZYCveHKfDKfNqfapT5k=
=GPQH
-----END PGP SIGNATURE-----

--------------enig19009667E77CB39BEBD6B2B3--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top