Nokogiri not read html file in Cent OS 32-bit

P

Priyank Shah

Hi all,

I want to read and use html file content using Nokogiri in my cent os-32
bit

but it not read any html contents.

@test = Nokogiri::HTML("abc.html")
puts "#{@test}"

but it just shows me java scripts on source page not any html contents.


please reply me if any one know about this issue.


Thanks,
Priyank Shah
 
R

Ryan Davis

@test =3D Nokogiri::HTML("abc.html")

503 % ri Nokogiri.HTML
=3D Nokogiri.HTML

(from gem nokogiri-1.4.3.1)
=
--------------------------------------------------------------------------=
--
HTML(thing, url =3D nil, encoding =3D nil, options =3D =
XML::parseOptions::DEFAULT_HTML, &block)

=
--------------------------------------------------------------------------=
 
P

Priyank Shah

Ryan Davis wrote in post #961099:
503 % ri Nokogiri.HTML
= Nokogiri.HTML

(from gem nokogiri-1.4.3.1)


hi,

Thanks for reply but not getting solution i get only

<!DOCTYPE html public \"-//W3C DTD HTML 4.0 Tansitional//EN\" .....

as a output, not actual html contents in file.

I check nokogiri but i think it is some html character set encoding
issue.

can you give me some idea about this?

Thanks,
Priyank Shah
 
F

Florian Gilcher

What Ryan is telling you: you have to pass a filepointer or the actual HTML a=
s string, not a string containing a filename.=20
 
P

Priyank Shah

Florian Gilcher wrote in post #961470:
What Ryan is telling you: you have to pass a filepointer or the actual
HTML as string, not a string containing a filename.


hi,

Thanks for explain but still i get the same problem

i us following in cent Os-5.5 32 bit

$> nokogiri -v

Ruby

engine:mri
version:1.8.7
platform:i686-linux

libxml:

loaded: 2.6.26
binding: extension
complied:2.6.26
nokogiri:1.4.3.1


--------
my code is like

f = File.open("test.html")
data = Nokogiri::HTML(f)
puts "#{data}"

p "#{data}"

but any of this give

Output:

"<!DOCTYPE html PUBLIC \"-W3C//DTD HTML 4.0 Transitional//EN\" .......

this type of output it shows not get actual html contents.

So help me if you have any more idea.

Thanks,
Priyank Shah
 
P

Priyank Shah

Niklas Cathor wrote in post #961490:
Can't reproduce your problem. Try this:

require 'rubygems'
require 'nokogiri'
# make sure the file contains sth.
File.open('test.html', 'w') {|f|
f.write("<html><body><h1>Foo</h1></body></html>") }

f = File.open('test.html')
data = Nokogiri::HTML(f)
puts data
p data

----- OUTPUT ------

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><h1>Foo</h1></body></html>
#<Nokogiri::HTML::Document:0x3ff244a4fb70 name="document"
children=[#<Nokogiri::XML::DTD:0x3ff244ad5e14 name="html">,
#<Nokogiri::XML::Element:0x3ff244adf5b8 name="html"
children=[#<Nokogiri::XML::Element:0x3ff244b50e0c name="body"
children=[#<Nokogiri::XML::Element:0x3ff244b50b28 name="h1"
children=[#<Nokogiri::XML::Text:0x3ff244b508a8 "Foo">]>]>]>]>




Hi,

First thanks to all for helping me in my problem.

I got the solution finally,

I tried

f = open("test.html").read
data = Nokogiri::HTML(f)
puts data
instead of

f = FIle.open("test.html")
data = Nokogiri::HTML(f)
puts data

and i get html data.

so basically i don't use File class.

Thanks,
Priyank Shah
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top