what's the matter with my programm:Web Analysis

P

Pen Ttt

HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
if comment
p ["Comment",comment]
elseif tag
p ["Tag",tag]
elseif tdata
tdata.gsub!(/\s+/,"")
tdata.sub!(/ $/,"")
p [ "TextData",tdata] unless tdata.empty?

end
}
_END_
<!DOCTYPE HTML>
<HTML>
<BODY>
< A name="FOO" href="foo" attr >foo</A>
< A name="BAR" href="bar" attr >bar</A>
< A name=BAZ href=baz attr >baz</A>
<!--
<A href="dummy">dummy</A>
-->
<BODY>
</HTML>

i run it ,the output is:
syntax error, unexpected '<', expecting $end
<!DOCTYPE HTML>
^
what's the problem?how can i solve it?
 
T

Thomas Preymesser

it's __END__ not _END_

-Thomas

HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
if comment
p ["Comment",comment]
elseif tag
p ["Tag",tag]
elseif tdata
tdata.gsub!(/\s+/,"")
tdata.sub!(/ $/,"")
p [ "TextData",tdata] unless tdata.empty?

end
}
_END_
<!DOCTYPE HTML>
<HTML>
<BODY>
< A name="FOO" href="foo" attr >foo</A>
< A name="BAR" href="bar" attr >bar</A>
< A name=BAZ href=baz attr >baz</A>
<!--
<A href="dummy">dummy</A>
-->
<BODY>
</HTML>

i run it ,the output is:
syntax error, unexpected '<', expecting $end
<!DOCTYPE HTML>
^
what's the problem?how can i solve it?
 
T

Thomas Preymesser

[Note: parts of this message were removed to make it a legal post.]

i change _END_ into __END__ ,it's no use.
please run it on your computer to see what happen,think you
 
B

Brian Candler

Pen said:
please run it on your computer to see what happen,think you

z.rb:11: undefined method `elseif' for main:Object (NoMethodError)

That's trying to tell you something.
 
P

Pen Ttt

there two bugs in my first program :
1ã€it's __END__ not _END_
2ã€it's "elsif" not "elseif"
i change my program into
#the filename is: /home/pt/htmlscan_test.rb
HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
if comment
p ["Comment",comment]
elsif tag
p ["Tag",tag]
elsif tdata
tdata.gsub!(/\s+/,"")
tdata.sub!(/ $/,"")
p [ "TextData",tdata] unless tdata.empty?
end
}
__END__
<!DOCTYPE HTML>
<HTML>
<BODY>
< A name="FOO" href="foo" attr >foo</A>
< A name="BAR" href="bar" attr >bar</A>
< A name=BAZ href=baz attr >baz</A>
<!--
<A href="dummy">dummy</A>
-->
<BODY>
</HTML>

there is another problem too:
it can be run on netbeans ide6.8,got correct answer
["Tag", "<!DOCTYPE HTML>"]
["Tag", "<HTML>"]
["Tag", "<BODY>"]
["Tag", "< A name=\"FOO\" href=\"foo\" attr >"]
["TextData", "foo"]
["Tag", "</A>"]
["Tag", "< A name=\"BAR\" href=\"bar\" attr >"]
["TextData", "bar"]
["Tag", "</A>"]
["Tag", "< A name=BAZ href=baz attr >"]
["TextData", "baz"]
["Tag", "</A>"]
["Comment", "<!--\n <A href=\"dummy\">dummy</A>\n -->"]
["Tag", "<BODY>"]
["Tag", "</HTML>"]

but when i run it on terminal
pt@pt-laptop:~$ ruby /home/pt/htmlscan_test.rb
/home/pt/htmlscan_test.rb:20: syntax error, unexpected '<', expecting
$end
<!DOCTYPE HTML>
^

what's the matter?
can you try it on your computer?
please help me.
 
J

Josh Cheek

there two bugs in my first program :
1=E3=80=81it's __END__ not _END_
2=E3=80=81it's "elsif" not "elseif"
i change my program into
#the filename is: /home/pt/htmlscan_test.rb
HTMLRegexp =3D/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =3DDATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=3Dmatch[0..2]
if comment
p ["Comment",comment]
elsif tag
p ["Tag",tag]
elsif tdata
tdata.gsub!(/\s+/,"")
tdata.sub!(/ $/,"")
p [ "TextData",tdata] unless tdata.empty?
end
}
__END__
<!DOCTYPE HTML>
<HTML>
<BODY>
< A name=3D"FOO" href=3D"foo" attr >foo</A>
< A name=3D"BAR" href=3D"bar" attr >bar</A>
< A name=3DBAZ href=3Dbaz attr >baz</A>
<!--
<A href=3D"dummy">dummy</A>
-->
<BODY>
</HTML>

there is another problem too:
it can be run on netbeans ide6.8,got correct answer
["Tag", "<!DOCTYPE HTML>"]
["Tag", "<HTML>"]
["Tag", "<BODY>"]
["Tag", "< A name=3D\"FOO\" href=3D\"foo\" attr >"]
["TextData", "foo"]
["Tag", "</A>"]
["Tag", "< A name=3D\"BAR\" href=3D\"bar\" attr >"]
["TextData", "bar"]
["Tag", "</A>"]
["Tag", "< A name=3DBAZ href=3Dbaz attr >"]
["TextData", "baz"]
["Tag", "</A>"]
["Comment", "<!--\n <A href=3D\"dummy\">dummy</A>\n -->"]
["Tag", "<BODY>"]
["Tag", "</HTML>"]

but when i run it on terminal
pt@pt-laptop:~$ ruby /home/pt/htmlscan_test.rb
/home/pt/htmlscan_test.rb:20: syntax error, unexpected '<', expecting
$end
<!DOCTYPE HTML>
^

what's the matter?
can you try it on your computer?
please help me.
It works for me on 1.8.6, 1.8.7, and 1.9.1
http://img13.imageshack.us/img13/122/picture1svu.png
 
P

Pen Ttt

i reopen my computer ,run the script,get the correct answer
i still have something want to know

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]

what is the meaning of
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
i want to see material about string#scan method
it's difficult for me to understand :
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
in the script.
can you explain it for me?
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

i reopen my computer ,run the script,get the correct answer
i still have something want to know

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]

what is the meaning of
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
i want to see material about string#scan method
it's difficult for me to understand :
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
in the script.
can you explain it for me?
Scan docs are here (this is 1.8.6)
http://ruby-doc.org/core/classes/String.html#M000812
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,139
Latest member
JamaalCald
Top