Still Query Continues

N

Newb Newb

Thanks for taking time for Reply.
My Problem Has not Sorted Out.Actually i m Extracting These images
from other Webpages.
Using Hpricot i Got all the Image urls.I did code like below

doc = Hpricot.parse(item.description)
imgs = doc.search("//img")
@src_array = imgs.collect{|img|"<img src
=\"#{img.attributes["src"]}\">" }

It has given all the image url.But for example i have image urls even
like below

<img src
="http://www.ingolfwetrust.com/golf-central/content/binary/Craig-Stadler-Davis-Love-II.jpg">

- <img src
="http://www.ingolfwetrust.com/golf-central/aggbug.ashx?id=0ab690de-a81c-4470-8539-7f9ce0f75ee3">

In this case the first one is valid image url because it has jpg file
extensions.so i need to display the image urls which has the
jpg,.png,gif extensions only..

Is this possible using Regular Expressions?
Pls Help me out To Understand
 
S

Sandor Szücs

In this case the first one is valid image url because it has jpg =20
file
extensions.so i need to display the image urls which has the
.jpg,.png,gif extensions only..

Is this possible using Regular Expressions?

yes.

irb> a=3D%w{abc.jpg def ghi.png jkl.pngjpg mnp.bpng}
=3D> ["abc.jpg", "def", "ghi.png", "jkl.pngjpg", "mnp.bpng"]
irb> a.select {|w| w.match(/\.(png|jpg)?$/)}
=3D> ["abc.jpg", "ghi.png"]
Pls Help me out To Understand

If you want to understand more then you should read more =20
documentation and
wikipedia on your topic. Also test carefully your expressions by an irb
session. Often it helps if you write a simple test to understand your =20=

problem.

In my opinion a great ressource for regular expressions is
http://www.regular-expressions.info

Maybe the wikibook with the topic ruby will help you in the future:
http://en.wikibooks.org/wiki/Ruby_Programming

Please read the fine manuals in the web before writing forum entries.

regards, Sandor Sz=FCcs
--=
 
N

Newb Newb

Again I have to start from the scratch..From The beginning itself i got
into problem.Actually my problem is I want to extract the image tag
which contains image file extensions like .jpg .png.But currently i m
using this RegEx (/<img.*?>/).But it gives me img tags without .jpg or
png file extensions.
So pls Kindly Help Me All..I m really struggling.
You People Favour me and thanks much.
 
S

Sandor Szücs

Again I have to start from the scratch..=46rom The beginning itself i =20=
got
into problem.Actually my problem is I want to extract the image tag
which contains image file extensions like .jpg .png.

You define the last part of your target: ends with .jpg or .png
But currently i m
using this RegEx (/<img.*?>/).But it gives me img tags without .jpg or
.png file extensions.

That regular expression doesn't match the end: .jpg or .png

Think about what your regex will match. Try it yourself with irb.
You want to match an url string with a specific ending.
Try to match the start, the end and all characters between, without =20
matching
characters pre and post of your target.

Your string looks like that "<img src=3D'http://host.domain.tld/path/to/=20=

pic.png' alt=3D'test'/>
What you want to match is just http://host.domain.tld/path/to/pic.png

What characters are allowed in your target?
What substrings should be a part of the string you want to match?
What characters are the bounces that you don't want to match?

Please try to solve the problem yourself. You should learn to think =20
about the
problem you want to solve, but I have included one solution.

Don't just copy 'n paste a solution. ;)

regards, Sandor Sz=FCcs
--

P.S.
What characters are allowed in your target?
a-zA-Z0-9:_%\-\.\/
What substrings should be a part of the string you want to match?
the last part have to be: \.png or \.jpg or \.gif
What characters are the bounces that you don't want to match?
In my example above it is ' . Is this character a part of our target? =20=

no!

here an example:
I downloaded the html file of http://rubyforge.org/

irb> s=3DFile.read("RubyForge_ Welcome.html")
irb(main):014:0> s.split(' ').select do |w|
irb(main):015:1* t=3D w.match(/[a-zA-Z0-9_:%\-\/\.]*\.(png|jpg|gif)/)
irb(main):016:1> puts t if t
irb(main):017:1> end
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
header-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
controls-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
tabs-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
inner-tabs-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
active-tab-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
active-inner-tab-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
bottom-fade-bg.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
controls-left.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
header.png
/images/lsrc_2008_logo.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
clear.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
clear.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
clear.png
http://static.rubyforge.vm.bytemark.co.uk/themes/rubyforge/images/=20
clear.png=
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top