Regexp with Ruby

Discussion in 'Ruby' started by Ajay Vijey, Nov 15, 2006.

  1. Ajay Vijey

    Ajay Vijey Guest

    Hallo @ all,

    I have to replace in a File the image tags with an other!

    File Data:
    -------------
    <td bordercolor="#FFFFFF">
    <table border="0" id="table2" bgcolor="#FFFFFF" width="100%">
    <tr>
    <td align="left" valign="top" width="25%"><a href="../personal/po.htm">
    <img src="images/animationpundo_schwarz_kl.gif"
    alt="Personal und Organisation" border="0" width="84"
    height="64"></a></td>
    <td align="left" valign="top" width="25%"><font face="Arial"><a
    href="../personal/po.htm">

    Will scan this image tag:
    --------------------------
    <img src="images/animationpundo_schwarz_kl.gif"
    alt="Personal und Organisation" border="0" width="84" height="64">


    I already tested this with the follow code:
    -----------------------------------------------
    ...scan(/<img.*>/m)
    and with
    ...scan(/<img.*?>/m)

    But the result was always:
    ----------------------------
    <img src="images/animationpundo_schwarz_kl.gif"
    alt="Personal und Organisation" border="0" width="84"
    height="64"></a></td>
    <td align="left" valign="top" width="25%"><font face="Arial"><a
    href="../personal/po.htm">


    I hope someone can help me! Thanks a lot!


    Kind Regards
    Ajay


    --
    Posted via http://www.ruby-forum.com/.
    Ajay Vijey, Nov 15, 2006
    #1
    1. Advertising

  2. Ajay Vijey

    Hugh Sasse Guest

    On Thu, 16 Nov 2006, Ajay Vijey wrote:

    > Hallo @ all,
    >
    > I have to replace in a File the image tags with an other!
    >
    > File Data:
    > -------------

    [trimmed]
    >
    > Will scan this image tag:
    > --------------------------
    > <img src="images/animationpundo_schwarz_kl.gif"
    > alt="Personal und Organisation" border="0" width="84" height="64">
    >
    >
    > I already tested this with the follow code:
    > -----------------------------------------------
    > ...scan(/<img.*>/m)
    > and with
    > ...scan(/<img.*?>/m)
    >
    > But the result was always:
    > ----------------------------
    > <img src="images/animationpundo_schwarz_kl.gif"
    > alt="Personal und Organisation" border="0" width="84"
    > height="64"></a></td>
    > <td align="left" valign="top" width="25%"><font face="Arial"><a
    > href="../personal/po.htm">
    >
    >
    > I hope someone can help me! Thanks a lot!


    I'd agree with your choice of regexp. I think we need to see more of
    the surrounding code to fix this.

    > Kind Regards
    > Ajay


    Hugh
    Hugh Sasse, Nov 15, 2006
    #2
    1. Advertising

  3. Ajay Vijey

    Ajay Vijey Guest

    Hugh Sasse wrote:
    > I'd agree with your choice of regexp. I think we need to see more of
    > the surrounding code to fix this.



    rubyscript
    --------------
    datei_new = IO.read(“index.htmâ€)
    datei_regexp = datei_new.scan(/(<img.*>)/m)

    puts datei_regexp



    index.htm
    ------------

    <html>
    <head><title>test</title></head>
    <body>
    <table>
    <tr>
    <td bordercolor="#FFFFFF">

    <table border="0" id="table2" bgcolor="#FFFFFF" width="100%">
    <tr>

    <td align="left" valign="top" width="25%"><a href="../personal/po.htm">
    <img src="images/animationpundo_schwarz_kl.gif"
    alt="Personal und Organisation" border="0" width="84"
    height="64"></a></td>

    <td align="left" valign="top" width="25%"><font face="Arial"><a
    href="../personal/po.htm"></font></td>

    </tr>
    </table>

    </td>
    </tr>
    </table>
    </body>
    </html>




    --
    Posted via http://www.ruby-forum.com/.
    Ajay Vijey, Nov 15, 2006
    #3
  4. Ajay Vijey

    hemant Guest

    On 11/16/06, Paul Lutus <> wrote:
    > Ajay Vijey wrote:
    >
    > > Hallo @ all,
    > >
    > > I have to replace in a File the image tags with an other!

    >
    > As another poster has pointed out, you aren't showing enough code for an
    > analysis, and, while you are replacing tags, please reformat your IMG tags
    > thus:
    >
    > <img src="..."/>
    >
    > Note the self-closing form. This won't bother older browsers, and it will
    > allow you to meet the newer (X)HTML standards as well.
    >
    > Here is sample program that extracts all the IMG tags from a Web page (of
    > both the old and new varieties):
    >
    > ----------------------------------------
    > #!/usr/bin/ruby -w
    >
    > data = File.read("sample.html")
    >
    > extract = data.scan(%r{<img.*?/>}m)
    >
    > puts extract.join("\n")
    > ----------------------------------------
    >
    > This outputs from my sample page:
    >
    > <img src="../images/leftarrow.png" border="0" alt="" />
    > <img src="../images/rightarrow.png" border="0" alt="" />
    > <img src="rock_ptarmigan_chick_small.jpg" width="300" height="289" alt=""/>
    > <img src="pws_naked_island003_cropped_small.jpg" width="300" height="232"
    > alt=""/>
    > <img src="pws_naked_island011_cropped_small.jpg" width="300" height="225"
    > alt=""/>
    > <img src="pws_naked_island007_cropped_small.jpg" width="300" height="236"
    > alt=""/>
    > <img src="pws_naked_island012_small.jpg" width="300" height="200" alt=""/>
    > <img src="pws_naked_island013_cropped_small.jpg" width="300" height="236"
    > alt=""/>
    > <img src="../images/leftarrow.png" border="0" alt="" />
    > <img src="../images/rightarrow.png" border="0" alt="" />
    >
    > --
    > Paul Lutus
    > http://www.arachnoid.com
    >
    >


    If i were to do this..I would use hpricot.


    --
    There was only one Road; that it was like a great river: its springs
    were at every doorstep, and every path was its tributary.
    hemant, Nov 15, 2006
    #4
  5. Ajay Vijey

    Guest

    Ajay Vijey wrote:
    > Hugh Sasse wrote:
    > > I'd agree with your choice of regexp. I think we need to see more of
    > > the surrounding code to fix this.

    >
    >
    > rubyscript
    > --------------
    > datei_new = IO.read("index.htm")
    > datei_regexp = datei_new.scan(/(<img.*>)/m)
    >
    > puts datei_regexp
    >
    >
    >
    > index.htm
    > ------------
    >
    > <html>
    > <head><title>test</title></head>
    > <body>
    > <table>
    > <tr>
    > <td bordercolor="#FFFFFF">
    >
    > <table border="0" id="table2" bgcolor="#FFFFFF" width="100%">
    > <tr>
    >
    > <td align="left" valign="top" width="25%"><a href="../personal/po.htm">
    > <img src="images/animationpundo_schwarz_kl.gif"
    > alt="Personal und Organisation" border="0" width="84"
    > height="64"></a></td>
    >
    > <td align="left" valign="top" width="25%"><font face="Arial"><a
    > href="../personal/po.htm"></font></td>
    >
    > </tr>
    > </table>
    >
    > </td>
    > </tr>
    > </table>
    > </body>
    > </html>


    Works for me with datei_new.scan(/(<img.*?>)/m) (the .*? performs a
    non-greedy match so it stops with the smallest match it can make,
    rather than the longest)

    The parentheses you have around the text of the regexp are unnecessary,
    they cause the results to be more deeply nested in arrays. You should
    use /<img.*?>/m

    --Ken Bloom
    , Nov 15, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Greg Hurrell
    Replies:
    4
    Views:
    157
    James Edward Gray II
    Feb 14, 2007
  2. Mikel Lindsaar
    Replies:
    0
    Views:
    480
    Mikel Lindsaar
    Mar 31, 2008
  3. Joao Silva
    Replies:
    16
    Views:
    355
    7stud --
    Aug 21, 2009
  4. Uldis  Bojars
    Replies:
    2
    Views:
    190
    Janwillem Borleffs
    Dec 17, 2006
  5. Matìj Cepl

    new RegExp().test() or just RegExp().test()

    Matìj Cepl, Nov 24, 2009, in forum: Javascript
    Replies:
    3
    Views:
    177
    Matěj Cepl
    Nov 24, 2009
Loading...

Share This Page