Regular expression

Discussion in 'Perl Misc' started by sk, Apr 10, 2007.

  1. sk

    sk Guest

    I m trying to add a extra keyword on a specific word in a html document
    In the following document I want to put "My" in front of all the
    words"computer"
    excluding the id for the stylesheet and name of image files.

    <html>
    <head>
    <title>hello</titile>
    </head>
    <body>
    <div id="computer"> This is computer</div>
    <img src="images/computer.jpg" >Computer
    </body>
    </html>
     
    sk, Apr 10, 2007
    #1
    1. Advertising

  2. sk

    J. Gleixner Guest

    sk wrote:
    > I m trying to add a extra keyword on a specific word in a html document
    > In the following document I want to put "My" in front of all the
    > words"computer"
    > excluding the id for the stylesheet and name of image files.
    >
    > <html>
    > <head>
    > <title>hello</titile>
    > </head>
    > <body>
    > <div id="computer"> This is computer</div>
    > <img src="images/computer.jpg" >Computer
    > </body>
    > </html>


    s/\scomputer/ My computer/;
     
    J. Gleixner, Apr 10, 2007
    #2
    1. Advertising

  3. On Tue, 10 Apr 2007 15:41:25 -0500, J. Gleixner wrote:

    > sk wrote:
    >> I m trying to add a extra keyword on a specific word in a html document
    >> In the following document I want to put "My" in front of all the
    >> words"computer"
    >> excluding the id for the stylesheet and name of image files.
    >>
    >> <html>
    >> <head>
    >> <title>hello</titile>
    >> </head>
    >> <body>
    >> <div id="computer"> This is computer</div>
    >> <img src="images/computer.jpg" >Computer
    >> </body>
    >> </html>

    >
    > s/\scomputer/ My computer/;


    will not work on:
    <img src="images/computer.jpg" >Computer
     
    4i4ko Trevi4ko, Apr 10, 2007
    #3
  4. sk

    Guest

    On Apr 10, 4:43 pm, 4i4ko Trevi4ko <> wrote:
    > On Tue, 10 Apr 2007 15:41:25 -0500, J. Gleixner wrote:
    > > sk wrote:
    > >> I m trying to add a extra keyword on a specific word in a html document
    > >> In the following document I want to put "My" in front of all the
    > >> words"computer"
    > >> excluding the id for the stylesheet and name of image files.

    >
    > >> <html>
    > >> <head>
    > >> <title>hello</titile>
    > >> </head>
    > >> <body>
    > >> <div id="computer"> This is computer</div>
    > >> <img src="images/computer.jpg" >Computer
    > >> </body>
    > >> </html>

    >
    > > s/\scomputer/ My computer/;

    >
    > will not work on:
    > <img src="images/computer.jpg" >Computer


    I got it through 2 regular expressions. Maybe someone can help me
    merge them together...

    $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/igms;
    $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/ig;
     
    , Apr 10, 2007
    #4
  5. sk

    sk Guest

    "J. Gleixner" <> wrote in message
    news:461bf675$0$496$...
    > sk wrote:
    >> I m trying to add a extra keyword on a specific word in a html document
    >> In the following document I want to put "My" in front of all the
    >> words"computer"
    >> excluding the id for the stylesheet and name of image files.
    >>
    >> <html>
    >> <head>
    >> <title>hello</titile>
    >> </head>
    >> <body>
    >> <div id="computer"> This is computer</div>
    >> <img src="images/computer.jpg" >Computer
    >> </body>
    >> </html>

    >
    > s/\scomputer/ My computer/;


    what is \s? a whole word? Is there another way to do this?
    Sometimes i define the function name like "GetComputer".
    I want to be able to change to "GetMyComputer"
     
    sk, Apr 10, 2007
    #5
  6. sk

    Guest

    On Apr 10, 5:22 pm, "sk" <> wrote:
    > "J. Gleixner" <> wrote in message
    >
    > news:461bf675$0$496$...
    >
    >
    >
    > > sk wrote:
    > >> I m trying to add a extra keyword on a specific word in a html document
    > >> In the following document I want to put "My" in front of all the
    > >> words"computer"
    > >> excluding the id for the stylesheet and name of image files.

    >
    > >> <html>
    > >> <head>
    > >> <title>hello</titile>
    > >> </head>
    > >> <body>
    > >> <div id="computer"> This is computer</div>
    > >> <img src="images/computer.jpg" >Computer
    > >> </body>
    > >> </html>

    >
    > > s/\scomputer/ My computer/;

    >
    > what is \s? a whole word? Is there another way to do this?
    > Sometimes i define the function name like "GetComputer".
    > I want to be able to change to "GetMyComputer"


    \s is whitespaces
     
    , Apr 10, 2007
    #6
  7. sk

    Tony Curtis Guest

    wrote:
    > On Apr 10, 4:43 pm, 4i4ko Trevi4ko <> wrote:
    >> On Tue, 10 Apr 2007 15:41:25 -0500, J. Gleixner wrote:
    >>> sk wrote:
    >>>> I m trying to add a extra keyword on a specific word in a html document
    >>>> In the following document I want to put "My" in front of all the
    >>>> words"computer"
    >>>> excluding the id for the stylesheet and name of image files.
    >>>> <html>
    >>>> <head>
    >>>> <title>hello</titile>
    >>>> </head>
    >>>> <body>
    >>>> <div id="computer"> This is computer</div>
    >>>> <img src="images/computer.jpg" >Computer
    >>>> </body>
    >>>> </html>
    >>> s/\scomputer/ My computer/;

    >> will not work on:
    >> <img src="images/computer.jpg" >Computer

    >
    > I got it through 2 regular expressions. Maybe someone can help me
    > merge them together...
    >
    > $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/igms;
    > $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/ig;


    A regex is the wrong (general) solution here. You need an HTML parser.

    hth
    t
     
    Tony Curtis, Apr 11, 2007
    #7
  8. sk wrote:
    > I m trying to add a extra keyword on a specific word in a html
    > document In the following document I want to put "My" in front of all
    > the words"computer"
    > excluding the id for the stylesheet and name of image files.


    As has been mentioned a gazillion times in this NG parsing HTML correctly is
    not easy and only a masochist would try to do it using REs. It is very easy
    to find sample HTML code, that breaks any RE construct that you can come up
    with. See the FAQ "How do I remove HTML from a string?" for some examples
    that you probably forgot to consider.

    Instead just use a proper HTML parser from CPAN. They come ready-made and
    most important they actualy work.

    jue
     
    Jürgen Exner, Apr 11, 2007
    #8
  9. On 2007-04-10 21:17, <> wrote:
    > On Apr 10, 4:43 pm, 4i4ko Trevi4ko <> wrote:
    >> On Tue, 10 Apr 2007 15:41:25 -0500, J. Gleixner wrote:
    >> > sk wrote:
    >> >> I m trying to add a extra keyword on a specific word in a html
    >> >> document In the following document I want to put "My" in front of
    >> >> all the words"computer" excluding the id for the stylesheet and
    >> >> name of image files.

    >>
    >> >> <html>
    >> >> <head>
    >> >> <title>hello</titile>
    >> >> </head>
    >> >> <body>
    >> >> <div id="computer"> This is computer</div>
    >> >> <img src="images/computer.jpg" >Computer
    >> >> </body>
    >> >> </html>

    >>
    >> > s/\scomputer/ My computer/;

    >>
    >> will not work on:
    >> <img src="images/computer.jpg" >Computer

    >
    > I got it through 2 regular expressions. Maybe someone can help me
    > merge them together...
    >
    > $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/igms;


    That will match
    $1: From the first ">" to the ">" in the img tag in line 7.
    $2: "Computer" in line 7
    $3: From the newline after "Computer" to the opening "<" in
    "</html".

    In other words it will only replace the last "Computer" in the file.
    The "g" is ineffective, since there can be only one last "computer".
    The "m" doesn't seem to have any effect either.

    > $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/ig;


    This will replace any "computer" after a ">" and before a "<" on the
    same line. In the example file there is only one "computer" left which
    matches that criterion, but that doesn't seem guaranteed.

    hp

    --
    _ | Peter J. Holzer | I know I'd be respectful of a pirate
    |_|_) | Sysadmin WSR | with an emu on his shoulder.
    | | | |
    __/ | http://www.hjp.at/ | -- Sam in "Freefall"
     
    Peter J. Holzer, Apr 15, 2007
    #9
  10. Peter J. Holzer <> wrote:
    > On 2007-04-10 21:17, <> wrote:



    >> $string =~ s/(>.*)(computer)(.*<)/\1my \2\3/igms;



    > The "m" doesn't seem to have any effect either.



    "m" only affects the ^ and $ anchors.

    So it is a no-op for patterns that do not contain those anchors.

    (but no sensible person would adopt code that generates easy-to-avoid
    warnings anyway.)


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 15, 2007
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Keith-Earl
    Replies:
    1
    Views:
    458
    Mary Chipman
    Jun 15, 2004
  2. VSK
    Replies:
    2
    Views:
    2,310
  3. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    851
    Alan Moore
    Dec 2, 2005
  4. GIMME
    Replies:
    3
    Views:
    11,978
    vforvikash
    Dec 29, 2008
  5. Noman Shapiro
    Replies:
    0
    Views:
    235
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page