Extracting the link text

Discussion in 'Perl Misc' started by Fritz Bayer, Apr 28, 2005.

  1. Fritz Bayer

    Fritz Bayer Guest

    Hi,

    I would like to extract all the links from a html page, which I store
    in a string variable.

    For each link, I would also like to print out the link text, however,
    omitting ALL possible tags in which the text could be embedded.

    I'm looking for a regular expression, which does just that. Can
    somebody help me out?

    Fritz
    Fritz Bayer, Apr 28, 2005
    #1
    1. Advertising

  2. Fritz Bayer

    Guest

    Fritz Bayer <> wrote:
    > I would like to extract all the links from a html page [...]
    > I'm looking for a regular expression, which does just that. Can
    > somebody help me out?


    perldoc -q "remove html"

    Chris
    , Apr 28, 2005
    #2
    1. Advertising

  3. Fritz Bayer wrote:

    > Hi,
    >
    > I would like to extract all the links from a html page, which I store
    > in a string variable.
    >
    > For each link, I would also like to print out the link text, however,
    > omitting ALL possible tags in which the text could be embedded.


    Use one of the modules for parsing HTML.
    >
    > I'm looking for a regular expression, which does just that.


    No, you aren't, because there ain't no such thing.

    > Can
    > somebody help me out?
    >
    > Fritz


    --
    Christopher Mattern

    "Which one you figure tracked us?"
    "The ugly one, sir."
    "...Could you be more specific?"
    Chris Mattern, Apr 28, 2005
    #3
  4. Chris Mattern wrote:
    > Fritz Bayer wrote:
    >> I would like to extract all the links from a html page, which I store
    >> in a string variable.
    >>
    >> For each link, I would also like to print out the link text, however,
    >> omitting ALL possible tags in which the text could be embedded.

    >
    > Use one of the modules for parsing HTML.


    HTML::LinkExtor sounds promising. :)

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Apr 28, 2005
    #4
  5. wrote:
    > Fritz Bayer <> wrote:
    >>I would like to extract all the links from a html page [...]
    >>I'm looking for a regular expression, which does just that. Can
    >>somebody help me out?

    >
    > perldoc -q "remove html"


    Better yet:

    perldoc -q "extract URLs"

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Apr 28, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kunal
    Replies:
    0
    Views:
    470
    kunal
    Oct 15, 2005
  2. Kevin Spencer

    Re: Link Link Link DANGER WILL ROBINSON!!!

    Kevin Spencer, May 17, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    795
    Kevin Spencer
    May 17, 2005
  3. Replies:
    0
    Views:
    359
  4. Blue®
    Replies:
    4
    Views:
    775
    Blue®
    Sep 27, 2003
  5. David  Housman

    Dynamic text: modify text by image/link click

    David Housman, Feb 2, 2007, in forum: Javascript
    Replies:
    1
    Views:
    192
    marss
    Feb 2, 2007
Loading...

Share This Page