Retrieve Description/ Meta tags from website as well as remove HTML

Discussion in 'ASP .Net' started by Mark, Jun 24, 2005.

  1. Mark

    Mark Guest

    Hi all, does anyone know of a nice utility/ class which will allow me to
    retrieve the details of a webpage?

    Specifically, I would like to be able to retrive the html and then call a
    method which would retrieve: meta tags
    Keywords
    Description
    as well as another method which removes all the HTML from the string
    starting at the body tag

    Does one exist? I know I can write one using regular expressions etc but
    rather than inventing the wheel :)

    Thanks
    Mark
    Mark, Jun 24, 2005
    #1
    1. Advertising

  2. Mark

    JV Guest

    I assume you mean programmatically, since you can obviously hand-edit in VS
    or even just NOTEPAD.

    I had to do something like this to work around the VS bug where it
    occasionally eats the closing tag on a <link> tag. I didn't do a whole lot
    of research but here is what I can tell you.

    1) the HTML parsers I found were expensive. I didn't find a free one. Least
    not one that was useful.
    2) Sometimes people use the IE browser control for DOM access, but I found
    it to be pretty clunky for my purposes.
    3) You can't really load it in an XML document because the HTML is rarely
    well-formed XML (though maybe in VS2005 using XHTML it will be?)

    I ended up doing some of my own string parsing since my need was relatively
    simple.

    "Mark" <> wrote
    > Hi all, does anyone know of a nice utility/ class which will allow me to
    > retrieve the details of a webpage?
    JV, Jun 24, 2005
    #2
    1. Advertising

  3. Mark

    Wilbur Slice Guest

    On Fri, 24 Jun 2005 15:59:03 +1200, "Mark" <> wrote:

    >Hi all, does anyone know of a nice utility/ class which will allow me to
    >retrieve the details of a webpage?
    >
    >Specifically, I would like to be able to retrive the html and then call a
    >method which would retrieve: meta tags
    >Keywords
    >Description
    >as well as another method which removes all the HTML from the string
    >starting at the body tag
    >
    >Does one exist? I know I can write one using regular expressions etc but
    >rather than inventing the wheel :)
    >
    >Thanks
    >Mark
    >
    >
    >



    Yeah, take a look at this:


    http://www.codefluent.com/smourier/download/htmlagilitypack.zip
    Wilbur Slice, Jun 24, 2005
    #3
  4. Mark

    Mark Guest

    Thanks for you help guys :)
    Cheers
    Mark
    "Wilbur Slice" <> wrote in message
    news:...
    > On Fri, 24 Jun 2005 15:59:03 +1200, "Mark" <> wrote:
    >
    > >Hi all, does anyone know of a nice utility/ class which will allow me to
    > >retrieve the details of a webpage?
    > >
    > >Specifically, I would like to be able to retrive the html and then call a
    > >method which would retrieve: meta tags
    > >Keywords
    > >Description
    > >as well as another method which removes all the HTML from the string
    > >starting at the body tag
    > >
    > >Does one exist? I know I can write one using regular expressions etc but
    > >rather than inventing the wheel :)
    > >
    > >Thanks
    > >Mark
    > >
    > >
    > >

    >
    >
    > Yeah, take a look at this:
    >
    >
    > http://www.codefluent.com/smourier/download/htmlagilitypack.zip
    >
    >
    >
    Mark, Jun 25, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. intec studenf
    Replies:
    0
    Views:
    395
    intec studenf
    May 4, 2005
  2. =?Utf-8?B?RGF2aWQgVGhpZWxlbg==?=

    set <title> and <meta name='description'> with master page

    =?Utf-8?B?RGF2aWQgVGhpZWxlbg==?=, Mar 16, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    2,951
    Steven Cheng[MSFT]
    Mar 17, 2006
  3. Donald Firesmith

    html tags within meta tags allowed?

    Donald Firesmith, Jan 5, 2005, in forum: XML
    Replies:
    5
    Views:
    894
    Andy Dingley
    Jan 8, 2005
  4. Dariusz Tomon

    meta name DESCRIPTION dynamically

    Dariusz Tomon, Oct 24, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    283
    Mark Fitzpatrick
    Oct 24, 2006
  5. Anton
    Replies:
    0
    Views:
    360
    Anton
    Dec 3, 2009
Loading...

Share This Page