WWW CMS: filtering actual(ly relevant) content

Discussion in 'XML' started by lbrtchx@gmail.com, Feb 15, 2008.

  1. Guest

    ~
    I was actually wondering about how do they filter and keep track of
    actual content on pages out there on the net and how helpful would
    current protocols and web servers be on such things
    ~
    Only on pages designed in the 94-95's you could use the last-modified
    response header as a way to have an idea of something that might have
    changed on the page. Current pages in almost all sites are googled,
    syndicated or just filled up with an incredible amount of clutter and
    nonsense. This makes searching the net a time consuming and not so
    reliable endeavor, among many other things, because they use page
    contextualization; if you search for, say CSS, you may find lots of
    pages that just had the acronym "CSS" on a left frame as a jump off
    link to another page of probably it was included as credit ("css"
    desinged by ...) in the page's footer
    ~
    I really don't know if and how the actual content of pages is
    indexed. I was thinking of basically:
    ~
    * keeping local copies of certain pages
    * on which tidy was run to make them well-formed XML, and
    * keeping and managing XPath indexes of the pages and
    * pasers to get the meat out of the pages
    ~
    Any libraries or solid/comprehensive studies out there?
    ~
    Thanks
    lbrtchx
     
    , Feb 15, 2008
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tom
    Replies:
    3
    Views:
    565
    Bryan Thrasher
    Jul 3, 2005
  2. Joe Strout
    Replies:
    2
    Views:
    185
    Diez B. Roggisch
    Oct 15, 2008
  3. Yash Ganthe

    Good CMS for static multi-lingual content?

    Yash Ganthe, May 17, 2011, in forum: ASP .Net
    Replies:
    0
    Views:
    776
    Yash Ganthe
    May 17, 2011
  4. Replies:
    5
    Views:
    174
  5. r_honey
    Replies:
    23
    Views:
    464
    dhtml
    Dec 22, 2008
Loading...

Share This Page