indexing web pages - in python?

D

Dan Stromberg

Are there any open source search engines written in python for indexing a
given collection of (internal only) html pages? Right now I'm talking
about dozens, but hopefully it'll be hundreds or thousands at some point.

I'm thinking some sort of CGI script, with perhaps a cron job that updates
the indexes.

I'm not particularly looking for something that has a full RDBMS behind
it - just a file that stores indexes. I'll go with an RDBMS-based
solution if I must, but I don't think that's really needed at this point.

TIA
 
K

Kevin T. Ryan

Are there any open source search engines written in python for indexing a
given collection of (internal only) html pages? Right now I'm talking
about dozens, but hopefully it'll be hundreds or thousands at some point.

I'm thinking some sort of CGI script, with perhaps a cron job that updates
the indexes.

I'm not particularly looking for something that has a full RDBMS behind
it - just a file that stores indexes. I'll go with an RDBMS-based
solution if I must, but I don't think that's really needed at this point.

TIA

You could try:

http://gnosis.cx/download/indexer.py

There is an extensive write-up by the author at:

http://gnosis.cx/publish/programming/charming_python_15.txt

Might be something you'd be interested in ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top