database search using Lucene

amitatgroups · Aug 3, 2007

I need example for database search using Lucene.

I found many example Lucene with file but not on Database please guide
me

Thanks...

Manish Pandit · Aug 4, 2007

I need example for database search using Lucene.

I found many example Lucene with file but not on Database please guide
me

Thanks...

In case of a database, you write the index by customizing it. For a
file, lucene and index the entire file but for a db it does not know
what all columns/tables you'd want indexed.

Step 1 - Create an index writer just like you would with a file

Step 2 - Create 1 document per record (this can be an aggregate -
anything you'd want to link a search result to). In this doc, keep
adding fields along with the values you'd want to index or store. Like
this:

myDoc.add(new Field("first_name",rs.getString("first_name"),
Field.Store.YES, Field.Index.YES) ) ;

I cannot really say what you'd want stored, index or both but I hope
you have a general idea on how storing, indexing or doing both
matters.

Step 3 - Close the writer.

Hope this helps.

-cheers,
Manish

Joe Attardi · Aug 4, 2007

I need example for database search using Lucene.

I found many example Lucene with file but not on Database please guide
me

You might want to try the Lucene mailing list. I am on that list right
now and the regular posters are very, very helpful with all sorts of
Lucene questions. All the info is here: http://lucene.apache.org/java/docs/mailinglists.html

Amit Jain · Aug 18, 2007

when i am executing below code, getting Exception :->
"no segments* file found in org.apache.lucene.store.FSDirectory@C:
\dbindex: files:"

on line "IndexWriter writer=new IndexWriter(new File("c:\
\dbindex"),new StandardAnalyzer(),false);"

I m getting above Exception

import java.sql.*;
import java.io.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.*;

/**
* @author apurohit
*
*/
public class DBIndex {

/**
* @param args
*/
private Connection con;
private String dbDriver,connectionURL,user,password;

public DBIndex()
{
con=null;
dbDriver="com.mysql.jdbc.Driver";
connectionURL="jdbc:mysql://192.168.1.4:3306/startvisitindia";
user="root";
password="";
}

public void setDBDriver(String driver)
{
this.dbDriver=driver;
}

public void setConnectionURL(String connectionURL)
{
this.connectionURL=connectionURL;
}

public void setAuthentication(String user,String password)
{
this.user=user;
this.password=password;
}

public Connection getConnection()
{
try{
Class.forName(dbDriver);
con=DriverManager.getConnection(connectionURL,user,password);
}
catch(Exception e){
e.printStackTrace();
}
return con;
}

private boolean isIndexExist(String indexPath)
{
boolean exist=false;
try{
IndexReader ir=IndexReader.open(indexPath);
exist=true;
ir.close();
}catch(IOException e){
System.out.println("ioexception:-> "+e);
}catch(Exception e){
System.out.println("exception:-> "+e);
}

return exist;
}

public void performIndexing(String indexPath)
{
try{
Connection connection=getConnection();
String query="select DestId,Name,StateName,DestinationDescription
from destination";
Statement statement=connection.createStatement();
ResultSet contentResutlset=statement.executeQuery(query);
IndexWriter writer=new IndexWriter(new File("c:\\dbindex"),new
StandardAnalyzer(),false);
while(contentResutlset.next()){
//Adding all fields' contents to a single string for indexing
String contents=contentResutlset.getString(2)+"
"+contentResutlset.getString(3)+" "+contentResutlset.getString(4);
//Extracting and adding tags to contents string for creating index
/*
String tagQuery="select t.tag_name from tags t,taggedcontents tc
where tc.content_id = '"+contentResutlset.getString(1)+"' and
tc.tag_id=t.tag_id";
Statement tagStatement=connection.createStatement();
ResultSet tagResultset=tagStatement.executeQuery(tagQuery);
while(tagResultset.next()){contents=contents+"
"+tagResultset.getString(1);}

System.out.println("Indexing Content no.(ID) " +
contentResutlset.getShort(1)+"\n"+contents);
*/
//Creating index for a single content(record in contents table)
Document doc = new Document();
doc.add(new Field("contents",
contents,Store.NO,Index.TOKENIZED,TermVector.YES));
doc.add(new
Field("id",contentResutlset.getString(1),Store.YES,Index.UN_TOKENIZED,TermVector.NO));
writer.addDocument(doc);
//writer.close();
}
writer.close();
}//try
catch(Exception e){
e.printStackTrace();
}//catch

}

public static void main(String[] args) {
DBIndex dbi=new DBIndex();
try{
Connection connection=dbi.getConnection();
String query="select DestId,Name,StateName,DestinationDescription
from destination";
Statement statement=connection.createStatement();
ResultSet contentResutlset=statement.executeQuery(query);
System.out.println("111");
IndexWriter writer = new IndexWriter(new File("c:\\dbindex"),new
StandardAnalyzer(),false);
//System.out.println("IndexWriter:-> "+writer);
while(contentResutlset.next()){
//Adding all fields' contents to a single string for indexing
String contents=contentResutlset.getString(2)+"
"+contentResutlset.getString(3)+" "+contentResutlset.getString(4);
//Extracting and adding tags to contents string for creating index
/*
String tagQuery="select t.tag_name from tags t,taggedcontents tc
where tc.content_id = '"+contentResutlset.getString(1)+"' and
tc.tag_id=t.tag_id";
Statement tagStatement=connection.createStatement();
ResultSet tagResultset=tagStatement.executeQuery(tagQuery);
while(tagResultset.next())
{
contents=contents+" "+tagResultset.getString(1);
}
*/
//System.out.println("Indexing Content no.(ID) " +
contentResutlset.getShort(1)+"\n"+contents);
System.out.println("Indexing Content no.(ID) " +
contentResutlset.getString(1));

//Creating index for a single content(record in contents table)
//Document doc = new Document();
//doc.add(new Field("contents",
contents,Store.NO,Index.TOKENIZED,TermVector.YES));
//doc.add(new
Field("id",contentResutlset.getString(1),Store.YES,Index.UN_TOKENIZED,TermVector.NO));
//writer.addDocument(doc);
//writer.close();
}
//writer.close();
contentResutlset.close();
statement.close();
connection.close();
}//try
catch(Exception e){
System.out.println(e.getMessage());
}//catch*/
}
}

Manish Pandit · Aug 18, 2007

IndexWriter writer=new IndexWriter(new File("c:\\dbindex"),new
StandardAnalyzer(),false);

Replace 'false' with 'true' so that lucene can create a new index. If
you are using 'false', make sure the folder exists.

-cheers,
Manish

Amit Jain · Aug 18, 2007

Thanks for reply...

Hi,
i replace 'false' with 'true' and folder is exit.
but still same exception.

Thanks...

Manish Pandit · Aug 18, 2007

Thanks for reply...

Hi,
i replace 'false' with 'true' and folder is exit.
but still same exception.

Thanks...

Couple things. Always print stack traces, either on the console or in
a log file.

You are not closing the writer, so lucene cannot aquire the lock file
(that it creates in temp) after indexing. So any subsequent tries
fail.

Put writer.close() in finally along with connection.close().

I was able to uncomment writer.close() (the last one in the code),
change the db stuff to something I already had on my box, and changed
'false' to 'true' without the folder being present, and it worked. I
hope you changed the stuff in main(), as you have duplicate code in
the actual class which does not get called.

Here is the output:

111
Indexing Content no.(ID) 1
Indexing Content no.(ID) 3
Indexing Content no.(ID) 5
Indexing Content no.(ID) 6
Indexing Content no.(ID) 7
Indexing Content no.(ID) 9
Indexing Content no.(ID) 10

Here is the folder it created:

Directory of C:\dbindex

08/18/2007 07:17 AM <DIR> .
08/18/2007 07:17 AM <DIR> ..
08/18/2007 07:17 AM 20 segments
1 File(s) 20 bytes
2 Dir(s) 406,626,304 bytes free

-cheers,
Manish

Amit Jain · Aug 19, 2007

Thanks for your reply...
Done...

Thanks

Amit Jain · Aug 20, 2007

Hi,

I am getting compile time error symbol not found on line
doc.add(Field.Text("contents", new FileReader(f)));
doc.add(Field.Keyword("filename", f.getCanonicalPath()));
of method indexFile(IndexWriter writer, File f)

when i look into class Field i didn't found any thing about Text and
Keyword symbol.

what i have to do to resolve above compile time error...

/**
*
*/

import java.sql.*;
import java.io.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.*;

/**
* @author apurohit
*
*/
/**
* This code was originally written for
* Erik's Lucene intro java.net article
*/
public class FileIndex {
public static void main(String[] args) throws Exception {

File indexDir = new File("fileindex");
File dataDir = new File("txt_file");

int numIndexed = index(indexDir, dataDir);

}

// open an index and start file directory traversal
public static int index(File indexDir, File dataDir)throws
IOException {
//Listing 1.1 Indexer: traverses a file system and
indexes .txt files
//Create Lucene index in this directory Index files in this
directory
//Lucene in action: a sample application 13
if (!dataDir.exists() || !dataDir.isDirectory()) {
throw new IOException(dataDir+ " does not exist or is not a
directory");
}
IndexWriter writer = new IndexWriter(indexDir,new
StandardAnalyzer(), true);
writer.setUseCompoundFile(false);
indexDirectory(writer, dataDir);
int numIndexed = writer.docCount();
writer.optimize();
writer.close();
return numIndexed;
}

// recursive method that calls itself when it finds a directory
private static void indexDirectory(IndexWriter writer, File
dir)throws IOException {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files;
if (f.isDirectory()) {
indexDirectory(writer, f);
} else if (f.getName().endsWith(".txt")) {
indexFile(writer, f);
}
}
}

// method to actually index a file using Lucene
private static void indexFile(IndexWriter writer, File f)throws
IOException {
if (f.isHidden() || !f.exists() || !f.canRead()) {
return;
}
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = new Document();
doc.add(Field.Text("contents", new FileReader(f)));
doc.add(Field.Keyword("filename", f.getCanonicalPath()));

writer.addDocument(doc);
}
}

Thanks...

Manish Pandit · Aug 20, 2007

Hi,

I am getting compile time error symbol not found on line
doc.add(Field.Text("contents", new FileReader(f)));
doc.add(Field.Keyword("filename", f.getCanonicalPath()));
of method indexFile(IndexWriter writer, File f)

when i look into class Field i didn't found any thing about Text and
Keyword symbol.

what i have to do to resolve above compile time error...

/**
*
*/

import java.sql.*;
import java.io.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.*;

/**
* @author apurohit
*
*/
/**
* This code was originally written for
* Erik's Lucene intro java.net article
*/
public class FileIndex {
public static void main(String[] args) throws Exception {

File indexDir = new File("fileindex");
File dataDir = new File("txt_file");

int numIndexed = index(indexDir, dataDir);

}

// open an index and start file directory traversal
public static int index(File indexDir, File dataDir)throws
IOException {
//Listing 1.1 Indexer: traverses a file system and
indexes .txt files
//Create Lucene index in this directory Index files in this
directory
//Lucene in action: a sample application 13
if (!dataDir.exists() || !dataDir.isDirectory()) {
throw new IOException(dataDir+ " does not exist or is not a
directory");
}
IndexWriter writer = new IndexWriter(indexDir,new
StandardAnalyzer(), true);
writer.setUseCompoundFile(false);
indexDirectory(writer, dataDir);
int numIndexed = writer.docCount();
writer.optimize();
writer.close();
return numIndexed;
}

// recursive method that calls itself when it finds a directory
private static void indexDirectory(IndexWriter writer, File
dir)throws IOException {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files;
if (f.isDirectory()) {
indexDirectory(writer, f);
} else if (f.getName().endsWith(".txt")) {
indexFile(writer, f);
}
}
}

// method to actually index a file using Lucene
private static void indexFile(IndexWriter writer, File f)throws
IOException {
if (f.isHidden() || !f.exists() || !f.canRead()) {
return;
}
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = new Document();
doc.add(Field.Text("contents", new FileReader(f)));
doc.add(Field.Keyword("filename", f.getCanonicalPath()));

writer.addDocument(doc);
}

}

Thanks...

You are using an older API on a newer version of lucene (2.2). There
is no Field.Keyword and Field.Text anymore. This has been replaced
with Field.Index, Field.Store and Field.TermVector APIs in newer
lucene. Look for examples to upgrade to new API, or download an older
lucene (1.4.x).

-cheers,
Manish

Amit Jain · Aug 20, 2007

Ok
Thanks for your reply...

Code to search the desktop for data in the microsoft access database.	0	Jul 24, 2023
Apache Lucene Porter Stemming	0	Oct 18, 2011
lucene problem	1	Apr 11, 2006
How to write an advanced search?	3	Mar 2, 2022
How can I speed up reading the Lucene search results	0	Mar 12, 2008
create my own search feature with eclipse RCP	1	Oct 29, 2008
Lucene Keywords not matching	0	Apr 4, 2005
Updating documents in Lucene	2	Jan 30, 2008

database search using Lucene

amitatgroups

Manish Pandit

Joe Attardi

Amit Jain

Manish Pandit

Amit Jain

Manish Pandit

Amit Jain

Amit Jain

Manish Pandit

Amit Jain

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads