How to read in data from a non-xml text file in Java

Joined
Mar 18, 2011
Messages
1
Reaction score
0
Hi, I'm new to this forum. I need some help with java programming.

I'm writing a program that extract the data in text files and store it's content in its respective fields.

The input file looks like this.
<DOC>
<DOCNO> CNN19981001.0130.0263 </DOCNO>
<DOCTYPE> NEWS </DOCTYPE>
<TXTTYPE> CAPTION </TXTTYPE>
<TEXT>
The budget surplus was ignored by investors on Wall Street. The Dow
Jones industrial average lost 237 points to close at 7842. We'll have
more in "Dollars & Sense" at 46 minutes past the hour.
</TEXT>
</DOC>

I'll like to have
docno = CNN19981001.0130.0263
doctype = NEWS
etcetc as output.

I tried using SAXparser but I got an error when it parse the "&" character.
"org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference."
Was wondering whether there is a better way to read in the file.

My code:

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
org.w3c.dom.Document tempDoc = dBuilder.parse(file);
tempDoc.getDocumentElement().normalize();

Please help me. It's urgent. Thanks in advance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top