pls give me some idea about parsing in java

A

Anindya

Hi all,
I need to parse a file of about 35 MB using java which has a lot of
info.The keywords are like <title>,<media>etc.I am a bit perplexed
about how to parse such a large file..Please give me some tips
 
A

anindya.kgp

Actually,I need to extract text between 2 markers like say <title> and
</title>,etc.So how do you do that?newayz thanx for the links
 
A

Arne Vajhøj

Actually,I need to extract text between 2 markers like say <title> and
</title>,etc.So how do you do that?newayz thanx for the links

If the file is valid XML you can read it into a W3C DOM
Document and select the data with XPath (assuming that the
file can fit in memory).

XPathAPI.selectSingleNode(doc.getDocumentElement(),
"//sometag/someothertag/title/text()")

Else you can read the file as text and use regex to
select the relevant part.

Pattern.compile("(?:<title>)([^<]*)(?:</title>)")

Arne
 
A

Arne Vajhøj

Lew said:
"pls" is not a word. "newayz" is not a word. "Java" is capitalized.
There should be a space, really two, after the period at the end of the
sentence. A single period ends a sentence. There should be a space
after a comma.

Why bother ? The question was perfect understandable !
The size of the file makes no difference to how you
parse a file, only to how long it takes.

Not everyone would recommend the same solution independent of file
size.

It is relevant to know the size of the file to evaluate whether
an "everything in memory at the same time" solution is feasible.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top