file manipulation

V

Vandana

Hi All,

I have one more question with file manipulation.

Suppose I have the following structure in a file :
---------------------
Instance J1 (
net n1()
net n2 ()
net n3()
)
Instance J2 (
net n1()
net n2()
net n3()
)
Instance J3 (
net n1()
net n2()
net n3()
)
-----------------------
As an example, I want to read J3/net n3.
(the files are huge ....in GB)
I can grep for n3 but it will return 3 instances of n3.

How can I ensure that Im reading n3 values that belong to instance
J3?

Thank you for your time. I really appreciate your help.

Thanks
Vandana.
 
P

Peter Szinek

Hi All,

I have one more question with file manipulation.

Suppose I have the following structure in a file :
---------------------
Instance J1 (
net n1()
net n2 ()
net n3()
)
Instance J2 (
net n1()
net n2()
net n3()
)
Instance J3 (
net n1()
net n2()
net n3()
)
-----------------------
As an example, I want to read J3/net n3.
(the files are huge ....in GB)
I can grep for n3 but it will return 3 instances of n3.

How can I ensure that Im reading n3 values that belong to instance
J3?

If I got it correctly, something like this might work:

data[/Instance J3(.+?)^\)/m, 1]

Cheers,
Peter
 
K

Ken Bloom

Hi All,

I have one more question with file manipulation.

Suppose I have the following structure in a file : ---------------------
Instance J1 (
net n1()
net n2 ()
net n3()
)
Instance J2 (
net n1()
net n2()
net n3()
)
Instance J3 (
net n1()
net n2()
net n3()
)
-----------------------
As an example, I want to read J3/net n3. (the files are huge ....in GB)
I can grep for n3 but it will return 3 instances of n3.

How can I ensure that Im reading n3 values that belong to instance J3?

Thank you for your time. I really appreciate your help.

Thanks
Vandana.

If your file was in XML, you could use REXML::parsers::SAX2Parser (or
some other SAX parser) to create a solution that turns on an
@inInstanceJ3 variable when it encounters the right piece of data, turns
off @inInstanceJ3 when it encounters the matching close tag, and records
n3's only when @inInstanceJK3 is true.

For your format, I suggest you find a parser generator that lets you
create actions for different grammar elements, to implement a similar
solution.
 
B

Brian Candler

Vandana said:
Suppose I have the following structure in a file :
---------------------
Instance J1 (
net n1()
net n2 ()
net n3()
)
Instance J2 (
net n1()
net n2()
net n3()
)
Instance J3 (
net n1()
net n2()
net n3()
)
-----------------------
As an example, I want to read J3/net n3.
(the files are huge ....in GB)
I can grep for n3 but it will return 3 instances of n3.

How can I ensure that Im reading n3 values that belong to instance
J3?

Using (Unix shell command) grep, or using Ruby?

In Ruby you could just set a variable whenever you see a line matching
/Instance \S+/, so when you see a line matching /n3/ you can check what
the preceding Instance was.

Reading multiple gigabytes this way is never going to be efficient,
unless you have enough GB to keep the whole dataset in RAM. If not, then
consider indexing the data, perhaps with something like cdb. This would
let you jump immediately to the data for instance J3 without scanning
through the whole file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,150
Latest member
MakersCBDReviews
Top