S
scottmf
I am parsing very large data files (4 million lines or more) to reorder
the data and eliminate unnecessary information. Unfortunately because
of how the file is arranged I have to read the entire file before
processing the data. Currently everything is written to 2-d arrays and
takes about 3Gb of memory to process. I would like to start using a
temp file so that machines with less memory can still complete the
process, but in order to do so I need to be able to append data to the
middle of the file.
eg:
starting with data file:
Line: order read:
load1 1
ID1 a b c 2
ID2 d e f 3
ID3 g h i 4
load2 5
ID1 j k l 6
ID2 m n o 7
ID3 p q r 8
temp file becomes:
Line: order wrote:
ID1 1
load1 a b c 2
load2 j k l 7
ID2 3
load1 d e f 4
load2 m n o 8
ID3 5
load1 g h i 6
load2 p q r 9
any suggestions are much appreciated.
the data and eliminate unnecessary information. Unfortunately because
of how the file is arranged I have to read the entire file before
processing the data. Currently everything is written to 2-d arrays and
takes about 3Gb of memory to process. I would like to start using a
temp file so that machines with less memory can still complete the
process, but in order to do so I need to be able to append data to the
middle of the file.
eg:
starting with data file:
Line: order read:
load1 1
ID1 a b c 2
ID2 d e f 3
ID3 g h i 4
load2 5
ID1 j k l 6
ID2 m n o 7
ID3 p q r 8
temp file becomes:
Line: order wrote:
ID1 1
load1 a b c 2
load2 j k l 7
ID2 3
load1 d e f 4
load2 m n o 8
ID3 5
load1 g h i 6
load2 p q r 9
any suggestions are much appreciated.