M
Mike
This whole problem gets a lot simpler if you have a relational database.
You can read the whole file once in sequential order. For each field, write a database record with 3 columns one containing the row
number, one containing the column number and finally one with the field value. The resulting table can then be easily processed by
column or by row or even by field value and by any combination of the above.
Mike Sicilian
|I have a group of files in a format that is that is tab delimited with
| about a million columns and a thousand rows.
|
| Reading this file left-to-right top-to-bottom is not a problem but my
| requirements are to read it top-to-bottom left-to-right (to read each
| column in order as follows).
|
| 1,4,7
| 2,5,8
| 3,6,9
|
| It's an O(n^2) problem if I read each line for each column (it could
| take a week for a big file). The file is too big to hold the lines in
| memory and I see no strategy where I can hold a subset of lines in
| memory.
|
| This seems like a simple problem but I have struggled with lots of
| solutions that have come up short.
|
| Any suggestions would be appreciated.
|
| Thank you.
|
You can read the whole file once in sequential order. For each field, write a database record with 3 columns one containing the row
number, one containing the column number and finally one with the field value. The resulting table can then be easily processed by
column or by row or even by field value and by any combination of the above.
Mike Sicilian
|I have a group of files in a format that is that is tab delimited with
| about a million columns and a thousand rows.
|
| Reading this file left-to-right top-to-bottom is not a problem but my
| requirements are to read it top-to-bottom left-to-right (to read each
| column in order as follows).
|
| 1,4,7
| 2,5,8
| 3,6,9
|
| It's an O(n^2) problem if I read each line for each column (it could
| take a week for a big file). The file is too big to hold the lines in
| memory and I see no strategy where I can hold a subset of lines in
| memory.
|
| This seems like a simple problem but I have struggled with lots of
| solutions that have come up short.
|
| Any suggestions would be appreciated.
|
| Thank you.
|