P
pkirk25
My data is in a big file that I have no control over. Sometimes its
over 30 MB and often there are several of them.
It is machine generated and is nicely formatted. Example text follows:
AuctioneerSnapshotDB = {
["nordrassil-neutral"] = {
["nextAuctionId"] = 20,
["version"] = 1,
["updates"] = {
[1] = "15416.012;;0;0;0;0;0;0",
},
["auctions"] = {
[1] =
"16717;0;0;0;1;1650000;1650000;Boneglay;0;0;3;1159391569;1159420369",
[2] =
"6661;0;0;0;1;399900;599900;Krius;0;0;2;1159391569;1159398769",
[3] =
"6657;0;0;1289192110;1;7300;7900;Bootyboy;0;0;4;1159391569;1159477969",
[19] =
"9865;1191;0;680935487;1;5013;8000;Warmist;0;0;1;1159391569;1159393369",
},
["ahKey"] = "nordrassil-neutral",
I think I will be able to find what I want and populate my structs by
looking for keywords like "nordrassil-neutral" and "ahKey". The code
is not pretty. In fact, it seem sot have works like "Fragile - Handle
with Care" stamped all over it.
A pseudocode version might read:
Copy each line into a temporary string
If we have found the keyword "nordrassil-neutral" and have found the
keyword "auctions"
if the line contains 10 ";"
populate the struct
} while we have not found the keyword "ahKey"
I can tell that there are 3 contigous "\t" before each numbered line.
But my question is if this is the right approach to a structured
document or is there a better way? I can see that there is a rational
structure but can't see how to use the formatted text better than my
brute force of counting approach.
BTW, I have asked for guidance on the format - decoding it myself is
the only option.
over 30 MB and often there are several of them.
It is machine generated and is nicely formatted. Example text follows:
AuctioneerSnapshotDB = {
["nordrassil-neutral"] = {
["nextAuctionId"] = 20,
["version"] = 1,
["updates"] = {
[1] = "15416.012;;0;0;0;0;0;0",
},
["auctions"] = {
[1] =
"16717;0;0;0;1;1650000;1650000;Boneglay;0;0;3;1159391569;1159420369",
[2] =
"6661;0;0;0;1;399900;599900;Krius;0;0;2;1159391569;1159398769",
[3] =
"6657;0;0;1289192110;1;7300;7900;Bootyboy;0;0;4;1159391569;1159477969",
[19] =
"9865;1191;0;680935487;1;5013;8000;Warmist;0;0;1;1159391569;1159393369",
},
["ahKey"] = "nordrassil-neutral",
I think I will be able to find what I want and populate my structs by
looking for keywords like "nordrassil-neutral" and "ahKey". The code
is not pretty. In fact, it seem sot have works like "Fragile - Handle
with Care" stamped all over it.
A pseudocode version might read:
Copy each line into a temporary string
If we have found the keyword "nordrassil-neutral" and have found the
keyword "auctions"
if the line contains 10 ";"
populate the struct
} while we have not found the keyword "ahKey"
I can tell that there are 3 contigous "\t" before each numbered line.
But my question is if this is the right approach to a structured
document or is there a better way? I can see that there is a rational
structure but can't see how to use the formatted text better than my
brute force of counting approach.
BTW, I have asked for guidance on the format - decoding it myself is
the only option.