J
Jasper
Hi,
I have multiple data files which need parsing in realtime so high
performance is *crucial*.
I dont have a format definition, but from what I can see there is a
hierarchy of data.
Each data field is named thus <"name":> (the <> are mine).
The data can be quoted text or unquoted text or a composite hierarcy field.
Each name/data pair is terminated by a comma unless it is the last in the
group.
A comma can also appear within a quoted text data field.
The hierarchical tokens are open and close braces <{}> and open and
close square brackets <[]>.
Thats all there is to it
The data describes, say, a school class, so we have a rigid set of data
groups.
eg we have data describing the teacher, data describing the class taken, and
a repeating group describing each kid and grades.
So it would be nice to be able to parse this data out into appropriate
structures.
Below is a snipped of dummy data (in reality there is much more). I have
added the spacing and carriage returns for clarity. The real data has no
white spaces. There may be a variable number of parameters (I think) so it
would be useful to be able to ID and potentially store the variable name
with its data value.
Anyone got any ideas/code snips/references of the best, most speedy (at run
time), way to go about it? A tight, pure c++ solution (with or without the
stl) would be needed.
Thanks in advance for any help
{
"teacher":{
"name":
"Mr Borat",
"age":
"35",
"Nationality":
"Kazakhstan"},
"Class":{
"Semester":
"Summer",
"Room":
null,
"Subject":
"Politics",
"Notes":
"We're happy, you happy?"},
"Students":
[
{
"Smith":
[{"First Name":"Mary","sex":"Female}],
"Brown":
[{"First Name":"John","sex":"Male}],
"Jackson":
[{"First Name":"Jackie","sex":"Female}]
}
],
"Grades":
[
{
"Test":
[{"grade":A,"points":68},{"grade":B,"points":25},{"grade":C,"points":15}],
"Test":
[{"grade":C,"points":2},{"grade":B,"points":29},{"grade":A,"points":55}],
"Test":
[{"grade":C,"points":2},{"grade":A,"points":72},{"grade":A,"points":65}]
}
]
}
I have multiple data files which need parsing in realtime so high
performance is *crucial*.
I dont have a format definition, but from what I can see there is a
hierarchy of data.
Each data field is named thus <"name":> (the <> are mine).
The data can be quoted text or unquoted text or a composite hierarcy field.
Each name/data pair is terminated by a comma unless it is the last in the
group.
A comma can also appear within a quoted text data field.
The hierarchical tokens are open and close braces <{}> and open and
close square brackets <[]>.
Thats all there is to it
The data describes, say, a school class, so we have a rigid set of data
groups.
eg we have data describing the teacher, data describing the class taken, and
a repeating group describing each kid and grades.
So it would be nice to be able to parse this data out into appropriate
structures.
Below is a snipped of dummy data (in reality there is much more). I have
added the spacing and carriage returns for clarity. The real data has no
white spaces. There may be a variable number of parameters (I think) so it
would be useful to be able to ID and potentially store the variable name
with its data value.
Anyone got any ideas/code snips/references of the best, most speedy (at run
time), way to go about it? A tight, pure c++ solution (with or without the
stl) would be needed.
Thanks in advance for any help
{
"teacher":{
"name":
"Mr Borat",
"age":
"35",
"Nationality":
"Kazakhstan"},
"Class":{
"Semester":
"Summer",
"Room":
null,
"Subject":
"Politics",
"Notes":
"We're happy, you happy?"},
"Students":
[
{
"Smith":
[{"First Name":"Mary","sex":"Female}],
"Brown":
[{"First Name":"John","sex":"Male}],
"Jackson":
[{"First Name":"Jackie","sex":"Female}]
}
],
"Grades":
[
{
"Test":
[{"grade":A,"points":68},{"grade":B,"points":25},{"grade":C,"points":15}],
"Test":
[{"grade":C,"points":2},{"grade":B,"points":29},{"grade":A,"points":55}],
"Test":
[{"grade":C,"points":2},{"grade":A,"points":72},{"grade":A,"points":65}]
}
]
}