parsing of structured text

R

Robert Fendt

Hi all,

I have to parse a file containing (slightly erroneous) vCal data. The
format of vCal/iCal is that of a structured ASCII file, not unlike XML
in a way. A vCal block contains information on a line-by-line basis,
with the possibility of sub-blocks (for events).

BEGIN:VCALENDAR
VERSION:1.0
BEGIN:VEVENT
....
END:VEVENT
BEGIN:VEVENT
....
END:VEVENT
END:VCALENDAR
BEGIN:VCALENDAR
VERSION:1.0
....
END:VCALENDAR

Were this C++, I would use an iterator approach, with classes for the
calendar and event blocks respectively, and pass an iterator pointing
to the current position in the file for deserialisation, getting a
new iterator back that points to the position behind the block. That
way I decide what to do next based on the current line's contents,
i.e., implement a state machine of some sorts.

While this approach is certainly possible in Python as well, I have
the nagging feeling that there should be a much cleaner, simpler
(i.e., "Pythonic") way to deal with such a problem. Ideally, the end
result would look something like this, however I am a bit at a loss
right now as to how best to achieve it. Any suggestions?

for calendar_block in input_file:
version = calendar_block.version
num_events = len(calendar_block.events)


Thanks,
Robert
 
K

Kushal Kumaran

Hi all,

I have to parse a file containing (slightly erroneous) vCal data. The
format of vCal/iCal is that of a structured ASCII file, not unlike XML
in a way. A vCal block contains information on a line-by-line basis,
with the possibility of sub-blocks (for events).

BEGIN:VCALENDAR
VERSION:1.0
BEGIN:VEVENT
...
END:VEVENT
BEGIN:VEVENT
...
END:VEVENT
END:VCALENDAR
BEGIN:VCALENDAR
VERSION:1.0
...
END:VCALENDAR

Were this C++, I would use an iterator approach, with classes for the
calendar and event blocks respectively, and pass an iterator pointing
to the current position in the file for deserialisation, getting  a
new iterator back that points to the position behind the block. That
way I decide what to do next based on the current line's contents,
i.e., implement a state machine of some sorts.

While this approach is certainly possible in Python as well, I have
the nagging feeling that there should be a much cleaner, simpler
(i.e., "Pythonic") way to deal with such a problem. Ideally, the end
result would look something like this, however I am a bit at a loss
right now as to how best to achieve it. Any suggestions?

Using code someone else has already written would qualify as pythonic, IMO.

http://pypi.python.org/pypi/vobject
 
R

Robert Fendt

Using code someone else has already written would qualify as pythonic, IMO.

http://pypi.python.org/pypi/vobject

That seems to do what I need, thank you. I seem to have been a bit
blind when I looked for existing packages. Of course, using vobject
"only" works efficiently around my problem (i.e., how to formulate
such a parser in the most simple/elegant/pythonic way). ;-)

Regards,
Robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top