Text Parser Help Please

B

Bucco

I am trying to put together a simple script that will parse a text file
that contains a list of tasks. Each line could be different in format
from the other. Most lines have words that are marked and can be
pulled out with a regex. Here is a simple example:

(A) @home Mow lawn d:6/30/06
@phone call home
(B) p:program @pc @desk Add text parser to the program

Basically, each line is a task in a list of todos. They can have one
of three priority rankings (A), (B), or (C). The priority is always
first on the line if it is present. Then There can be a project name
that the task is related to, "p:program". The next item on the line is
a context and starts with the @ symbol. Each task can have more than
one context. After this is the task description that consists of one
or more words and has no definitive marker. Some tasks may have a due
date after the task that is marked by a d: followed by a date.

So basically, the program will read in the text file, process each line
so that a task is printed to a new file in either a prject file, due
file, and/or context file. When processing each line, I thought of
breaking them down by white space into an array and then using a regex
to match the easy items and remove them the array and use them as a
hash key for the task.

I gues the best way might be to extract each marker assin it to a hash
as a key and then extract the task and assign it to the hash as the
value. I can't seem to get to this point without a lot of if
statements. I was wondering if anyone else had a cleaner way of doing
this.

Thanks:)
SA
 
V

vasudevram

If the format of the text file is not hardwired (i.e. you have the
freedom to change it), why not try this:

Instead of your current format, use Ruby data (hashes, arrays, etc.) as
the format for the task list content - in a text file. That way, you
can directly read in the content - which is Ruby code - and eval it.
Almost no programming needed for parsing this way - the Ruby
interpreter will do the parsing for you. All you have to do is design
the data structure and a little code to read in and eval the text file.

Vasudev
---
Vasudev Ram
Independent software consultant
http://www.geocities.com/vasudevram
PDF conversion toolkit:
http://sourceforge.net/projects/xtopdf
 
C

ccahua

Hi,

I'm still learning to 'put' :), but I found this script very handy and
it might fit your needs.
My Fiendish Plan - http://www.sedumphotos.net/nfagerlund/fmp/ from Mr.
Fagerlund

When run, it parses lines prefixed with a ^ symbol and category name
exporting them as separate text files. A text file with all your todos
categorized by project, context or whatever category is broken out into
their respective text files.

Example: ^project Learn Ruby in 10 years becomes project.txt with
'Learn Ruby in 10 years' as the content.

hth,
tony
 
S

snowball

vasudevram said:
If the format of the text file is not hardwired (i.e. you have the
freedom to change it), why not try this:

Instead of your current format, use Ruby data (hashes, arrays, etc.) as
the format for the task list content - in a text file. That way, you
can directly read in the content - which is Ruby code - and eval it.
Almost no programming needed for parsing this way - the Ruby
interpreter will do the parsing for you. All you have to do is design
the data structure and a little code to read in and eval the text file.

Another option might be to write the file in yaml (http://www.yaml.org)
and parse the data into Ruby using Syck.
 
B

Bucco

snowball said:
Another option might be to write the file in yaml (http://www.yaml.org)
and parse the data into Ruby using Syck.

I do not disagree that changing the format of the text file would be
easier, but, that is not an option at this time. I think if I can
extract the marked words, I coul then use them as keys in a hash with
the task as the value. I just can't think of an easy way to do it
without a lot of if statements.

Thanks:)
SA
 
S

surfunbear

Perl has a great parser much similar to yacc written by Damien
Conway. There is a book out that describes using it as well.
I don't think ruby has this type of thing yet, but it would be nice.
I have used the perl parser and it works great once you figure it
out, but I have used yacc which is similiar. It's
based on compiler theory. You could buil a C, java or ruby parser
with it or use it for simpler parsing.

here is the URL:

http://search.cpan.org/dist/Parse-RecDescent/lib/Parse/RecDescent.pod
 
B

Bucco

Exactly what I was looking for. This would allow me to dump to
specific files based upon different parameters. Thank you all for your
help.

Thanks:)
SA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Help please 8
Please help 2
HELP PLEASE 4
Please help 7
Please, help me. 1
Hello and Help please :-) 1
Malicious Coding Help Please 5
Troubles with Fullpage / please help 0

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top