T
TimBenz
I need a RegEx that I can use to scroll through textual data to extract
lines in a semi-regular format. The original data is a form something like
this:
AAA AAAAA BBBB BB CCCCC DDDDD EEEEEE FFFFFFF
Note, there are zero or more spaces in the "A" entity and the "B" entity,
and the rest of the entities have no spaces. Second, there is no fixed
length for any of the entities. They can be any non-zero length. About the
only point of consistency is that the "B" entity has a finite number of
forms, about fifteen. So far my attempt has been like this:
(.*)(COM|COMMON SHARES|Domestic Common)\s{1,}(.*?)\s{1,}(.*?)\s{1,}(.*?)\s
From which I extract $1, $3, and $5.
How do I spool through the whole text file and extract every line for which
the above holds? Are there better ways of doing this without the arduous
part where I have to detail all the variants of the B entity?
Thanks.
lines in a semi-regular format. The original data is a form something like
this:
AAA AAAAA BBBB BB CCCCC DDDDD EEEEEE FFFFFFF
Note, there are zero or more spaces in the "A" entity and the "B" entity,
and the rest of the entities have no spaces. Second, there is no fixed
length for any of the entities. They can be any non-zero length. About the
only point of consistency is that the "B" entity has a finite number of
forms, about fifteen. So far my attempt has been like this:
(.*)(COM|COMMON SHARES|Domestic Common)\s{1,}(.*?)\s{1,}(.*?)\s{1,}(.*?)\s
From which I extract $1, $3, and $5.
How do I spool through the whole text file and extract every line for which
the above holds? Are there better ways of doing this without the arduous
part where I have to detail all the variants of the B entity?
Thanks.