regular expression

G

gardsted

I just can't seem to get it:
I was having some trouble with finding the first <REAPER_PROJECT in the following with this regex:

Should these two approaches behave similarly?
I used hours before I found the second one,
but then again, I'm not so smart...:

kind retards
jorgen / de mente
using python 2.5.1
-------------------------------------------
import re

TESTTXT="""<REAPER_PROJECT 0.1
<METRONOME 6 2.000000
SAMPLES "" ""
<TRACK
MAINSEND 1
<VOLENV2
ACT 1
<PANENV2
ACT 1
"""
print "The First approach - flags in finditer"
rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT,re.MULTILINE):
print i,i.groups()

print "The Second approach - flags in pattern "
rex = re.compile(r'(?m)^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT):
print i,i.groups()
 
D

Diez B. Roggisch

gardsted said:
I just can't seem to get it:
I was having some trouble with finding the first <REAPER_PROJECT in the
following with this regex:

Should these two approaches behave similarly?
I used hours before I found the second one,
but then again, I'm not so smart...:

kind retards
jorgen / de mente
using python 2.5.1
-------------------------------------------
import re

TESTTXT="""<REAPER_PROJECT 0.1
<METRONOME 6 2.000000
SAMPLES "" ""<TRACK
MAINSEND 1
<VOLENV2
ACT 1<PANENV2
ACT 1"""
print "The First approach - flags in finditer"
rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT,re.MULTILINE):
print i,i.groups()

print "The Second approach - flags in pattern "
rex = re.compile(r'(?m)^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT):
print i,i.groups()

What the heck is that format? XML's retarded cousin living in the attic?

Ok, back to the problem then...

This works for me:

rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]+)',re.MULTILINE)
for i in rex.finditer(TESTTXT):
print i,i.groups()

However, you might think of getting rid of the ^ beceause otherwise you
_only_ get the first tag beginning at a line. And making the * a + in
the TAGNAME might also be better.

Diez
 
G

gardsted

Ups - got it - there are no flags in finditer;-)
So rtfm, once again, jorgen!
I just can't seem to get it:
I was having some trouble with finding the first <REAPER_PROJECT in the
following with this regex:

Should these two approaches behave similarly?
I used hours before I found the second one,
but then again, I'm not so smart...:

kind retards
jorgen / de mente
using python 2.5.1
-------------------------------------------
import re

TESTTXT="""<REAPER_PROJECT 0.1
<METRONOME 6 2.000000
SAMPLES "" ""<TRACK
MAINSEND 1
<VOLENV2
ACT 1<PANENV2
ACT 1"""
print "The First approach - flags in finditer"
rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT,re.MULTILINE):
print i,i.groups()

print "The Second approach - flags in pattern "
rex = re.compile(r'(?m)^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
for i in rex.finditer(TESTTXT):
print i,i.groups()
 
M

MonkeeSage

What the heck is that format? XML's retarded cousin living in the attic?

ROFL...for some reason that makes me think of wierd Ed Edison from
maniac mansion, heh ;)
 
G

gardsted

The retarded cousin - that's me!

I keep getting confused by the caret - sometimes it works - sometimes it's better with backslash-n
Yes - retarded cousin, I guess.

The file format is a config-track for a multitrack recording software, which i need to automate a bit.
I can start it from the command line and have it create a remix (using various vst and other effects)
Sometimes, however, we may have deleted the 'guitar.wav' and thus have to leave
out that track from the config-file or the rendering won't work.

Since it seems 'whitespace matters' in the file I have the following code to get me a tag:
I cost me a broken cup and coffee all over the the kitchen tiles - temper!

I still don't understand why I have to use \n instead of ^ af the start of TAGCONTENTS and TAGEND.
But I can live with it!

Thank you for your kind and humorous help!
kind retards
jorgen / de mente
www.myspace.com/dementedk
------------------------------------------------------------

import re

TESTTXT=open('003autoreaper.rpp').read() # whole file now

def getLevel(levl):
rex = re.compile(
r'(?m)' # multiline
r'(?P<TAGSTART>^ {%d}[<])' # the < character
r'(?P<TAGNAME>[a-zA-Z0-9_]*)' # the tagname
r'(?P<TAGDATA>[\S \t]*?$)' # the rest of the tagstart line
r'(?P<TAGCONTENTS>(\n {%d}[^>][\S \t]*$){0,})' # all the data coming before the >
r'(?P<TAGEND>\n {%d}>[\S \t]*$)' %(levl,levl,levl) # the > character
)
return rex

for i in getLevel(2).finditer(TESTTXT):
myMatch = i.groupdict()
print i.group('TAGNAME'),i.start('TAGSTART'), i.end('TAGEND')
#print i.groups()
if myMatch['TAGNAME'] == 'TRACK':
#print i.groups()
for j in getLevel(6).finditer(TESTTXT,i.start('TAGSTART'), i.end('TAGEND')):
myMatch2 = j.groupdict()
#print j.groups()
print j.group('TAGNAME'),j.start('TAGSTART'), j.end('TAGEND')
if myMatch2['TAGNAME'] == 'SOURCE':
for m in myMatch2:
print m, myMatch2[m]
 
P

Paul McGuire

Sorry about your coffee cup! Would you be interested in a pyparsing
rendition?

-- Paul


from pyparsing import *

def defineGrammar():
ParserElement.setDefaultWhitespaceChars(" \t")

ident = Word(alphanums+"_")
LT,GT = map(Suppress,"<>")
NL = LineEnd().suppress()

real = Word(nums,nums+".")
integer = Word(nums)
quotedString = QuotedString('"')

dataValue = real | integer | Word(alphas,alphanums) | quotedString
dataDef = ident + ZeroOrMore(dataValue) + NL
tagDef = Forward()
tagDef << LT + ident + ZeroOrMore(dataValue) + NL + \
Dict(ZeroOrMore(Group(dataDef) | Group(tagDef))) + GT + NL
tagData = Dict(OneOrMore(Group(tagDef)))
return tagData

results = defineGrammar().parseString(TESTTXT)
print( results.dump() )
print results.REAPER_PROJECT.TRACK.keys()
print results.REAPER_PROJECT.TRACK.PANENV2
print results.REAPER_PROJECT.TRACK.PANENV2.ACT


prints out:

[['REAPER_PROJECT', '0.1', ['METRONOME', '6', '2.000000', ['SAMPLES',
'', '']], ['TRACK', ['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']],
['PANENV2', ['ACT', '1']]]]]
- REAPER_PROJECT: ['0.1', ['METRONOME', '6', '2.000000', ['SAMPLES',
'', '']], ['TRACK', ['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']],
['PANENV2', ['ACT', '1']]]]
- METRONOME: ['6', '2.000000', ['SAMPLES', '', '']]
- SAMPLES: ['', '']
- TRACK: [['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']], ['PANENV2',
['ACT', '1']]]
- MAINSEND: 1
- PANENV2: [['ACT', '1']]
- ACT: 1
- VOLENV2: [['ACT', '1']]
- ACT: 1
['PANENV2', 'MAINSEND', 'VOLENV2']
[['ACT', '1']]
1
 
G

gardsted

Paul said:
Sorry about your coffee cup! Would you be interested in a pyparsing
rendition?

-- Paul


from pyparsing import *

def defineGrammar():
ParserElement.setDefaultWhitespaceChars(" \t")

ident = Word(alphanums+"_")
LT,GT = map(Suppress,"<>")
NL = LineEnd().suppress()

real = Word(nums,nums+".")
integer = Word(nums)
quotedString = QuotedString('"')

dataValue = real | integer | Word(alphas,alphanums) | quotedString
dataDef = ident + ZeroOrMore(dataValue) + NL
tagDef = Forward()
tagDef << LT + ident + ZeroOrMore(dataValue) + NL + \
Dict(ZeroOrMore(Group(dataDef) | Group(tagDef))) + GT + NL
tagData = Dict(OneOrMore(Group(tagDef)))
return tagData

results = defineGrammar().parseString(TESTTXT)
print( results.dump() )
print results.REAPER_PROJECT.TRACK.keys()
print results.REAPER_PROJECT.TRACK.PANENV2
print results.REAPER_PROJECT.TRACK.PANENV2.ACT


prints out:

[['REAPER_PROJECT', '0.1', ['METRONOME', '6', '2.000000', ['SAMPLES',
'', '']], ['TRACK', ['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']],
['PANENV2', ['ACT', '1']]]]]
- REAPER_PROJECT: ['0.1', ['METRONOME', '6', '2.000000', ['SAMPLES',
'', '']], ['TRACK', ['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']],
['PANENV2', ['ACT', '1']]]]
- METRONOME: ['6', '2.000000', ['SAMPLES', '', '']]
- SAMPLES: ['', '']
- TRACK: [['MAINSEND', '1'], ['VOLENV2', ['ACT', '1']], ['PANENV2',
['ACT', '1']]]
- MAINSEND: 1
- PANENV2: [['ACT', '1']]
- ACT: 1
- VOLENV2: [['ACT', '1']]
- ACT: 1
['PANENV2', 'MAINSEND', 'VOLENV2']
[['ACT', '1']]
1

Thank You Paul - I am very interested.
In between drinking coffee and smashing coffee cups, I actually visited your site and my
impression was: wow, If I could only take the time instead of struggling with this
'almost there' re thing!
I am not that good at it actually, but working hard, not worrying about the cups to much...

I will now revisit pyparsing and learn!

I cheated a bit on you and read this: http://www.oreillynet.com/pub/au/2557.

I live in a little danish town, Svendborg, nice by the sea and all.
I learned steel construction in the 80's at the local shipyard,
(now closed), much later (96-98) I received a very short education in
IT-skills on a business school in Odense, the nearest city.
I spent the years 98-05 working for Maersk Data, later IBM.
From 05 and onwards independent.
Struggling hard to keep orders at a bare minimum,
I spend some of my spare time working with the elderly, and some of it
programming python for different purposes at home, and some of it playing
in the band: http://myspace.com/dementedk, and some of it combining the two.

So now You know more or less the same about me as I know about You.
Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top