# Re: Differential pattern match

Discussion in 'Perl Misc' started by Graham S, Oct 4, 2012.

1. ### Graham SGuest

I haven't had time to check this out fully yet, but the following is a quick
guide to guitar tabs:

The six strings of a guitar, from lowest to highest pitch, are tuned EADGBE.
Thus a tab of 333x22 would mean 3rd fet on the E, A and D strings, the G
string muted (not played) and the 2nd fret on the B and E strings. This is
the normal tablature for chords where all the notes are from up to but not
exceeding the 9th fret. Beond the 9th fret, one or two spaces are left to
make it clear which strings are being played and at what fret. Thus '999x
10x' would mean 9th fret on the E, A & D strings, the G and E strings muted,
and 10th fret on the B string. However, some tab writers might equally show
this as '999x 10 x' or '9 9 9 x 10 x '. Whatever the denotation, there
should always be a reference to all 6 strings. Your reference to 'x11 10'
would therefore mean the (low) E string muted, 1st fret on the A & D
strings, 10th fret on the G string, but would be an incomplete tab as it
doesn't give the fretting for the B and (high) E strings (I also doubt if it
could be played as the person who could bridge his fingers between the 1st
and 10th frets probably hasn't been born yet . Similarly your 'x999 10 11
12' chord is actually a tab for a seven string instrument. No problem
though - as you say, you don't play guitar.

It's important to appreciate that one x (or X) , or one digit (0-9), or two
digits (11-24), can all refer to any one string. I'm not at all worried if
the regex extends up to 29 (in fact there might be the odd 1 in a million
guitars that extends to a 29th fret).

I don't have a problem with working out what number or X refers to what
string and am enirely confident I can split an indvidual chord tab
correctly. My problem is finding a flexible pattern match to correctly
identify when a chord is a chord.

Will check out you proposed regex ASAP, but meanwhile if you've got any
further ideas having read the above, I'd be delighted to hear them.

Graham S

"Ben Morrow" <> wrote in message
news:...
>
> Quoth "Graham" <>:
>> I need a pattern match that will match all of the following (and other
>> variants too)
>>
>> x1234x X99 10 10x x 12 12 12 12 12
>>
>> The characters refer to chord fingering on a 6 string guitar. Up to the
>> 9th
>> fret, the figering is usually tabulated as on the left above, but beyond
>> this one or two spaces are usually left between double figures encounted
>> from the 10th-24th frets.

>
> Patterns of the first and third type (assuming I've understood correctly
> where Xs can appear: I don't play the guiter) can be matched with
> something like
>
> m[
> ([xX])? (?:
> ([1-9]){0,4} |
> [ ]{0,2} (?: ([12]?[0-9]) [ ]{1,2} ){1,4}
> ) ([xX])?
> ]x
>
> Note that this will match the empty string, which probably isn't right;
> are 'x' or 'xx' alone valid? If not you can change the 'compact' part of
> the match to ([1-9]){1,4} to require at least one digit somewhere. Also
> note that this will allow fret numbers up to 29; if that matters you
> could change the 'extended' fret number match to (2[0-4]|1?[0-9]).
>
> Including patterns of the second type is harder. How do you know whether
> 'x11 10' is 'two fingers on 1 and one on 10' or 'one finger on 11 and
> one on 10'? Is there always a space after an initial 'x' if the first
> group of digits is a single fret number? (Is there always an initial
> 'x', in which case the pattern above could have been simplified?)
>
> If a pattern with no initial 'clustered' digits always starts with
> 'x ' then we can use that to distinguish:
>
> m[
> (?: ([Xx]) | ([Xx])? ([1-9]){0,4} )
> (?: [ ]{1,2} ([12]?[0-9]) ){1,4}
> ([Xx])?
> ]x
>
> All the points in the second paragraph above appy here, as well; in
> addition, this will allow patterns for chords with more than four
> fingers, such as 'x999 10 11 12'. That can't be sensibly fixed within
> the regex, but can be easily checked by counting the capture groups
> afterwards.
>
> Ben
>

Graham S, Oct 4, 2012

2. ### Rainer WeikusatGuest

"Graham S" <> writes:
> The six strings of a guitar, from lowest to highest pitch, are tuned EADGBE.
> Thus a tab of 333x22 would mean 3rd fet on the E, A and D strings, the G
> string muted (not played) and the 2nd fret on the B and E strings. This is
> the normal tablature for chords where all the notes are from up to but not
> exceeding the 9th fret. Beond the 9th fret, one or two spaces are left to
> make it clear which strings are being played and at what fret. Thus '999x
> 10x' would mean 9th fret on the E, A & D strings, the G and E strings muted,
> and 10th fret on the B string. However, some tab writers might equally show
> this as '999x 10 x' or '9 9 9 x 10 x '.

So what's '11x 11x'? Can this be decided without examining the
complete input? Examples of that:

11x 11x3
11x 11x 33
11x 11x
11x 11x 11x

Rainer Weikusat, Oct 5, 2012

3. ### Rainer WeikusatGuest

"Graham" <> writes:

[...]

> '11x 11x3', if it was anything, would be
> 1-lowE string
> 1-A string
> x-D string
> 11-G string
> x-B string
> 3-highE string
> but unplayable due to the stretch between the 3rd and 11th frets (and
> assuming the a capo was used to barre at the first fret)
>
> '11x 11x 33' if it was anything would be
> 11-lowE string
> x-A string
> 11-D string
> x-G string
> 3-B string
> 3-highE string
> but again unplayable due to the stretch between the 3rd and 11th frets

[...]

> My brain can figure that out pretty easily. What it can't do is figure out a
> pattern match that accounts for all the variables.

And I assume that 11x 11x33 would be the same as 11x 11x 33. If this
is true, the sequence 11x 11x3 can mean 1 1 x 11 x 3 or 11 x 11 x 3,
depending on what, if anything comes after the final 3: Either this
needs to be parsed back to front and not front to back or it needs to
do a real exhaustive search with backtracking, based on knowing the
meaning/ semantics of the input, not just the patterns occuring in it.
And I don't think this can be done with regular expressions (I could,
of course, be wrong).

Rainer Weikusat, Oct 5, 2012
4. ### Justin CGuest

On 2012-10-05, Graham <> wrote:
>

[snip]
> '11x 11x 33' if it was anything would be
> 11-lowE string
> x-A string
> 11-D string
> x-G string
> 3-B string
> 3-highE string
> but again unplayable due to the stretch between the 3rd and 11th frets

I'm not conviced this is unplayable. Tab allows for, AIUI, two hands
on the neck with hammering/tapping techniques. Though maybe the
notation is different for 'tricks' like that... it's been a long time
since I played seriously... or even mucking about.

Justin.

--
Justin C, by the sea.

Justin C, Oct 8, 2012
5. ### Martin StrÃ¶mbergGuest

Graham <> wrote:
> Apart from the following line - which I obviously wouldn't expect to match
> 8X998X 7X778X 6X776X 5X556X AXBBAX 9X99AX
> 8X998X 7X778X 6X776X 5X556X AXBBAX 9X99AXDon't
> know what's going on there with the As and the B's - maybe it a typo, or
> maybe it means something I'm not aware of (I'm no guitarist either!)

I guess it's a computerliterate guitarist that wrote those. "A" is 10
and "B" is 11.

(I _am_ a (non-professional) guitarist. But alas I've never read the
above format.)

--
MartinS

Martin StrÃ¶mberg, Oct 8, 2012
6. ### J. GleixnerGuest

On 10/11/12 12:09, Graham wrote:

> Don't know if you're still about Ben, but pulling out the string numbers is
> proving tricky. The following code (predominately yours)
>
> \$string = '(\b[12][0-9]\b|[0-9Xx])[]*';
> \$chord = \$string*6;

^ --- that should be x not *

'x' is the repetition operator.

See Multiplicative Operators in perldoc perlop.

A stare and compare was easy enough, but 'use warnings' would have
pointed it out too.

J. Gleixner, Oct 11, 2012
7. ### Justin CGuest

On 2012-10-12, Graham <> wrote:
> Yes, I've gotten lazy and not been using 'strict' and 'warnings'. Apologies.
> Will backtrack and amend just as soon as I've sorted this problem (promise!)

No, do it now. Get valid code first and then work out
why it doesn't do what you want. It's much easier
that way than to bend invalid code to fit your
purpose and then try and make it valid.

Justin.

--
Justin C, by the sea.

Justin C, Oct 12, 2012
8. ### Rainer WeikusatGuest

"Graham" <> writes:

[...]

>>> > my \$string = '(\b [12][0-9] \b | [1-9Xx]) [ ]*';

[...]

>>> String = (\b[12][0-9]\b|[0-9Xx])[]*

[...]

> my \$string = '(\b [12][0-9] \b | [1-9Xx]) [ ]*';

[...]

> /(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] ) [ ]{0,2} ){6}/x

[...]

> \$string = '(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][ ]{0,2}';

[...]

> String = [(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][ ]{0,2}]
> Chord =
> [(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][ ]{0,2}(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][
> ]{0,2}(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][ ]{0,2}(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][
> ]{0,2}(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][ ]{0,2}(?<=[Xx])|\b[12][0-9](?=[Xx])|\b|[xX0-9][
> ]{0,2}]

[...]

> String = [(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )]
> Chord = [(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )(?: (?: (??<=[Xx])|\b) [12][0-9] (??=[Xx])|\b) | [xX0-9] )
> [ ]{0,2} )]

[...]

> Any thoughts on where I'm going wrong?

You're wrongly assuming that this sequence of expanding line noise
will 'magically' turn into sensible code which happens to produce the
desired result if you just keep adding more {@!!"~Â£Â£\$&&&!!```Honk!
Honk! Honk! Â--\$%& to it. In the meantime, you could have implemented
a parser which actually solves your problem three times over, if you
weren't so hell-bent on "five hours of typing to save ten minutes of
thinking".

Rainer Weikusat, Oct 12, 2012