regex: splitting conditionally on |

C

computorist

I'm parsing mediawiki markup and I'd like to split a multi-line string
on | (vertical-bar), but only if it isn't contained w/in another
pattern.

| name1 = value1
| name2 = value2 | name3 = value3
| name4 = [[foo|bar]]

I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
value2','name3 = value3','name4 = [[foo|bar]]']

I've working on a negative look-ahead pattern without success and my
brain is tired.

Thanks for any suggestions
 
R

Robert Klemme

I'm parsing mediawiki markup and I'd like to split a multi-line string
on | (vertical-bar), but only if it isn't contained w/in another
pattern.

| name1 = value1
| name2 = value2 | name3 = value3
| name4 = [[foo|bar]]

I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
value2','name3 = value3','name4 = [[foo|bar]]']

I've working on a negative look-ahead pattern without success and my
brain is tired.

Thanks for any suggestions

Without testing, something like this might work:

str.scan %r{
(?: \[ [^\]]* \] | [^|] )+
}xm

Cheers

robert
 
S

Sebastian Hungerecker

computorist said:
I'm parsing mediawiki markup and I'd like to split a multi-line string
on | (vertical-bar), but only if it isn't contained w/in another
pattern.

| name1 = value1
| name2 = value2 | name3 = value3
| name4 = [[foo|bar]]

I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
value2','name3 = value3','name4 = [[foo|bar]]']

I've working on a negative look-ahead pattern without success and my
brain is tired.

Thanks for any suggestions

str.split /\s*\|\s*(?=\s*\w+\s*=)/
I don't know whether this exactly meets your requirement, but this will split
only on |s that are followed by a word and a =. For your sample input that
gives the desired result.

HTH,
Sebastian
 
C

computorist

computorist said:
I'm parsing mediawiki markup and I'd like to split a multi-line string
on | (vertical-bar), but only if it isn't contained w/in another
pattern.
| name1 = value1
| name2 = value2 | name3 = value3
| name4 = [[foo|bar]]
I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
value2','name3 = value3','name4 = [[foo|bar]]']
I've working on a negative look-ahead pattern without success and my
brain is tired.
Thanks for any suggestions

str.split /\s*\|\s*(?=\s*\w+\s*=)/
I don't know whether this exactly meets your requirement, but this will split
only on |s that are followed by a word and a =. For your sample input that
gives the desired result.

HTH,
Sebastian

This seems to work well. Searching for what I want is better than
excluding the bits I don't want.

Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top