Regular Expression for the special character "|" pipe


A

Aman Kashyap

I would like to create a regular expression in which i can match the "|" special character too.

e.g.

start=|ID=ter54rt543d|SID=ter54rt543d|end=|

I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.

By default python treat "|" as an OR operator.

But in my case I want to use to as a part of search string.
 
Ad

Advertisements

V

Vlastimil Brom

2014-05-27 12:59 GMT+02:00 Aman Kashyap said:
I would like to create a regular expression in which i can match the "|" special character too.

e.g.

start=|ID=ter54rt543d|SID=ter54rt543d|end=|

I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.

By default python treat "|" as an OR operator.

But in my case I want to use to as a part of search string.
--

Hi,
you can just escpape the pipe with backlash like any other metacharacter:

r"start=\|ID=ter54rt543d"

be sure to use the raw string notation r"...", or you can double all
backslashes in the string.

hth,
vbr
 
A

Aman Kashyap

Hi,

you can just escpape the pipe with backlash like any other metacharacter:



r"start=\|ID=ter54rt543d"



be sure to use the raw string notation r"...", or you can double all

backslashes in the string.



hth,

vbr


Thanks vbr for the quick response.

I have string = |SOH=|ID=re65dgt5dd|DS=fjkjf|SDID=fhkhkf|ID=fkjfkf|EOM=|

and want to replace 2 sub-strings
|ID=re65dgt5dd| with |ID=MAN|
|ID=fkjfkf| with |MAN|

I am using regular expression ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*|$

the output is |SOH=|ID=MAN|DS=fjkjf|SDID=MAN|ID=MAN|EOM=|ID=MAN

expected value is = |SOH=|ID=MAN|DS=fjkjf|SDID=fhkhkf|ID=MAN|EOM=|

could you please help me in this regard?
 
D

Daniel

What about skipping the re and try this:

'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]
 
A

Aman Kashyap

What about skipping the re and try this:



'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]



2014-05-27 12:59 GMT+02:00 Aman Kashyap <[email protected]>:

you can just escpape the pipe with backlash like any other metacharacter:



be sure to use the raw string notation r"...", or you can double all
backslashes in the string.


vbr

Thanks for the response.

I got the answer finally.

This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|
 
W

Wolfgang Maier

Thanks for the response.

I got the answer finally.

This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|

or, and more readable:

r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|'

This is what Vlastimil was talking about. It saves you from having to
escape the backslashes.
 
Ad

Advertisements

R

Roy Smith

Wolfgang Maier said:
On 27.05.2014 14:09, Vlastimil Brom wrote:

you can just escpape the pipe with backlash like any other metacharacter:

r"start=\|ID=ter54rt543d"

be sure to use the raw string notation r"...", or you can double all

backslashes in the string.
Thanks for the response.

I got the answer finally.

This is the regular expression to be
used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|

or, and more readable:

r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|'

This is what Vlastimil was talking about. It saves you from having to
escape the backslashes.

Sometimes what I do, instead of using backslashes, I put the problem
character into a character class by itself. It's a matter of personal
opinion which way is easier to read, but it certainly eliminates all the
questions about "how many backslashes do I need?"
r'[|]ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*[|]'

Another thing that can help make regexes easier to read is the VERBOSE
flag. Basically, it ignores whitespace inside the regex (see
https://docs.python.org/2/library/re.html#module-contents for details).
So, you can write something like:

pattern = re.compile(r'''[|]
ID=
[a-z]*
[0-9]*
[a-z]*
[0-9]*
[a-z]*
[|]''',
re.VERBOSE)

Or, alternatively, take advantage of the fact that Python concatenates
adjacent string literals, and write it like this:

pattern = re.compile(r'[|]'
r'ID='
r'[a-z]*'
r'[0-9]*'
r'[a-z]*'
r'[0-9]*'
r'[a-z]*'
r'[|]'
)
 
Ad

Advertisements

M

Mark Lawrence

What about skipping the re and try this:



'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]



2014-05-27 12:59 GMT+02:00 Aman Kashyap <[email protected]>:
I would like to create a regular expression in which i can match the "|" special character too.





I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.

By default python treat "|" as an OR operator.

But in my case I want to use to as a part of search string.


you can just escpape the pipe with backlash like any other metacharacter:



be sure to use the raw string notation r"...", or you can double all
backslashes in the string.


vbr

Thanks for the response.

I got the answer finally.

This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|

I'm pleased to see that you have answers. In return would you please
use the mailing list
https://mail.python.org/mailman/listinfo/python-list or read and action
this https://wiki.python.org/moin/GoogleGroupsPython to prevent us
seeing double line spacing and single line paragraphs, thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top