One more regular expressions question

V

Victor Polukcht

I have a couple of strings like:

Unassigned Number (1) 32
No Route To Destination (3) 12
Normal call clearing (16) 2654
User busy (17) 630
No user respond (18) 5
User alerting no answer (19) 16
Call rejected (21) 3
Destination out of order (27) 1
Invalid number format (28) 32
Normal unspecified (31) 32
No channel available (34) 2
Temporary failure (41) 11
Switching equipment congestion (42) 4
Resource unavailable unspecified (47) 2
Bearer capability not authorized (57) 73
Incomp. dest. / Non-existent CUG (88) 1
Recovery on timer expiry (102) 2
Interworking, unspecified (127) 5

I need to get:
Error code (value in brackets) - Value - Message.

My actual problem is i can't get how to include space, comma, slash.
 
R

Roberto Bonvallet

Victor said:
My actual problem is i can't get how to include space, comma, slash.

Post here what you have written already, so we can tell you what the
problem is.
 
P

Peter Otten

Victor said:
I have a couple of strings like:

Unassigned Number (1) 32
No Route To Destination (3) 12
Normal call clearing (16) 2654
User busy (17) 630
No user respond (18) 5
User alerting no answer (19) 16
Call rejected (21) 3
Destination out of order (27) 1
Invalid number format (28) 32
Normal unspecified (31) 32
No channel available (34) 2
Temporary failure (41) 11
Switching equipment congestion (42) 4
Resource unavailable unspecified (47) 2
Bearer capability not authorized (57) 73
Incomp. dest. / Non-existent CUG (88) 1
Recovery on timer expiry (102) 2
Interworking, unspecified (127) 5

I need to get:
Error code (value in brackets) - Value - Message.

My actual problem is i can't get how to include space, comma, slash.

The following solution is 100% regex-free :)
Unassigned Number (1) 32
No Route To Destination (3) 12
Normal call clearing (16) 2654
User busy (17) 630
No user respond (18) 5
User alerting no answer (19) 16
Call rejected (21) 3
Destination out of order (27) 1
Invalid number format (28) 32
Normal unspecified (31) 32
No channel available (34) 2
Temporary failure (41) 11
Switching equipment congestion (42) 4
Resource unavailable unspecified (47) 2
Bearer capability not authorized (57) 73
Incomp. dest. / Non-existent CUG (88) 1
Recovery on timer expiry (102) 2
Interworking, unspecified (127) 5
[int(line.rsplit("(", 1)[1].split(")", 1)[0]) for line in
lines.splitlines()]
[1, 3, 16, 17, 18, 19, 21, 27, 28, 31, 34, 41, 42, 47, 57, 88, 102, 127]

Peter
 
V

Victor Polukcht

My pattern now is:

(?P<var1>[^(]+)(?P<var2>\d+)\)\s+(?P<var3>\d+)

And i expect to get:

var1 = "Unassigned Number "
var2 = "1"
var3 = "32"

I'm sure my regexp is incorrect, but can't understand where exactly.

Regex.debug shows that even the first block is incorrect.

Thanks in advance.
 
D

Daniele Varrazzo

Victor said:
I have a couple of strings like:

Unassigned Number (1) 32 [...]
Interworking, unspecified (127) 5

I need to get:
Error code (value in brackets) - Value - Message.

My actual problem is i can't get how to include space, comma, slash.

Probably you have some escaping problem. The substitution:

re.sub(r"^(.*)\s*\((\d+)\)\s+(\d+)", r'\2 - \3 - \1', row)

does the required job (where "row" is one of your lines)

To match a special character, such as "(", you need to escape it with a
"\", because it has a special meaning in the regexp syntax. Because "\"
is the escaping mechanism for Python strings too, you better use raw
strings to specify the pattern.

Other special character/groups matching patterns, such as "\s" to
specify whitespaces, are documented, together with everything else you
need, at http://docs.python.org/lib/re-syntax.html

HTH

Daniele
 
H

harvey.thomas

Victor said:
My pattern now is:

(?P<var1>[^(]+)(?P<var2>\d+)\)\s+(?P<var3>\d+)

And i expect to get:

var1 = "Unassigned Number "
var2 = "1"
var3 = "32"

I'm sure my regexp is incorrect, but can't understand where exactly.

Regex.debug shows that even the first block is incorrect.

Thanks in advance.

problem is.

You are missing \( after the first group. The RE should be:

'(?P<var1>[^(]+)\((?P<var2>\d+)\)\s+(?P<var3>\d+)'
 
V

Victor Polukcht

Great thanks.

You post helped me so much!

My resulting regexp is:
"(?P<var1>^(.*)\s*)\(((?P<var2>\d+))\)\s+((?P<var3>\d+))"

Victor said:
I have a couple of strings like:
Unassigned Number (1) 32 [...]
Interworking, unspecified (127) 5
I need to get:
Error code (value in brackets) - Value - Message.
My actual problem is i can't get how to include space, comma, slash.Probably you have some escaping problem. The substitution:

re.sub(r"^(.*)\s*\((\d+)\)\s+(\d+)", r'\2 - \3 - \1', row)

does the required job (where "row" is one of your lines)

To match a special character, such as "(", you need to escape it with a
"\", because it has a special meaning in the regexp syntax. Because "\"
is the escaping mechanism for Python strings too, you better use raw
strings to specify the pattern.

Other special character/groups matching patterns, such as "\s" to
specify whitespaces, are documented, together with everything else you
need, athttp://docs.python.org/lib/re-syntax.html

HTH

Daniele
 
J

Jussi Salmela

Victor Polukcht kirjoitti:
Great thanks.

You post helped me so much!

My resulting regexp is:
"(?P<var1>^(.*)\s*)\(((?P<var2>\d+))\)\s+((?P<var3>\d+))"

If it doesn't have to be a regex:

#===================================================
s = '''\
Unassigned Number (1) 32
No Route To Destination (3) 12
Normal call clearing (16) 2654
User busy (17) 630
No user respond (18) 5
User alerting no answer (19) 16
Call rejected (21) 3
Destination out of order (27) 1
Invalid number format (28) 32
Normal unspecified (31) 32
No channel available (34) 2
Temporary failure (41) 11
Switching equipment congestion (42) 4
Resource unavailable unspecified (47) 2
Bearer capability not authorized (57) 73
Incomp. dest. / Non-existent CUG (88) 1
Recovery on timer expiry (102) 2
Interworking, unspecified (127) 5
'''

for row in s.split('\n')[:-1]:
var1, var2 = row.split('(')
var2, var3 = var2.split()
var2 = var2[:-1]
print var2, var3, var1
#===================================================

Cheers,
Jussi
 
D

Daniele Varrazzo

Victor said:
Great thanks.

You post helped me so much!

My resulting regexp is:
"(?P<var1>^(.*)\s*)\(((?P<var2>\d+))\)\s+((?P<var3>\d+))"

Notice that this way you are including trailing whitespaces in the var1
group. You may want to put the "\s*" outside the parenthesis.

mmm... in this case you should make the ".*" in the first group
non-greedy. r"^(?P<var1>.*?)\s*\(((?P<var2>\d+))\)\s+((?P<var3>\d+))"
does the job.

Bye

Daniele
 
N

Neil Cerutti

My pattern now is:

(?P<var1>[^(]+)(?P<var2>\d+)\)\s+(?P<var3>\d+)

And i expect to get:

var1 = "Unassigned Number "
var2 = "1"
var3 = "32"

I'm sure my regexp is incorrect, but can't understand where
exactly.

Break it up using verbose notation to help yourself. Also, use
more helpful names. With names like var1 and var2 you might as
well not used named groups.

r = re.compile(r"""(?x)
(?P<error> [^(]+ )
(?P<errno> \d+ )
\)
\s+
(?P<lineno> \d+ )""")

This way it's clearer that there's a \) with no matching \(.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top