re.match and non-alphanumeric characters

The Web President · Nov 16, 2008

Dear all,

this is really driving me nuts and any help would be extremely
appreciated.

I have a string that contains some numeric data. I want to isolate
these data using re.match, as follows.

bogus = "IFC(35m)"
data = re.match(r'(\d+)',bogus)
print data.group(1)

I would expect to have "35" printed out to screen, but instead I get
an error that the regular expression did not match:

Traceback (most recent call last):
File "C:\Documents and Settings\Mattia\Desktop\Neeltje\read.py",
line 20, in <module>
print data.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Note that the same holds if I look for "35" straight, instead of "\d
+". If instead I look for "IFC" it works fine. That is, apparently
re.match will match only up to the first non-alphanumeric character
and ignore anything after a "(", "_", "[" and god knows what else.

I am using Python 2.6 (r26:66721, latest stable version). Am I missing
something very big and very important?

r · Nov 16, 2008

Dear all,

this is really driving me nuts and any help would be extremely
appreciated.

I have a string that contains some numeric data. I want to isolate
these data using re.match, as follows.

bogus = "IFC(35m)"
data = re.match(r'(\d+)',bogus)
print data.group(1)

I would expect to have "35" printed out to screen, but instead I get
an error that the regular expression did not match:

Traceback (most recent call last):
File "C:\Documents and Settings\Mattia\Desktop\Neeltje\read.py",
line 20, in <module>
print data.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Note that the same holds if I look for "35" straight, instead of "\d
+". If instead I look for "IFC" it works fine. That is, apparently
re.match will match only up to the first non-alphanumeric character
and ignore anything after a "(", "_", "[" and god knows what else.

I am using Python 2.6 (r26:66721, latest stable version). Am I missing
something very big and very important?

try re.search or re.findall
re.match is only at the beginning of a string
i almost never use it(4, 6)

MRAB · Nov 16, 2008

Dear all,

this is really driving me nuts and any help would be extremely
appreciated.

I have a string that contains some numeric data. I want to isolate
these data using re.match, as follows.

bogus = "IFC(35m)"
data = re.match(r'(\d+)',bogus)
print data.group(1)

I would expect to have "35" printed out to screen, but instead I get
an error that the regular expression did not match:

Traceback (most recent call last):
File "C:\Documents and Settings\Mattia\Desktop\Neeltje\read.py",
line 20, in <module>
print data.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Note that the same holds if I look for "35" straight, instead of "\d
+". If instead I look for "IFC" it works fine. That is, apparently
re.match will match only up to the first non-alphanumeric character
and ignore anything after a "(", "_", "[" and god knows what else.

I am using Python 2.6 (r26:66721, latest stable version). Am I missing
something very big and very important?

re.match() anchors the match at the start of the string. What you need
is re.search(). It's all in the documentation!

Gabriel Genellina · Nov 16, 2008

En Sun, 16 Nov 2008 14:33:42 -0200, The Web President

I have a string that contains some numeric data. I want to isolate
these data using re.match, as follows.

bogus = "IFC(35m)"
data = re.match(r'(\d+)',bogus)
print data.group(1)

I would expect to have "35" printed out to screen, but instead I get
an error that the regular expression did not match:

http://docs.python.org/library/re.html#matching-vs-searching

Diez B. Roggisch · Nov 16, 2008

The said:
Dear all,

this is really driving me nuts and any help would be extremely
appreciated.

I have a string that contains some numeric data. I want to isolate
these data using re.match, as follows.

bogus = "IFC(35m)"
data = re.match(r'(\d+)',bogus)
print data.group(1)

I would expect to have "35" printed out to screen, but instead I get
an error that the regular expression did not match:

Traceback (most recent call last):
File "C:\Documents and Settings\Mattia\Desktop\Neeltje\read.py",
line 20, in <module>
print data.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Note that the same holds if I look for "35" straight, instead of "\d
+". If instead I look for "IFC" it works fine. That is, apparently
re.match will match only up to the first non-alphanumeric character
and ignore anything after a "(", "_", "[" and god knows what else.

I am using Python 2.6 (r26:66721, latest stable version). Am I missing
something very big and very important?

Yep - re.search. Match matches the whole string. You want searching.

Diez

John Machin · Nov 16, 2008

Match matches the whole string.

*ONLY* if the pattern ends with "$" or r"\Z"

Diez B. Roggisch · Nov 16, 2008

John said:
*ONLY* if the pattern ends with "$" or r"\Z"

You think so?

import re

rex = re.compile("abc.*def")

if rex.match("abc0123455678def"):
print "matched"

Diez

Steve Holden · Nov 16, 2008

Diez said:
You think so?

import re

rex = re.compile("abc.*def")

if rex.match("abc0123455678def"):
print "matched"

Your test is inconclusive: necessary, but not sufficient.
.... print "Matched"
....
Matched
regards
Steve

John Machin · Nov 17, 2008

You think so?

import re

rex = re.compile("abc.*def")

if rex.match("abc0123455678def"):
print "matched"

OK, I'll try again:

The following 3-tuples represent (pattern, string,
matched_portion_of_string):
('abc', 'abc', 'abc')
('abc', 'abcdef', 'abc')
('abc$', 'abc', 'abc')
('abc$', 'abcdef', '<no match>')

Saying "Match matches the whole string" is incorrect; see the second
case. If you want to ensure that the whole string matches the pattern,
the pattern needs to be terminated by "$" or "\Z".

datetime.strptime w/ non-UTC and non-local TZs?	0	Aug 15, 2011
Pyautogui, cv2 and cannot find image	0	Feb 7, 2023
Ifs and assignments	0	Jan 2, 2014
Question regarding lists and regex	2	Nov 9, 2006
Index of first and last non-"\xff" in a long string	7	Nov 12, 2007
generate and send mail with python: tutorial	8	Aug 11, 2011
GET and POST dropping wildcard characters in DB query	3	Jan 17, 2006
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012

re.match and non-alphanumeric characters

The Web President

r

MRAB

Gabriel Genellina

Diez B. Roggisch

John Machin

Diez B. Roggisch

Steve Holden

John Machin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads