# group 0 in the re module

Discussion in 'Python' started by Yingjie Lan, Dec 8, 2010.

1. ### Yingjie LanGuest

Hi,

According to the doc, group(0) is the entire match.

>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m.group(0) # The entire match 'Isaac Newton'

But if you do this:
>>> import re
>>> re.sub(r'(\d{3})(\d{3})', r'\0 to \1-\2', '757234')

'\x00 to 757-234'

where I expected
'757234 to 757-234'

Then I found that in python re '\0' is considered an octal number.
So, is there anyway to refer to the entire match by an escaped
notation?

Thanks,

Yingjie

Yingjie Lan, Dec 8, 2010

2. ### J. GerlachGuest

Am 08.12.2010 03:23, schrieb Yingjie Lan:
> Hi,
>
> According to the doc, group(0) is the entire match.
>
>>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>>> m.group(0) # The entire match 'Isaac Newton'

>
> But if you do this:
>>>> import re
>>>> re.sub(r'(\d{3})(\d{3})', r'\0 to \1-\2', '757234')

> '\x00 to 757-234'
>
> where I expected
> '757234 to 757-234'
>
> Then I found that in python re '\0' is considered an octal number.
> So, is there anyway to refer to the entire match by an escaped
> notation?
>
> Thanks,
>
> Yingjie
>

the documentation of the re module says:

> \g<number> uses the corresponding group number; \g<2> is
> therefore equivalent to \2, but isnâ€™t ambiguous in a replacement such
> as \g<2>0. \20 would be interpreted as a reference to group 20, not a
> reference to group 2 followed by the literal character '0'. The
> backreference \g<0> substitutes in the entire substring matched by
> the RE.

... so you're looking for r"\g<0> to \1-\2"

J. Gerlach, Dec 8, 2010