reg exp and octal notation

L

Lucas Branca

Could someone explain me the difference between the results below?

## $cat octals.txt
## \006\034abc

import re

a= "\006\034abc"
preg= re.compile(r'([\0-\377]*)')
res = preg.search(a)
print res.groups()

loader = open('./octals.txt', 'r')
b = loader.readline()
preg= re.compile(r'([\0-\377]*)')
res = preg.search(b)
print res.groups()


RESULTS

('\x06\x1cabc',)

('\\006\\034abc\n',)


Many thanks
Lucas
 
R

Ruud de Jong

Lucas Branca schreef:
Could someone explain me the difference between the results below?

## $cat octals.txt
## \006\034abc

import re

a= "\006\034abc"
preg= re.compile(r'([\0-\377]*)')
res = preg.search(a)
print res.groups()

loader = open('./octals.txt', 'r')
b = loader.readline()

Look at the value of b at this point, you'll see:'\\006\\034abc\n'

In other words, the backslashes are seen as literal backslashes.
readline() does no evaluation of the string, it just copies the
characters.

Regards,

Ruud
preg= re.compile(r'([\0-\377]*)')
res = preg.search(b)
print res.groups()


RESULTS

('\x06\x1cabc',)

('\\006\\034abc\n',)


Many thanks
Lucas
 
P

Peter Otten

Lucas said:
Could someone explain me the difference between the results below?

## $cat octals.txt
## \006\034abc

import re

a= "\006\034abc"
preg= re.compile(r'([\0-\377]*)')
res = preg.search(a)
print res.groups()

loader = open('./octals.txt', 'r')
b = loader.readline()
preg= re.compile(r'([\0-\377]*)')
res = preg.search(b)
print res.groups()


RESULTS

('\x06\x1cabc',)

('\\006\\034abc\n',)

a and b are two entirely different strings. Whatever similarity there
appears to be is an artifact of Python's treatment of escape sequences -
only in source code not in an arbitrary file.

Your literal string:
'\x06\x1c\n'

What you read from the text file:
'\\006\\034\n'

Maybe it helps to learn what's really inside these two strings, so let's
have a look at the ascii codes:
map(ord, s) [6, 28, 10]
map(ord, t)
[92, 48, 48, 54, 92, 48, 51, 52, 10]

Another example: in source code you can write the newline as
('\n', '\n', '\n', '\n')

But if read from a file \n, \x0a, \012 would just be sequences of two or
four characters.

Only when you have understood the above you should return to regular
expressions. Your regexp always matches the whole string - i. e. is
redundant (and probably not what you want, but that you would need to
explain in another post).

[\0-\377] is just a fancy way of writing "match any character"
* means "repeat the preceding as often as you want" (including zero times)

Peter
 
L

Lucas Branca

-- snip --
--snip --
In other words, the backslashes are seen as literal backslashes.
readline() does no evaluation of the string, it just copies the
characters

yeah... you are right guys. I have matched two problems
reg exp are innocents .

Ok. Let's say so:
I have to read each line of a file and strip a particular string from there
(a string containing octal notation too)

the problem is actually the file.readline() that doesn't return
what I was expected to.

pardon my 'newbyeeeee' but is there a way to read a line xy from that file
and obtaining:

line xy: \006\034abc

('\x06\x1cabc',)

and not every single char in it like now ?
('\\006\\034abc\n',)

(before I start to reinvent the wheel ....... :) )

Thank you
Lucas
 
J

Jeff Epler

If you have a string and want to perform backslash-substitution on it,
use python2.3's "string_escape" codec.

Two examples:
'0'

You can remove the trailing newline this way:
if s.endswith("\n"): s = s[:-1]

Jeff
 
L

Lucas Branca

Great!
It's just what I was looking for.
(...and I read it in "what's new" this morning ......
.... "boing boing" with my head now ... :) )

Thank you very much



Jeff Epler said:
If you have a string and want to perform backslash-substitution on it,
use python2.3's "string_escape" codec.

Two examples:
'0'

You can remove the trailing newline this way:
if s.endswith("\n"): s = s[:-1]

Jeff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top