Raw String Question

Jim Garrison · Mar 12, 2009

I'm an experienced Perl developer learning Python, but I seem to
be missing something about raw strings. Here's a transcript of
a Python shell session:

Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.

****************************************************************
Personal firewall software may warn about the connection IDLE
makes to its subprocess using this computer's internal loopback
interface. This connection is not visible on any external
interface and no data is sent to or received from the Internet.
****************************************************************

IDLE 3.0 'a\\"'

It seems the parser is interpreting the backslash as an escape
character in a raw string if the backslash is the last character.
Is this expected?

Tim Chase · Mar 12, 2009

r"a\"

SyntaxError: EOL while scanning string literal (<pyshell#45>, line 1)

It seems the parser is interpreting the backslash as an escape
character in a raw string if the backslash is the last character.
Is this expected?

Yep...as documented[1], "even a raw string cannot end in an odd
number of backslashes".

-tkc

[1]
http://docs.python.org/reference/lexical_analysis.html

Jim Garrison · Mar 12, 2009

Tim said:
SyntaxError: EOL while scanning string literal (<pyshell#45>, line 1)

It seems the parser is interpreting the backslash as an escape
character in a raw string if the backslash is the last character.
Is this expected?

Click to expand...

Yep...as documented[1], "even a raw string cannot end in an odd number
of backslashes".

-tkc

[1]
http://docs.python.org/reference/lexical_analysis.html

OK, I'm curious as to the reasoning behind saying that

When an 'r' or 'R' prefix is present, a character following a
backslash is included in the string without change, and all
backslashes are left in the string.

which sounds reasonable, but then saying in effect "Oh wait, let's
introduce a special case and make it impossible to have a literal
backslash as the last character of a string without doubling it".

So you have a construct (r'...') whose sole reason for existence
is to ignore escapes, but it REQUIRES an escape mechanism for one
specific case (which comes up frequently in Windows pathnames).

I would suggest that this is pathologically inconsistent (now donning
my flameproof underwear "-)

At the very least the "all backslashes are left in the string" quote
from the Lexical Analysis page (rendered in italics no less) needs to
be reworded to include the exception instead of burying this in a
parenthetical side-comment.

Jim Garrison · Mar 12, 2009

Tim said:
SyntaxError: EOL while scanning string literal (<pyshell#45>, line 1)

It seems the parser is interpreting the backslash as an escape
character in a raw string if the backslash is the last character.
Is this expected?

Click to expand...

Yep...as documented[1], "even a raw string cannot end in an odd number
of backslashes".

So how do you explain this?
"a\\'b"

The backslash is kept, but it causes the following quote to be escaped.

Albert Hopkins · Mar 12, 2009

Yep...as documented[1], "even a raw string cannot end in an odd number
of backslashes".

Click to expand...

So how do you explain this?
"a\\'b"

That doesn't "end in an odd number of backslashes."

Python is __repr__esenting a raw string as a "regular" string.
Literally they are equivalent:
True

Krishnakant · Mar 12, 2009

Hello all specially John and Terry.

I finally got my way around odfpy and could manage the spreadsheet to
some extent.

However I now have a small but unexpected problem.

I would be very happy if some one could help me understand why is the
text not getting centered in the spreadsheet I create.

The cell merging is happening but no text centering in those merged
cells.

If any one is interested I can send my part of code snippid.
to just tell in short, it just has the sudo code as
create document

create a style to set centered text

create table and add rows to which cells are added.
the cell has a p (paragraph ) element with the style of centered text
applied.

cells are merged

but no centering happens.

Please let me know if any one wanted me to send the code off the list.

Even better, if some one has a code snippid which can just do that.

happy hacking.
Krishnakant.

John Machin · Mar 12, 2009

Hello all specially John and Terry.

Hello again Krishnakant,

I see that you haven't had the evil spirits exorcised from your mail/
news client ... it's hijacked a thread again :-(

I finally got my way around odfpy and could manage the spreadsheet to
some extent.

However I now have a small but unexpected problem.

I would be very happy if some one could help me understand why is the
text not getting centered in the spreadsheet I create.

The cell merging is happening but no text centering in those merged
cells.

If any one is interested I can send my part of code snippid.
to just tell in short, it just has the sudo code as
create document

create a style to set centered text

create table and add rows to which cells are added.
the cell has a p (paragraph ) element with the style of centered text
applied.

cells are merged

but no centering happens.

Please let me know if any one wanted me to send the code off the list.

Even better, if some one has a code snippid which can just do that.

You might like to try:
(a) checking that you can get text centred in an UNmerged cell
(b) using Calc, creating a small ods file with your desired
formatting, then comparing the XML in that ods file with the one your
script has created
(c) contacting the author/maintainer of the odfpy package

Miles · Mar 12, 2009

OK, I'm curious as to the reasoning behind saying that

Â When an 'r' or 'R' prefix is present, a character following a
Â backslash is included in the string without change, and all
Â backslashes are left in the string.

which sounds reasonable, but then saying in effect "Oh wait, let's
introduce a special case and make it impossible to have a literal
backslash as the last character of a string without doubling it".

That's not a special case; that's the *opposite* of a special case.

So you have a construct (r'...') whose sole reason for existence
is to ignore escapes, but it REQUIRES an escape mechanism for one
specific case (which comes up frequently in Windows pathnames).

The backslash still IS an escape character, it just behaves
differently than it does for a non-raw string.

At the very least the "all backslashes are left in the string" quote
from the Lexical Analysis page (rendered in italics no less) needs to
be reworded to include the exception instead of burying this in a
parenthetical side-comment.

There is no exception. All backslashes are left in the string. The
impossibility of ending a raw string in an unescaped backslash is also
rendered in italics.

-Miles

MRAB · Mar 12, 2009

Jim said:
Tim said:

r"a\"
SyntaxError: EOL while scanning string literal (<pyshell#45>, line 1)

It seems the parser is interpreting the backslash as an escape
character in a raw string if the backslash is the last character.
Is this expected?

Click to expand...

Yep...as documented[1], "even a raw string cannot end in an odd number
of backslashes".

Click to expand...

So how do you explain this?
"a\\'b"

The backslash is kept, but it causes the following quote to be escaped.

(The following examples are from Python 2.x.)

The other special case is with \u in a Unicode string:
u'A'

However, \x isn't special:
u'\\x41'

and \u isn't a recognised escape sequence in a bytestring:
'\\u0041'

MRAB · Mar 13, 2009

andrew said:
MRAB wrote:
[...]

The other special case is with \u in a Unicode string:

u'A'

Click to expand...

this isn't true for 3.0:
'\\u0041'

(there's no "u" because it's a string, not a bytes literal)

and as far as i can tell, that's correct behaviour according to the docs.

From the 3.0 docs "Even in a raw string, string quotes can be escaped
with a backslash, but the backslash remains in the string". Seems a bit
pointless to me. I would've preferred the backslash to have no special
behaviour at all. Simpler, IMHO...

MRAB · Mar 13, 2009

andrew said:
MRAB said:

andrew said:

MRAB wrote:
[...]
The other special case is with \u in a Unicode string:

ur"\u0041"
u'A'
this isn't true for 3.0:

r"\u0041"
'\\u0041'

(there's no "u" because it's a string, not a bytes literal)

and as far as i can tell, that's correct behaviour according to the
docs.

Click to expand...

From the 3.0 docs "Even in a raw string, string quotes can be escaped
with a backslash, but the backslash remains in the string". Seems a bit
pointless to me. I would've preferred the backslash to have no special
behaviour at all. Simpler, IMHO...

Click to expand...

not sure what you are implying here. i understood "string quotes" in the
text you quote (which i had read) to mean \" and \', which is the
behaviour the original poster saw (and why you cannot end a string with a
slash).

however, you seem to think "string quotes" are \u escapes?
>

Huh? I don't know why you think that.

did you see:

As a result, '\U' and '\u' escapes in raw strings are not
treated specially.

a few paragraphs above?

also,

6

My point is this:

In Python 3.x a backslash doesn't have a special meaning in a raw
string, except that it can prevent a following quote from ending the
string, but the backslash is still included. Why? How useful is that? I
think it would've been simpler if a backslash had _no_ special effect,
not even with a following quote. If you want a quote then either use the
other quote character as the delimiter or use a triple-quoted raw
string.

Lie Ryan · Mar 13, 2009

MRAB said:
In Python 3.x a backslash doesn't have a special meaning in a raw
string, except that it can prevent a following quote from ending the
string, but the backslash is still included. Why? How useful is that? I
think it would've been simpler if a backslash had _no_ special effect,
not even with a following quote. If you want a quote then either use the
other quote character as the delimiter or use a triple-quoted raw
string.

I think the reason is because rawstring is originally devised for
regular expressions and in regex it is common to want to allow both
single quote and double quote in the same pattern, like:

<(.*?) (.*?)=("|')(.*?)("|')>

Krishnakant · Mar 14, 2009

I see that you haven't had the evil spirits exorcised from your mail/
news client ... it's hijacked a thread again :-(
don't worry, won't happen this time.

It seams I did some thing wrong with the settings and it drops out
mails.
Now the problem is sorted out.

You might like to try:
(a) checking that you can get text centred in an UNmerged cell

Tryed, it works perfectly. The text gets centered in an unmerged cell.

(b) using Calc, creating a small ods file with your desired
formatting, then comparing the XML in that ods file with the one your
script has created

This is the first thing I tryed and even got some insight into the xml.
However when I apply the same elements and attributes to the one I am
creating with odfpy, I get "attribute not allowed " errors.
If some one is interested to look at the code, please let me know, I can
send an attachment off the list so that others are not forced to
download some thing they are not concerned about.

(c) contacting the author/maintainer of the odfpy package

Done and awaiting the reply for last 4 days.

It was then that I came to this thread.

happy hacking.
Krishnakant.

David Bolen · Mar 14, 2009

Krishnakant said:
However when I apply the same elements and attributes to the one I am
creating with odfpy, I get "attribute not allowed " errors.
If some one is interested to look at the code, please let me know, I can
send an attachment off the list so that others are not forced to
download some thing they are not concerned about.

I just tried this myself and the following creates a 3x3 spreadsheet
with the first row spanning all three columns (no special formatting
like centering or anything), using odf2py 0.8:

import sys

from odf.opendocument import OpenDocumentSpreadsheet
from odf.style import Style, TableColumnProperties
from odf.table import Table, TableRow, TableColumn, \
TableCell, CoveredTableCell
from odf.text import P

def make_ods():
ods = OpenDocumentSpreadsheet()

col = Style(name='col', family='table-column')
col.addElement(TableColumnProperties(columnwidth='1in'))

table = Table()
table.addElement(TableColumn(numbercolumnsrepeated=3, stylename=col))
ods.spreadsheet.addElement(table)

# Add first row with cell spanning columns A-C
tr = TableRow()
table.addElement(tr)
tc = TableCell(numbercolumnsspanned=3)
tc.addElement(P(text="ABC1"))
tr.addElement(tc)
# Uncomment this to more accurately match native file
##tc = CoveredTableCell(numbercolumnsrepeated=2)
##tr.addElement(tc)

# Add two more rows with non-spanning cells
for r in (2,3):
tr = TableRow()
table.addElement(tr)
for c in ('A','B','C'):
tc = TableCell()
tc.addElement(P(text='%s%d' % (c, r)))
tr.addElement(tc)

ods.save("ods-test.ods")

Maybe that will give you a hint as to what is happening in your case.

Note that it appears creating such a spreadsheet directly in Calc also
adds covered table cells for those cells beneath the spanned cell, but
Calc loads a file fine without those and still lets you later split
the merge and edit the underlying cells. So I'm not sure how required
that is as opposed to just how Calc manages its own internal structure.

-- David

Krishnakant · Mar 14, 2009

Hi David,
based on your code snippid I added a couple of lines to actually center
align text in the merged cell in first row.

Please note in the following code that I have added ParagraphProperties
in the imports and created one style with textalign="center" as an
attribute.

*** code follows ***

import sys

from odf.opendocument import OpenDocumentSpreadsheet
from odf.style import Style, TableColumnProperties, ParagraphProperties
from odf.table import Table, TableRow, TableColumn, TableCell,
CoveredTableCell
from odf.text import P
class makeods:
def make_ods(self):
ods = OpenDocumentSpreadsheet()

col = Style(name='col', family='table-column')
col.addElement(TableColumnProperties(columnwidth='1in'))
tablecontents = Style(name="Table Contents", family="paragraph")

tablecontents.addElement(ParagraphProperties(textalign="center"))

table = Table()
table.addElement(TableColumn(numbercolumnsrepeated=3,
stylename=col))
ods.spreadsheet.addElement(table)

# Add first row with cell spanning columns A-C
tr = TableRow()
table.addElement(tr)
tc = TableCell(numbercolumnsspanned=3)
tc.addElement(P(stylename=tablecontents, text="ABC1"))
tr.addElement(tc)
# Uncomment this to more accurately match native file
##tc = CoveredTableCell(numbercolumnsrepeated=2)
##tr.addElement(tc)

# Add two more rows with non-spanning cells
for r in (2,3):
tr = TableRow()
table.addElement(tr)
for c in ('A','B','C'):
tc = TableCell()
tc.addElement(P(text='%s%d' % (c, r)))
tr.addElement(tc)

ods.save("ods-test.ods")
m = makeods()

m.make_ods()

Still the text in the cell is not centered.

happy hacking.
Krishnakant.

David Bolen · Mar 14, 2009

Krishnakant said:
based on your code snippid I added a couple of lines to actually center
align text in the merged cell in first row.

Sorry, guess I should have verified handling all the requirements

I think there's two issues:

* I neglected to add the style I created to the document, so even in my
first example, columns had a default style (not the 1in) style I thought
I was creating.

* I don't think you want a paragraph style applied to the paragraph
text within the cell, but to the cell as a whole. I think if you
just try to associate it with the text.P() element the "width" of
the paragraph is probably just the text itself so there's nothing to
center, although that's just a guess.

I've attached an adjusted version that does center the spanned cell
for me. Note that I'll be the first to admit I don't necessarily
understand all the ODF style rules. In particular, I got into a lot
of trouble trying to add my styles to the overall document styles
(e.g., ods.styles) which I think can then be edited afterwards rather
than the automatic styles (ods.automaticstyles).

The former goes into the styles.xml file whereas the latter is included
right in contents.xml. For some reason using ods.styles kept causing
OpenOffice to crash trying to load the document, so I finally just went
with the flow and used automaticstyles. It's closer to how OO itself
creates the spreadsheet anyway.

-- David

from odf.opendocument import OpenDocumentSpreadsheet
from odf.style import Style, TableColumnProperties, ParagraphProperties
from odf.table import Table, TableRow, TableColumn, \
TableCell, CoveredTableCell
from odf.text import P

def make_ods():
ods = OpenDocumentSpreadsheet()

col = Style(name='col', family='table-column')
col.addElement(TableColumnProperties(columnwidth='1in'))

centered = Style(name='centered', family='table-cell')
centered.addElement(ParagraphProperties(textalign='center'))

ods.automaticstyles.addElement(col)
ods.automaticstyles.addElement(centered)

table = Table()
table.addElement(TableColumn(numbercolumnsrepeated=3, stylename=col))
ods.spreadsheet.addElement(table)

# Add first row with cell spanning columns A-C
tr = TableRow()
table.addElement(tr)
tc = TableCell(numbercolumnsspanned=3, stylename=centered)
tc.addElement(P(text="ABC1"))
tr.addElement(tc)

# Add two more rows with non-spanning cells
for r in (2,3):
tr = TableRow()
table.addElement(tr)
for c in ('A','B','C'):
tc = TableCell()
tc.addElement(P(text='%s%d' % (c, r)))
tr.addElement(tc)

ods.save("ods-test.ods")

if __name__ == "__main__":
make_ods()

John Machin · Mar 15, 2009

[snip]

Note that it appears creating such a spreadsheet directly in Calc also
adds covered table cells for those cells beneath the spanned cell, but
Calc loads a file fine without those and still lets you later split
the merge and edit the underlying cells. So I'm not sure how required
that is as opposed to just how Calc manages its own internal structure.

Don't feel lonely

OASIS and Calc aren't sure either. Here's what
the ODF 1.1 standard has to say [section 8.1.3]:
"""
Table Cell

The <table:table-cell> and <table:covered-table-cell> elements specify
the content of a table cells. They are contained in table row
elements. A table cell can contain paragraphs and other text content
as well as sub tables. Table cells may be empty.

The <table:table-cell> element is very similar to the table cell
elements of [XSL] and [HTML4], and the rules regarding cells that span
several columns or rows that exist in HTML and XSL apply to the
OpenDocument specification as well. This means that there are no
<table:table-cell> elements in the row/column grid for positions that
are covered by a merged cell, that is, that are covered by a cell that
spans several columns or rows. The <table:covered-table-cell> element
exists to be able to specify cells for such positions . It has to
appear wherever a position in the row/column grid is covered by a cell
that spans several rows or columns. Its position in the grid is
calculated by a assuming a column and row span of 1 for all cells
regardless whether they are specified by a <table:table-cell> or a
<table:covered-table-cell> element. The <table:covered-table-cell> is
especially used by spreadsheet applications, where it is a common use
case that a covered cell contains content.
"""

So I was under the impression that only a covered-table-cell could
appear under the cover, especially if the cell were to carry
meaningful content. Not so, evidently, ...

I have adapted your second script as follows, to specify:
row 1: 3-col 2-row spanning cell, then 1 ordinary (not "covered") cell
containing text "X"
row 2: 3 ordinary cells "P2" "Q2" "R2"
row 3: 3 ordinary cells "P3" "Q3" "R3"

odfpy does not automagically change the ordinary cells to covered.

According to the standard, the X cell and the P2-R2 cells are illegal;
they should be covered-table-cell elements.
However Calc neither complains nor bumps the cells out of the arena
(e.g. bumping the X cell from B1 to D1). X and the P2-R2 cells are
hidden ("covered"). If you unmerge the 3x2 A1, Calc displays X in B1,
P2 in A2, etc.

8<--- changed piece of script
# Add first row with first cell spanning A1:C2
tr = TableRow()
table.addElement(tr)
tc = TableCell(numbercolumnsspanned=3, numberrowsspanned=2,
stylename=centered)
tc.addElement(P(text="ABC12"))
tr.addElement(tc)

# first row, second cell
tc = TableCell()
tc.addElement(P(text="X"))
tr.addElement(tc)

# Add two more rows with non-spanning non-covered cells
for r in (2,3):
tr = TableRow()
table.addElement(tr)
for c in "PQR":
tc = TableCell()
tc.addElement(P(text='%s%d' % (c, r)))
tr.addElement(tc)
8<---

In practice, it seems that all cells in the spanned range must be
filled in somehow -- the minimum safest (in the sense of both
complying with the standard and working with Calc) filler being a
covered-table-cell with no attributes other than number-columns-
repeated.

Cheers,
John

Krishnakant · Mar 17, 2009

Hi David and John.
Thanks a lot, the problem is solved.

David, your idea was the key to solve the problem.
Actually John in his code and the explanation made it clear that the
wrong attributes were being used on wrong elements.

david's code confirmed the fact. The center style which david had in
his code solved the problem for me because we don't need paragraph
alignment styles but the merged cells to be having center alignment for
the text contained in those cells.

This implied that the centering aught to happen in the cells when they
are created. So that style was applied to the cells in david's code.

The other mistake was that after merging 4 cells I had to actually add 3
cells to be used by the merged (spanned ) cells. after those cells I
could start my next set of merged cells.
Initially the mistake I was doing was adding a cell which could span 4
cells and try to add another cell just after the code that created that
merged set of cells.

But later on I realised that once cells are merged we have to physically
add the blank cells to fill up the merged space else the next few cells
won't show up with the other text.

Now the code works fine and if any one would ever need this, please do
write to me.

Thanks to all again, specially David and John.

happy hacking.
Krishnakant.

__future__ and unrecognised flags	0	Dec 12, 2008
Matplotlib/Pylab Error	3	Dec 10, 2012
Python2.6 + win32com crashes with unicode bug	5	Oct 29, 2009
Which libraries for Python 2.5.2	29	Dec 27, 2011
Python 3.0 crashes displaying Unicode at interactive prompt	14	Dec 13, 2008
__future__ and compile: unrecognised flags	0	Dec 13, 2008
Unicode raw string containing \u	3	Oct 28, 2007
A python problem about int to long promotion just see the idle session	4	May 3, 2006

Raw String Question

Jim Garrison

Tim Chase

Jim Garrison

Jim Garrison

Albert Hopkins

Krishnakant

John Machin

Miles

MRAB

MRAB

MRAB

Lie Ryan

Krishnakant

David Bolen

Krishnakant

David Bolen

John Machin

Krishnakant

Ask a Question

Similar Threads

Staff online

Members online

Forum statistics

Latest Threads