Quickie: converting r"\x2019" to int

R

Robin Haswell

Hey guys. This should just be a quickie: I can't figure out how to convert
r"\x2019" to an int - could someone give me a hand please?

Cheers

-Rob
 
K

Kent Johnson

Robin said:
Hey guys. This should just be a quickie: I can't figure out how to convert
r"\x2019" to an int - could someone give me a hand please?

Is this what you mean?
In [9]: int(r'\x2019'[2:], 16)
Out[9]: 8217

or maybe you meant this:
In [6]: ord(u'\u2019')
Out[6]: 8217

Kent
 
J

Just

Kent Johnson said:
Robin said:
Hey guys. This should just be a quickie: I can't figure out how to convert
r"\x2019" to an int - could someone give me a hand please?

Is this what you mean?
In [9]: int(r'\x2019'[2:], 16)
Out[9]: 8217

or maybe you meant this:
In [6]: ord(u'\u2019')
Out[6]: 8217

Or even:
>>> import struct
>>> struct.unpack("q", "\0\0"+ r'\x2019')[0] 101671307850041L
>>>

Who knows :)

Just
 
F

Fredrik Lundh

Just said:
Robin said:
Hey guys. This should just be a quickie: I can't figure out how to convert
r"\x2019" to an int - could someone give me a hand please?

Is this what you mean?
In [9]: int(r'\x2019'[2:], 16)
Out[9]: 8217

or maybe you meant this:
In [6]: ord(u'\u2019')
Out[6]: 8217

Or even:
import struct
struct.unpack("q", "\0\0"+ r'\x2019')[0] 101671307850041L

Who knows :)

I think we can be pretty sure that he didn't mean
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: not a numeric character

though...
'RIGHT SINGLE QUOTATION MARK'

</F>
 
R

Robin Haswell

Kent Johnson said:
Robin Haswell wrote:
Is this what you mean?
In [9]: int(r'\x2019'[2:], 16)
Out[9]: 8217

or maybe you meant this:
In [6]: ord(u'\u2019')
Out[6]: 8217

Or even:
import struct
struct.unpack("q", "\0\0"+ r'\x2019')[0] 101671307850041L


rob@aranea:~$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Something like that. Except with:
Traceback (most recent call last):


:)

-Rob
 
F

Frank Millman

Robin said:
rob@aranea:~$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Something like that. Except with:
Traceback (most recent call last):


:)

-Rob

I decided to use this as a learning exercise for myself. This is what I
figured out. All quotes are paraphrased from the docs.

"\xhh in a string substitutes the character with the hex value hh.
Unlike in Standard C, at most two hex digits are accepted."

\x20 is equal to a space. Therefore '\x2019' is equal to ' 19'.

"int(x) converts a string or number to a plain integer. If the argument
is a string, it must contain a possibly signed decimal number
representable as a Python integer, possibly embedded in whitespace."

Therefore int(' 19') is equal to 19.

"When an 'r' prefix is present, a character following a backslash is
included in the string without change, and all backslashes are left in
the string".

Therefore r'\x2019' is left unchanged, and cannot be converted to an
int.

Rob, this explains *why* you are getting the above error. It does not
explain how to achieve your objective, as you have not specified what
it is. If you give more information, one of the resident gurus may be
able to assist you.

Frank Millman
 
S

Serge Orlov

Robin said:
Hey guys. This should just be a quickie: I can't figure out how to convert
r"\x2019" to an int - could someone give me a hand please?
19
 
R

Robin Haswell

Therefore r'\x2019' is left unchanged, and cannot be converted to an
int.

Rob, this explains *why* you are getting the above error. It does not
explain how to achieve your objective, as you have not specified what
it is. If you give more information, one of the resident gurus may be
able to assist you.

Thanks, I think that helps.

Basically I'm decoding HTML character references. "&#x2019" is a character
reference, equal to a single quote (ish).
http://ganesh.bronco.co.uk/example.html is the character in action. I want
to get from the string "x2019" to the Unicode character ’.

However, your help has lead me to a solution!

That's got it - thanks :)

-Rob
 
D

Dennis Lee Bieber

Basically I'm decoding HTML character references. "&#x2019" is a character
reference, equal to a single quote (ish).

Right Single Quote, I believe...

Did you look at the contents of htmlentitydefs ?

htmlentitydefs.entitydefs[htmlentitydefs.codepoint2name[int("2019",16)]] 'rsquo'
htmlentitydefs.entitydefs[htmlentitydefs.codepoint2name[int("2019",16)]]
'’'
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top