Odd json encoding erro

Discussion in 'Python' started by Wells, Dec 15, 2009.

  1. Wells

    Wells Guest

    Wells, Dec 15, 2009
    #1
    1. Advertisements

  2. Wells

    Chris Rebert Guest

    On Tue, Dec 15, 2009 at 2:03 PM, Wells <> wrote:
    > I get this exception when decoding a certain JSON string:
    >
    > 'ascii' codec can't encode character u'\u2019' in position 8: ordinal
    > not in range(128)
    >
    > The JSON data in question:
    >
    > http://mlb.com/lookup/json/named.player_info.bam?sport_code='mlb'&player_id='489002'
    >
    > It's in the 'high_school' key. Is there some string function I can run
    > on the information before I decode it to avoid this?


    >From what I can guess (you didn't include any code), you're printing

    the result of loading the JSON (which probably loaded correctly) to
    the terminal without specifying the exact encoding to use. In such
    cases, Python defaults to ASCII. However, your data obviously includes
    non-ASCII characters, thus resulting in the error you're encountering.
    Instead of `print the_high_school`, try `print
    the_high_school.encode('utf8')`.

    Note that the `json` library returns Unicode strings of the type
    `unicode` and not byte strings of type `str` (unless you're using
    Python 3.0, in which case `unicode` got renamed to `str` and `str` got
    renamed to `bytes`). When outputting Unicode, it needs to be encoded
    to bytes. The built-in type() function* can help determine when you
    have Unicode data.

    Cheers,
    Chris
    --
    http://blog.rebertia.com

    *Yes, it's not /really truly/ a function, but the distinction is not
    relevant here.
     
    Chris Rebert, Dec 15, 2009
    #2
    1. Advertisements

  3. On Dec 15, 3:03 pm, Wells <> wrote:
    > I get this exception when decoding a certain JSON string:
    >
    > 'ascii' codec can't encode character u'\u2019' in position 8: ordinal
    > not in range(128)
    >
    > The JSON data in question:
    >
    > http://mlb.com/lookup/json/named.player_info.bam?sport_code='mlb'....
    >
    > It's in the 'high_school' key. Is there some string function I can run
    > on the information before I decode it to avoid this?


    In my test using this same data, I did not get such an error. Here's
    my code:

    data = '{"player_info": {"queryResults": { "row": { "active_sw": "Y",
    "bats": "R", "birth_city": "Baltimore", "birth_country": "USA",
    "birth_date": "1987-08-31T00:00:00", "birth_state": "MD", "college":
    "", "death_city": "", "death_country": "", "death_date": "",
    "death_state": "", "end_date": "", "file_code": "sf", "gender": "M",
    "height_feet": "6", "height_inches": "1", "high_school": "St. Paul
    \u2019s School For Boys (MN) HS", "jersey_number": "",
    "name_display_first_last": "Steve Johnson",
    "name_display_first_last_html": "Steve Johnson",
    "name_display_last_first": "Johnson, Steve",
    "name_display_last_first_html": "Johnson, Steve",
    "name_display_roster": "Johnson, S", "name_display_roster_html":
    "Johnson, S", "name_first": "Steven", "name_full": "Johnson, Steve",
    "name_last": "Johnson", "name_matrilineal": "", "name_middle":
    "David", "name_nick": "", "name_prefix": "", "name_title": "",
    "name_use": "Steve", "player_id": "489002", "primary_position": "1",
    "primary_position_txt": "P", "primary_sport_code": "",
    "pro_debut_date": "", "start_date": "2009-12-10T00:00:00", "status":
    "Active", "status_code": "A", "status_date": "2009-12-10T00:00:00",
    "team_abbrev": "SF", "team_code": "sfn", "team_id": "137",
    "team_name": "San Francisco Giants", "throws": "R", "weight": "200" },
    "totalSize": "1" }}}'

    import json

    print json.loads(data)

    (I'm running 2.6.4 on Mac OS X)

    Intchanter
    Daniel Fackrell
     
    Intchanter / Daniel Fackrell, Dec 15, 2009
    #3
  4. Wells

    Wells Guest

    Sorry- more detail- the actual problem is an exception thrown when
    running str() on the value, like so:

    >>> a = u'St. Paul\u2019s School For Boys (MN) HS'
    >>> print str(a)

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
    position 8: ordinal not in range(128)

    Is there some way to run str() against a unicode object?
     
    Wells, Dec 15, 2009
    #4
  5. Wells

    Chris Rebert Guest

    On Tue, Dec 15, 2009 at 3:04 PM, Wells <> wrote:
    > Sorry- more detail- the actual problem is an exception thrown when
    > running str() on the value, like so:
    >
    >>>> a = u'St. Paul\u2019s School For Boys (MN) HS'
    >>>> print str(a)

    > Traceback (most recent call last):
    >  File "<stdin>", line 1, in <module>
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
    > position 8: ordinal not in range(128)
    >
    > Is there some way to run str() against a unicode object?


    To repeat what I said earlier, you use the .encode() method instead:

    print a.encode('utf8')

    Might I recommend reading:
    http://www.joelonsoftware.com/articles/Unicode.html

    Regards,
    Chris
    --
    http://blog.rebertia.com
     
    Chris Rebert, Dec 15, 2009
    #5
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michael Speer

    Odd behavior with odd code

    Michael Speer, Feb 16, 2007, in forum: C Programming
    Replies:
    33
    Views:
    1,481
    Richard Heathfield
    Feb 18, 2007
  2. David Wilson

    Python 2.6 json & encoding of datetime.

    David Wilson, Oct 11, 2008, in forum: Python
    Replies:
    0
    Views:
    469
    David Wilson
    Oct 11, 2008
  3. Florian Frank
    Replies:
    0
    Views:
    468
    Florian Frank
    Jun 30, 2009
  4. Replies:
    1
    Views:
    135
    Matt Kruse
    Jan 2, 2007
  5. sajuptpm
    Replies:
    2
    Views:
    714
    sajuptpm
    Dec 28, 2012
  6. Acácio Centeno
    Replies:
    1
    Views:
    649
    dieter
    Feb 15, 2013
  7. Bryan Britten
    Replies:
    9
    Views:
    544
    Bryan Britten
    May 28, 2013
  8. David Karr
    Replies:
    1
    Views:
    341
    David Karr
    Jun 17, 2013
Loading...