EBCDIC <--> ASCII

martinjamesevans · Dec 4, 2008

I'm having a problem trying to use the codecs package to aid me in
converting some bytes from EBCDIC into ASCII.

I have some 8bit text that is in mixed format. I extract the bytes
that are coded for EBCDIC and would like to display them correctly.
The bytes that are EBCDIC could values 0-255, I'm only really
interested in the printable portions and could say leave the rest as
dots.

I've tried starting with something like this, but I assume it is
expecting the source to be in unicode already?

e.g. (pretend the second half are EBCDIC characters)

sAll = "This bit is ASCII, <this bit ebcdic>"
sSource = sAll[19:]

sEBCDIC = unicode(sSource, 'cp500', 'ignore')
sASCII = sEBCDIC.encode('ascii')

Obviously I could just knock up a 255 character lookup table and do it
myself, I was just trying to be a little more Pythonic and use that
built in table.

Thanks,

Martin

Ulrich Eckhardt · Dec 4, 2008

I've tried starting with something like this, but I assume it is
expecting the source to be in unicode already?

e.g. (pretend the second half are EBCDIC characters)

sAll = "This bit is ASCII, <this bit ebcdic>"

Why pretend? You can use this:

"abcde\x81\x82\x83\x84"

sSource = sAll[19:]

sEBCDIC = unicode(sSource, 'cp500', 'ignore')

If you mean this sSource, then no. sSource is treated as byte string here
which is converted to Unicode using 'cp500' as encoding. Note that in
interactive mode, 'print x' will actually convert the string according to
stdout's current encoding (typically ASCII or - I think - Latin 1).

s1 = u'abcde'
s2 = s1.encode('cp500')
s3 = s1.encode('ascii')
s4 = unicode( s2, 'cp500')
s5 = unicode( s3, 'ascii')

Uli

Michael Ströder · Dec 4, 2008

I'm having a problem trying to use the codecs package to aid me in
converting some bytes from EBCDIC into ASCII.

Which EBCDIC variant?

sEBCDIC = unicode(sSource, 'cp500', 'ignore')

Are you sure CP500 is the EBCDIC variant for the language you want?

http://www.ietf.org/rfc/rfc1345.txt lists it as:

&charset IBM500
&rem source: IBM NLS RM Vol2 SE09-8002-01, March 1990
&alias CP500
&alias ebcdic-cp-be
&alias ebcdic-cp-ch

Obviously I could just knock up a 255 character lookup table and do it
myself, I was just trying to be a little more Pythonic and use that
built in table.

It's pythonic to implement a Unicode codec for unknown character tables.
I've put these two on my web site:

http://www.stroeder.com/pylib/encodings/ebcdicatde.py
http://www.stroeder.com/pylib/encodings/cp273.py (needs ebcdicatde)

Ciao, Michael.

Michael Ströder · Dec 5, 2008

Thanks for the tables, ebcdicatde.py does look more suitable.

My problem appears to be that my source is a byte string. In a
nutshell I need "\x81\x82\x83\xf1\xf2\xf3" to become "abc123" in a
byte string.

Python 2.5.2 (r252:60911, Aug 1 2008, 00:43:38)
Ciao, Michael.

martinjamesevans · Dec 8, 2008

Python 2.5.2 (r252:60911, Aug 1 2008, 00:43:38)
>>> import ebcdicatde
>>> "\x81\x82\x83\xf1\xf2\xf3".decode('ebcdic-at-de').encode('ascii')
'abc123'
>>>

Ciao, Michael.- Hide quoted text -

- Show quoted text -

Many thanks for all your posts!
Just what I needed.

Converting EBCDIC to Unicode	3	Sep 28, 2010
hex to ebcdic	5	Jun 29, 2007
Flatten an email Message with a non-ASCII body using 8bit CTE	0	Jan 24, 2013
EBCDIC ascii conversion?	2	Jun 13, 2004
Converting Pack/Unpacked EBCDIC file to ASCII	9	Feb 22, 2005
Ascii to Unicode.	4	Jul 28, 2010
problem with logging exceptions with non-ASCII __str__ result	1	Jan 14, 2008
Unicode questions	17	Oct 19, 2010

EBCDIC <--> ASCII

martinjamesevans

Ulrich Eckhardt

Michael Ströder

Michael Ströder

martinjamesevans

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads