uuDecode problem

P

py

Hi,
I am encoding a string such as...

Code:
data = someFile.readlines()
encoded = []
for line in data:
    encoded.append(binascii.b2a_uu(stringToEncode))
return encoded

....I then try to decode this by...

Code:
def decode(data):
    result = []
    for val in data:
        result.append(binascii.a2b_uu(val))
    return result

this seems to work sometimes....for example a list which has a short
string in it like ["this is a test"]

however if the list of data going into the decode function contains a
bunch of elements I get the following error...

result.append(binascii.a2b_uu(val))
binascii.Error: Trailing garbage

...any idea why this is happening? Anyone successfully use the uu to
encode/decode strings of varying length (even larger strings, more than
a few hundred characters)?
 
A

Alex Martelli

py said:
encoded.append(binascii.b2a_uu(stringToEncode))

binascii.b2a_uu only works for up to 45 bytes at once; but if you were
feeding it more than 45 bytes, this should raise a binascii.Error
itself.
..any idea why this is happening? Anyone successfully use the uu to
encode/decode strings of varying length (even larger strings, more than
a few hundred characters)?

Definitely not, given the above limit. But I still don't quite
understand the exact mechanics of the error you're getting.


Alex
 
P

py

Alex said:
binascii.b2a_uu only works for up to 45 bytes at once; but if you were
feeding it more than 45 bytes, this should raise a binascii.Error
itself.
Definitely not, given the above limit. But I still don't quite
understand the exact mechanics of the error you're getting.


Alex

here is an example.

def doSomething():
data = aFile.readlines()
result = []
for x in data:
result.append(encode(x))
return result

def printResult(encodedData):
"""encodedData is a list of strings which are uu encoded"""
print decode(encodedData)

encode(data):
"""data is a string"""
if len(data) > 45:
tmp = []
for c in data:
tmp.append(binascii.b2a_uu(c))
return ''.join(tmp)
else:
return binascii.b2a_uu(data)


decode(data):
"""data is a list of strings"""
result = []
for val in data
if len(val) > 45:
response = []
for x in val:
response.append(binascii.a2b_uu(x))
result.append(response)
else:
result.append(binascii.a2b_uu(val))
return ''.join(result)

....i would use those functions like

data = doSomething()
printResult(data)

Now i get this...
" response.append(binascii.a2b_uu(x))
java.lang.StringIndexOutOfBoundsException:
java.lang.StringIndexOutOfBoundsExcep
tion: String index out of range: 1"

So the error is in the decode method .....this is in Jython...perhaps
Jython doesn't handle binascii.a2b_uu ? or perhaps since the actual
data is being encoded in python, then read in and decoded in my jython
script..that could be the problem?

thanks.
 
J

jepler

Note that you can use the 'uu' encoding, which will handle
arbitrary-length input and give multi-line uuencoded output, including
the 'begin' and 'end' lines:

Otherwise, you should use something like
encoded = [binascii.b2a_uu(chunk)
for chunk in iter(lambda: someFile.read(45), "")]
to send at most 45 bytes to each call to b2a_uu.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDlw/oJd01MZaTXX0RAqIOAJ46t30KNHMT7tAHULcPQORmqKQ9PACglBTh
GBjbFibJBu+GDx6cbtC53Us=
=GWjT
-----END PGP SIGNATURE-----
 
A

Alex Martelli

py said:
"""data is a string"""
if len(data) > 45:
tmp = []
for c in data:
tmp.append(binascii.b2a_uu(c))

You can't decode b2a-encoded data character by character, blindly, as
you're trying to to here. Each character in the source string can be
encoded into multiple characters in the target string, and the slicing,
if slicing is needed, must be appropriate. I suggest a redesign...!


Alex
 
P

py

Alex Martelli wrote:
I suggest a redesign...!


What would you suggest? I have to encode/decode in chunks b/c of the
45 byte limitation.

Thanks.
 
F

Fredrik Lundh

py said:
What would you suggest? I have to encode/decode in chunks b/c of the
45 byte limitation.

so use 45-byte chunks, instead of one-byte chunks.

but why are you using UU encoding in a nonstandard way ? why not just
use the "uu" module to do the chunking for you? the third example on this
page might be helpful:

http://effbot.org/librarybook/uu.htm

(if you don't want the standard begin/end lines, it's probably a better idea
to use base64 encoding instead...)

</F>
 
A

Alex Martelli

py said:
Alex Martelli wrote:
I suggest a redesign...!


What would you suggest? I have to encode/decode in chunks b/c of the
45 byte limitation.

Not quite:

I.e., you can pass to a2b_uu ANY string (and ONLY such a string, not,
e.g., a slice or single char of it, as you're trying to do) that's a
result of a b2a_uu call; the length limitation applies only the other
way.

I join /F in suggesting yo use binascii the standard way, but, at any
rate, you should at least redesign your decoding strategy so it only
calls a2b_uu on strings which are the results of b2a_uu calls.


Alex
 
P

py

Thanks...I think base64 will work just fine...and doesnt seem to have
45 byte limitations, etc.

Thanks.
 
A

Alex Martelli

py said:
Thanks...I think base64 will work just fine...and doesnt seem to have
45 byte limitations, etc.

Sure, base64 is a better encoding by all criteria, unless you
specifically need to use uu encoding for compatibility with other old
software.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top