module: zipfile.writestr - line endings issue

T

towers

Hi

I'm probably doing something stupid but I've run into a problem
whereby I'm trying to add a csv file to a zip archive - see example
code below.

The csv just has several rows with carriage return line feeds (CRLF).

However after adding it to an archive and then decompressing the line
endings have been converted to just line feeds (LF).

Does anyone know if this is a bug or am I just doing something wrong?

Many Thanks,
Damon

****************************************************************************
import zipfile


zipFile = zipfile.ZipFile('LocalZipFile.zip', 'a',
zipfile.ZIP_DEFLATED)
dfile = open('LocalCSVFile.csv', 'r')
zipFile.writestr('test.csv',dfile.read())
dfile.close()
zipFile.close()
****************************************************************************
 
P

programmer.py

Hi

I'm probably doing something stupid but I've run into a problem
whereby I'm trying to add a csv file to a zip archive - see example
code below.

The csv just has several rows with carriage return line feeds (CRLF).

However after adding it to an archive and then decompressing the line
endings have been converted to just line feeds (LF).

Does anyone know if this is a bug or am I just doing something wrong?

Many Thanks,
Damon

****************************************************************************
import zipfile

zipFile = zipfile.ZipFile('LocalZipFile.zip', 'a',
zipfile.ZIP_DEFLATED)
dfile = open('LocalCSVFile.csv', 'r')
zipFile.writestr('test.csv',dfile.read())
dfile.close()
zipFile.close()
****************************************************************************

Line endings will drive you up the wall. Anyway, the python zipfile
library does not seem to be doing any translations. The code below
works for me.

# begin code
import zipfile
import os.path

# First create some work data...
csv_data = '1,2\r\n3,4\r\n5,6\r\n'

# Now, create the zipfile
zip_file = zipfile.ZipFile(r'c:\tmp\test.zip', 'w',
zipfile.ZIP_DEFLATED)
zip_file.writestr('test.csv',csv_data)
zip_file.close()

# Now, extract the info
zip_file = zipfile.ZipFile(r'c:\tmp\test.zip')
assert len(zip_file.read('test.csv')) == len(csv_data)
# end code

Something else must be tweaking your line endings.
HTH!

jw
 
T

towers

Thanks - your code works for me also.

But I still get the issue when I read the file directly and add it to
the archive.

Say if I:

1. Use the test.csv file created with your code - currently the line
endings look good (viewed in notepad on Win XP)
2. Run the following code:

# begin code
import zipfile
import os.path

# Now, create the zipfile
dfile = open('test.csv', 'r')
zip_file = zipfile.ZipFile(r'C:\temp\ice\line endings\test.zip', 'w',
zipfile.ZIP_DEFLATED)
zip_file.writestr('test.csv',dfile.read())
dfile.close()
zip_file.close()

3. Then extract the file and the file endings have been corrupted. Now
one long line in notepad. (Other programs interpret correctly though.)

Maybe the issue lies with this way (i.e. dfile.read()) of writing the
file to the archive...possibly.

Damon
 
P

Paul Carter

Thanks - your code works for me also.

But I still get the issue when I read the file directly and add it to
the archive.

Say if I:

1. Use the test.csv file created with your code - currently the line
endings look good (viewed in notepad on Win XP)
2. Run the following code:

# begin code
import zipfile
import os.path

# Now, create the zipfile
dfile = open('test.csv', 'r')
zip_file = zipfile.ZipFile(r'C:\temp\ice\line endings\test.zip', 'w',
zipfile.ZIP_DEFLATED)
zip_file.writestr('test.csv',dfile.read())
dfile.close()
zip_file.close()

3. Then extract the file and the file endings have been corrupted. Now
one long line in notepad. (Other programs interpret correctly though.)

Maybe the issue lies with this way (i.e. dfile.read()) of writing the
file to the archive...possibly.

Damon

Please don't top post.

The problem is with how you are opening the file. You need to open in
binary mode if you wish to read your file unaltered. Also, file() is
preferred over open() these days I think. Use:

dfile = file('test.csv', 'rb')
 
S

Suresh Babu Kolla

Paul said:
Please don't top post.

The problem is with how you are opening the file. You need to open in
binary mode if you wish to read your file unaltered. Also, file() is
preferred over open() these days I think. Use:

dfile = file('test.csv', 'rb')

From Python 2.5 library documentation.

<quote>
When opening a file, it's preferable to use `open()' instead of
invoking this constructor directly. `file' is more suited to type
testing (for example, writing `isinstance(f, file)').
</quote>

Python documentation seem to recommend using open(). I personally prefer
to use open, just because python open has same signature as POSIX open,
even beginner programmers can understand the intent of the code clearly.

Kolla
 
T

towers

Please don't top post.
From Python 2.5 library documentation.

<quote>
When opening a file, it's preferable to use `open()' instead of
invoking this constructor directly. `file' is more suited to type
testing (for example, writing `isinstance(f, file)').
</quote>

Python documentation seem to recommend using open(). I personally prefer
to use open, just because python open has same signature as POSIX open,
even beginner programmers can understand the intent of the code clearly.

Kolla


Opening in binary mode solves the issue. Thanks very much for the help.
 
P

Paul Carter

From Python 2.5 library documentation.

<quote>
When opening a file, it's preferable to use `open()' instead of
invoking this constructor directly. `file' is more suited to type
testing (for example, writing `isinstance(f, file)').
</quote>

Python documentation seem to recommend using open(). I personally prefer
to use open, just because python open has same signature as POSIX open,
even beginner programmers can understand the intent of the code clearly.

Kolla

Yeah, you're right. I know I had read that file() was preferred
somewhere, but obviously it wasn't a good source. Thanks for the
correction!
 
?

=?ISO-8859-1?Q?Ricardo_Ar=E1oz?=

Hi, I'm new to this python stuff so maybe I'm stating the obvious, or
worse, maybe I'm completely off track.

Not long ago someone was asking about a way to hide source code. I
stumbled upon zipimport standard module. It seems it lets you get your
imports from zip files. The docs say it is implicitly called, so you
could have your modules in a zipped file. That should make them a bit
more arcane.
Another idea would be to modify this module to use encrypted zip files,
or to use PyCrypto or some other module as a middle man in order to keep
the contents encrypted.

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top