newbie - merging xls files using xldt and xlwt

Discussion in 'Python' started by Albert-jan Roskam, Oct 15, 2008.

  1. Hi,

    I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

    Any ideas?

    Thanks in advance!

    Albert-Jan


    """
    Merge all xls files in a given directory into one multisheet xls file.
    The sheets get the orginal file name, without the extension.
    File names should not exceed 31 characters, as this is the maximum
    sheet name length
    """

    import xlrd, xlwt
    import glob, os.path

    def merge_xls (in_dir, out_file="d:/merged_output.xls"):

    xls_files = glob.glob(in_dir + "*.xls")
    sheet_names = []
    merged_book = xlwt.Workbook()

    [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]
    for k, xls_file in enumerate(xls_files):
    if len (sheet_names[k]) <= 31:
    book = xlrd.open_workbook(xls_file)
    ws = merged_book.add_sheet(sheet_names[k])
    for sheetx in range(book.nsheets):
    sheet = book.sheet_by_index(sheetx)
    for rx in range(sheet.nrows):
    for cx in range(sheet.ncols):
    ws.write(rx, cx, sheet.cell_value(rx, cx))
    else:
    print "File name too long: <%s.xls> (maximum is 31 chars) " % (sheet_names[k])
    print "File <%s.xls> is *not* included in the merged xls file." % (sheet_names[k])
    merged_book.save(out_file)

    print "---> Merged xls file written to %s using the following source files: " % (out_file)
    for k, v in enumerate(sheet_names):
    if len(v) <= 31: print "\t", str(k+1).zfill(3), "%s.xls" % (v)

    merge_xls(in_dir="d:/temp/")

    *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
    put_cell 0 1

    Traceback (most recent call last):
    File "G:\generic_syntaxes\merge_xls.py", line 37, in -toplevel-
    merge_xls(in_dir="d:/temp/")
    File "G:\generic_syntaxes\merge_xls.py", line 21, in merge_xls
    book = xlrd.open_workbook(xls_file)
    File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 139, in open_workbook
    bk.get_sheets()
    File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 389, in get_sheets
    sht = self.get_sheet(sheetno)
    File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 379, in get_sheet
    sh.read(self)
    File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 285, in read
    self.put_cell(rowx, colx, XL_CELL_TEXT, bk._sharedstrings[index])
    File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 214, in put_cell
    self._cell_types[rowx][colx] = ctype
    IndexError: list assignment index out of range
     
    Albert-jan Roskam, Oct 15, 2008
    #1
    1. Advertising

  2. Albert-jan Roskam

    John Machin Guest

    On Oct 15, 9:16 pm, Albert-jan Roskam <> wrote:
    > Hi,
    >
    > I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.
    >
    > Any ideas?


    The version of xlrd that you are using is an antique. Go to
    http://pypi.python.org/pypi/xlrd and get the latest version. If that
    doesn't work, send me a copy of one of the files that is displaying
    this problem.
    >
    > Thanks in advance!
    >
    > Albert-Jan
    >
    > """
    > Merge all xls files in a given directory into one multisheet xls file.
    > The sheets get the orginal file name, without the extension.
    > File names should not exceed 31 characters, as this is the maximum
    > sheet name length
    > """
    >
    > import xlrd, xlwt
    > import glob, os.path
    >
    > def merge_xls (in_dir, out_file="d:/merged_output.xls"):
    >
    >     xls_files   = glob.glob(in_dir + "*.xls")
    >     sheet_names = []
    >     merged_book = xlwt.Workbook()
    >
    >     [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]


    Wah! Try this:
    sheet_names = [os.path.basename(v)[:-4]) for v in xls_files]


    >     for k, xls_file in enumerate(xls_files):
    >         if len (sheet_names[k]) <= 31:
    >             book = xlrd.open_workbook(xls_file)
    >             ws = merged_book.add_sheet(sheet_names[k])
    >             for sheetx in range(book.nsheets):
    >                 sheet = book.sheet_by_index(sheetx)
    >                 for rx in range(sheet.nrows):
    >                     for cx in range(sheet.ncols):
    >                          ws.write(rx, cx, sheet..cell_value(rx, cx))


    I presume that you're not too worried about any date data.

    If an input file has more than 1 sheet, you are creating only one
    sheet in the output file, and overwriting cells.

    >         else:
    >[snip]
    >
    > merge_xls(in_dir="d:/temp/")
    >
    > *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
    > put_cell 0 1
    >
    > Traceback (most recent call last):

    [snip]
    >     self._cell_types[rowx][colx] = ctype
    > IndexError: list assignment index out of range


    Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
    xlrd has like Excel become more resilient over time :)

    BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel

    Cheers,
    John
     
    John Machin, Oct 15, 2008
    #2
    1. Advertising

  3. Albert-jan Roskam

    Support Desk Guest

    account balance checker

    Hello all,

    I was wondering if it would be possible to make a script to grab my
    balance account balance a few times a day without having to login every
    time. I know I can use the urlib2 library, but not sure how to go about
    filling in the forms and submitting them. BOA has a mobile site that is
    pretty simple. Anyone else use Bank of America and would be interested in
    this. This is not for anything illegal, just for me to prevent overdrafting
    my account

    https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobileDevice=true



    y =
    urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
    eDevice=true',urllib.urlencode({'onlineID':'MYONLLINEID'})).readlines()
     
    Support Desk, Oct 15, 2008
    #3
  4. Albert-jan Roskam

    Chris Rebert Guest

    Re: account balance checker

    You'd probably have to use something like mechanize
    (http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
    but if BofA's website uses Javascript at all, you're probably out of
    luck.

    Cheers,
    Chris
    --
    Follow the path of the Iguana...
    http://rebertia.com

    On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
    <> wrote:
    > Hello all,
    >
    > I was wondering if it would be possible to make a script to grab my
    > balance account balance a few times a day without having to login every
    > time. I know I can use the urlib2 library, but not sure how to go about
    > filling in the forms and submitting them. BOA has a mobile site that is
    > pretty simple. Anyone else use Bank of America and would be interested in
    > this. This is not for anything illegal, just for me to prevent overdrafting
    > my account
    >
    > https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobileDevice=true
    >
    >
    >
    > y =
    > urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
    > eDevice=true',urllib.urlencode({'onlineID':'MYONLLINEID'})).readlines()
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
     
    Chris Rebert, Oct 16, 2008
    #4
  5. Albert-jan Roskam

    Support Desk Guest

    RE: account balance checker

    I was also looking at the ClientForm Library
    http://wwwsearch.sourceforge.net/ClientForm/

    which can get me past the first username form, but I noticed it then goes
    to a challenge question form and im not sure how to take the resulting for
    and resubmit it with new information and then resubmit the resulting form
    with the password

    -----Original Message-----
    From: [mailto:] On Behalf Of Chris
    Rebert
    Sent: Thursday, October 16, 2008 10:15 AM
    To: Support Desk
    Cc:
    Subject: Re: account balance checker

    You'd probably have to use something like mechanize
    (http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
    but if BofA's website uses Javascript at all, you're probably out of
    luck.

    Cheers,
    Chris
    --
    Follow the path of the Iguana...
    http://rebertia.com

    On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
    <> wrote:
    > Hello all,
    >
    > I was wondering if it would be possible to make a script to grab my
    > balance account balance a few times a day without having to login every
    > time. I know I can use the urlib2 library, but not sure how to go about
    > filling in the forms and submitting them. BOA has a mobile site that is
    > pretty simple. Anyone else use Bank of America and would be interested in
    > this. This is not for anything illegal, just for me to prevent

    overdrafting
    > my account
    >
    > https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobileDevice=true
    >
    >
    >
    > y =
    >

    urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
    > eDevice=true',urllib.urlencode({'onlineID':'MYONLLINEID'})).readlines()
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
     
    Support Desk, Oct 16, 2008
    #5
  6. paster 64bit windows compile

    Hello all,

    I've been tearing my hair out trying to get pylon installed most of the day,
    and it seems that both setup tools and paster.exe have some serious issues
    with 64bit on windows.

    Unfortunately I'm stuck with 2.6 64bit.

    I think I've got it nearly all up and running, the biggest problem is that
    paster fails to find the python executable when run, as it is 32bit and the
    python executable is 64 [afaik]. Does anyone either have a 64bit compile, or
    know where the source code for paster.exe lives so I can attempt to compile
    it myself. It doesn't appear to be in the source for PasterScript package.
    Somehow it 'appears' when setup .py install is run, but I cannot find any c
    code anywhere.

    Many thanks in advance

    Jules
     
    Jules Stevenson, Oct 23, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chanman
    Replies:
    0
    Views:
    966
    Chanman
    Jun 19, 2008
  2. Albert-jan Roskam
    Replies:
    0
    Views:
    671
    Albert-jan Roskam
    Oct 15, 2008
  3. pp
    Replies:
    7
    Views:
    2,750
    John Machin
    Jan 10, 2010
  4. John Mathew

    Merging xls into a common one

    John Mathew, Oct 17, 2013, in forum: Python
    Replies:
    0
    Views:
    87
    John Mathew
    Oct 17, 2013
  5. Sameer Gupta
    Replies:
    0
    Views:
    124
    Sameer Gupta
    Oct 17, 2013
Loading...

Share This Page