Editing/Writing Word-Files from Python

Discussion in 'Python' started by Daniel Cloutier, Apr 20, 2004.

  1. Hi,

    is it possible to edit or write Word-files out of a Python-Program?

    thx in advance
    daniel
    Daniel Cloutier, Apr 20, 2004
    #1
    1. Advertising

  2. Daniel Cloutier

    Peter Hansen Guest

    Daniel Cloutier wrote:
    > is it possible to edit or write Word-files out of a Python-Program?


    Word uses a proprietary binary format, so you don't really
    have as an option the simple "open file, make some changes,
    write file" approach you might be picturing.

    On the other hand, this *is* Python, so you have alternatives:

    1. Use ActiveX and control Word from Python. This is described
    well in Mark Hammond's book on Win32 programming with Python,
    web pages (use Google), and posts in the comp.lang.python archives.

    2. Write HTML or RTF files, which are not proprietary binary
    formats. These can be edited in Python and then written back
    again. Not sure if there's an RTF library, but "it's just text".

    3. Define in more detail what you are actually looking for
    ("edit" is ill-defined) including the context, and you'll
    probably get another three or four ways of doing it.

    -Peter
    Peter Hansen, Apr 20, 2004
    #2
    1. Advertising

  3. Daniel Cloutier

    Alan Kennedy Guest

    [Daniel Cloutier]
    > is it possible to edit or write Word-files out of a Python-Program?


    If you have access to Office 2003, are feeling brave, and have a lot
    of time on your hands, you could create and manipulate the XML
    structures that Word 2003 uses.

    It thought the group members might find it interesting to see such a
    file, so I have exported a "Hello World!" document as XML, and posted
    the result below. I had to tidy it up a little, the original came out
    all on one line. And I had to add an encoding declaration :)

    In terms of generating such structures, well, everybody has their own
    favourite *ML templating language. I'd use TAL or XSLT in "Literal
    Result Element as Stylesheet" mode ...

    http://www.w3.org/TR/xslt#result-element-stylesheet

    #--------- helloworld.xml --- cut here ------------------------
    <?xml version="1.0" encoding='utf-8'?>
    <?mso-application progid="Word.Document"?>
    <w:wordDocument
    w:embeddedObjPresent="no"
    w:macrosPresent="no"
    w:eek:cxPresent="no"
    xml:space="preserve"
    xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
    xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
    xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice"
    xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"
    xmlns:v="urn:schemas-microsoft-com:vml"
    xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
    xmlns:w10="urn:schemas-microsoft-com:eek:ffice:word"
    xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint"
    >

    <o:DocumentProperties>
    <o:Title>Hello World</o:Title>
    <o:Author>Alan</o:Author>
    <o:LastAuthor>Alan</o:LastAuthor>
    <o:Revision>1</o:Revision>
    <o:TotalTime>1</o:TotalTime>
    <o:Created>2004-04-20T15:38:00Z</o:Created>
    <o:LastSaved>2004-04-20T15:39:00Z</o:LastSaved>
    <o:pages>1</o:pages>
    <o:Words>1</o:Words>
    <o:Characters>12</o:Characters>
    <o:Company>Alan</o:Company>
    <o:Lines>1</o:Lines>
    <o:paragraphs>1</o:paragraphs>
    <o:CharactersWithSpaces>12</o:CharactersWithSpaces>
    <o:Version>11.6113</o:Version>
    </o:DocumentProperties>
    <w:fonts>
    <w:defaultFonts
    w:ascii="Times New Roman"
    w:cs="Times New Roman"
    w:fareast="Times New Roman"
    w:h-ansi="Times New Roman"
    />
    </w:fonts>
    <w:styles>
    <w:versionOfBuiltInStylenames w:val="4"/>
    <w:latentStyles
    w:defLockedState="off"
    w:latentStyleCount="156"
    />
    <w:style
    w:default="on"
    w:styleId="Normal"
    w:type="paragraph"
    >

    <w:name w:val="Normal"/>
    <w:rPr>
    <wx:font wx:val="Times New Roman"/>
    <w:lang
    w:bidi="AR-SA"
    w:fareast="EN-GB"
    w:val="EN-GB"
    />
    </w:rPr>
    </w:style>
    <w:style
    w:default="on"
    w:styleId="DefaultParagraphFont"
    w:type="character"
    >

    <w:name w:val="Default Paragraph Font"/>
    <w:semiHidden/>
    </w:style>
    <w:style
    w:default="on"
    w:styleId="TableNormal"
    w:type="table"
    >

    <w:name w:val="Normal Table"/>
    <wx:uiName wx:val="Table Normal"/>
    <w:semiHidden/>
    <w:rPr>
    <wx:font wx:val="Times New Roman"/>
    </w:rPr>
    <w:tblPr>
    <w:tblInd w:type="dxa" w:w="0"/>
    <w:tblCellMar>
    <w:top w:type="dxa" w:w="0"/>
    <w:left w:type="dxa" w:w="108"/>
    <w:bottom w:type="dxa" w:w="0"/>
    <w:right w:type="dxa" w:w="108"/>
    </w:tblCellMar>
    </w:tblPr>
    </w:style>
    <w:style w:default="on" w:styleId="NoList" w:type="list">
    <w:name w:val="No List"/>
    <w:semiHidden/>
    </w:style>
    </w:styles>
    <w:docPr>
    <w:view w:val="print"/>
    <w:zoom w:percent="100"/>
    <w:doNotEmbedSystemFonts/>
    <w:proofState w:grammar="clean" w:spelling="clean"/>
    <w:attachedTemplate w:val=""/>
    <w:defaultTabStop w:val="720"/>
    <w:displayHorizontalDrawingGridEvery w:val="0"/>
    <w:displayVerticalDrawingGridEvery w:val="0"/>
    <w:useMarginsForDrawingGridOrigin/>
    <w:characterSpacingControl w:val="DontCompress"/>
    <w:eek:ptimizeForBrowser/>
    <w:validateAgainstSchema/>
    <w:saveInvalidXML w:val="off"/>
    <w:ignoreMixedContent w:val="off"/>
    <w:alwaysShowPlaceholderText w:val="off"/>
    <w:compat>
    <w:footnoteLayoutLikeWW8/>
    <w:shapeLayoutLikeWW8/>
    <w:alignTablesRowByRow/>
    <w:forgetLastTabAlignment/>
    <w:doNotUseHTMLParagraphAutoSpacing/>
    <w:layoutRawTableWidth/>
    <w:layoutTableRowsApart/>
    <w:useWord97LineBreakingRules/>
    <w:dontAllowFieldEndSelect/>
    <w:useWord2002TableStyleRules/>
    </w:compat>
    </w:docPr>
    <w:body>
    <wx:sect>
    <w:p>
    <w:pPr>
    <w:jc w:val="center"/>
    <w:rPr>
    <w:sz w:val="40"/>
    <w:sz-cs w:val="40"/>
    </w:rPr>
    </w:pPr>
    <w:r>
    <w:rPr>
    <w:sz w:val="40"/>
    <w:sz-cs w:val="40"/>
    </w:rPr>
    <w:t>Hello World!</w:t>
    </w:r>
    </w:p>
    <w:sectPr>
    <w:pgSz w:h="16838" w:w="11906"/>
    <w:pgMar
    w:bottom="1440"
    w:footer="720"
    w:gutter="0"
    w:header="720"
    w:left="1800"
    w:right="1800"
    w:top="1440"
    />
    <w:cols w:space="720"/>
    </w:sectPr>
    </wx:sect>
    </w:body>
    </w:wordDocument>
    #--------- helloworld.xml --- cut here ------------------------

    --
    alan kennedy
    ------------------------------------------------------
    check http headers here: http://xhaus.com/headers
    email alan: http://xhaus.com/contact/alan
    Alan Kennedy, Apr 20, 2004
    #3
  4. Daniel Cloutier

    Hung Jung Lu Guest

    Daniel Cloutier <> wrote in message news:<c63fgi$7esh7$-berlin.de>...
    >
    > is it possible to edit or write Word-files out of a Python-Program?


    Yes.

    (a) You need the win32all modules from Mark Hammond.

    (b) To try out the object model of MS Word, press ALT+F11 to bring up
    the VBA environment, then F2 to view the object browser. It's a
    complicated subject.

    -------------------------------------------
    import pythoncom
    from win32com.client import Dispatch

    app = Dispatch('Word.Application')
    app.Visible = 1
    doc = app.Documents.Add()
    s = doc.Sentences(1)
    s.Text = 'This is a test.'
    doc.SaveAs('C:\\mydoc2.doc')
    app.Quit()

    app = None
    pythoncom.CoUninitialize()
    --------------------------------------------

    You may want to add better exception handling, otherwise, you may
    often have danggling processes when exceptions happen. (You'll then
    have to manually kill them from the task manager.)

    regards,

    Hung Jung
    Hung Jung Lu, Apr 20, 2004
    #4
  5. Hung Jung Lu wrote:
    > Daniel Cloutier <> wrote in message news:<c63fgi$7esh7$-berlin.de>...
    >
    >>is it possible to edit or write Word-files out of a Python-Program?

    >
    >
    > Yes.
    >
    > (a) You need the win32all modules from Mark Hammond.
    >
    > (b) To try out the object model of MS Word, press ALT+F11 to bring up
    > the VBA environment, then F2 to view the object browser. It's a
    > complicated subject.
    >
    > -------------------------------------------
    > import pythoncom
    > from win32com.client import Dispatch
    >
    > app = Dispatch('Word.Application')
    > app.Visible = 1
    > doc = app.Documents.Add()
    > s = doc.Sentences(1)
    > s.Text = 'This is a test.'
    > doc.SaveAs('C:\\mydoc2.doc')
    > app.Quit()
    >
    > app = None
    > pythoncom.CoUninitialize()
    > --------------------------------------------
    >
    > You may want to add better exception handling, otherwise, you may
    > often have danggling processes when exceptions happen. (You'll then
    > have to manually kill them from the task manager.)
    >
    > regards,
    >
    > Hung Jung


    thanks a lot, yesterday i found out myself that i have to use the
    win32com package, but i didn't know how to add some text

    greetings
    daniel
    Daniel Cloutier, Apr 21, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. crazyprakash
    Replies:
    4
    Views:
    3,360
    adrian
    Oct 30, 2005
  2. Al Moritz
    Replies:
    7
    Views:
    622
    Richard Laing
    Jul 22, 2003
  3. Frost
    Replies:
    8
    Views:
    501
    Vladimir S. Oka
    Feb 10, 2006
  4. Replies:
    0
    Views:
    572
  5. Tony
    Replies:
    2
    Views:
    373
Loading...

Share This Page