Opening MS Word files via Python

Discussion in 'Python' started by Fazer, Apr 21, 2004.

  1. Fazer

    Fazer Guest

    Here comes another small question from me :)

    I am curious as to how I should approach this issue. I would just
    want to parse simple text and maybe perhaps tables in the future.
    Would I have to save the word file and open it in a text editor? That
    would kind of....suck... Has anyone else tackled this issue?

    Thanks,
    Fazer, Apr 21, 2004
    #1
    1. Advertising

  2. Fazer

    Rob Nikander Guest

    Fazer wrote:
    > I am curious as to how I should approach this issue. I would just
    > want to parse simple text and maybe perhaps tables in the future.
    > Would I have to save the word file and open it in a text editor? That
    > would kind of....suck... Has anyone else tackled this issue?


    The win32 extensions for python allow you to get at the COM objects for
    applications like Word, and that would let you get the text and tables.
    google: win32 python.

    word = win32com.client.Dispatch('Word.Application')
    word.Documents.Open('C:\\myfile.doc')

    But I don't know the best way to find out the methods and properties of
    the "word" object.

    Rob
    Rob Nikander, Apr 21, 2004
    #2
    1. Advertising

  3. (Fazer) wrote in message
    > I am curious as to how I should approach this issue. I would just
    > want to parse simple text and maybe perhaps tables in the future.
    > Would I have to save the word file and open it in a text editor? That
    > would kind of....suck... Has anyone else tackled this issue?


    See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/279003

    Cheers,
    Simon B.
    Simon Brunning, Apr 21, 2004
    #3
  4. Fazer

    jmdeschamps Guest

    Rob Nikander <> wrote in message news:<>...
    > Fazer wrote:
    > > I am curious as to how I should approach this issue. I would just
    > > want to parse simple text and maybe perhaps tables in the future.
    > > Would I have to save the word file and open it in a text editor? That
    > > would kind of....suck... Has anyone else tackled this issue?

    >
    > The win32 extensions for python allow you to get at the COM objects for
    > applications like Word, and that would let you get the text and tables.
    > google: win32 python.
    >
    > word = win32com.client.Dispatch('Word.Application')
    > word.Documents.Open('C:\\myfile.doc')
    >
    > But I don't know the best way to find out the methods and properties of
    > the "word" object.
    >
    > Rob


    You can use VBA documentation for Word, and using dot notation and
    normal Pythonesque way of calling functions, play with its diverses
    objects, methods and attributes...
    Here's some pretty straightforward code along these lines:
    #************************
    import win32com.client
    import tkFileDialog

    # Launch Word
    MSWord = win32com.client.Dispatch("Word.Application")
    MSWord.Visible = 0
    # Open a specific file
    myWordDoc = tkFileDialog.askopenfilename()
    MSWord.Documents.Open(myWordDoc)
    #Get the textual content
    docText = MSWord.Documents[0].Content
    # Get a list of tables
    listTables= MSWord.Documents[0].Tables
    #************************

    Happy parsing,

    Jean-Marc
    jmdeschamps, Apr 21, 2004
    #4
  5. Fazer

    Fazer Guest

    (jmdeschamps) wrote in message news:<>...
    > Rob Nikander <> wrote in message news:<>...
    > > Fazer wrote:
    > > > I am curious as to how I should approach this issue. I would just
    > > > want to parse simple text and maybe perhaps tables in the future.
    > > > Would I have to save the word file and open it in a text editor? That
    > > > would kind of....suck... Has anyone else tackled this issue?

    > >
    > > The win32 extensions for python allow you to get at the COM objects for
    > > applications like Word, and that would let you get the text and tables.
    > > google: win32 python.
    > >
    > > word = win32com.client.Dispatch('Word.Application')
    > > word.Documents.Open('C:\\myfile.doc')
    > >
    > > But I don't know the best way to find out the methods and properties of
    > > the "word" object.
    > >
    > > Rob

    >
    > You can use VBA documentation for Word, and using dot notation and
    > normal Pythonesque way of calling functions, play with its diverses
    > objects, methods and attributes...
    > Here's some pretty straightforward code along these lines:
    > #************************
    > import win32com.client
    > import tkFileDialog
    >
    > # Launch Word
    > MSWord = win32com.client.Dispatch("Word.Application")
    > MSWord.Visible = 0
    > # Open a specific file
    > myWordDoc = tkFileDialog.askopenfilename()
    > MSWord.Documents.Open(myWordDoc)
    > #Get the textual content
    > docText = MSWord.Documents[0].Content
    > # Get a list of tables
    > listTables= MSWord.Documents[0].Tables
    > #************************
    >
    > Happy parsing,
    >
    > Jean-Marc



    That is Awesome! Thanks!

    How would I save something in word format? I am guessing
    MSWord.Docments.Save(myWordDoc) or around those lines? where can I
    find more documentatin? Thanks.
    Fazer, Apr 24, 2004
    #5
  6. Fazer

    anon Guest

    Fazer wrote...

    > (jmdeschamps) wrote in message news:<>...
    >
    >>Rob Nikander <> wrote in message news:<>...

    <snip>
    >>>
    >>>But I don't know the best way to find out the methods and properties of
    >>>the "word" object.
    >>>

    <snip>
    >
    > How would I save something in word format? I am guessing
    > MSWord.Docments.Save(myWordDoc) or around those lines? where can I
    > find more documentatin? Thanks.




    Open MS Word and press (ALT + F11), then F2
    anon, Apr 24, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stephen Witter

    opening a word doc in word not browser

    Stephen Witter, May 18, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    468
    Stephen Witter
    May 18, 2004
  2. Luis Esteban Valencia
    Replies:
    2
    Views:
    2,922
    sanjupillai
    Sep 26, 2008
  3. crazyprakash
    Replies:
    4
    Views:
    3,360
    adrian
    Oct 30, 2005
  4. Tim Golden

    RE: Opening MS Word files via Python

    Tim Golden, Apr 21, 2004, in forum: Python
    Replies:
    0
    Views:
    771
    Tim Golden
    Apr 21, 2004
  5. Jim Showalter
    Replies:
    1
    Views:
    279
    Joshua Cranmer
    Aug 26, 2008
Loading...

Share This Page