RE: Parsing xml file using python

Discussion in 'Python' started by David LeBlanc, Mar 5, 2004.


  1. > Hello, all,
    >
    > I am new to Python.
    >
    > I need to read an XML document and ignore all XML tags and write only
    > those between the tags to a text file. In other words, if I have an
    > XML document like so:
    >
    > <tag1>This</tag1>
    > <tag2>is</tag2>
    > <tag3>a</tag3>
    > <tag1>test</tag1>
    >
    > I need to write "This is a test" to a text file. How do I achieve
    > this? Thanks.


    PyXML would be a perfect solution - and easy too.

    Dave LeBlanc
    Seattle, WA USA
     
    David LeBlanc, Mar 5, 2004
    #1
    1. Advertising

  2. Josiah Carlson, Mar 5, 2004
    #2
    1. Advertising

  3. David LeBlanc

    chad Guest

    Hi, Thanks.

    I downloaded PyXML from sourceforge, but could not install it. When I
    double click the install file, it asks me to go next and choose the
    folder to install it, but then it gives me no choice to choose from.
    It seems the installer cannot get my file system info.

    I am using Python on Win2K.


    "David LeBlanc" <> wrote in message news:<>...
    > > Hello, all,
    > >
    > > I am new to Python.
    > >
    > > I need to read an XML document and ignore all XML tags and write only
    > > those between the tags to a text file. In other words, if I have an
    > > XML document like so:
    > >
    > > <tag1>This</tag1>
    > > <tag2>is</tag2>
    > > <tag3>a</tag3>
    > > <tag1>test</tag1>
    > >
    > > I need to write "This is a test" to a text file. How do I achieve
    > > this? Thanks.

    >
    > PyXML would be a perfect solution - and easy too.
    >
    > Dave LeBlanc
    > Seattle, WA USA
     
    chad, Mar 5, 2004
    #3
  4. David LeBlanc

    Tim Heaney Guest

    (chad) writes:
    >
    > I downloaded PyXML from sourceforge, but could not install it.


    Python already comes with stuff like xmllib, minidom, and the
    aforementioned sgmllib

    http://www.python.org/doc/current/lib/markup.html

    You should be able to do what you described without installing
    anything else.

    I hope this helps,

    Tim
     
    Tim Heaney, Mar 5, 2004
    #4
  5. David LeBlanc

    chad Guest

    Josiah Carlson <> wrote in message news:<c29981$4eu$>...
    > > PyXML would be a perfect solution - and easy too.

    >
    > Even easier would be to use an SGML parser:
    > http://flangy.com/dev/python/striphtml.html
    >
    > Works for XML, HTML, etc.
    >
    > - Josiah


    This is absolutely cool. Nice and neat and beautiful. Thank you very
    much for that cool link, Josiah.
     
    chad, Mar 5, 2004
    #5
  6. David LeBlanc

    Sam Smith Guest

    Please try the following code. As Tim says everything you need for
    your problem at hand is included in the core distribution. Look for
    documentation for xml.sax module.

    Code ::

    import xml.sax.handler
    import xml.sax
    import sys

    class ReadXML(xml.sax.handler.ContentHandler):
    def __init__(self, xml_file):
    xml.sax.handler.ContentHandler.__init__(self)
    self.content = ""
    xml.sax.parse(xml_file, self)

    def characters(self, content):
    self.content = self.content.lstrip().rstrip() + " " + content

    if __name__=="__main__":
    if len(sys.argv) != 3:
    print "usage: %prog xml_file output_file"
    sys.exit()
    xml_file = sys.argv[1]
    output_file = sys.argv[2]
    readXML = ReadXML(xml_file)
    f = file(output_file, "w+")
    f.write(readXML.content)
    f.close()
     
    Sam Smith, Mar 5, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. chad

    Parsing xml file using python

    chad, Mar 5, 2004, in forum: Python
    Replies:
    4
    Views:
    27,223
    Andrew Clover
    Mar 5, 2004
  2. Tony Meyer

    RE: Parsing xml file using python

    Tony Meyer, Mar 7, 2004, in forum: Python
    Replies:
    0
    Views:
    397
    Tony Meyer
    Mar 7, 2004
  3. Originlabs
    Replies:
    0
    Views:
    459
    Originlabs
    Apr 23, 2008
  4. John Levine
    Replies:
    0
    Views:
    747
    John Levine
    Feb 2, 2012
  5. Erik Wasser
    Replies:
    5
    Views:
    483
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page