XML/encoding/prolog/python hell...

F

fscked

I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>

Is there some fundamental thing I am not getting? I cannot get
'tostrings' to work in ElementTree and I cannot figure the prolog out.

I posted a similar message back in January, but haven't had much luck.

PS
No I haven't been trying to do this since January, more important
things came up at work and I have just revived this. :)
 
K

kyosohma

I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>

Is there some fundamental thing I am not getting? I cannot get
'tostrings' to work in ElementTree and I cannot figure the prolog out.

I posted a similar message back in January, but haven't had much luck.

PS
No I haven't been trying to do this since January, more important
things came up at work and I have just revived this. :)

I've never done this, but I found a recipe on the ActiveState website
that looks like it would be helpful:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/159100

I think you could modify it to make it work.

You could probably also use a combination of the csv module and the
pyxml module (links below).

http://pyxml.sourceforge.net/topics/
http://www.rexx.com/~dkuhlman/pyxmlfaq.html

I also found a Python XML book: http://www.oreilly.com/catalog/pythonxml/chapter/ch01.html

I hope that helps. I've started my own adventure into XML with XRC and
wxPython.

Mike
 
F

fscked

I've never done this, but I found a recipe on the ActiveState website
that looks like it would be helpful:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/159100


I tried looking at that but couldn't figure out how to get the
property file working.

I think you could modify it to make it work.

You could probably also use a combination of the csv module and the
pyxml module (links below).

http://pyxml.sourceforge.net/topics/
http://www.rexx.com/~dkuhlman/pyxmlfaq.html

These are a little too confusing for me. :)
 
D

Diez B. Roggisch

fscked said:
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

Show us code. As concise as possible. Then we might be able to help you.

Diez
 
F

fscked

Here is what I currently have. Still missing prolog information and
namespace info. Encoding is irritating me also. :)

import os,sys
import csv
from elementtree.ElementTree import Element, SubElement, ElementTree,
tostring

def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
for elem in elem:
indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i

root = Element("boxes")
myfile = open('ClientsXMLUpdate.csv')
csvreader = csv.reader(myfile)

for row in csvreader:
mainbox = SubElement(root, "box")
r2 = csv.reader(myfile)
b = r2.next()
mainbox.attrib["city"] = b[10]
mainbox.attrib["country"] = b[9]
mainbox.attrib["phone"] = b[8]
mainbox.attrib["address"] = b[7]
mainbox.attrib["name"] = b[6]
mainbox.attrib["pl_heartbeat"] = b[5]
mainbox.attrib["sw_ver"] = b[4]
mainbox.attrib["hw_ver"] = b[3]
mainbox.attrib["date_activated"] = b[2]
mainbox.attrib["mac_address"] = b[1]
mainbox.attrib["boxid"] = b[0]

indent(root)
ElementTree(root).write('test.xml', "UTF-8")
 
S

Stefan Behnel

fscked said:
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.

UTF-8 encoding is the default. No need for a prologue here.

ET 1.3 will have an xml_declaration keyword argument for write() that will
allow you to write the declaration even if unnecessary. lxml already has it
now (and is ET compatible, so your code should just straight work).

http://codespeak.net/lxml

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>

This should help you to get namespaces working:

http://effbot.org/zone/element.htm#xml-namespaces

Hope it helps,
Stefan
 
S

Stefan Behnel

with lxml (although untested):
import os,sys
import csv

from lxml.etree import Element, SubElement, ElementTree, tostring
root = Element("{Boxes}boxes")
myfile = open('ClientsXMLUpdate.csv')
csvreader = csv.reader(myfile)

for row in csvreader:
mainbox = SubElement(root, "{Boxes}box")
r2 = csv.reader(myfile)
b = r2.next()
mainbox.put("city", b[10])
[...]

ElementTree(root).write('test.xml', "UTF-8", xml_declaration=True,
pretty_print=True)

Hope it helps,
Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,871
Messages
2,569,919
Members
46,172
Latest member
JamisonPat

Latest Threads

Top