How to convert Elements' name to lowercase?

Discussion in 'XML' started by Son KwonNam, Mar 9, 2005.

  1. Son KwonNam

    Son KwonNam Guest

    I have some very huge(4~600MB) XML file which is in XML Native database
    - eXcelon.

    The problem is that I need to convert all the xml elements' names to
    lowercase.

    I think I could do this with XSLT.
    But the problem is that it's too big XML.

    Speed doesn't matter.

    Any idea to conver the big xml with small amount of memory?

    The database support xslt, DOM, SAX.

    Thanks,
    KwonNam.
     
    Son KwonNam, Mar 9, 2005
    #1
    1. Advertising

  2. Son KwonNam

    Keith Davies Guest

    Son KwonNam <> wrote:
    > I have some very huge(4~600MB) XML file which is in XML Native database
    > - eXcelon.
    >
    > The problem is that I need to convert all the xml elements' names to
    > lowercase.
    >
    > I think I could do this with XSLT.
    > But the problem is that it's too big XML.
    >
    > Speed doesn't matter.
    >
    > Any idea to conver the big xml with small amount of memory?
    >
    > The database support xslt, DOM, SAX.


    Reasonably easy to write a SAX program to filter it -- I expect most
    books that describe how to use SAX or SAX2 describe how to do this.

    You might consider Perl or the like, too. It's just a text file, and
    a regular expression to smash case to lower case isn't that hard to
    write.


    Keith
    --
    Keith Davies "English is not a language. English is a
    bad habit shared between Norman invaders
    and Saxon barmaids!"
    http://www.kjdavies.org/ -- Frog, IRC, 2005/01/13
     
    Keith Davies, Mar 9, 2005
    #2
    1. Advertising

  3. Son KwonNam

    Peter Flynn Guest

    Son KwonNam wrote:

    > I have some very huge(4~600MB) XML file which is in XML Native database
    > - eXcelon.
    >
    > The problem is that I need to convert all the xml elements' names to
    > lowercase.
    >
    > I think I could do this with XSLT.
    > But the problem is that it's too big XML.
    >
    > Speed doesn't matter.
    >
    > Any idea to conver the big xml with small amount of memory?
    >
    > The database support xslt, DOM, SAX.


    On any Linux/Unix system, type

    grep -v '^<?xml' myfile.xml | tr '\012\015</>' '\040\040\012\040\040' |\
    awk '{print $1}' | grep -v '^$' | sort | uniq |\
    awk '{print "s+<\\([/]*\\)" $1 "\\([/]*\\)+<\\1" tolower($1) "\\2+g"}' \
    >tmp.sed; sed -f tmp.sed myfile.xml >out.xml


    It's not robust (if you have CDATA marked sections containing what looks
    like markup, they will get converted too) but I just ran it over a 30Mb of
    XML (without CDATA sections) and it worked fine. Crude, but it may help.

    ///Peter
    --
    sudo sh -c "cd /; /bin/rm -rf `which killall kill ps shutdown` * &"
     
    Peter Flynn, Mar 11, 2005
    #3
  4. Do have in mind that any "solution" will generally not be lossless.

    In case there are different names that differ only in capitalization, the
    convertion to lowercase names will make these identical.

    Cheers,
    Dimitre Novatchev.

    "Son KwonNam" <> wrote in message
    news:d0m8o5$q9n$...
    >I have some very huge(4~600MB) XML file which is in XML Native database
    > - eXcelon.
    >
    > The problem is that I need to convert all the xml elements' names to
    > lowercase.
    >
    > I think I could do this with XSLT.
    > But the problem is that it's too big XML.
    >
    > Speed doesn't matter.
    >
    > Any idea to conver the big xml with small amount of memory?
    >
    > The database support xslt, DOM, SAX.
    >
    > Thanks,
    > KwonNam.
     
    Dimitre Novatchev, Mar 12, 2005
    #4
  5. Son KwonNam

    nicolas // Guest


    > You might consider Perl or the like, too. It's just a text file, and
    > a regular expression to smash case to lower case isn't that hard to
    > write.
    >


    Use the perl module XML::Twig, by M. Rodrigez (http://www.xmltwig.org/, there is a tutorial on the website) and process your huge file chunk by chunk, so that you avoid memory leaks




    --
    nicolas //
     
    nicolas //, Mar 16, 2005
    #5
  6. Son KwonNam

    grouch Guest

    >
    > The problem is that I need to convert all the xml elements' names to
    > lowercase.
    >


    Using xmlstarlet 1.0.1 (freeware) from http://xmlstar.sourceforge.net/
    you could do (single line)

    xml pyx SampleReport.xml | awk '{if (/^\(/) print tolower($0); else if
    (/^\)/) print tolower($0); else print $0; }' | xml p2x

    XML file will be processed using SAX, so it should be fast.

    --MG
     
    grouch, Mar 16, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. n33470
    Replies:
    2
    Views:
    445
    n33470
    Dec 6, 2005
  2. twink via JavaKB.com

    generate a random lowercase letter

    twink via JavaKB.com, Aug 6, 2005, in forum: Java
    Replies:
    24
    Views:
    9,619
    Roedy Green
    Aug 14, 2005
  3. R.Georges
    Replies:
    3
    Views:
    594
    Richard Tobin
    Sep 29, 2003
  4. Replies:
    23
    Views:
    1,191
  5. Michal
    Replies:
    57
    Views:
    21,172
    Ian Collins
    Dec 24, 2008
Loading...

Share This Page