M
Marc Dubois
hi,
is it possible to parse an XML file in C so that i can fulfill these
requirements :
1) replace all "<" and ">" signs inside the body of tag by a space, e.g. :
Example 1:
<foo> blabla < bla </foo>
becomes
<foo> blabla bla </foo>
Example 2:
<foo>> blablabla </foo>
becomes
<foo> blablabla </foo>
2) Remove all extra spaces at the end of every line of the XML file
3) Replace all special characters ( Unicode or Hexadecimal characters) by a
space
I mean the XML file is not well formed if there are "<" and ">" signs a
little bit everywhere,
it is not a valid file in that case, so i do not think the use of a parser
would be appropriate in that case. (How would the parser react when it
encounters a < that does not correspond to the beginning of a tag ???)
Do you have an idea on how i can write a program to deal with these
requirements ?
Technical environment is : Unix, KSH, and C (gcc)
I am thinking of using the "sed" command instead, i can get rid of the extra
spaces and replace the special characters but i still do not know how to
deal with the extra ">" and "<" signs.
Thanks for your help.
is it possible to parse an XML file in C so that i can fulfill these
requirements :
1) replace all "<" and ">" signs inside the body of tag by a space, e.g. :
Example 1:
<foo> blabla < bla </foo>
becomes
<foo> blabla bla </foo>
Example 2:
<foo>> blablabla </foo>
becomes
<foo> blablabla </foo>
2) Remove all extra spaces at the end of every line of the XML file
3) Replace all special characters ( Unicode or Hexadecimal characters) by a
space
I mean the XML file is not well formed if there are "<" and ">" signs a
little bit everywhere,
it is not a valid file in that case, so i do not think the use of a parser
would be appropriate in that case. (How would the parser react when it
encounters a < that does not correspond to the beginning of a tag ???)
Do you have an idea on how i can write a program to deal with these
requirements ?
Technical environment is : Unix, KSH, and C (gcc)
I am thinking of using the "sed" command instead, i can get rid of the extra
spaces and replace the special characters but i still do not know how to
deal with the extra ">" and "<" signs.
Thanks for your help.