Removing obscure chars

Y

Yobbo

Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on a
certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) > 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","™")
s = Replace(s,"—","-")
s = Replace(s,"’",""")
s = Replace(s,"'",""")
s = Replace(s,"""",""")
s = Replace(s,"&","&")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need be?

Thanks
 
A

Adrienne Boswell

Hi All

I have an ASP function in place to strip invalid chars out of a data
store before I create an XML file of this data, but my function
doesn't work on a certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

I detest these "smart" quotes. Are regular quotes dumb by comparison?
Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) > 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking
for the html equiv rather than the actual char, but I can't possibly
get away with simply copy and pasting these friggin(!!) chars into my
function. Surely this is bad practise?

You are putting in the HTML entity, you may need to put the ascii
character instead, for example:
s = replace(s,chr(60),"&gt;")
Does anybody know how I can trap and replace/remove these chars if
need be?

Thanks

HTH
 
D

Daniel Crichton

Yobbo wrote on Tue, 3 Apr 2007 18:17:59 +0100:
Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on
a certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) > 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need
be?

Your function is quite limited. What happens when a character not in your
list appears? The XML supported entity list is pretty small.

Here's the function I use in my own XML generation code, it's crude but it works:

function XMLEncode(strText)

'loop through code and replace all non-alphanumeric characters with their
ascii value
strNewText = ""

For i = 1 to Len(strText)

j = Asc(Mid(strText,i,1))

If j = 10 Then
'replace tab with a line break
strNewText= strNewText & "&lt;br&gt;"
ElseIf j = 13 or j = 9 then 'cr, lf, tab
'strip them
ElseIf j = 34 then
strNewText = strNewText & "&quot;"
ElseIf j = 39 then
strNewText = strNewText & "&apos;"
ElseIf j = 32 or j = 45 or (j >=49 and j <= 57) or (j >=65 and j <= 90) or
(j >= 97 and j <= 122) then
'ok
strNewText = strNewText & Mid(strText,i,1)
ElseIf j = 38 Then '&
strNewText = strNewText & "&amp;"
ElseIf j = 60 then '<
strNewText = strNewText & "&lt;"
ElseIf j = 62 then '>
strNewText = strNewText & "&gt;"
Else
strNewText = strNewText & "&#" & j & ";"
End If

Next

XMLEncode = strNewText
End Function


This checks each character in the string in turn, and replaces some with
entities, and the rest of the non-printable characters with their numeric
value. You could easily add a few more entity replacements as required. Just
watch out for the first couple of replacements where I replace tabs with a
<br>, and strip out carriage returns and line feeds, as that might not fit
what you want do with the XML yourself.

Dan
 
A

Anthony Jones

Yobbo said:
Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on a
certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) > 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need be?

Thanks

If you are creating an XML file can you use a DOMDocument to build it and
save it?
That'll ensure correct XML is created.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top