T
tempest
Hi all.
This is a rather long posting but I have some questions concerning the
usage of character entities in XML documents and PCI security
compliance.
The company I work for is using a third party ecommerce service for
hosting its online store. A few months ago this third party commerce
site began using PGP file encryption on XML files (e.g. web orders)
transferred to us as part of the ongoing PCI security compliance.
Basically we only need to add a PGP decryption process before we can
parse the incoming XML files so there should not have been any
technical issue.
However, we noticed that XML files they created since PGP encryption
was implemented contain some unusual character entities.
For example, if a XML file have elements containing characters such as
<, >, &, -, /, ' and so on, the XML file will use the following
character entities to represent them as shown below:
Character Unusal Character Entities
< &lt;
- &#45;
/ &#47;
' &#39;
No matter how you look at them, they are NOT the proper character
entities for the original characters shown.
The problem with these bad character entities is that when we use .Net
Framework components such as XmlReader to load the XML file, character
entities are not expanded back to the original characters they
represent.
Instead I would get the following result:
Unusal Character Entities Expanded Result:
&lt; <
&gt; >
&#38; &
&#45; -
&#47; /
&#39; '
If you take a close look at the expanded results, you would see that
they are the normal character entities you would expect to see.
It seems to me that XML export process used by the ecommerce site has
applied character entities "encoding" twice.
For example, the proper character entity for / is /.
However, if you treat / as data string and not as character entity
and apply another "encoding", you would get &#47;.
This means that whenever a online customer enter characters such as &
or / in their name or shipping address, the XML file we parsed will
not give us the correct text.
For example, if customer entered "Christian & Cruz" on their shipping
address the XML file we downloaded will show them as "Christian
&#38; Cruz". And when the XML file is parsed the resulting string
we get would be "Christian & Cruz".
Another example. If a customer entered "c/o R. Fenton, M.D." in their
shipping address, the XML file will show this string as "c&#47;o
R. Fenton, M.D.". And the resulting string we parsed would be
"c/o R. Fenton, M.D.".
When we reported this problem to the ecommerse hosting company, their
response was that these character entities were "encoded" per PCI
security policy and thus they have no plan to "fix" them.
Their reply sounds strange because these weird character entities they
use in XML files are NOT data encryption nor do they provide security
benefits.
Can anyone tell me if there is in fact some kind of special character
entities used in XML file per PCI security compliancy?
Or is our ecommerce hosting company wrong?
Any information would be appreciated.
Thank you.
This is a rather long posting but I have some questions concerning the
usage of character entities in XML documents and PCI security
compliance.
The company I work for is using a third party ecommerce service for
hosting its online store. A few months ago this third party commerce
site began using PGP file encryption on XML files (e.g. web orders)
transferred to us as part of the ongoing PCI security compliance.
Basically we only need to add a PGP decryption process before we can
parse the incoming XML files so there should not have been any
technical issue.
However, we noticed that XML files they created since PGP encryption
was implemented contain some unusual character entities.
For example, if a XML file have elements containing characters such as
<, >, &, -, /, ' and so on, the XML file will use the following
character entities to represent them as shown below:
Character Unusal Character Entities
< &lt;
& &amp;&gt;
- &#45;
/ &#47;
' &#39;
No matter how you look at them, they are NOT the proper character
entities for the original characters shown.
The problem with these bad character entities is that when we use .Net
Framework components such as XmlReader to load the XML file, character
entities are not expanded back to the original characters they
represent.
Instead I would get the following result:
Unusal Character Entities Expanded Result:
&lt; <
&gt; >
&#38; &
&#45; -
&#47; /
&#39; '
If you take a close look at the expanded results, you would see that
they are the normal character entities you would expect to see.
It seems to me that XML export process used by the ecommerce site has
applied character entities "encoding" twice.
For example, the proper character entity for / is /.
However, if you treat / as data string and not as character entity
and apply another "encoding", you would get &#47;.
This means that whenever a online customer enter characters such as &
or / in their name or shipping address, the XML file we parsed will
not give us the correct text.
For example, if customer entered "Christian & Cruz" on their shipping
address the XML file we downloaded will show them as "Christian
&#38; Cruz". And when the XML file is parsed the resulting string
we get would be "Christian & Cruz".
Another example. If a customer entered "c/o R. Fenton, M.D." in their
shipping address, the XML file will show this string as "c&#47;o
R. Fenton, M.D.". And the resulting string we parsed would be
"c/o R. Fenton, M.D.".
When we reported this problem to the ecommerse hosting company, their
response was that these character entities were "encoded" per PCI
security policy and thus they have no plan to "fix" them.
Their reply sounds strange because these weird character entities they
use in XML files are NOT data encryption nor do they provide security
benefits.
Can anyone tell me if there is in fact some kind of special character
entities used in XML file per PCI security compliancy?
Or is our ecommerce hosting company wrong?
Any information would be appreciated.
Thank you.