XMLHTTP: translating entities like curly quotes and curly apostrophes?

K

Ken Fine

I'm using XMLHTTP to screen-scrape many thousands of pages of content as
part of a data-structuring project.

One issue that I'm running into is that some entities such as curly quotes
and curly apostrophes do not translate properly; they're returned as
question marks indicating an unidentified character.

I'm guessing the usual hack of writing a translate function doesn't work
since the problem lies in the data being pulled down by XMLHTTP.

Is there anything that can be done, short of using a different
screen-scraping component? I intially used something called "ASPTear", but
moved to XMLHTTP since it seems to return fewer errors in production

Thanks,
-KF
 
K

Ken Fine

In the realm of the useful: best solution I found was to acquire the content
to a local source and search/replace all of the entities to more standard
equivalents. Maybe this helps someone.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top