How to process HTML pages on server side with HTML DOM?

V

Vince C.

Hi.

I'd like to process HTML documents in an ASP script, i.e. to remove any unwanted
elements and extract desired element and attributes. I know how to do it on
client side within IE using its HTML DOM. But what I'd like is to do it
server-side. Is there a way, for instance, to reuse MSIE technology to retrieve
interfaces like IHTMLElement, IHTMLDOMAttribute, aso, or just built-in features
that would allow me to do the same?

Thanks or any hint/suggestion.

Vince C.
 
Y

Yan-Hong Huang[MSFT]

Hi Vince,

Thanks for posting in the group.

Currently I am finding somebody who could help you on it. We will get back
here with more information as soon as possible. If you have any more
concerns, please feel free to post here.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! ¨C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
M

MSFT

Hi Vince,

As I understabd, you want to parse the DOM element for a HTML file in ASP
server script. To achieve this, we may read the HTML file with FSO and Load
it in a HTMLDocument object, for example:

<%@Language=VBScript CODEPAGE=65001 %>

<%

Dim doc

Set doc = CreateObject("HTMLFILE")

dim objFSO

Set objFSO = Server.CreateObject("Scripting.FileSystemObject")

dim htmlFile

set htmlFile= objFSO.OpenTextFile("c:\test.html")


doc.write htmlFile.ReadAll

'doc.body.innerText="hello world"

Response.Write doc.documentElement.outerHTML

%>

Hope this answer your question,

Regards,

Luke
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
 
V

Vince C.

MSFT said:
Hi Vince,

As I understabd, you want to parse the DOM element for a HTML file in ASP
server script. To achieve this, we may read the HTML file with FSO and Load
it in a HTMLDocument object, for example:

<%@Language=VBScript CODEPAGE=65001 %>

<%

Dim doc

Set doc = CreateObject("HTMLFILE")

dim objFSO

Set objFSO = Server.CreateObject("Scripting.FileSystemObject")

dim htmlFile

set htmlFile= objFSO.OpenTextFile("c:\test.html")


doc.write htmlFile.ReadAll

'doc.body.innerText="hello world"

Response.Write doc.documentElement.outerHTML

%>

Hope this answer your question,

Oh my! I never thought it was so simple! Knocks one's socks off... It was worth
asking the question before reinventing the wheel.

Not even threading nor performance issues apart from those that relate to using
FSO? Note you don't need to say "yes" as I'm already satisfied ;-).

Vince C.
"- Use the forge, Luke..."
 
M

MSFT

Hi Vince,

For frequently requested web page, we need to consider the performance for
FSO. Anyway, FSO is the common way we use in ASP to read a file.

Luke
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
 
V

Vince C.

Pete said:
Or, if your html is xml-compliant, you can always use the XML Parser.

That's my problem: it's not. Because modified (for now) by non XHTML-aware
Office tools.

Vince C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top