Saving contents of page to a string variable

J

John Kotuby

Hi all,

I am using a 3rd party program in a VS2005 web project. The tool takes as
input a string containing HTML and converts it to RTF.
I have been creating a page by dynamically loading UserControls and then
sending the page to the browser.

Now I want to write the contents of the the page to a string variable,
convert that variable to the RTF and send that to the browser and I am stuck
on the "simple" part of getting the contents of the rendered page into a
string var. It also occurs to me that maybe I might not be able to get the
rendered page from the LoadComplete sub.

Below is some code that demonstrates how the 3rd party control should work.
Note that the code
"Dim htmlString = Response.Output" indicates my lack of understanding of how
to get the rendered contents of the page into a string variable, which I am
trying to do. I have been looking at the Response object to accomplish this
task but maybe I am looking in the wrong place.

Also, can I likewise persist the contents of a page that I have assembled in
..NET to a file on the server itself, rather than sending it to the browser?

--------------------------
Protected Sub Common_DeckRTFOutput_LoadComplete(ByVal sender As Object,
ByVal e As System.EventArgs) Handles Me.LoadComplete
Dim htmlString = Response.Output
Dim rtfString As String = ""

'create object (instance) of html2rtf converter
Dim h As SautinSoft.HtmlToRtf.Converter = New
SautinSoft.HtmlToRtf.Converter()
If Not h Is Nothing Then
'set converter options
h.OutputTextFormat = SautinSoft.HtmlToRtf.eOutputTextFormat.Rtf
h.HtmlPath = "C:\development\powercard.net"

'convert strings
rtfString = h.ConvertString(htmlString)
Response.Write(rtfString)
End If
End Sub
 
G

gfergo

John,

Are you trying to do a "screen scrape"?? In other words, do you want
to grab the HTML as if you were viewing the source?

There is a great article on 4Guys (http://www.4guysfromrolla.com)
called Screen Scrapes in ASP.NET.

Here is a snippet from that article -

<%@ Import Namespace="System.Net" %>
<script language="VB" runat="server">
Sub Page_Load(sender as Object, e as EventArgs)
'STEP 1: Create a WebClient instance
Dim objWebClient as New WebClient()


'STEP 2: Call the DownloadedData method
Const strURL as String = "http://www.aspmessageboard.com/"
Dim aRequestedHTML() as Byte

aRequestedHTML = objWebClient.DownloadData(strURL)

'STEP 3: Convert the Byte array into a String
Dim objUTF8 as New UTF8Encoding()
Dim strRequestedHTML as String
strRequestedHTML = objUTF8.GetString(aRequestedHTML)


'WE'RE DONE! - display the string
lblHTMLOutput.Text = strRequestedHTML
End Sub
</script>
 
G

George Ter-Saakov

I see what you are trying to do.

You can use following code.

1. Move all your HTML to one big UserControl that is a Page right now.

Then in your aspx page

MyUsercontrol cn = LoadControl("~/MyUserControl.ascx")'
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
cn.RenderControl(hw);
string sHtml = sw.ToString();

It's C# but you should easily convert it to VB.NET

George.
 
J

John Kotuby

Hi Steve,
I have been using your ExportPanel with good results. Thanks very much.

This latest need came about when the boss saw the DOC output from the
ExportPanel (which I thought was very nice) and said "Where are the custom
logos/graphics that I told our customers we could place in DOC and PDF files
we generate?"

Well as far as I can tell, if an image is included in the HTML and exported
as a DOC attachment, there is only a web reference to the location of the
image... which is not actually part of the DOC file.

So I'm trying a product which claims to actually integrate images into the
RTF document structure itself, thus the need for the grabbing the HTML as it
is sent to the browser.

From what I am seeing, instead of directly calling the page that generates
the HTML, I call a "proxy" page containing the WebClient that then calls the
other page (even if it is on the same server in the same web) and collect
the HTML stream which I can then place in a variable or save to a file,
probably with FSO or something similar in .NET.
--------------------
Dim myStream as Stream =
myWebClient.OpenRead("~/htmlout.aspx?deckid=233")
Dim sr as StreamReader = new StreamReader(myStream)
dim htmlString as string = sr.ReadToEnd().ToString
--------------------

Or something like that to capture stream output to a variable?

Looks great if I can get it to work.

Thanks muchly....


Steve C. Orr said:
You can use the WebClient class to accomplish that.
Here's more info:
http://www.superdotnet.com/Article.aspx?ArticleID=39
 
J

John Kotuby

Thanks George,
I might try Steve's suggestion first, but this looks good also. So many ways
to skin the poor cat, and I couldn't find even one.

George Ter-Saakov said:
I see what you are trying to do.

You can use following code.

1. Move all your HTML to one big UserControl that is a Page right now.

Then in your aspx page

MyUsercontrol cn = LoadControl("~/MyUserControl.ascx")'
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
cn.RenderControl(hw);
string sHtml = sw.ToString();

It's C# but you should easily convert it to VB.NET

George.
 
J

John Kotuby

Thanks...

That makes for 2 suggestions to use the WebClient as a "proxy" browser to
collect the contents of another page. Your added info about converting a
Byte array to String may be a very important step.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,785
Messages
2,569,624
Members
45,318
Latest member
LuisWestma

Latest Threads

Top