HttpWebResponse contains carriage returns, tab characters, etc?

D

Dave

I'm trying to download a webpage by using the HttpWebRequest. It returns the
html source, however, it contains "\r\n", "\t" etc throughout the text. Is
there a way to return the same HTML as when I navigate to the url in the
browser and do a "View Source"? Or do I have to manually strip these out?
The full code is below where I'm posting the necessary data to simulate a
form post.

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Url);
req.Method = "POST";
req.UserAgent = "Mozilla/4.0+";
req.ContentType = "application/x-www-form-urlencoded";

System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
byte[] PostBuffer = encoding.GetBytes(PostData);
req.ContentLength = PostBuffer.Length;
Stream stm = req.GetRequestStream();
stm.Write(PostBuffer, 0, PostBuffer.Length);
stm.Close();

// Get the response.
resp = req.GetResponse() as HttpWebResponse;
sr = new StreamReader(resp.GetResponseStream());

string result = sr.ReadToEnd(); <--returns the source but with carriage
returns etc.
 
D

densial

I would expect your source to contain CR's and Tabs, this is probably
the way it's being sent. if you see something else in a "view source"
I suspect that's a problem with the browser, not with the request.

so yes, you have to manually strip these out.
 
A

Anthony Jones

Dave said:
I'm trying to download a webpage by using the HttpWebRequest. It returns the
html source, however, it contains "\r\n", "\t" etc throughout the text. Is
there a way to return the same HTML as when I navigate to the url in the
browser and do a "View Source"? Or do I have to manually strip these out?
The full code is below where I'm posting the necessary data to simulate a
form post.

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Url);
req.Method = "POST";
req.UserAgent = "Mozilla/4.0+";
req.ContentType = "application/x-www-form-urlencoded";

System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
byte[] PostBuffer = encoding.GetBytes(PostData);
req.ContentLength = PostBuffer.Length;
Stream stm = req.GetRequestStream();
stm.Write(PostBuffer, 0, PostBuffer.Length);
stm.Close();

// Get the response.
resp = req.GetResponse() as HttpWebResponse;
sr = new StreamReader(resp.GetResponseStream());

string result = sr.ReadToEnd(); <--returns the source but with carriage
returns etc.

I have a suspision that you believe the string actually contains \r\n and \t
rather than the controls control code equivalents and that you believe this
because that's what the debugger is showing you. However the debugger shows
you the string in an escaped form that is valid as string literal in C#.

OTH something really bizare is happening.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,818
Messages
2,569,727
Members
45,664
Latest member
Phil79581

Latest Threads

Top