B
Brent
I'm having odd problems with the HttpWebResponse class. Some servers are
quite speedy, while others don't seem to want to talk to my code.
Consider the following pages*:
http://planetbrent.com/test.aspx?url=http://www.yahoo.com
http://planetbrent.com/test.aspx?url=http://www.msn.com
http://planetbrent.com/test.aspx?url=http://www.sec.gov
http://planetbrent.com/test.aspx?ur...5/000090266405001703/0000902664-05-001703.txt
The first two return a "screen scrape" quite rapidly -- all-but
instantly. The third and fourth links -- both on the same server, I'm
guessing -- take several seconds (if you test them, please be patient!).
The four pages are roughly the same size in bytes.
Viewing the slow pages in a normal browser from the same machine is
acceptably fast.
I'm thinking there may be an issue with the way the response comes back
from the server. Is it possible that some servers deliver data that's
confusing to the HttpWebResponse class?
I read something about bad chunking (or something like that) from Apache
servers, and that one possible solution was to get bytes instead of
lines. That might be a solution ... except that I need to read through
URLs like the fourth one line-by-line, as each line gets parsed by
other parts of the code.
I'd sure appreciate any pointers on handling this situation!
--Brent
*Full source code for the page.
=========================================================
<%@ Page Language="C#" %>
<%@ Import Namespace="System" %>
<%@ Import Namespace="System.Web" %>
<%@ Import Namespace="System.Net" %>
<%@ Import Namespace="System.IO" %>
<%@ Import Namespace="System.Text" %>
<%@ Import Namespace="System.Data" %>
<script language="C#" runat="server">
string rowsdeclared = "";
public void Page_Load(Object sender, EventArgs e)
{
string url = Request.QueryString["url"] != null ?
Request.QueryString["url"] : "http://www.yahoo.com";
ctrlSF.Text = "Started: " +
System.DateTime.Now.ToString("HH:mm:ss") + " | ";
ctrlText.Text = getHeader(url);
ctrlSF.Text += "Finished: " + System.DateTime.Now.ToString("HH:mm:ss");
}
public string getHeader (string strURL)
{
StringBuilder returnString = new StringBuilder();
String strreturn;
String thisRow = "";
WebResponse objResponse;
WebRequest objRequest = HttpWebRequest.Create(strURL);
objResponse = objRequest.GetResponse();
using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
{
while ((thisRow = sr.ReadLine()) != null)
{
returnString.Append(thisRow);
}
}
return returnString.ToString();
}
</script>
<asp:Literal id="ctrlSF" runat="server" />
<asp:Literal id="ctrlText" runat = "server" />
quite speedy, while others don't seem to want to talk to my code.
Consider the following pages*:
http://planetbrent.com/test.aspx?url=http://www.yahoo.com
http://planetbrent.com/test.aspx?url=http://www.msn.com
http://planetbrent.com/test.aspx?url=http://www.sec.gov
http://planetbrent.com/test.aspx?ur...5/000090266405001703/0000902664-05-001703.txt
The first two return a "screen scrape" quite rapidly -- all-but
instantly. The third and fourth links -- both on the same server, I'm
guessing -- take several seconds (if you test them, please be patient!).
The four pages are roughly the same size in bytes.
Viewing the slow pages in a normal browser from the same machine is
acceptably fast.
I'm thinking there may be an issue with the way the response comes back
from the server. Is it possible that some servers deliver data that's
confusing to the HttpWebResponse class?
I read something about bad chunking (or something like that) from Apache
servers, and that one possible solution was to get bytes instead of
lines. That might be a solution ... except that I need to read through
URLs like the fourth one line-by-line, as each line gets parsed by
other parts of the code.
I'd sure appreciate any pointers on handling this situation!
--Brent
*Full source code for the page.
=========================================================
<%@ Page Language="C#" %>
<%@ Import Namespace="System" %>
<%@ Import Namespace="System.Web" %>
<%@ Import Namespace="System.Net" %>
<%@ Import Namespace="System.IO" %>
<%@ Import Namespace="System.Text" %>
<%@ Import Namespace="System.Data" %>
<script language="C#" runat="server">
string rowsdeclared = "";
public void Page_Load(Object sender, EventArgs e)
{
string url = Request.QueryString["url"] != null ?
Request.QueryString["url"] : "http://www.yahoo.com";
ctrlSF.Text = "Started: " +
System.DateTime.Now.ToString("HH:mm:ss") + " | ";
ctrlText.Text = getHeader(url);
ctrlSF.Text += "Finished: " + System.DateTime.Now.ToString("HH:mm:ss");
}
public string getHeader (string strURL)
{
StringBuilder returnString = new StringBuilder();
String strreturn;
String thisRow = "";
WebResponse objResponse;
WebRequest objRequest = HttpWebRequest.Create(strURL);
objResponse = objRequest.GetResponse();
using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
{
while ((thisRow = sr.ReadLine()) != null)
{
returnString.Append(thisRow);
}
}
return returnString.ToString();
}
</script>
<asp:Literal id="ctrlSF" runat="server" />
<asp:Literal id="ctrlText" runat = "server" />