Long strings and memory errors.

Brent · Jul 26, 2005

I'd like to think that my code* is pretty simple, but I'm running into
memory errors when loading larger documents.

The URL you see below in the first line of the Page_Load function is
about 3 MB. That URL then gets parsed with several regular expressions.
The code often runs out of memory. I'm guessing that each time the
document is parsed, a new string of 3 MB is created -- the several
regular expressions I use consume memory rapidly.

One option I have explored is reading in each row. The problem is that
the regular expression parsing looks at multiple lines, and I can't
guarantee where the lines I need will occur.

I'm a bit frustrated at this point, as the code works fine on smaller
documents. I'd sure appreciate any help.

-- Brent

*==============================================================
public void Page_Load(Object sender, EventArgs e) {

string strResponse =
getText("http://www.sec.gov/Archives/edgar/data/1085158/0001085158-99-000008.txt"));
string report_date = getRegExGroupValue(strResponse, @regExPattern1,"G2");
string report_header = getRegExGroupValue(strResponse,
@regExPattern1,"G2");
string report_companyname = getRegExGroupValue(strResponse,
@regExPattern3,"G2");
string report_date = getRegExGroupValue(strResponse, @regExPattern4,"G2");

}

public string getText(string strURL)
{
HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(strURL);
oRequest.Timeout = 10*60000; // 10 minutes; for long files (10000
= 10 seconds)
oRequest.UserAgent = "Web Client";
HttpWebResponse oResponse = (HttpWebResponse)oRequest.GetResponse();

Stream myStream = oResponse.GetResponseStream();

StreamReader sr = new StreamReader(myStream);
string strResponse = sr.ReadToEnd();
return strResponse;
myStream.Close();
}

public string getRegExGroupValue(string strText, string strPat, string
strGroup)
{

string returnValue;

if(Regex.IsMatch(strText,@strPat,RegexOptions.Multiline|RegexOptions.IgnoreCase))
{
Match strMatch = Regex.Match(strText,@strPat,RegexOptions.Multiline);
Regex.Replace(strText,@"\s"," ");
return strMatch.Groups[strGroup].Value.Trim();

}
else
{
return "0";
}
}
==========================================================

Kevin Spencer · Jul 26, 2005

I'm guessing that the content doesn't change very often. Perhaps creating a
class to parse the content, and cache an instance of the class in
Application or Session State would do the trick.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
The sun never sets on
the Kingdom of Heaven

Brent · Jul 27, 2005

Thanks, Kevin. I ended up grabbing the first 150 lines of text, where
the header info occurs, then grabbing the whole file again line by line.
It works pretty quickly, but it's probably not elegant...so be it!

--Brent

Juan T. Llibre · Jul 27, 2005

re:

It works pretty quickly, but it's probably not elegant...so be it!

Hey, it's a lot more "elegant" that what didn't work !

;-)

Long files: HttpWebRequest & StreamRead	2	Jul 25, 2005
HTTPWebResponse Timeout problems	1	Oct 25, 2005
expert Help Please HttpWebResponse --The operation has timed-out.	0	Nov 29, 2006
Problem with HttpWebRequest class...	0	Dec 12, 2003
HttpWebResponse.GetResponseStream returns incomplete stream	9	Aug 22, 2006
Using Session cookie to access ASP from within ASP.NET	0	Aug 18, 2003
HTTP Handlers	0	Nov 3, 2003
Accessing ASP Session from ASP.NET via Session cookie	1	Aug 18, 2003

Long strings and memory errors.

Brent

Kevin Spencer

Brent

Juan T. Llibre

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads