Performance problem with RegEx

J

jmacduff

I have a performance issue related to regular expressions and
caching , hopefully someone can point me in the right direction?

I have a asp.net web service that is called several million times a
day. It does not have data caching enabled since the input variables
for the webmethods change every time its called. It's my understanding
that if the input paramters change frequently that the hash table
created by a caching option wont actually do any good.

Everytime the web service is called, among other things, the webmethod
gets a list of RegEx from a sql database and then loops through each
expression trying to find a match with a webmethod input paramter.

I have seen that as I add more regular expressions to test against, my
processor usage on the machine goes up accordinly. Right now I have
about 100 RegX in my db table and my proc usage is pretty consistantly
at 100%. In addition I see "random" out of memory exceptions from the
RegEx engine itself, the web server has 4 gigs of ram running on
server 2003.

I have two questions:

1. Is there a way for me to cache the list of regular expressions I am
getting from the sql db? If I was using a aspx page I would use the
datacache property on ado.net however how can I accomplish the same
thing the web service?

2. Can I "Pre-Compile" or otherwise improve the performance of the
regular expression testing itself?

thanks,
Jeff
 
S

Samuel R. Neff

regex caching is automatic if you use the static methods on regex
class but not if you create your own regex instances.

I would suggest retrieving the regular expressions from the db once,
create regex instances from each and store in a static array. Use the
RegexOptions.Compiled flag when creating the instances since they'll
be used many times. Regex instances are immutable and threadsafe so
you don't have to worry about using the same instance many times
simultaneously.

Another thing to look at is optimizing the array order. Say you are
matching against 100 regular expressions and assume you only need to
find the first match. Keeps stats on how often the patterns are
matched and then over time reorder the expressions so the most common
expressions are checked first.

HTH,

Sam
 
K

Kevin Spencer

Store a DataTable in the Application Cache. You will only need to create it
when the Application starts.

--
HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top