FSO + XMLHTTP + reading large files + errr....

Discussion in 'ASP General' started by Steven Burn, May 22, 2005.

  1. Steven Burn

    Steven Burn Guest

    The application;

    Service on my webserver that allows a user to upload their HOSTS file for functions to verify the contents are still valid.

    Uses;

    1. XMLHTTP (MSXML2)
    2. FileSystemObject
    3. CrazyBeavers Upload control (couldn't get the Dundas one to work)

    How it's supposed to work;

    1. User uploads file (test file = 1.10MB)
    2. FSO saves file to server then prepares for reading
    3. File is opened using the Read(n) method (line # 45 in res_upload.txt)
    4. Content is parsed and the parts required, passed to a function (URLExists)
    5. Status saved to report to later show to client

    Files:

    Test file: hpHosts - http://www.hosts-file.net

    Example report: http://mysteryfcm.plus.com/?mode=Hosts&bFile=2252005_132703_copy of hosts.txt.htm

    Code that uploads then processes the file: http://mysteryfcm.plus.com/res_upload.txt

    The problem;

    The file can contain anything from 10 lines to 20,000+ lines, each one is parsed and passed to the URLExists function. I'm not entirely sure whether the problem is due to the content itself, or the number of calls to the XMLHTTP object but, a file containing 10,000 lines (the test file) times out after around 1,000 lines.

    The timeout set for the URLExists function is 5 seconds (the timeout takes well over an hour of reading/parsing, to occur), the script timeout is set to 5000

    I've tried cutting alot of the content of the test file out so it's 206K instead of 1.10MB, but it's still taking forever to process, and then timing out anyway.

    The question;

    I've found some doc's online that show how to read and parse large files with a ton of content and have applied this to the application in question but for reasons best known to itself, it is still timing out. What I'm wondering is;

    1. Would it be best to ditch the application server-side and make it a downloadable application instead?

    It would be easier to work with in VB but thats the reason I didn't want to do it that way (I like a challenge).

    2. Would it be viable to split the file into parts once uploaded, and process each part seperately instead of doing it the way I'm currently doing it?.

    3. Is there a better alternative that I've simply not thought of?

    I've probably not given enough info and apologies in advance if this is the case (got a million and one things going through my head atm). Thanks in advance for any advice/suggestions.

    --
    Regards

    Steven Burn
    Ur I.T. Mate Group
    www.it-mate.co.uk

    Keeping it FREE!
    Steven Burn, May 22, 2005
    #1
    1. Advertising

  2. [I don't understand why my news reader won't prefix the lines from the OP...
    sorry for any confusion... I prefixed the short sections by hand...]

    >>>>>>>>>>>>>>>>>>>

    "Steven Burn" <> wrote in message
    news:...
    The application;

    Service on my webserver that allows a user to upload their HOSTS file for
    functions to verify the contents are still valid.

    [snip]

    The problem;

    The file can contain anything from 10 lines to 20,000+ lines, each one is
    parsed and passed to the URLExists function. I'm not entirely sure whether
    the problem is due to the content itself, or the number of calls to the
    XMLHTTP object but, a file containing 10,000 lines (the test file) times out
    after around 1,000 lines.

    The timeout set for the URLExists function is 5 seconds (the timeout takes
    well over an hour of reading/parsing, to occur), the script timeout is set
    to 5000

    > I've tried cutting alot of the content of the test file out so it's 206K
    > instead of 1.10MB, but it's still taking forever to process, and then
    > timing
    > out anyway.

    <<<<<<<<<<<<<<<<<<<



    Are there really boxes out there with HOSTS files anywhere even near that
    long? We're talking %windir%\system32\drivers\etc\HOSTS, yeah? Any time
    I'm even tempted to put more than 25 lines in HOSTS [esp. the same
    entries on more than one PC] I find somewhere reasonably convenient to
    install BIND! Don't like 99.5% of the HOSTS files out there have just one
    line?

    127.0.0.1 localhost

    Sorry if all that's beside the point, I'm mostly just curious as to whether
    or not this commonly exists, and if so, why?



    >>>>>>>>>>>>>>>>>>>>>

    > The question;
    >
    > I've found some doc's online that show how to read and parse large files
    > with a ton of content and have applied this to the application in question
    > but for reasons best known to itself, it is still timing out. What I'm
    > wondering is;
    >
    > 1. Would it be best to ditch the application server-side and make it a
    > downloadable application instead?

    <<<<<<<<<<<<<<<<<<<<<<



    I would think so. You don't really care about the contents, do you? Aren't
    you really only interested in whether or not the file has changed? I can
    see potential value in storing the last confirmed copy off of the local
    machine, to prevent tampering, but wouldn't a CRC be just a valid of a
    check, without all the muss and fuss?

    I would also consider setting a file system change hook, and then comparing
    the contents [CRC] to a non-locally stored value just once per session, to
    verify that it was not changed while your app wasn't running. After that,
    you'll know instantly when any other changes are made.

    Long story short, the only real value a server-side process can add to this
    paradigm is off-site storage.




    >>>>>>>>>>>>>>>>>>>>>>>>

    > It would be easier to work with in VB but thats the reason I didn't
    > want
    > to do it that way (I like a challenge).
    >
    > 2. Would it be viable to split the file into parts once uploaded, and
    > process each part seperately instead of doing it the way I'm currently
    > doing
    > it?.
    >
    > 3. Is there a better alternative that I've simply not thought of?

    <<<<<<<<<<<<<<<<<<<<<<<<<<



    If you really have a good reason to parse and store the contents
    entry-for-entry, then I'd store them in a database and leverage that
    technology, that's by far the easiest way to make this kind of thing scale.


    -Mark
    Mark J. McGinty, May 23, 2005
    #2
    1. Advertising

  3. Steven Burn

    Adrienne Guest

    Gazing into my crystal ball I observed "Mark J. McGinty"
    <> writing in
    news::

    > Are there really boxes out there with HOSTS files anywhere even near
    > that long? We're talking %windir%\system32\drivers\etc\HOSTS, yeah?
    > Any time I'm even tempted to put more than 25 lines in HOSTS [esp. the
    > same entries on more than one PC] I find somewhere reasonably
    > convenient to install BIND! Don't like 99.5% of the HOSTS files out
    > there have just one line?
    >
    > 127.0.0.1 localhost
    >
    > Sorry if all that's beside the point, I'm mostly just curious as to
    > whether or not this commonly exists, and if so, why?
    >


    My hosts file is 6351 lines. It contains listings of bad hosts, spyware
    hosts, advertising hosts, etc. Because of this I see very little ads, and
    have had no problems with spyware or viruses for at least five years. I go
    to web sites with third party ads and see a nice, friendly 404. I even
    changed my 404 to read "Doh! The website cannot be found" in red.

    You might want to take a look at
    <http://www.mvps.org/winhelp2002/hosts.htm> to see how this can help your
    system.

    Although my Hosts file is long, I have no lag time in requesting a page
    that is not on the list. I love it!

    --
    Adrienne Boswell
    http://www.cavalcade-of-coding.info
    Please respond to the group so others can share
    Adrienne, May 23, 2005
    #3
  4. Steven Burn

    Steven Burn Guest

    Thanks for your comments.

    The reasoning for this is quite simply because of the hpHosts & mvps Hosts etc files, that contain server IP's/URL's that are no longer valid (my app simply detects and reports their validity). The contents themselves will not be stored unless the user asks my app to do so.

    --
    Regards

    Steven Burn
    Ur I.T. Mate Group
    www.it-mate.co.uk

    Keeping it FREE!

    "Mark J. McGinty" <> wrote in message news:...
    > [I don't understand why my news reader won't prefix the lines from the OP...
    > sorry for any confusion... I prefixed the short sections by hand...]
    >
    > >>>>>>>>>>>>>>>>>>>

    > "Steven Burn" <> wrote in message
    > news:...
    > The application;
    >
    > Service on my webserver that allows a user to upload their HOSTS file for
    > functions to verify the contents are still valid.
    >
    > [snip]
    >
    > The problem;
    >
    > The file can contain anything from 10 lines to 20,000+ lines, each one is
    > parsed and passed to the URLExists function. I'm not entirely sure whether
    > the problem is due to the content itself, or the number of calls to the
    > XMLHTTP object but, a file containing 10,000 lines (the test file) times out
    > after around 1,000 lines.
    >
    > The timeout set for the URLExists function is 5 seconds (the timeout takes
    > well over an hour of reading/parsing, to occur), the script timeout is set
    > to 5000
    >
    > > I've tried cutting alot of the content of the test file out so it's 206K
    > > instead of 1.10MB, but it's still taking forever to process, and then
    > > timing
    > > out anyway.

    > <<<<<<<<<<<<<<<<<<<
    >
    >
    >
    > Are there really boxes out there with HOSTS files anywhere even near that
    > long? We're talking %windir%\system32\drivers\etc\HOSTS, yeah? Any time
    > I'm even tempted to put more than 25 lines in HOSTS [esp. the same
    > entries on more than one PC] I find somewhere reasonably convenient to
    > install BIND! Don't like 99.5% of the HOSTS files out there have just one
    > line?
    >
    > 127.0.0.1 localhost
    >
    > Sorry if all that's beside the point, I'm mostly just curious as to whether
    > or not this commonly exists, and if so, why?
    >
    >
    >
    > >>>>>>>>>>>>>>>>>>>>>

    > > The question;
    > >
    > > I've found some doc's online that show how to read and parse large files
    > > with a ton of content and have applied this to the application in question
    > > but for reasons best known to itself, it is still timing out. What I'm
    > > wondering is;
    > >
    > > 1. Would it be best to ditch the application server-side and make it a
    > > downloadable application instead?

    > <<<<<<<<<<<<<<<<<<<<<<
    >
    >
    >
    > I would think so. You don't really care about the contents, do you? Aren't
    > you really only interested in whether or not the file has changed? I can
    > see potential value in storing the last confirmed copy off of the local
    > machine, to prevent tampering, but wouldn't a CRC be just a valid of a
    > check, without all the muss and fuss?
    >
    > I would also consider setting a file system change hook, and then comparing
    > the contents [CRC] to a non-locally stored value just once per session, to
    > verify that it was not changed while your app wasn't running. After that,
    > you'll know instantly when any other changes are made.
    >
    > Long story short, the only real value a server-side process can add to this
    > paradigm is off-site storage.
    >
    >
    >
    >
    > >>>>>>>>>>>>>>>>>>>>>>>>

    > > It would be easier to work with in VB but thats the reason I didn't
    > > want
    > > to do it that way (I like a challenge).
    > >
    > > 2. Would it be viable to split the file into parts once uploaded, and
    > > process each part seperately instead of doing it the way I'm currently
    > > doing
    > > it?.
    > >
    > > 3. Is there a better alternative that I've simply not thought of?

    > <<<<<<<<<<<<<<<<<<<<<<<<<<
    >
    >
    >
    > If you really have a good reason to parse and store the contents
    > entry-for-entry, then I'd store them in a database and leverage that
    > technology, that's by far the easiest way to make this kind of thing scale.
    >
    >
    > -Mark
    >
    >
    >
    >
    >
    Steven Burn, May 23, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. NIck
    Replies:
    0
    Views:
    108
  2. asim

    FSO and uncode files

    asim, Aug 19, 2003, in forum: ASP General
    Replies:
    0
    Views:
    87
  3. Pietro

    FSO or XMLHTTP

    Pietro, Jun 10, 2005, in forum: ASP General
    Replies:
    9
    Views:
    148
    Chris Hohmann
    Jun 13, 2005
  4. Jtzero 511

    xslt param with libxslt ....errr!

    Jtzero 511, Aug 12, 2010, in forum: Ruby
    Replies:
    0
    Views:
    123
    Jtzero 511
    Aug 12, 2010
  5. yawnmoth

    Msxml*.XMLHTTP vs. Microsoft.XMLHTTP

    yawnmoth, Nov 7, 2006, in forum: Javascript
    Replies:
    11
    Views:
    374
    Matt Kruse
    Nov 9, 2006
Loading...

Share This Page