How to screen scrape for results?

Discussion in 'ASP .Net' started by Swanand Mokashi, Apr 3, 2006.

  1. Hi all --

    I would like to create an application(call it Application "A") that I would
    like to mimic exactly as a form on a foreign system (Application "F").
    Application "F" is on the web (so basically I can not control it). I will
    have a form exactly on Application "A" as that of Application "F".
    Application "A" will submit to the url of the application "F". I would like
    to do a screen scrape of the confirmation obtained after submitting the form
    on application "F".

    I can easily do the screen scraping of a static page but am not sure how to
    screen scrape a form result ?

    Any ideas?

    TIA
    Swanand
     
    Swanand Mokashi, Apr 3, 2006
    #1
    1. Advertising

  2. Swanand Mokashi

    sloan Guest

    Here are the hints you need:

    You will not have the FormPostCollection and FormPostItem objects.
    These are simple objects. the FormPostItem basically has the key and value
    you want to post.
    FormPostCollection is a implemented CollectionBase, which keeps a collection
    of FormPostItems

    However, minus that, you can figure out what is going on.

    This took me about 3 days to figure out (the GET and querystring was easy,
    the form/post was the tough one).
    So post a "thank you" for this one.

    I got the code for everything outside of the FORM/POST part from another
    developer, but the form/post stuff was what I Figured out in the equation.




    public void WriteTextFile(string Url, string FilePath, long BufferSize )
    {

    try
    {



    //create a web request
    HttpWebRequest oHttpWebRequest = null;
    oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);

    //set the connection timeout
    oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;


    this.postDataToHttpWebRequest ( oHttpWebRequest ,
    myCollectionOfFormPostValues );



    //create a response object that we can read a stream from
    HttpWebResponse oHttpResponse = (HttpWebResponse)
    oHttpWebRequest.GetResponse();


    long workingbuffersize = 1;

    //if we don't get back anything from the response, throw and exception
    if (oHttpResponse == null)
    {
    throw new Exception("Url is missing or invalid.");
    }

    //Define the encoding type
    try
    {
    //see if the page will give us back an encoding type
    if (oHttpResponse.ContentEncoding.Length > 0)
    m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
    else
    m_enc = Encoding.GetEncoding(1252);
    }
    catch
    {
    // *** Invalid encoding passed
    m_enc = Encoding.GetEncoding(1252);
    }


    //create a stream reader grabbing text we get over HTTP
    StreamReader sr = new
    StreamReader(oHttpResponse.GetResponseStream(),m_enc);

    //set the variable that we will use as a buffer to store characters in
    while the file is downloading
    char[] DownloadedCharChunk = new char[BufferSize];

    //go ahead and create our streamwriter to write our file
    StreamWriter sw = new StreamWriter(FilePath,false,m_enc);

    sw.AutoFlush = false;

    //when the working buffer size hits 0 then we know that the file has
    finished downloading
    while (workingbuffersize > 0)
    {
    //set the working buffer size based on the length of characters we
    receive from the stream
    //we will also set DownloadedCharChunk to the set of characters we
    recieve from the stream
    workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);

    if (workingbuffersize > 0)
    {
    //write DownloadedCharChunk to the file on disk
    sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
    }

    } // while


    sr.Close();
    sw.Close();

    }
    catch(Exception e)
    {
    throw e;
    }


    }


    private string buildPostString ( FormPostCollection formPostCollec)
    {

    StringBuilder sb = new StringBuilder();

    foreach (FormPostItem fpi in formPostCollec)
    {
    //string postValue = Encode(Request.Form(postKey));
    sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
    }

    return sb.ToString();
    }



    private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
    Collections.FormPostCollection formPostCollec)
    {


    if (null != formPostCollec )
    {



    ASCIIEncoding encoding=new ASCIIEncoding();


    byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));


    webRequest.Method = "POST";
    webRequest.ContentType="application/x-www-form-urlencoded";
    //oHttpWebRequest.ContentType = "text/xml";//Does Not Work

    webRequest.ContentLength = data.Length;
    Stream newStream=webRequest.GetRequestStream();
    // Send the data.
    newStream.Write(data,0,data.Length);
    newStream.Close();
    }



    }







    "Swanand Mokashi" <> wrote in message
    news:...
    > Hi all --
    >
    > I would like to create an application(call it Application "A") that I

    would
    > like to mimic exactly as a form on a foreign system (Application "F").
    > Application "F" is on the web (so basically I can not control it). I will
    > have a form exactly on Application "A" as that of Application "F".
    > Application "A" will submit to the url of the application "F". I would

    like
    > to do a screen scrape of the confirmation obtained after submitting the

    form
    > on application "F".
    >
    > I can easily do the screen scraping of a static page but am not sure how

    to
    > screen scrape a form result ?
    >
    > Any ideas?
    >
    > TIA
    > Swanand
    >
    >
    >
     
    sloan, Apr 3, 2006
    #2
    1. Advertising

  3. And just why would you want to do this suspicious activity ?

    --
    ( OHM ) - One Handed Man
    AKA Terry Burns - http://TrainingOn.net

    -----------------------------------------------------------

    "Swanand Mokashi" <> wrote in message
    news:...
    > Hi all --
    >
    > I would like to create an application(call it Application "A") that I
    > would like to mimic exactly as a form on a foreign system (Application
    > "F"). Application "F" is on the web (so basically I can not control it). I
    > will have a form exactly on Application "A" as that of Application "F".
    > Application "A" will submit to the url of the application "F". I would
    > like to do a screen scrape of the confirmation obtained after submitting
    > the form on application "F".
    >
    > I can easily do the screen scraping of a static page but am not sure how
    > to screen scrape a form result ?
    >
    > Any ideas?
    >
    > TIA
    > Swanand
    >
    >
    >
     
    OHM \( One Handed Man \), Apr 3, 2006
    #3
  4. Not trying to do anythig suspicious.
    Have a client who wants to sync data with another web site. We have no
    control over the other web site (to say write a web service or drop an XML
    file and ask them to parse). The data that needs to be synched is the same
    as that submitted by the form on the other web site.

    Not trying to create an auto-posting bot :)

    "OHM ( One Handed Man )" <> wrote in message
    news:...
    > And just why would you want to do this suspicious activity ?
    >
    > --
    > ( OHM ) - One Handed Man
    > AKA Terry Burns - http://TrainingOn.net
    >
    > -----------------------------------------------------------
    >
    > "Swanand Mokashi" <> wrote in message
    > news:...
    >> Hi all --
    >>
    >> I would like to create an application(call it Application "A") that I
    >> would like to mimic exactly as a form on a foreign system (Application
    >> "F"). Application "F" is on the web (so basically I can not control it).
    >> I will have a form exactly on Application "A" as that of Application "F".
    >> Application "A" will submit to the url of the application "F". I would
    >> like to do a screen scrape of the confirmation obtained after submitting
    >> the form on application "F".
    >>
    >> I can easily do the screen scraping of a static page but am not sure how
    >> to screen scrape a form result ?
    >>
    >> Any ideas?
    >>
    >> TIA
    >> Swanand
    >>
    >>
    >>

    >
    >
     
    Swanand Mokashi, Apr 3, 2006
    #4
  5. OK, did sloans idea work, I didnt really read it through ?

    --
    ( OHM ) - One Handed Man
    AKA Terry Burns - http://TrainingOn.net



    "Swanand Mokashi" <> wrote in message
    news:...
    > Not trying to do anythig suspicious.
    > Have a client who wants to sync data with another web site. We have no
    > control over the other web site (to say write a web service or drop an XML
    > file and ask them to parse). The data that needs to be synched is the same
    > as that submitted by the form on the other web site.
    >
    > Not trying to create an auto-posting bot :)
    >
    > "OHM ( One Handed Man )" <> wrote in message
    > news:...
    >> And just why would you want to do this suspicious activity ?
    >>
    >> --
    >> ( OHM ) - One Handed Man
    >> AKA Terry Burns - http://TrainingOn.net
    >>
    >> -----------------------------------------------------------
    >>
    >> "Swanand Mokashi" <> wrote in message
    >> news:...
    >>> Hi all --
    >>>
    >>> I would like to create an application(call it Application "A") that I
    >>> would like to mimic exactly as a form on a foreign system (Application
    >>> "F"). Application "F" is on the web (so basically I can not control it).
    >>> I will have a form exactly on Application "A" as that of Application
    >>> "F". Application "A" will submit to the url of the application "F". I
    >>> would like to do a screen scrape of the confirmation obtained after
    >>> submitting the form on application "F".
    >>>
    >>> I can easily do the screen scraping of a static page but am not sure how
    >>> to screen scrape a form result ?
    >>>
    >>> Any ideas?
    >>>
    >>> TIA
    >>> Swanand
    >>>
    >>>
    >>>

    >>
    >>

    >
    >
     
    OHM \( One Handed Man \), Apr 3, 2006
    #5
  6. Ok I have stated playing with your code. I tried it with an ASP.NET page
    and it did not seem to work -- probably not your problem. I have a check for

    if (!Page.IsPostBack)

    {

    }



    and with posting to this page with your code, seems to return false for
    Page.IsPostBack. May not be a problem as the form I want to post to
    ultimately is not ASP.NET form I will test this with ASP form and let you
    know.

    Thanks for all your help!

    Swanand



    "sloan" <> wrote in message
    news:ukB6c%...
    >
    >
    >
    >
    > Here are the hints you need:
    >
    > You will not have the FormPostCollection and FormPostItem objects.
    > These are simple objects. the FormPostItem basically has the key and
    > value
    > you want to post.
    > FormPostCollection is a implemented CollectionBase, which keeps a
    > collection
    > of FormPostItems
    >
    > However, minus that, you can figure out what is going on.
    >
    > This took me about 3 days to figure out (the GET and querystring was easy,
    > the form/post was the tough one).
    > So post a "thank you" for this one.
    >
    > I got the code for everything outside of the FORM/POST part from another
    > developer, but the form/post stuff was what I Figured out in the equation.
    >
    >
    >
    >
    > public void WriteTextFile(string Url, string FilePath, long BufferSize )
    > {
    >
    > try
    > {
    >
    >
    >
    > //create a web request
    > HttpWebRequest oHttpWebRequest = null;
    > oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);
    >
    > //set the connection timeout
    > oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;
    >
    >
    > this.postDataToHttpWebRequest ( oHttpWebRequest ,
    > myCollectionOfFormPostValues );
    >
    >
    >
    > //create a response object that we can read a stream from
    > HttpWebResponse oHttpResponse = (HttpWebResponse)
    > oHttpWebRequest.GetResponse();
    >
    >
    > long workingbuffersize = 1;
    >
    > //if we don't get back anything from the response, throw and exception
    > if (oHttpResponse == null)
    > {
    > throw new Exception("Url is missing or invalid.");
    > }
    >
    > //Define the encoding type
    > try
    > {
    > //see if the page will give us back an encoding type
    > if (oHttpResponse.ContentEncoding.Length > 0)
    > m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
    > else
    > m_enc = Encoding.GetEncoding(1252);
    > }
    > catch
    > {
    > // *** Invalid encoding passed
    > m_enc = Encoding.GetEncoding(1252);
    > }
    >
    >
    > //create a stream reader grabbing text we get over HTTP
    > StreamReader sr = new
    > StreamReader(oHttpResponse.GetResponseStream(),m_enc);
    >
    > //set the variable that we will use as a buffer to store characters in
    > while the file is downloading
    > char[] DownloadedCharChunk = new char[BufferSize];
    >
    > //go ahead and create our streamwriter to write our file
    > StreamWriter sw = new StreamWriter(FilePath,false,m_enc);
    >
    > sw.AutoFlush = false;
    >
    > //when the working buffer size hits 0 then we know that the file has
    > finished downloading
    > while (workingbuffersize > 0)
    > {
    > //set the working buffer size based on the length of characters we
    > receive from the stream
    > //we will also set DownloadedCharChunk to the set of characters we
    > recieve from the stream
    > workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);
    >
    > if (workingbuffersize > 0)
    > {
    > //write DownloadedCharChunk to the file on disk
    > sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
    > }
    >
    > } // while
    >
    >
    > sr.Close();
    > sw.Close();
    >
    > }
    > catch(Exception e)
    > {
    > throw e;
    > }
    >
    >
    > }
    >
    >
    > private string buildPostString ( FormPostCollection formPostCollec)
    > {
    >
    > StringBuilder sb = new StringBuilder();
    >
    > foreach (FormPostItem fpi in formPostCollec)
    > {
    > //string postValue = Encode(Request.Form(postKey));
    > sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
    > }
    >
    > return sb.ToString();
    > }
    >
    >
    >
    > private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
    > Collections.FormPostCollection formPostCollec)
    > {
    >
    >
    > if (null != formPostCollec )
    > {
    >
    >
    >
    > ASCIIEncoding encoding=new ASCIIEncoding();
    >
    >
    > byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));
    >
    >
    > webRequest.Method = "POST";
    > webRequest.ContentType="application/x-www-form-urlencoded";
    > //oHttpWebRequest.ContentType = "text/xml";//Does Not Work
    >
    > webRequest.ContentLength = data.Length;
    > Stream newStream=webRequest.GetRequestStream();
    > // Send the data.
    > newStream.Write(data,0,data.Length);
    > newStream.Close();
    > }
    >
    >
    >
    > }
    >
    >
    >
    >
    >
    >
    >
    > "Swanand Mokashi" <> wrote in message
    > news:...
    >> Hi all --
    >>
    >> I would like to create an application(call it Application "A") that I

    > would
    >> like to mimic exactly as a form on a foreign system (Application "F").
    >> Application "F" is on the web (so basically I can not control it). I will
    >> have a form exactly on Application "A" as that of Application "F".
    >> Application "A" will submit to the url of the application "F". I would

    > like
    >> to do a screen scrape of the confirmation obtained after submitting the

    > form
    >> on application "F".
    >>
    >> I can easily do the screen scraping of a static page but am not sure how

    > to
    >> screen scrape a form result ?
    >>
    >> Any ideas?
    >>
    >> TIA
    >> Swanand
    >>
    >>
    >>

    >
    >
     
    Swanand Mokashi, Apr 3, 2006
    #6
  7. Works with ASP !!
    Thanks again
    Swanand

    "Swanand Mokashi" <> wrote in message
    news:%...
    > Ok I have stated playing with your code. I tried it with an ASP.NET page
    > and it did not seem to work -- probably not your problem. I have a check
    > for
    >
    > if (!Page.IsPostBack)
    >
    > {
    >
    > }
    >
    >
    >
    > and with posting to this page with your code, seems to return false for
    > Page.IsPostBack. May not be a problem as the form I want to post to
    > ultimately is not ASP.NET form I will test this with ASP form and let you
    > know.
    >
    > Thanks for all your help!
    >
    > Swanand
    >
    >
    >
    > "sloan" <> wrote in message
    > news:ukB6c%...
    >>
    >>
    >>
    >>
    >> Here are the hints you need:
    >>
    >> You will not have the FormPostCollection and FormPostItem objects.
    >> These are simple objects. the FormPostItem basically has the key and
    >> value
    >> you want to post.
    >> FormPostCollection is a implemented CollectionBase, which keeps a
    >> collection
    >> of FormPostItems
    >>
    >> However, minus that, you can figure out what is going on.
    >>
    >> This took me about 3 days to figure out (the GET and querystring was
    >> easy,
    >> the form/post was the tough one).
    >> So post a "thank you" for this one.
    >>
    >> I got the code for everything outside of the FORM/POST part from another
    >> developer, but the form/post stuff was what I Figured out in the
    >> equation.
    >>
    >>
    >>
    >>
    >> public void WriteTextFile(string Url, string FilePath, long BufferSize )
    >> {
    >>
    >> try
    >> {
    >>
    >>
    >>
    >> //create a web request
    >> HttpWebRequest oHttpWebRequest = null;
    >> oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);
    >>
    >> //set the connection timeout
    >> oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;
    >>
    >>
    >> this.postDataToHttpWebRequest ( oHttpWebRequest ,
    >> myCollectionOfFormPostValues );
    >>
    >>
    >>
    >> //create a response object that we can read a stream from
    >> HttpWebResponse oHttpResponse = (HttpWebResponse)
    >> oHttpWebRequest.GetResponse();
    >>
    >>
    >> long workingbuffersize = 1;
    >>
    >> //if we don't get back anything from the response, throw and exception
    >> if (oHttpResponse == null)
    >> {
    >> throw new Exception("Url is missing or invalid.");
    >> }
    >>
    >> //Define the encoding type
    >> try
    >> {
    >> //see if the page will give us back an encoding type
    >> if (oHttpResponse.ContentEncoding.Length > 0)
    >> m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
    >> else
    >> m_enc = Encoding.GetEncoding(1252);
    >> }
    >> catch
    >> {
    >> // *** Invalid encoding passed
    >> m_enc = Encoding.GetEncoding(1252);
    >> }
    >>
    >>
    >> //create a stream reader grabbing text we get over HTTP
    >> StreamReader sr = new
    >> StreamReader(oHttpResponse.GetResponseStream(),m_enc);
    >>
    >> //set the variable that we will use as a buffer to store characters in
    >> while the file is downloading
    >> char[] DownloadedCharChunk = new char[BufferSize];
    >>
    >> //go ahead and create our streamwriter to write our file
    >> StreamWriter sw = new StreamWriter(FilePath,false,m_enc);
    >>
    >> sw.AutoFlush = false;
    >>
    >> //when the working buffer size hits 0 then we know that the file has
    >> finished downloading
    >> while (workingbuffersize > 0)
    >> {
    >> //set the working buffer size based on the length of characters we
    >> receive from the stream
    >> //we will also set DownloadedCharChunk to the set of characters we
    >> recieve from the stream
    >> workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);
    >>
    >> if (workingbuffersize > 0)
    >> {
    >> //write DownloadedCharChunk to the file on disk
    >> sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
    >> }
    >>
    >> } // while
    >>
    >>
    >> sr.Close();
    >> sw.Close();
    >>
    >> }
    >> catch(Exception e)
    >> {
    >> throw e;
    >> }
    >>
    >>
    >> }
    >>
    >>
    >> private string buildPostString ( FormPostCollection formPostCollec)
    >> {
    >>
    >> StringBuilder sb = new StringBuilder();
    >>
    >> foreach (FormPostItem fpi in formPostCollec)
    >> {
    >> //string postValue = Encode(Request.Form(postKey));
    >> sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
    >> }
    >>
    >> return sb.ToString();
    >> }
    >>
    >>
    >>
    >> private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
    >> Collections.FormPostCollection formPostCollec)
    >> {
    >>
    >>
    >> if (null != formPostCollec )
    >> {
    >>
    >>
    >>
    >> ASCIIEncoding encoding=new ASCIIEncoding();
    >>
    >>
    >> byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));
    >>
    >>
    >> webRequest.Method = "POST";
    >> webRequest.ContentType="application/x-www-form-urlencoded";
    >> //oHttpWebRequest.ContentType = "text/xml";//Does Not Work
    >>
    >> webRequest.ContentLength = data.Length;
    >> Stream newStream=webRequest.GetRequestStream();
    >> // Send the data.
    >> newStream.Write(data,0,data.Length);
    >> newStream.Close();
    >> }
    >>
    >>
    >>
    >> }
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >> "Swanand Mokashi" <> wrote in message
    >> news:...
    >>> Hi all --
    >>>
    >>> I would like to create an application(call it Application "A") that I

    >> would
    >>> like to mimic exactly as a form on a foreign system (Application "F").
    >>> Application "F" is on the web (so basically I can not control it). I
    >>> will
    >>> have a form exactly on Application "A" as that of Application "F".
    >>> Application "A" will submit to the url of the application "F". I would

    >> like
    >>> to do a screen scrape of the confirmation obtained after submitting the

    >> form
    >>> on application "F".
    >>>
    >>> I can easily do the screen scraping of a static page but am not sure how

    >> to
    >>> screen scrape a form result ?
    >>>
    >>> Any ideas?
    >>>
    >>> TIA
    >>> Swanand
    >>>
    >>>
    >>>

    >>
    >>

    >
    >
     
    Swanand Mokashi, Apr 3, 2006
    #7
  8. Swanand Mokashi

    Guest

    You can also try SWExplorerAutomation (SWEA)
    (http://www.webunittesting.com). SWEA supports frames, DHTML (AJAX)
    pages, windows and HTML dialogs, popup windows, file downloads. SWEA
    solutions can be run from ASP.NET pages or windows service.
     
    , Apr 3, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Al Cadalzo

    Unable to get next page in screen scrape

    Al Cadalzo, Nov 12, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    419
    Al Cadalzo
    Nov 13, 2003
  2. Tony Pino
    Replies:
    5
    Views:
    650
    Tony Pino
    Dec 3, 2003
  3. Rob Lauer
    Replies:
    2
    Views:
    549
    Chris Jackson
    Jan 26, 2004
  4. Ollie
    Replies:
    3
    Views:
    4,333
    Chad Z. Hower aka Kudzu
    Feb 25, 2004
  5. n8

    screen scrape + login

    n8, Nov 24, 2004, in forum: ASP .Net
    Replies:
    14
    Views:
    1,013
    Scott Allen
    Nov 30, 2004
Loading...

Share This Page