HTTP and Java

Discussion in 'Java' started by Roedy Green, Apr 2, 2011.

  1. Roedy Green

    Roedy Green Guest

    If Java sent an identical HTTP header, to that sent by a browser,
    including User-Agent to a website, is there a plausible mechanism by
    which a website would treat the requests differently, namely reject
    Java and accept the browser request?
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 2, 2011
    #1
    1. Advertising

  2. Roedy Green

    Luuk Guest

    On 02-04-2011 08:35, Roedy Green wrote:
    > If Java sent an identical HTTP header, to that sent by a browser,
    > including User-Agent to a website, is there a plausible mechanism by
    > which a website would treat the requests differently, namely reject
    > Java and accept the browser request?


    How would this webserver know the difference, if there is none?

    I dont think its possible to do something different if you send the same
    information.

    --
    Luuk
    Luuk, Apr 2, 2011
    #2
    1. Advertising

  3. In message <>, Roedy Green wrote:

    > If Java sent an identical HTTP header, to that sent by a browser,
    > including User-Agent to a website, is there a plausible mechanism by
    > which a website would treat the requests differently, namely reject
    > Java and accept the browser request?


    Other than by pulling JavaScript tricks, you mean?
    Lawrence D'Oliveiro, Apr 2, 2011
    #3
  4. Roedy Green

    Tom Anderson Guest

    On Fri, 1 Apr 2011, Roedy Green wrote:

    > If Java sent an identical HTTP header, to that sent by a browser,
    > including User-Agent to a website, is there a plausible mechanism by
    > which a website would treat the requests differently, namely reject Java
    > and accept the browser request?


    For a single request, no.

    A server might be able to tell the difference between a browser and Java
    by looking at previous interactions, eg if the browser had made some AJAX
    calls that the Java program did not because it was not running JavaScript.

    IME, if i have a Java program and a browser behaving differently, it's
    because the request is different in some way i hadn't realised. Get
    Wireshark on the case immediately, and safe yourself some puzzlement.

    tom

    --
    Freedom is the right of all sentient beings. -- Optimus Prime
    Tom Anderson, Apr 2, 2011
    #4
  5. On 04/02/2011 02:35 AM, Roedy Green wrote:
    > If Java sent an identical HTTP header, to that sent by a browser,
    > including User-Agent to a website, is there a plausible mechanism by
    > which a website would treat the requests differently, namely reject
    > Java and accept the browser request?


    A plausible technique is the use of separating out the real content into
    things which require further loading--like requiring client-side
    scripting, cookies, use of CSS, iframes, images, embedded plugins (e.g.,
    Flash or Silverlight), client-side redirects. You can then ensure that
    these things are also downloaded before accepting other page requests.

    An additional factor could be if browsers actually use HTTP different
    from Java... e.g., if one pipelines and the other doesn't, or perhaps
    automatic SSL upgrading, etc. It all comes down to how many false
    negatives the server is willing to bear.

    --
    Beware of bugs in the above code; I have only proved it correct, not
    tried it. -- Donald E. Knuth
    Joshua Cranmer, Apr 2, 2011
    #5
  6. Roedy Green

    Roedy Green Guest

    On Sat, 02 Apr 2011 06:26:37 -0700, Patricia Shanahan <>
    wrote, quoted or indirectly quoted someone who said :

    >I don't think it could tell a single request apart, without requiring
    >the requester to solve e.g. a text recognition problem. It might be able
    >to detect a difference in number and frequency of requests from a single
    >IP address, if the Java program sent multiple requests.


    I have sidestepped the problem by going to a different website that
    has almost the same information. That still leaves the puzzle.

    The websites in question are www.ecs.com.tw and www.ecsusa.com

    the 500 code I get back is supposedly a server side error, not my
    fault.

    In theory it could be a timeout, or something to do with sending
    multiple gets. The first works. Or the timing between gets. Yet my
    experiments seemed to eliminate those causes. I think I will have to
    leave this. It a black hole without much reward for the solution other
    than satisfying curiosity. I just hoped someone would think of some
    factor I had not considered.



    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 3, 2011
    #6
  7. Roedy Green

    Luuk Guest

    On 03-04-2011 08:17, Roedy Green wrote:
    > On Sat, 02 Apr 2011 06:26:37 -0700, Patricia Shanahan <>
    > wrote, quoted or indirectly quoted someone who said :
    >
    >> I don't think it could tell a single request apart, without requiring
    >> the requester to solve e.g. a text recognition problem. It might be able
    >> to detect a difference in number and frequency of requests from a single
    >> IP address, if the Java program sent multiple requests.

    >
    > I have sidestepped the problem by going to a different website that
    > has almost the same information. That still leaves the puzzle.
    >
    > The websites in question are www.ecs.com.tw and www.ecsusa.com
    >
    > the 500 code I get back is supposedly a server side error, not my
    > fault.


    It you get an ERROR 500 from www.ecs.com.tw, than its your error....

    luuk@opensuse:/tmp> wget -S -U "this is a weird browser"
    http://www.ecs.com.tw/
    --2011-04-03 12:56:29-- http://www.ecs.com.tw/
    Resolving www.ecs.com.tw... 210.17.27.2
    Connecting to www.ecs.com.tw|210.17.27.2|:80... connected.
    HTTP request sent, awaiting response...
    HTTP/1.1 200 OK
    Content-Length: 123
    Content-Type: text/html
    Content-Location: http://www.ecs.com.tw/index.html
    Last-Modified: Mon, 17 May 2010 13:11:33 GMT
    Accept-Ranges: bytes
    ETag: "8ed7ab78c2f5ca1:aa4"
    Server: Microsoft-IIS/6.0
    X-Powered-By: ASP.NET
    Date: Sun, 03 Apr 2011 10:57:04 GMT
    Connection: keep-alive
    Length: 123 [text/html]
    Saving to: `index.html.12'



    >
    > In theory it could be a timeout, or something to do with sending
    > multiple gets.


    not, a timeout should give another error



    --
    Luuk
    Luuk, Apr 3, 2011
    #7
  8. On 11-04-02 03:35 AM, Roedy Green wrote:
    > If Java sent an identical HTTP header, to that sent by a browser,
    > including User-Agent to a website, is there a plausible mechanism by
    > which a website would treat the requests differently, namely reject
    > Java and accept the browser request?


    When curling that first URL (www.ecs.com.tw), I get a

    <meta http-equiv="Refresh"
    content="0;url=http://www.ecs.com.tw/ECSWebSite/Index.aspx">

    A curl on that gives me a 302 to

    http://www.ecs.com.tw/ECSWebSite/Index.aspx?MenuID=0&amp;LanID=0

    which when fetched (a curl -L on the 302-producing page) is a "real" page.

    I think it's simply that your Java code is incorrect. Every request
    above produces an <html>...</html> page.

    AHS

    --
    That's not the recollection that I recall...All this information is
    certainly in the hands of the auditor and we certainly await his report
    to indicate what he deems has occurred.
    -- Halifax, Nova Scotia mayor Peter Kelly, who is currently deeply in
    the shit
    Arved Sandstrom, Apr 3, 2011
    #8
  9. Roedy Green

    Roedy Green Guest

    On Sun, 03 Apr 2011 12:58:07 +0200, Luuk <> wrote,
    quoted or indirectly quoted someone who said :

    >It you get an ERROR 500 from www.ecs.com.tw, than its your error....


    according to http://www.checkupdown.com/status/E500.html

    "This error can only be resolved by fixes to the Web server software.
    It is not a client-side problem. It is up to the operators of the Web
    server site to locate and analyse the logs which should give further
    information about the error."

    Yet clearly it was something I was doing as a client that triggered
    it.
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 4, 2011
    #9
  10. Roedy Green

    Roedy Green Guest

    On Sun, 03 Apr 2011 09:50:38 -0300, Arved Sandstrom
    <> wrote, quoted or indirectly quoted
    someone who said :

    >When curling that first URL (www.ecs.com.tw), I get a


    Sorry, that is the website where the problem is, but not the URL. You
    need to try a specific motherboard e.g.
    http://www.ecs.com.tw/ECSWebSite/Pr...ryID=1&TypeID=68&MenuID=19&LanID=0#Socket AM3
    then
    http://www.ecs.com.tw/ECSWebSite/Pr...goryID=1&DetailName=Feature&MenuID=19&LanID=0
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 4, 2011
    #10
  11. Roedy Green

    Tom Anderson Guest

    On Sun, 3 Apr 2011, Roedy Green wrote:

    > On Sun, 03 Apr 2011 09:50:38 -0300, Arved Sandstrom
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >
    >> When curling that first URL (www.ecs.com.tw), I get a

    >
    > Sorry, that is the website where the problem is, but not the URL. You
    > need to try a specific motherboard e.g.
    > http://www.ecs.com.tw/ECSWebSite/Pr...ryID=1&TypeID=68&MenuID=19&LanID=0#Socket AM3
    > then
    > http://www.ecs.com.tw/ECSWebSite/Pr...goryID=1&DetailName=Feature&MenuID=19&LanID=0


    I can download the boths URLs perfectly fine with both curl and
    URLConnection. The data downloaded is the same for both means.

    tom

    --
    Your words are mostly meaningless symbols -- Andrew, to Niall
    Tom Anderson, Apr 4, 2011
    #11
  12. Roedy Green

    Ian Shef Guest

    Roedy Green <> wrote in
    news::

    <snip>
    > Sorry, that is the website where the problem is, but not the URL. You
    > need to try a specific motherboard e.g.
    > http://www.ecs.com.tw/ECSWebSite/Product/Product_Model.aspx?CategoryID=1&
    > TypeID=68&MenuID=19&LanID=0#Socket%20AM3 then
    > http://www.ecs.com.tw/ECSWebSite/Product/Product_Detail.aspx?DetailID=111
    > 5&CategoryID=1&DetailName=Feature&MenuID=19&LanID=0


    Why do both? The second one alone gets me the same result.

    An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session
    identifier in the URL, etc.) is stateless.

    I have found WebScarab

    http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project

    useful for investigating these issues. It's like using a cannon to kill a
    fly, but the cannon is free.
    Ian Shef, Apr 5, 2011
    #12
  13. Roedy Green

    Roedy Green Guest

    On Mon, 04 Apr 2011 23:06:13 GMT, Ian Shef <>
    wrote, quoted or indirectly quoted someone who said :

    >Why do both? The second one alone gets me the same result

    To reproduce what I am doing. I read the first page and collect
    information from it to create the individual motherboard URLs.

    I suspect the problem has something to do with the order of reading
    pages.
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 5, 2011
    #13
  14. Roedy Green

    Roedy Green Guest

    On Mon, 04 Apr 2011 23:06:13 GMT, Ian Shef <>
    wrote, quoted or indirectly quoted someone who said :

    >
    >An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session
    >identifier in the URL, etc.) is stateless.


    It is supposed to be. I suspect caching, reuse of connections ...
    might mean it is not quite stateless.
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
    ~ anonymous Google Android developer
    Roedy Green, Apr 5, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mike
    Replies:
    5
    Views:
    946
    Keith M. Corbett
    Sep 21, 2004
  2. Peng Jie
    Replies:
    6
    Views:
    130
    Peng Jie
    Feb 6, 2005
  3. n3d!m

    Http post and http get

    n3d!m, Jan 25, 2012, in forum: Python
    Replies:
    2
    Views:
    331
    n3d!m
    Feb 6, 2012
  4. iMath
    Replies:
    0
    Views:
    112
    iMath
    Aug 7, 2013
  5. iMath
    Replies:
    0
    Views:
    107
    iMath
    Aug 7, 2013
Loading...

Share This Page