HTTP and Java

R

Roedy Green

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?
 
L

Luuk

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?

How would this webserver know the difference, if there is none?

I dont think its possible to do something different if you send the same
information.
 
L

Lawrence D'Oliveiro

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?

Other than by pulling JavaScript tricks, you mean?
 
T

Tom Anderson

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject Java
and accept the browser request?

For a single request, no.

A server might be able to tell the difference between a browser and Java
by looking at previous interactions, eg if the browser had made some AJAX
calls that the Java program did not because it was not running JavaScript.

IME, if i have a Java program and a browser behaving differently, it's
because the request is different in some way i hadn't realised. Get
Wireshark on the case immediately, and safe yourself some puzzlement.

tom
 
J

Joshua Cranmer

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?

A plausible technique is the use of separating out the real content into
things which require further loading--like requiring client-side
scripting, cookies, use of CSS, iframes, images, embedded plugins (e.g.,
Flash or Silverlight), client-side redirects. You can then ensure that
these things are also downloaded before accepting other page requests.

An additional factor could be if browsers actually use HTTP different
from Java... e.g., if one pipelines and the other doesn't, or perhaps
automatic SSL upgrading, etc. It all comes down to how many false
negatives the server is willing to bear.
 
R

Roedy Green

I don't think it could tell a single request apart, without requiring
the requester to solve e.g. a text recognition problem. It might be able
to detect a difference in number and frequency of requests from a single
IP address, if the Java program sent multiple requests.

I have sidestepped the problem by going to a different website that
has almost the same information. That still leaves the puzzle.

The websites in question are www.ecs.com.tw and www.ecsusa.com

the 500 code I get back is supposedly a server side error, not my
fault.

In theory it could be a timeout, or something to do with sending
multiple gets. The first works. Or the timing between gets. Yet my
experiments seemed to eliminate those causes. I think I will have to
leave this. It a black hole without much reward for the solution other
than satisfying curiosity. I just hoped someone would think of some
factor I had not considered.
 
L

Luuk

I have sidestepped the problem by going to a different website that
has almost the same information. That still leaves the puzzle.

The websites in question are www.ecs.com.tw and www.ecsusa.com

the 500 code I get back is supposedly a server side error, not my
fault.

It you get an ERROR 500 from www.ecs.com.tw, than its your error....

luuk@opensuse:/tmp> wget -S -U "this is a weird browser"
http://www.ecs.com.tw/
--2011-04-03 12:56:29-- http://www.ecs.com.tw/
Resolving www.ecs.com.tw... 210.17.27.2
Connecting to www.ecs.com.tw|210.17.27.2|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Content-Length: 123
Content-Type: text/html
Content-Location: http://www.ecs.com.tw/index.html
Last-Modified: Mon, 17 May 2010 13:11:33 GMT
Accept-Ranges: bytes
ETag: "8ed7ab78c2f5ca1:aa4"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Sun, 03 Apr 2011 10:57:04 GMT
Connection: keep-alive
Length: 123 [text/html]
Saving to: `index.html.12'


In theory it could be a timeout, or something to do with sending
multiple gets.

not, a timeout should give another error
 
A

Arved Sandstrom

If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?

When curling that first URL (www.ecs.com.tw), I get a

<meta http-equiv="Refresh"
content="0;url=http://www.ecs.com.tw/ECSWebSite/Index.aspx">

A curl on that gives me a 302 to

http://www.ecs.com.tw/ECSWebSite/Index.aspx?MenuID=0&amp;LanID=0

which when fetched (a curl -L on the 302-producing page) is a "real" page.

I think it's simply that your Java code is incorrect. Every request
above produces an <html>...</html> page.

AHS

--
That's not the recollection that I recall...All this information is
certainly in the hands of the auditor and we certainly await his report
to indicate what he deems has occurred.
-- Halifax, Nova Scotia mayor Peter Kelly, who is currently deeply in
the shit
 
R

Roedy Green

It you get an ERROR 500 from www.ecs.com.tw, than its your error....

according to http://www.checkupdown.com/status/E500.html

"This error can only be resolved by fixes to the Web server software.
It is not a client-side problem. It is up to the operators of the Web
server site to locate and analyse the logs which should give further
information about the error."

Yet clearly it was something I was doing as a client that triggered
it.
 
I

Ian Shef

Sorry, that is the website where the problem is, but not the URL. You
need to try a specific motherboard e.g.
http://www.ecs.com.tw/ECSWebSite/Product/Product_Model.aspx?CategoryID=1&
TypeID=68&MenuID=19&LanID=0#Socket%20AM3 then
http://www.ecs.com.tw/ECSWebSite/Product/Product_Detail.aspx?DetailID=111
5&CategoryID=1&DetailName=Feature&MenuID=19&LanID=0

Why do both? The second one alone gets me the same result.

An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session
identifier in the URL, etc.) is stateless.

I have found WebScarab

http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project

useful for investigating these issues. It's like using a cannon to kill a
fly, but the cannon is free.
 
R

Roedy Green

Why do both? The second one alone gets me the same result
To reproduce what I am doing. I read the first page and collect
information from it to create the individual motherboard URLs.

I suspect the problem has something to do with the order of reading
pages.
 
R

Roedy Green

An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session
identifier in the URL, etc.) is stateless.

It is supposed to be. I suspect caching, reuse of connections ...
might mean it is not quite stateless.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top