Premature EOF exception when reading in https input stream

E

epicwinter

I am making a simple https request and reading in the response. When I
do so I am always getting a Premature EOF exception from this
particular site which is the one I need to work with. I will be
receiving a mix of text and binary files in the response. So somehow I
need to interpret whether i am getting a binary file or a simple text
response. No matter what the response is I seem to be getting this
error though. I don't think this is typical but when I try the same
excercise using curl I receive no error, so I assume it is a personal
code problem. This is always how I have read the response stream in
the past, with success. What am I doing wrong?

Below is a functioning test that will reproduce my error, it is just
receiving a text response and outputs all of the response. As you can
see it just pings the web page and displays the html contained in the
response in full and then throws the Premature EOF exception.



import java.net.*;
import java.io.*;

public class Test
{
public static void main(String [] arstring)
{
try
{
URL url1 =new java.net.URL("https",
"cert.access.webmd.com", 443, "/ITS/post.aspx");
URLConnection connection = url1.openConnection();
connection.setUseCaches( false );
connection.setDoOutput( true );
connection.setDoInput( true );
InputStream inputStream = connection.getInputStream();
InputStreamReader isr = new InputStreamReader(inputStream
);
BufferedReader br = new BufferedReader(isr);
int c = -1;
while ((c= br.read()) >=0)
{
System.err.print((char)c);
}
br.close();
isr.close();

}
catch (Exception exception)
{
exception.printStackTrace();
}
}
}
 
J

John C. Bollinger

I am making a simple https request and reading in the response. When I
do so I am always getting a Premature EOF exception from this
particular site which is the one I need to work with. I will be
receiving a mix of text and binary files in the response. So somehow I
need to interpret whether i am getting a binary file or a simple text
response. No matter what the response is I seem to be getting this
error though. I don't think this is typical but when I try the same
excercise using curl I receive no error, so I assume it is a personal
code problem. This is always how I have read the response stream in
the past, with success. What am I doing wrong?

Below is a functioning test that will reproduce my error, it is just
receiving a text response and outputs all of the response. As you can
see it just pings the web page and displays the html contained in the
response in full and then throws the Premature EOF exception.

A cut & paste of the stack trace would have been helpful. For the
curious, I ran the code and got this trace:

java.io.IOException: Premature EOF
at
sun.net.www.http.ChunkedInputStream.readAheadBlocking(Unknown Source)
at sun.net.www.http.ChunkedInputStream.readAhead(Unknown Source)
at sun.net.www.http.ChunkedInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder$CharsetSD.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at java.io.BufferedReader.fill(Unknown Source)
at java.io.BufferedReader.read(Unknown Source)
at Test.main(Test.java:19)
import java.net.*;
import java.io.*;

public class Test
{
public static void main(String [] arstring)
{
try
{
URL url1 =new java.net.URL("https",
"cert.access.webmd.com", 443, "/ITS/post.aspx");
URLConnection connection = url1.openConnection();
connection.setUseCaches( false );
connection.setDoOutput( true );
connection.setDoInput( true );
InputStream inputStream = connection.getInputStream();
InputStreamReader isr = new InputStreamReader(inputStream
);

This InputStreamReader is incorrect. You should be finding the
Content-encoding header and (if found) using the specified encoding to
read the response. You may also need to scan the content-type for a
charset specification. If no encoding information is available from the
headers then you should be using "ISO-8859-1", but even this can pose a
problem if the encoding is specified inside an HTML <META> tag.

In this particular case, however, it appears that the response is
encoded in ISO-8859-1, with no funny business. There also do not seem
to be any non-ASCII characters or any control characters other than
standard whitespace, and so it is unlikely that a character encoding
issue is the problem.
BufferedReader br = new BufferedReader(isr);

It is usually best to do the buffering as close to the source as
possible. In this case that would mean wrapping the Connection's input
stream in a BufferedInputStream instead of using a BufferedReader at
this point.
int c = -1;
while ((c= br.read()) >=0)
{
System.err.print((char)c);
}
br.close();
isr.close();

You do not need to close both. It is sufficient to close just the
outermost reader or stream.
}
catch (Exception exception)
{
exception.printStackTrace();
}
}
}

Overall, although your code has some problems, I don't think any of them
are responsible for the failure you see. It looks more like the server
is sending a malformed HTTP response, or possibly that Java's support
for the chunked transfer coding is broken. I'd guess the former,
especially since it's odd to be using the chunked transfer coding for a
plain, fairly small HTML page in the first place. If the server is
getting the size of the last chunk wrong then that might explain the
error. If you want to investigate further, including to verify an error
on the server side, then you would want to capture and record the full
HTTP response. That will be complicated by the use of HTTPS.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top