using sockets to open connection to a search engine

D

Damo

Hi,
I'm trying to open a connection to altavista.com through java to
retrieve the search results for a query. This is the code I'm using, it
works for google and yahoo but not altavista or MSN.

s = new Socket("altavista.com",80);
p = new PrintStream(s.getOutputStream());
p.print("GET /web/results?q=java HTTP/1.0\r\n");
p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
p.print("Connection: close\r\n\r\n");
in = s.getInputStream();


If you type this : www.altavista.com/web/results?q=java into
the address bar, it will return the result page.

Can anyone help me
Thanks
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Damo said:
Hi,
I'm trying to open a connection to altavista.com through java to
retrieve the search results for a query. This is the code I'm using, it
works for google and yahoo but not altavista or MSN.

s = new Socket("altavista.com",80);
p = new PrintStream(s.getOutputStream());
p.print("GET /web/results?q=java HTTP/1.0\r\n");
p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
p.print("Connection: close\r\n\r\n");
in = s.getInputStream();

If you type this : www.altavista.com/web/results?q=java into
the address bar, it will return the result page.

Put something in between your browser and AltaVista and
see what the browser sends.

You already have User-Agent, but maybe it wants Referrer or
Accept or Accept-Language or Accept-Encoding.

Or maybe it wants HTTP/1.1 (which requires Host).

There is a limited number of things to add until
you are fully browser compatible.

Arne
 
D

Damo

sorry, I meant to say the error was a 404 , resource not found on this
server.
so its connecting but not returning the results
 
T

Tom Hawtin

Damo said:
If you type this : www.altavista.com/web/results?q=java into
the address bar, it will return the result page.

This seems to work (once I managed to spell alta-vista with both Ts -
shouldn't have repeated myself):

import java.io.*;
import java.net.*;

class Search {
public static void main(String[] args) throws Exception {
Socket s = new Socket("www.altavista.com",80);
String request =
"GET /web/results?q=java HTTP/1.1\r\n"+
"Host: www.altavista.com:80\r\n"+
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;rv:1.8.1)
Gecko/20061010 Firefox/2.\r\n"+
"Connection: close\r\n\r\n";
OutputStream out = s.getOutputStream();
out.write(request.getBytes());
out.flush();
InputStream in = s.getInputStream();
for (;;) {
int b = in.read();
if (b == -1) { break; }
System.out.print((char)b);
}
}
}

Tom Hawtin
 
M

Martin Gregorie

Damo said:
Hi,
I'm trying to open a connection to altavista.com through java to
retrieve the search results for a query. This is the code I'm using, it
works for google and yahoo but not altavista or MSN.

s = new Socket("altavista.com",80);
p = new PrintStream(s.getOutputStream());
p.print("GET /web/results?q=java HTTP/1.0\r\n");
p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
p.print("Connection: close\r\n\r\n");
in = s.getInputStream();


If you type this : www.altavista.com/web/results?q=java into
the address bar, it will return the result page.

Can anyone help me
Thanks
Try opening the socket to "www.altavista.com"

Its not the same host as "altavista.com". You can see the difference by
pinging them both and looking at the IPs and true host names.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top