using sockets to open connection to a search engine

Discussion in 'Java' started by Damo, Jan 15, 2007.

  1. Damo

    Damo Guest

    Hi,
    I'm trying to open a connection to altavista.com through java to
    retrieve the search results for a query. This is the code I'm using, it
    works for google and yahoo but not altavista or MSN.

    s = new Socket("altavista.com",80);
    p = new PrintStream(s.getOutputStream());
    p.print("GET /web/results?q=java HTTP/1.0\r\n");
    p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
    rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
    p.print("Connection: close\r\n\r\n");
    in = s.getInputStream();


    If you type this : www.altavista.com/web/results?q=java into
    the address bar, it will return the result page.

    Can anyone help me
    Thanks
     
    Damo, Jan 15, 2007
    #1
    1. Advertising

  2. Damo wrote:
    > Hi,
    > I'm trying to open a connection to altavista.com through java to
    > retrieve the search results for a query. This is the code I'm using, it
    > works for google and yahoo but not altavista or MSN.
    >
    > s = new Socket("altavista.com",80);
    > p = new PrintStream(s.getOutputStream());
    > p.print("GET /web/results?q=java HTTP/1.0\r\n");
    > p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
    > rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
    > p.print("Connection: close\r\n\r\n");
    > in = s.getInputStream();
    >
    > If you type this : www.altavista.com/web/results?q=java into
    > the address bar, it will return the result page.


    Put something in between your browser and AltaVista and
    see what the browser sends.

    You already have User-Agent, but maybe it wants Referrer or
    Accept or Accept-Language or Accept-Encoding.

    Or maybe it wants HTTP/1.1 (which requires Host).

    There is a limited number of things to add until
    you are fully browser compatible.

    Arne
     
    =?ISO-8859-1?Q?Arne_Vajh=F8j?=, Jan 15, 2007
    #2
    1. Advertising

  3. Damo

    Damo Guest

    sorry, I meant to say the error was a 404 , resource not found on this
    server.
    so its connecting but not returning the results
     
    Damo, Jan 15, 2007
    #3
  4. Damo

    Tom Hawtin Guest

    Damo wrote:
    >
    > If you type this : www.altavista.com/web/results?q=java into
    > the address bar, it will return the result page.


    This seems to work (once I managed to spell alta-vista with both Ts -
    shouldn't have repeated myself):

    import java.io.*;
    import java.net.*;

    class Search {
    public static void main(String[] args) throws Exception {
    Socket s = new Socket("www.altavista.com",80);
    String request =
    "GET /web/results?q=java HTTP/1.1\r\n"+
    "Host: www.altavista.com:80\r\n"+
    "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;rv:1.8.1)
    Gecko/20061010 Firefox/2.\r\n"+
    "Connection: close\r\n\r\n";
    OutputStream out = s.getOutputStream();
    out.write(request.getBytes());
    out.flush();
    InputStream in = s.getInputStream();
    for (;;) {
    int b = in.read();
    if (b == -1) { break; }
    System.out.print((char)b);
    }
    }
    }

    Tom Hawtin
     
    Tom Hawtin, Jan 15, 2007
    #4
  5. Damo

    Damo Guest

    excellent, cheers, that did the trick
     
    Damo, Jan 15, 2007
    #5
  6. Damo wrote:
    > Hi,
    > I'm trying to open a connection to altavista.com through java to
    > retrieve the search results for a query. This is the code I'm using, it
    > works for google and yahoo but not altavista or MSN.
    >
    > s = new Socket("altavista.com",80);
    > p = new PrintStream(s.getOutputStream());
    > p.print("GET /web/results?q=java HTTP/1.0\r\n");
    > p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
    > rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
    > p.print("Connection: close\r\n\r\n");
    > in = s.getInputStream();
    >
    >
    > If you type this : www.altavista.com/web/results?q=java into
    > the address bar, it will return the result page.
    >
    > Can anyone help me
    > Thanks
    >

    Try opening the socket to "www.altavista.com"

    Its not the same host as "altavista.com". You can see the difference by
    pinging them both and looking at the IPs and true host names.


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
     
    Martin Gregorie, Jan 16, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. david
    Replies:
    1
    Views:
    390
    Saurabh Nandu
    Nov 17, 2003
  2. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    716
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  3. Replies:
    1
    Views:
    380
    Sybren Stuvel
    Apr 10, 2006
  4. Sasha
    Replies:
    3
    Views:
    598
    Sasha
    May 22, 2007
  5. pandi
    Replies:
    5
    Views:
    457
    pandi
    Dec 14, 2009
Loading...

Share This Page