Fatching a web page with space in url

C

cool2005

I tried to fatch a wab page in java as following

public static String downloadWWWPage(String pageAddr) throws
IOException {
....

URL url = new URL(pageAddr);
String websiteAddress = url.getHost();
String file = url.getFile();
....
Socket clientSocket = new Socket(websiteAddress, 80);
System.out.println("Socket opened to " + websiteAddress + "\n");

// creating a BufferReader object using the input stream reader
// this will read the content send by the webserver
BufferedReader inFromServer = new BufferedReader(
new InputStreamReader(clientSocket.getInputStream()));

// Need to create a output stream writer
// that will talk to the webserver of the website
OutputStreamWriter outWriter = new OutputStreamWriter(clientSocket
.getOutputStream());

// make the GET call to the webserver with the desired url or the
// file name
// which you intent to get, also mention the protocol type, which is
// HTTP/1.0
// This call will trigger the webserver to throw this page, which
// will be read
// by the input stream

// making a get call to the file
outWriter.write("GET " + file + " HTTP/1.0\r\n\n");

===============================

The problem is that when varible "file" above contain a space charater
(even I encoded the space to %20 or  ) I got 404 but the same url
worked in the address field of a browser.

Is there any way to get around?

thanks

mark
 
A

anon

Dude,
You need to encode the URI ("file" in your example).
Try java.net.URLEncoder.encode(String e, String e);

But you sure do it in a complicated way. Why not ;

InputStream in = url.openConnection().getInputStream()

or

Object o = url.openConnection().getObject();

-a
 
A

Andrew Thompson

cool2005 said:
I tried to fatch a wab

'fatch a wab' ..is that code?
..page ..

What URL?

( And do you have the site owner's consent to fetch and
use the information from a Java (or other) program? Many
site owners do not allow it, and take measures to oprevent it. )
...in java as following

URL url = new URL(pageAddr);

...please don't post 'tab' characters to usenet.
I don't know how wide that is in your newsclient,
but on Google it's this wide..
URL url = new
URL(pageAddr);

Andrew T.
 
M

moonhk

Andrew said:
'fatch a wab' ..is that code?


What URL?

( And do you have the site owner's consent to fetch and
use the information from a Java (or other) program? Many
site owners do not allow it, and take measures to oprevent it. )


..please don't post 'tab' characters to usenet.
I don't know how wide that is in your newsclient,
but on Google it's this wide..
URL url = new
URL(pageAddr);

Andrew T.

Below coding is work.

/*

2006/08/14 eric.leung Work at Home. Need to check with proxy
Authenticator.


* Copyright (c) 2000 David Flanagan. All rights reserved.
* This code is from the book Java Examples in a Nutshell, 2nd Edition.
* It is provided AS-IS, WITHOUT ANY WARRANTY either expressed or
implied.
* You may study, use, and modify it for any non-commercial purpose.
* You may distribute it non-commercially as long as you retain this
notice.
* For a commercial use license, or to purchase the book (recommended),
* visit http://www.davidflanagan.com/javaexamples2.
*/
// package com.davidflanagan.examples.net;
import java.io.*;
import java.net.*;

/**
* This program connects to a Web server and downloads the specified
URL
* from it. It uses the HTTP protocol directly.
**/
public class HttpClient {
public static void main(String[] args) {
try {
// Check the arguments
if ((args.length != 1) && (args.length != 2))
throw new IllegalArgumentException("Wrong number of
args");

// Get an output stream to write the URL contents to
OutputStream to_file;
if (args.length == 2) to_file = new
FileOutputStream(args[1]);
else to_file = System.out;

// Now use the URL class to parse the user-specified URL
into
// its various parts.
URL url = new URL(args[0]);
String protocol = url.getProtocol();
if (!protocol.equals("http")) // Check that we support the
protocol
throw new IllegalArgumentException("Must use 'http:'
protocol");
String host = url.getHost();
int port = url.getPort();
if (port == -1) port = 80; // if no port, use the default
HTTP port
String filename = url.getFile();

// Open a network socket connection to the specified host
and port
Socket socket = new Socket(host, port);

// Get input and output streams for the socket
InputStream from_server = socket.getInputStream();
PrintWriter to_server = new
PrintWriter(socket.getOutputStream());

// Send the HTTP GET command to the Web server, specifying
the file
// This uses an old and very simple version of the HTTP
protocol
to_server.print("GET " + filename + "\n\n");
to_server.flush(); // Send it right now!

// Now read the server's response, and write it to the file
byte[] buffer = new byte[4096];
int bytes_read;
while((bytes_read = from_server.read(buffer)) != -1)
to_file.write(buffer, 0, bytes_read);

// When the server closes the connection, we close our
stuff
socket.close();
to_file.close();
System.out.println("Output to " + args[1]);
}
catch (Exception e) { // Report any errors that arise
System.err.println(e);
System.err.println("\n2006/08/11\n");
System.err.println("Usage: java HttpClient <URL>
[<filename>]");
System.err.println("e.g. java HttpClient
http://hk.yahoo.com abc.txt");
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,808
Messages
2,569,686
Members
45,454
Latest member
FionaValli

Latest Threads

Top