A
anthony
I am writing a program in c++ to open a specific web site and extract
the information in html for parsing.
Here is a segment of my code.
void OpenWebPage(const RCString& url ,const RCString& page)
{
// url ie www.statistics.gov.uk
// page ie /index.php
// coonect to internet over a socket on default port 80
int Socket = clientSock(url.c_str(), 80);
char* str = "GET " + page + " HTTP/1.0\n\n";
// Send a request over the socket to the internet through an http
request
send(Socket,str, len, 0);
// retrieve information
char buf[500];
int sizeOfBuf = sizeof(buf);
int bytes_read = 0;
do
{
bytes_read = recv(Socket, buf, sizeOfBuf, 0);
myStrBuffer += RCString(buf);
memset(buf, 0, sizeof(buf));
}
while (bytes_read == sizeOfBuf);
This works good for simple web sites ie
www.statistics.gov.uk/index.php.
however, when i try this with a more complicated page value
ie
http://epp.eurostat.cec.eu.int/port...t=EUROIND&root=EUROIND/shorties/euro_cp/cp240,
The information cannot be retrieved as the request doesn't understand
the url / page.
please can someone help me to figure out what to do , or at least point
me in the right direction.
cheers
Anthony
}
the information in html for parsing.
Here is a segment of my code.
void OpenWebPage(const RCString& url ,const RCString& page)
{
// url ie www.statistics.gov.uk
// page ie /index.php
// coonect to internet over a socket on default port 80
int Socket = clientSock(url.c_str(), 80);
char* str = "GET " + page + " HTTP/1.0\n\n";
// Send a request over the socket to the internet through an http
request
send(Socket,str, len, 0);
// retrieve information
char buf[500];
int sizeOfBuf = sizeof(buf);
int bytes_read = 0;
do
{
bytes_read = recv(Socket, buf, sizeOfBuf, 0);
myStrBuffer += RCString(buf);
memset(buf, 0, sizeof(buf));
}
while (bytes_read == sizeOfBuf);
This works good for simple web sites ie
www.statistics.gov.uk/index.php.
however, when i try this with a more complicated page value
ie
http://epp.eurostat.cec.eu.int/port...t=EUROIND&root=EUROIND/shorties/euro_cp/cp240,
The information cannot be retrieved as the request doesn't understand
the url / page.
please can someone help me to figure out what to do , or at least point
me in the right direction.
cheers
Anthony
}