Web pages extraction & Libcurl

Discussion in 'C++' started by Choi, Nov 9, 2007.

  1. Choi

    Choi Guest

    Good morning.

    I've tried to extract, using libcurl, web pages but it failed. There
    is no compilation error concerning the class I wrote, but the problems
    appear when I compile a main method which calls this class.

    My class :

    **********************************libcurl_tools.h

    #include <curl/curl.h>


    struct MemoryStruct {
    char *memory;
    size_t size;
    };




    class libcurl_tools {


    CURL * curl_handle;
    MemoryStruct chunk;


    public:

    libcurl_tools();
    void init();
    bool close();
    void set_referrer(std::string, CURL *);
    std::string perform(std::string, CURL *);



    };

    **********************************libcurl_tools.cpp

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #include <curl/curl.h>
    #include <curl/types.h>
    #include <curl/easy.h>

    #include <iostream> // pour std::cout
    #include <string> // pour std::string

    #include "libcurl_tools.h"



    libcurl_tools::libcurl_tools() {}



    static void *myrealloc(void *ptr, size_t size)
    {
    /* There might be a realloc() out there that doesn't like reallocing
    NULL pointers, so we take care of it here */
    if(ptr)
    return realloc(ptr, size);
    else
    return malloc(size);
    }



    static size_t WriteMemoryCallback(void *ptr, size_t size, size_t
    nmemb, void *data)
    {
    size_t realsize = size * nmemb;
    struct MemoryStruct *mem = (struct MemoryStruct *)data;

    mem->memory = (char *)myrealloc(mem->memory, mem->size + realsize +
    1);
    if (mem->memory) {
    memcpy(&(mem->memory[mem->size]), ptr, realsize);
    mem->size += realsize;
    mem->memory[mem->size] = 0;
    }
    return realsize;
    }




    void libcurl_tools::set_referrer(std::string referer, CURL
    *curl_handle ) {

    curl_easy_setopt(curl_handle, CURLOPT_REFERER, referer.c_str());

    }




    //initialise la connection
    void libcurl_tools::init() {




    chunk.memory=NULL; /* we expect realloc(NULL, size) to work */
    chunk.size = 0; /* no data at this point */


    curl_global_init(CURL_GLOBAL_ALL);

    /* init the curl session */
    curl_handle = curl_easy_init();




    }


    //télécharge l'url en argument
    std::string libcurl_tools::perform(std::string url , CURL
    *curl_handle) {

    /* specify URL to get */
    curl_easy_setopt(curl_handle, CURLOPT_URL, url.c_str());

    /* send all data to this function */
    curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION,
    WriteMemoryCallback);

    /* we pass our 'chunk' struct to the callback function */
    curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);

    /* some servers don't like requests that are made without a user-
    agent
    field, so we provide one */
    curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/
    1.0");

    /* get it! */
    curl_easy_perform(curl_handle);

    /* cleanup curl stuff */
    curl_easy_cleanup(curl_handle);

    /*
    * Now, our chunk.memory points to a memory block that is chunk.size
    * bytes big and contains the remote file.
    *
    * Do something nice with it!
    *
    * You should be aware of the fact that at this point we might have
    an
    * allocated data block, and nothing has yet deallocated that data.
    So when
    * you're done with it, you should free() it as a nice application.
    */




    //return chunk.memory; // les 2 ont l'air de marcher (pour le
    moment...)
    return std::string(chunk.memory);

    }


    bool libcurl_tools::close() {

    if(chunk.memory)
    free(chunk.memory);

    return 0;
    }

    *********************************** test.cpp which includes the main()


    #include <iostream>
    #include <string>
    #include <sstream>
    #include "libcurl_tools.h"


    //#include <stdio.h>

    using namespace std;


    int main()
    {

    libcurl_tools e();

    e.init();

    std::string chaine;
    std:: string s = "www.yahoo.com";


    chaine = e.perform(s);

    e.close();



    }

    There are 3 compilation errors concerning the calls of init(),
    perform() and close() methods, and I really don't know why it happens.
    I wish you could help me guys...

    Thanks
     
    Choi, Nov 9, 2007
    #1
    1. Advertising

  2. Choi wrote:
    > Good morning.
    >
    > I've tried to extract, using libcurl, web pages but it failed. There
    > is no compilation error concerning the class I wrote, but the problems
    > appear when I compile a main method which calls this class.
    >
    > My class :
    >

    [snip]
    >
    > int main()
    > {
    >
    > libcurl_tools e();

    did you mean:
    libcurl_tools e;

    instead?
    >
    > e.init();
    >
    > std::string chaine;
    > std:: string s = "www.yahoo.com";
    >
    >
    > chaine = e.perform(s);
    >
    > e.close();
    >
    >
    >
    > }
    >
    > There are 3 compilation errors concerning the calls of init(),
    > perform() and close() methods, and I really don't know why it happens.
    > I wish you could help me guys...


    Might be helpful in the future if you could post the errors too...

    Alan
     
    Alan Woodland, Nov 9, 2007
    #2
    1. Advertising

  3. Choi

    Choi Guest

    On Nov 9, 11:50 am, Choi <> wrote:
    > Good morning.
    >
    > I've tried to extract, using libcurl, web pages but it failed. There
    > is no compilation error concerning the class I wrote, but the problems
    > appear when I compile a main method which calls this class.
    >
    > My class :
    >
    > **********************************libcurl_tools.h
    >
    > #include <curl/curl.h>
    >
    > struct MemoryStruct {
    > char *memory;
    > size_t size;
    >
    > };
    >
    > class libcurl_tools {
    >
    > CURL * curl_handle;
    > MemoryStruct chunk;
    >
    > public:
    >
    > libcurl_tools();
    > void init();
    > bool close();
    > void set_referrer(std::string, CURL *);
    > std::string perform(std::string, CURL *);
    >
    > };
    >
    > **********************************libcurl_tools.cpp
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    > #include <string.h>
    >
    > #include <curl/curl.h>
    > #include <curl/types.h>
    > #include <curl/easy.h>
    >
    > #include <iostream> // pour std::cout
    > #include <string> // pour std::string
    >
    > #include "libcurl_tools.h"
    >
    > libcurl_tools::libcurl_tools() {}
    >
    > static void *myrealloc(void *ptr, size_t size)
    > {
    > /* There might be a realloc() out there that doesn't like reallocing
    > NULL pointers, so we take care of it here */
    > if(ptr)
    > return realloc(ptr, size);
    > else
    > return malloc(size);
    >
    > }
    >
    > static size_t WriteMemoryCallback(void *ptr, size_t size, size_t
    > nmemb, void *data)
    > {
    > size_t realsize = size * nmemb;
    > struct MemoryStruct *mem = (struct MemoryStruct *)data;
    >
    > mem->memory = (char *)myrealloc(mem->memory, mem->size + realsize +
    > 1);
    > if (mem->memory) {
    > memcpy(&(mem->memory[mem->size]), ptr, realsize);
    > mem->size += realsize;
    > mem->memory[mem->size] = 0;
    > }
    > return realsize;
    >
    > }
    >
    > void libcurl_tools::set_referrer(std::string referer, CURL
    > *curl_handle ) {
    >
    > curl_easy_setopt(curl_handle, CURLOPT_REFERER, referer.c_str());
    >
    > }
    >
    > //initialise la connection
    > void libcurl_tools::init() {
    >
    > chunk.memory=NULL; /* we expect realloc(NULL, size) to work */
    > chunk.size = 0; /* no data at this point */
    >
    > curl_global_init(CURL_GLOBAL_ALL);
    >
    > /* init the curl session */
    > curl_handle = curl_easy_init();
    >
    > }
    >
    > //télécharge l'url en argument
    > std::string libcurl_tools::perform(std::string url , CURL
    > *curl_handle) {
    >
    > /* specify URL to get */
    > curl_easy_setopt(curl_handle, CURLOPT_URL, url.c_str());
    >
    > /* send all data to this function */
    > curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION,
    > WriteMemoryCallback);
    >
    > /* we pass our 'chunk' struct to the callback function */
    > curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);
    >
    > /* some servers don't like requests that are made without a user-
    > agent
    > field, so we provide one */
    > curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/
    > 1.0");
    >
    > /* get it! */
    > curl_easy_perform(curl_handle);
    >
    > /* cleanup curl stuff */
    > curl_easy_cleanup(curl_handle);
    >
    > /*
    > * Now, our chunk.memory points to a memory block that is chunk.size
    > * bytes big and contains the remote file.
    > *
    > * Do something nice with it!
    > *
    > * You should be aware of the fact that at this point we might have
    > an
    > * allocated data block, and nothing has yet deallocated that data.
    > So when
    > * you're done with it, you should free() it as a nice application.
    > */
    >
    > //return chunk.memory; // les 2 ont l'air de marcher (pour le
    > moment...)
    > return std::string(chunk.memory);
    >
    > }
    >
    > bool libcurl_tools::close() {
    >
    > if(chunk.memory)
    > free(chunk.memory);
    >
    > return 0;
    >
    > }
    >
    > *********************************** test.cpp which includes the main()
    >
    > #include <iostream>
    > #include <string>
    > #include <sstream>
    > #include "libcurl_tools.h"
    >
    > //#include <stdio.h>
    >
    > using namespace std;
    >
    > int main()
    > {
    >
    > libcurl_tools e();
    >
    > e.init();
    >
    > std::string chaine;
    > std:: string s = "www.yahoo.com";
    >
    > chaine = e.perform(s);
    >
    > e.close();
    >
    > }
    >
    > There are 3 compilation errors concerning the calls of init(),
    > perform() and close() methods, and I really don't know why it happens.
    > I wish you could help me guys...
    >
    > Thanks


    If it can help, the errors are similar to :

    test.cpp:91: erreur: request for member «init" in «e", which is of
    non-class type «libcurl_tools ()()"
     
    Choi, Nov 9, 2007
    #3
  4. Choi

    Choi Guest

    On Nov 9, 12:34 pm, Alan Woodland <> wrote:
    > Choi wrote:
    > > Good morning.

    >
    > > I've tried to extract, using libcurl, web pages but it failed. There
    > > is no compilation error concerning the class I wrote, but the problems
    > > appear when I compile a main method which calls this class.

    >
    > > My class :

    >
    > [snip]
    >
    > > int main()
    > > {

    >
    > > libcurl_tools e();

    >
    > did you mean:
    > libcurl_tools e;
    >
    > instead?
    >
    >
    >
    >
    >
    > > e.init();

    >
    > > std::string chaine;
    > > std:: string s = "www.yahoo.com";

    >
    > > chaine = e.perform(s);

    >
    > > e.close();

    >
    > > }

    >
    > > There are 3 compilation errors concerning the calls of init(),
    > > perform() and close() methods, and I really don't know why it happens.
    > > I wish you could help me guys...

    >
    > Might be helpful in the future if you could post the errors too...
    >
    > Alan


    Thanks Alan, I modified libcurl_tools e(); -> libcurl_tools e; I
    didn't see that stupid mistake !!!! I also modified the class. Now it
    works !!!!

    Thanks a lot
     
    Choi, Nov 9, 2007
    #4
  5. Choi

    Choi Guest

    On Nov 9, 12:54 pm, Choi <> wrote:
    > On Nov 9, 11:50 am, Choi <> wrote:
    >
    >
    >
    > > Good morning.

    >
    > > I've tried to extract, using libcurl, web pages but it failed. There
    > > is no compilation error concerning the class I wrote, but the problems
    > > appear when I compile a main method which calls this class.

    >
    > > My class :

    >
    > > **********************************libcurl_tools.h

    >
    > > #include <curl/curl.h>

    >
    > > struct MemoryStruct {
    > > char *memory;
    > > size_t size;

    >
    > > };

    >
    > > class libcurl_tools {

    >
    > > CURL * curl_handle;
    > > MemoryStruct chunk;

    >
    > > public:

    >
    > > libcurl_tools();
    > > void init();
    > > bool close();
    > > void set_referrer(std::string, CURL *);
    > > std::string perform(std::string, CURL *);

    >
    > > };

    >
    > > **********************************libcurl_tools.cpp

    >
    > > #include <stdio.h>
    > > #include <stdlib.h>
    > > #include <string.h>

    >
    > > #include <curl/curl.h>
    > > #include <curl/types.h>
    > > #include <curl/easy.h>

    >
    > > #include <iostream> // pour std::cout
    > > #include <string> // pour std::string

    >
    > > #include "libcurl_tools.h"

    >
    > > libcurl_tools::libcurl_tools() {}

    >
    > > static void *myrealloc(void *ptr, size_t size)
    > > {
    > > /* There might be a realloc() out there that doesn't like reallocing
    > > NULL pointers, so we take care of it here */
    > > if(ptr)
    > > return realloc(ptr, size);
    > > else
    > > return malloc(size);

    >
    > > }

    >
    > > static size_t WriteMemoryCallback(void *ptr, size_t size, size_t
    > > nmemb, void *data)
    > > {
    > > size_t realsize = size * nmemb;
    > > struct MemoryStruct *mem = (struct MemoryStruct *)data;

    >
    > > mem->memory = (char *)myrealloc(mem->memory, mem->size + realsize +
    > > 1);
    > > if (mem->memory) {
    > > memcpy(&(mem->memory[mem->size]), ptr, realsize);
    > > mem->size += realsize;
    > > mem->memory[mem->size] = 0;
    > > }
    > > return realsize;

    >
    > > }

    >
    > > void libcurl_tools::set_referrer(std::string referer, CURL
    > > *curl_handle ) {

    >
    > > curl_easy_setopt(curl_handle, CURLOPT_REFERER, referer.c_str());

    >
    > > }

    >
    > > //initialise la connection
    > > void libcurl_tools::init() {

    >
    > > chunk.memory=NULL; /* we expect realloc(NULL, size) to work */
    > > chunk.size = 0; /* no data at this point */

    >
    > > curl_global_init(CURL_GLOBAL_ALL);

    >
    > > /* init the curl session */
    > > curl_handle = curl_easy_init();

    >
    > > }

    >
    > > //télécharge l'url en argument
    > > std::string libcurl_tools::perform(std::string url , CURL
    > > *curl_handle) {

    >
    > > /* specify URL to get */
    > > curl_easy_setopt(curl_handle, CURLOPT_URL, url.c_str());

    >
    > > /* send all data to this function */
    > > curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION,
    > > WriteMemoryCallback);

    >
    > > /* we pass our 'chunk' struct to the callback function */
    > > curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);

    >
    > > /* some servers don't like requests that are made without a user-
    > > agent
    > > field, so we provide one */
    > > curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/
    > > 1.0");

    >
    > > /* get it! */
    > > curl_easy_perform(curl_handle);

    >
    > > /* cleanup curl stuff */
    > > curl_easy_cleanup(curl_handle);

    >
    > > /*
    > > * Now, our chunk.memory points to a memory block that is chunk..size
    > > * bytes big and contains the remote file.
    > > *
    > > * Do something nice with it!
    > > *
    > > * You should be aware of the fact that at this point we might have
    > > an
    > > * allocated data block, and nothing has yet deallocated that data.
    > > So when
    > > * you're done with it, you should free() it as a nice application.
    > > */

    >
    > > //return chunk.memory; // les 2 ont l'air de marcher (pour le
    > > moment...)
    > > return std::string(chunk.memory);

    >
    > > }

    >
    > > bool libcurl_tools::close() {

    >
    > > if(chunk.memory)
    > > free(chunk.memory);

    >
    > > return 0;

    >
    > > }

    >
    > > *********************************** test.cpp which includes the main()

    >
    > > #include <iostream>
    > > #include <string>
    > > #include <sstream>
    > > #include "libcurl_tools.h"

    >
    > > //#include <stdio.h>

    >
    > > using namespace std;

    >
    > > int main()
    > > {

    >
    > > libcurl_tools e();

    >
    > > e.init();

    >
    > > std::string chaine;
    > > std:: string s = "www.yahoo.com";

    >
    > > chaine = e.perform(s);

    >
    > > e.close();

    >
    > > }

    >
    > > There are 3 compilation errors concerning the calls of init(),
    > > perform() and close() methods, and I really don't know why it happens.
    > > I wish you could help me guys...

    >
    > > Thanks

    >
    > If it can help, the errors are similar to :
    >
    > test.cpp:91: erreur: request for member «init" in «e", which is of
    > non-class type «libcurl_tools ()()"


    Thanks Alan, I modified libcurl_tools e(); -> libcurl_tools e; I
    didn't see that stupid mistake !!!! I also modified the class. Now it
    works !!!!

    Thanks a lot
     
    Choi, Nov 12, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sylvain/11XX

    libcurl and expat

    Sylvain/11XX, May 23, 2006, in forum: XML
    Replies:
    1
    Views:
    570
    Joe Kesselman
    May 23, 2006
  2. Replies:
    0
    Views:
    484
  3. Raj

    Post data using the libcurl

    Raj, Nov 9, 2006, in forum: Python
    Replies:
    0
    Views:
    337
  4. Raj

    Post data using the libcurl

    Raj, Nov 9, 2006, in forum: Python
    Replies:
    0
    Views:
    292
  5. Giuseppe

    Manage web form with libcurl

    Giuseppe, Nov 14, 2006, in forum: C++
    Replies:
    2
    Views:
    653
    Giuseppe
    Nov 14, 2006
Loading...

Share This Page