HTML2JPEG

Discussion in 'Perl Misc' started by katja, Dec 7, 2004.

  1. katja

    katja Guest

    How can I convert an Internet-page to a PDF/JPEG-file from a
    Perl-script under Linux? (I would like to make a screenshot from a
    command line).
    Thank you
    Katja
    katja, Dec 7, 2004
    #1
    1. Advertising

  2. katja <> wrote:
    > How can I convert an Internet-page to a PDF/JPEG-file from a
    > Perl-script under Linux? (I would like to make a screenshot from a
    > command line).
    > Thank you
    > Katja


    Since every browser for every platform for every user renders the WWW
    pages differently, it is unclear whether your task is well-defined, thus
    an answer should not be possible. You can download the contents of a web
    page as a text file easily. Converting text files on Linux is often
    done using wv suite (wvText, wvPDF et al.) which operates installed
    converting programs like latex, distill and others through a shell script
    wrapper.

    --
    Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
    Email: http://www.nasa.proj.ac.il/
    Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
    ----------------------------------------------------------------------
    The economy depends about as much on economists as the weather does on
    weather forecasters.
    -- Jean-Paul Kauffmann
    Roman M. Parparov, Dec 7, 2004
    #2
    1. Advertising

  3. *** Roman M. Parparov, 07-Dec-04, 14:06 h ***

    > katja <> wrote:
    > > How can I convert an Internet-page to a PDF/JPEG-file from a
    > > Perl-script under Linux? (I would like to make a screenshot from a
    > > command line).
    > > Thank you
    > > Katja

    >
    > Since every browser for every platform for every user renders the WWW
    > pages differently, it is unclear whether your task is well-defined, thus
    > an answer should not be possible. You can download the contents of a web
    > page as a text file easily. Converting text files on Linux is often
    > done using wv suite (wvText, wvPDF et al.) which operates installed
    > converting programs like latex, distill and others through a shell script
    > wrapper.


    Thanks for the hint, and sorry, yes, my question was too vague. What I wanted to
    ask is: Is there a perlish way to save the contents of a web page to an image
    file exactly the way a browser (any recent browser) would render it? I'd like to
    pass my script a url and automatically get a kind of screen shot of the web
    page. There doesn't seem to be a perl module (at least not on CPAN) to that
    purpose, and as far as I could find out, mozilla, firefox, galeon, konqueror and
    opera don't support any command line options to have them print to a file.

    Thank you,
    Katja
    Katja Zinchenko, Dec 8, 2004
    #3
  4. katja

    henq Guest

    Would http://freshmeat.net/projects/htmldoc/ help?

    Regards,

    Henk

    www.windsurfpedia.com


    "Katja Zinchenko" <> schreef in bericht
    news:p-muenchen.de...
    > *** Roman M. Parparov, 07-Dec-04, 14:06 h ***
    >
    >> katja <> wrote:
    >> > How can I convert an Internet-page to a PDF/JPEG-file from a
    >> > Perl-script under Linux? (I would like to make a screenshot from a
    >> > command line).
    >> > Thank you
    >> > Katja

    >>
    >> Since every browser for every platform for every user renders the WWW
    >> pages differently, it is unclear whether your task is well-defined, thus
    >> an answer should not be possible. You can download the contents of a web
    >> page as a text file easily. Converting text files on Linux is often
    >> done using wv suite (wvText, wvPDF et al.) which operates installed
    >> converting programs like latex, distill and others through a shell script
    >> wrapper.

    >
    > Thanks for the hint, and sorry, yes, my question was too vague. What I
    > wanted to
    > ask is: Is there a perlish way to save the contents of a web page to an
    > image
    > file exactly the way a browser (any recent browser) would render it? I'd
    > like to
    > pass my script a url and automatically get a kind of screen shot of the
    > web
    > page. There doesn't seem to be a perl module (at least not on CPAN) to
    > that
    > purpose, and as far as I could find out, mozilla, firefox, galeon,
    > konqueror and
    > opera don't support any command line options to have them print to a file.
    >
    > Thank you,
    > Katja
    henq, Dec 9, 2004
    #4
  5. Katja Zinchenko <> wrote:

    > Thanks for the hint, and sorry, yes, my question was too vague. What I wanted to
    > ask is: Is there a perlish way to save the contents of a web page to an image
    > file exactly the way a browser (any recent browser) would render it? I'd like to
    > pass my script a url and automatically get a kind of screen shot of the web
    > page. There doesn't seem to be a perl module (at least not on CPAN) to that
    > purpose, and as far as I could find out, mozilla, firefox, galeon, konqueror and
    > opera don't support any command line options to have them print to a file.


    > Thank you,
    > Katja


    But even the same browser would show it differently for various users.
    You must also realize that most of the sites do not fit within one browser
    window which is about 1024x768 pixels in size.

    The only way that comes to mind - an abstract implementation:
    1) Launch the browser to the requested URL from a perl script.
    Your browser and whatever it displays are an X11 window within your
    window manager/desktop environment.

    2) Use the perl library corresponding to your window manager or even an
    X11 perl interface to capture the window into an image.

    OR

    2) Use scriptlets supplied with your window manager (some have them) that
    do the window capture. AFAIK Afterstep has this ability. I am not sure
    but it is possible that X11 package has a program to take a shot of a full
    desktop. Then, knowing your browser X11 geometry you can crop out the
    browser itself.

    The image would usually be saved in the XWD format.

    3) Use an image conversion library of perl (or an external application)
    to convert XWD to JPG or whatever.

    Note: all this might be done with a shell script and existing Linux
    utilities without Perl.

    --
    Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
    Email: http://www.nasa.proj.ac.il/
    Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
    ----------------------------------------------------------------------
    The economy depends about as much on economists as the weather does on
    weather forecasters.
    -- Jean-Paul Kauffmann
    Roman M. Parparov, Dec 9, 2004
    #5
  6. Katja Zinchenko wrote:
    > Thanks for the hint, and sorry, yes, my question was too vague. What
    > I wanted to ask is: Is there a perlish way to save the contents of a
    > web page to an image file exactly the way a browser (any recent
    > browser) would render it?


    Of course. And it is so trivial that you don't even need Perl.
    Just use Lynx. It has an option to write the rendered text to a file:

    -dump
    dumps the formatted output of the default document or one specified on the
    command line to standard out.

    Further details see
    http://lynx.isc.org/current/lynx2-8-5/lynx_help/Lynx_users_guide.html

    jue
    Jürgen Exner, Dec 9, 2004
    #6
  7. "J?rgen Exner" <> wrote:
    > Katja Zinchenko wrote:
    > > Thanks for the hint, and sorry, yes, my question was too vague. What
    > > I wanted to ask is: Is there a perlish way to save the contents of a
    > > web page to an image file exactly the way a browser (any recent
    > > browser) would render it?


    > Of course. And it is so trivial that you don't even need Perl.
    > Just use Lynx. It has an option to write the rendered text to a file:


    > -dump
    > dumps the formatted output of the default document or one specified on the
    > command line to standard out.


    I suspect Katja wants to take screenshots of specific pages _including_
    graphics, so lynx isn't sufficient.

    > Further details see
    > http://lynx.isc.org/current/lynx2-8-5/lynx_help/Lynx_users_guide.html


    > jue




    --
    Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
    Email: http://www.nasa.proj.ac.il/
    Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
    ----------------------------------------------------------------------
    The economy depends about as much on economists as the weather does on
    weather forecasters.
    -- Jean-Paul Kauffmann
    Roman M. Parparov, Dec 9, 2004
    #7
  8. Roman M. Parparov wrote:
    > "J?rgen Exner" <> wrote:
    >> Katja Zinchenko wrote:
    >>> Thanks for the hint, and sorry, yes, my question was too vague. What
    >>> I wanted to ask is: Is there a perlish way to save the contents of a
    >>> web page to an image file exactly the way a browser (any recent
    >>> browser) would render it?

    >
    >> Of course. And it is so trivial that you don't even need Perl.
    >> Just use Lynx. It has an option to write the rendered text to a file:

    >
    >> -dump
    >> dumps the formatted output of the default document or one specified
    >> on the command line to standard out.

    >
    > I suspect Katja wants to take screenshots of specific pages
    > _including_ graphics, so lynx isn't sufficient.


    <quote>
    the way [...] any recent browser would render it
    </quote>

    Anything beyond that is pure speculation

    jue
    Jürgen Exner, Dec 9, 2004
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page