using python to visit web sites and print the web sites image to files

I

imx

Hi there,

I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!

Xiong
 
D

daftspaniel

I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file
Any pointer will be apprieciated!
Xiong

Google pywinauto.

HTH

Davy
 
M

Michael Bentley

I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!

Which OS?
 
G

Goldfish

You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser. There are libraries to open web
pages, scrape their contents, and do downloading. That would make your
bot platform neutral. Driving a GUI browser has the risk of being a
brittle script that might not handle different browsers, different
platforms, maybe even not handle different versions.

I run a mediawiki web site, and found a handy python-based library
written to manage it called pywikipediabot at http://sourceforge.net/projects/pywikipediabot/.

Okay, this library won't do your leg work for you, but it has pieces
and parts that demonstrate how to use python to surf a web site. Then,
with an HTML parser, you can hunt down images.

Greg
 
P

Paul Boddie

Goldfish said:
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser.

That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul
 
I

imx

That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul

Thanks for all the replies.
I will check pyglet to see if it can help.

The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong
 
D

daftspaniel

The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong

Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy
 
I

imx

Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy

Cool, but does it mean that I will need .net to run the code?

Xiong
 
D

daftspaniel

Cool, but does it mean that I will need .net to run the code?

Yep - runtime is free though as is IronPython. For my program the
license is BSD.

Cheers,
Davy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top