Fetch info from website and write to txt file.

Pitmairen · Mar 6, 2006

I want to make a program that get info from a website and prints it out
in a txt file.

I made this:

import urllib
f = urllib.urlopen("http://www.imdb.com/title/tt0407304/")
s = f.read()
k = open("test.txt","w")
k.write(s)
k.close()
f.close()

That saves all the html code into the test.txt file. But if i for
example only want the genre, plot outline and Cast overview to be
written to the txt file. How can i do that?

And another problem i have:

If the txt file i want the information to be saved in already have some
text saved in it. How can i save the info from the website between the
text that was there before?

for example:

blablablablablablablabla
blablablablablablablabla
blablablablablablablabla
(inset info from website here)
blablablablablablablabla
blablablablablablablabla
blablablablablablablabla

Pitmairen

gene tani · Mar 6, 2006

Pitmairen said:
I want to make a program that get info from a website and prints it out
in a txt file.

I made this:

import urllib
f = urllib.urlopen("http://www.imdb.com/title/tt0407304/")
s = f.read()
k = open("test.txt","w")
k.write(s)
k.close()
f.close()

That saves all the html code into the test.txt file. But if i for
example only want the genre, plot outline and Cast overview to be
written to the txt file. How can i do that?

And another problem i have:

If the txt file i want the information to be saved in already have some
text saved in it. How can i save the info from the website between the
text that was there before?

for example:

blablablablablablablabla
blablablablablablablabla
blablablablablablablabla
(inset info from website here)
blablablablablablablabla
blablablablablablablabla
blablablablablablablabla

to get a text file that looks like your web page, stripped of markup,
look at "lynx -dump" or "w3m -dump" ( i think links2 does the same).
else:

http://groups.google.com/group/comp...arch+this+group&&_doneTitle=Back+to+Search&&d
http://groups.google.com/group/comp...=2&as_maxy=2005&&_doneTitle=Back+to+Search&&d

gene tani · Mar 6, 2006

Pitmairen said:
I want to make a program that get info from a website and prints it out
in a txt file.

I made this:

import urllib
f = urllib.urlopen("http://www.imdb.com/title/tt0407304/")

path of even less resistance
http://imdbpy.sourceforge.net/

Dennis Lee Bieber · Mar 6, 2006

That saves all the html code into the test.txt file. But if i for
example only want the genre, plot outline and Cast overview to be
written to the txt file. How can i do that?

Well, how would you do it by hand? Write down the steps you go
through to extract that information from your HTML file by hand... Clean
that up into a generalized algorithm... Write code the performs that
algorithm...

IOW: You'll going to have write code to parse the HTML (there may be
libraries available to help, but you still need to do the recognizer for
the parts you want).

And another problem i have:

If the txt file i want the information to be saved in already have some
text saved in it. How can i save the info from the website between the
text that was there before?

{I'm making enemies today}

Same answer... How would you do this by hand? Translate that
procedure to code.

Though I suspect, in this case, "by hand" would be to open the
entire file into memory (using notepad or some editor). Open the other
text into another memory-based editor. Select, copy, paste... But that
puts all the work of the insertion on the editor program (IE, someone
else had to code the same thing you are asking to make the editor work).

Question: how do you identify /where/ to do the insert... By number
of lines, by some keyword, etc.?

http://cis.stvincent.edu/swd/extsort/extsort.html

Modify as needed (it assumes each "line" is a record to be
sorted/merged, while you want to merge on some arbitrary boundary)
--

Bruno Desthuilliers · Mar 6, 2006

Pitmairen a écrit :

I want to make a program that get info from a website and prints it out
in a txt file.

I made this:

import urllib
f = urllib.urlopen("http://www.imdb.com/title/tt0407304/")
s = f.read()
k = open("test.txt","w")
k.write(s)
k.close()
f.close()

That saves all the html code into the test.txt file. But if i for
example only want the genre, plot outline and Cast overview to be
written to the txt file. How can i do that?

Seems like you want BeautifulSoup:
http://www.crummy.com/software/BeautifulSoup/

And another problem i have:

If the txt file i want the information to be saved in already have some
text saved in it. How can i save the info from the website between the
text that was there before?

for example:

blablablablablablablabla
blablablablablablablabla
blablablablablablablabla
(inset info from website here)
blablablablablablablabla
blablablablablablablabla
blablablablablablablabla

You need to be able to identify the place where you want to insert your
data. Then it's a matter of reading the original file, creating a temp
file, writing lines before insertion point, writing data to insert,
writing remaing lines, closing all files, replacing original file by the
temp file.

Is there a way to pass this state from component to the fetch?	1	Apr 24, 2023
Export data from python to a txt file	5	Mar 29, 2013
Collect Excel Data from Website	5	Apr 30, 2022
How do I write a script to generate 10 random EVEN numbers and writethem to a .txt file?	3	Jul 8, 2013
Using JS to verify registration info?	1	Mar 19, 2020
[2.5.1] Read each line from txt file, replace, and save?	4	Sep 2, 2012
Search and write to .txt file	5	Aug 11, 2009
Can I Import table from txt file into form letter using Python?	3	Feb 21, 2013

Fetch info from website and write to txt file.

Pitmairen

gene tani

gene tani

Dennis Lee Bieber

Bruno Desthuilliers

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads