how to get the values saved in variables?

B

Ben

Hi, I work like a robot today. My job is to visit a webpage, copy
several numbers, paste them to a text file. Then another webpage......
There are about 100 URLs. I decided to download all webpages and
process those files to extract the numbers.

I used a program called url2file to download webpages. However, the
numbers I need to extract are not there. I got something like

<script>document.write(v1)</script>

where v1 holds the number.

Is it possible to get the values in javascript variables without manual
work?

Thank you.

Ben
 
G

Guillermo Rauch

Ben said:
Hi, I work like a robot today. My job is to visit a webpage, copy
several numbers, paste them to a text file. Then another webpage......
There are about 100 URLs. I decided to download all webpages and
process those files to extract the numbers.

I used a program called url2file to download webpages. However, the
numbers I need to extract are not there. I got something like

<script>document.write(v1)</script>

where v1 holds the number.

Is it possible to get the values in javascript variables without manual
work?

Thank you.

Ben

I doubt you'll be able to get the numbers from javascript unless
they're stored in a variable, or maybe by redeclaring document.write().
Anyhow, in order to complete the task you mention automatically, I'd
recommend you to create a PHP/bash/perl/ruby/... script to parse each
file with a regular expression and get the value between ( ).

In that case, I guess you'll get more help in another list.

Best,
- Guillermo.
 
M

Martin Honnen

Ben said:
I used a program called url2file to download webpages. However, the
numbers I need to extract are not there. I got something like

<script>document.write(v1)</script>

where v1 holds the number.

Is it possible to get the values in javascript variables without manual
work?

IE on Windows can be automated with script so you could write a script
to fire up IE, load a URL, read out a value, load the next URL.

Another way might be to use HTTPUnit <http://www.httpunit.org/>.
 
B

Ben

I am working on a project to collect data from websites. Server side
scripting is impossible.

There's no problem if the data were returned in plain html files. For
javascript variables, Is there any way to simulate the web browser to
interpret the code in downloaded files so that we can add some code to
write the values out to a text file?
 
E

Evertjan.

Ben wrote on 02 jan 2007 in comp.lang.javascript:
Hi, I work like a robot today. My job is to visit a webpage, copy
several numbers, paste them to a text file. Then another webpage......
There are about 100 URLs. I decided to download all webpages and
process those files to extract the numbers.
[....]

Is it possible to get the values in javascript variables without manual
work?


Certainly.

Storing the whole pages is not necessary.

Write a javascript programme for MS-Cscript or MS-Wscript, using the

var http = new ActiveXObject("Msxml2.XMLHTTP");
function getUrl(url) {
http.open("GET",url,false);
http.send();
return http.responseText;
};

and process the incoming string using Regex.

Multiple pages can be searched in one go.

You can even append the resulting string values to a local file adding date
time stamps on the go, as you are not restricted by browser security.

You could skedule such little programme as a dayly task and go on holyday
into a non internetted area for a month, telling your neighbor how to rest
the pc after a crash.
 
B

Ben

IE on Windows can be automated with script so you could write a script
to fire up IE, load a URL, read out a value, load the next URL.

This is the way I am looking for. I searched IE automation and got tons
of links. Don't know which one is close to what I am looking for. Start
with MSDN forums...
 
Z

zero0x

have you tried wget ?? it is very easy to use, and it can download
websites from list to one file.
 
T

Ted Zlatanov

Hi, I work like a robot today. My job is to visit a webpage, copy
several numbers, paste them to a text file. Then another webpage......
There are about 100 URLs. I decided to download all webpages and
process those files to extract the numbers.

I used a program called url2file to download webpages. However, the
numbers I need to extract are not there. I got something like

<script>document.write(v1)</script>

where v1 holds the number.

This should work on most OSs:

curl URL | perl -ne'm/document\.write\((.*)\)/ && print "$1\n"'

It prints everything between the parenthesis of document.write(...)
and nothing else, separating the values with a newline.

If you are on Windows, you may want to try cygwin, which will let you
do the command above easily (as long as you've installed Perl and curl).

Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top