reading internet data to generate random numbers.

L

Levi Campbell

Hi, I'm working on a random number generator using the internet as a
way to gather entropy, I have two questions.

1. is there a way to capture the internet stream?
2. how would I skip every 2nd, 3rd, or 4th byte to protect privacy?
 
F

Fredrik Lundh

Levi said:
Hi, I'm working on a random number generator using the internet as a
way to gather entropy, I have two questions.

1. is there a way to capture the internet stream?

what's an internet stream?

</F>
 
P

Peter Hansen

Levi said:
Hi, I'm working on a random number generator using the internet as a
way to gather entropy, I have two questions.

1. is there a way to capture the internet stream?

What specifically do you mean by the term "internet stream" here?
Generally speaking, the internet is not "streamed" at all, but perhaps
you have some special meaning in mind that isn't in general use.

-Peter
 
G

Grant Edwards

Hi, I'm working on a random number generator using the internet as a
way to gather entropy, I have two questions.

1. is there a way to capture the internet stream?

What OS? What, exactly, do you want to capture?
2. how would I skip every 2nd, 3rd, or 4th byte to protect privacy?

2nd, 3rd, 4th, byte of what?

Doesn't your OS have an entropy-gathering RN generator built-in?
 
R

Roman Suzi

So far interesting.

Most news sites provide RSS and/or ATOM feeds these days.
Or maybe you mean video/audio stream from Internet stations?
(not sure how much entropy such a stream could contain: probably
depends on the genre ;-)

Or perhaps you mean low-level Ethernet/TCP/IP "stream"? Then it is not
original and I already saw answers with recomendations.


Sincerely yours, Roman Suzi
 
G

Grant Edwards

Alternatively, if you want lots of high-quality random numbers, buy
a cheap web camera: http://www.lavarnd.org/.

The thermal noise present in a CCD sensor is a good source of
random bits, but I don't get what all the stuff about taking
"snapshots of a physical chaotic process" has to do it.
Using data from the Internet is just a bad idea.

I think that the timing of certain network events is one of the
Linux kernel's entropy sources.
 
M

Mike Meyer

Grant Edwards said:
I think that the timing of certain network events is one of the
Linux kernel's entropy sources.

BSD as well. The key word is "one". While network events don't make a
good source of random data, proplery combining such sources can create
good random data. Randomness is a deep subject. You should use a
library built by experts (and appropriate for your application) rather
than try and build one yourself. Most modern Unix systems have a
/dev/random that qualifies for a lot of applications.

<mike
 
S

Steven D'Aprano

Mike said:
BSD as well. The key word is "one". While network events don't make a
good source of random data, proplery combining such sources can create
good random data.

<pedant>

Depends on what you mean by "random". In particular,
the randomness of network events does not follow a
uniform distribution, but then not many things do.
Uniformly distributed random data is what you want for
cryptography. If you are modelling physical events, you
might want some other distribution, e.g. normal (bell
curve), Poisson, exponential, binomial, geometric,
hypergeometric, and so forth.

I have no idea what distribution data from the Internet
would have, I would imagine it is *extremely*
non-uniform and *very* biased towards certain values
(lots of "<" and ">" I bet, and relatively few "\x03").
But, for the sake of the argument, if that's the random
distribution that you actually need, then the Internet
would be a good source of randomness.

<\pedant>

Just not for encryption. It would be terrible for that.


Randomness is a deep subject.

This is certainly true. I love the Dilbert cartoon
where Dilbert is on a tour of Accounting. He comes
across a troll sitting at a desk chanting "Nine, nine,
nine, nine, ...". His guide says, "This is our random
number generator." Dilbert looks skeptical and asks
"Are you sure that's random?", to which the guide
answers "That's the trouble with randomness, you can
never be sure."
 
R

Robert Kern

Steven said:
Mike Meyer wrote:

<pedant>

Depends on what you mean by "random". In particular,
the randomness of network events does not follow a
uniform distribution, but then not many things do.
Uniformly distributed random data is what you want for
cryptography. If you are modelling physical events, you
might want some other distribution, e.g. normal (bell
curve), Poisson, exponential, binomial, geometric,
hypergeometric, and so forth.

I have no idea what distribution data from the Internet
would have, I would imagine it is *extremely*
non-uniform and *very* biased towards certain values
(lots of "<" and ">" I bet, and relatively few "\x03").
But, for the sake of the argument, if that's the random
distribution that you actually need, then the Internet
would be a good source of randomness.

No, it works just fine as a source of randomness. It does not work as a
stream of uniform random bytes, which is a different thing altogether
(and to be fair, Mike made that distinction fairly clearly). It's
perfectly good as one of many sources to draw on to rekey a
cryptographically strong PRNG, though. C.f.
http://en.wikipedia.org/wiki/Fortuna_(PRNG)

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
G

Grant Edwards

<pedant>

Depends on what you mean by "random". In particular,
the randomness of network events does not follow a
uniform distribution, but then not many things do.

One presumes there is a way to "uniformize" the events, but I'm
just guessings.

[...]
I have no idea what distribution data from the Internet would
have, I would imagine it is *extremely* non-uniform and *very*
biased towards certain values (lots of "<" and ">" I bet, and
relatively few "\x03").

I've never heard of anybody using the data as source of
entropy. All the entropy gathering I've read about used the
timing of network events, not the user-data associated with
those events.
 
S

Steven D'Aprano

I've never heard of anybody using the data as source of
entropy. All the entropy gathering I've read about used the
timing of network events, not the user-data associated with
those events.


Me neither, but the original poster did ask how to read every nth byte
of "the Internet stream", so I assumed he had something like that in mind.
 
G

Grant Edwards

Me neither, but the original poster did ask how to read every nth byte
of "the Internet stream", so I assumed he had something like that in mind.

I agree that would be a pretty bad idea unless you went to some
effort to reduce the bias in the distribution of the value of
data bytes.
 
P

Peter Hansen

Steven said:
Me neither, but the original poster did ask how to read every nth byte
of "the Internet stream", so I assumed he had something like that in mind.

And to think that if you'd just waited for the OP to explain what the
heck he meant by "the Internet stream", you'd have saved ever so much
time. ;-)

(But then, if we always did that Usenet wouldn't be any fun.)

-Peter
 
G

Grant Edwards

And to think that if you'd just waited for the OP to explain
what the heck he meant by "the Internet stream", you'd have
saved ever so much time. ;-)

(But then, if we always did that Usenet wouldn't be any fun.)

That's for sure. The real questions are rarely as interesting
and the imagined ones.
 
S

Steven D'Aprano

And to think that if you'd just waited for the OP to explain what the
heck he meant by "the Internet stream", you'd have saved ever so much
time. ;-)

Has he done so yet? I can't see it anywhere.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top