Re: Bayesian kids content filtering in Python?

Discussion in 'Python' started by John J. Lee, Aug 30, 2003.

  1. John J. Lee

    John J. Lee Guest

    (John J. Lee) writes:

    > "Paul Paterson" <> writes:

    [...]
    > censor, and you're not going to block them all. It may work well most
    > of the time, but is that enough? What's needed here, perhaps, is an
    > open effort to train on categories of things that people would like to
    > block. That might be enough, since I suppose *most* things you're

    [...]

    Thinking a bit more, that might well fail. It assumes that our
    high-level categories of things we want to block line up in a simple
    way with the workings of the algorithm, which is very doubtful when
    the set of things to filter is no longer highly restricted as in the
    email case. Though some censorship targets probably *are* spammishly
    predictable and unimaginative, no doubt lots aren't, too.

    I'm reminded of the experience of a miltary research project using
    neural networks to recognise tanks in aerial photographs. They got
    someone to go and take photos of tanks and other large tank-like
    objects partially hidden in forested terrain, and trained their
    network on a fraction of the photos. When they tested the network on
    the rest of the photographs, they were delighted to discover that it
    performed fantastically well, despite the great variability in the
    appearance of the objects and terrain, distinguishing tank from
    non-tank almost perfectly. Inexplicably, though, when they fed a very
    similar set of photos to same network, it failed miserably. It turned
    out that what the network had *really* trained on was not at all what
    they'd assumed. To take the photos, they'd gone out and scattered the
    real tanks over the landscape, and taken a set of tank photos. Then
    they'd moved the tanks out and put the mock-tanks in, and taken a set
    of mock-tank photos. Of course, that meant all the tank-photos were
    taken in bright light, and all the mock-tank photos were in dim light
    of late afternoon. The neural network sensibly picked up that the
    easy way to tell the two apart was just to look at how bright they
    were!

    I made the details of that story up, but who cares ;-)


    John
     
    John J. Lee, Aug 30, 2003
    #1
    1. Advertising

  2. John J. Lee

    Marc Wilson Guest

    In comp.lang.python, (John J. Lee) (John J. Lee) wrote in
    <>::

    | (John J. Lee) writes:
    |
    |> "Paul Paterson" <> writes:
    |[...]
    |> censor, and you're not going to block them all. It may work well most
    |> of the time, but is that enough? What's needed here, perhaps, is an
    |> open effort to train on categories of things that people would like to
    |> block. That might be enough, since I suppose *most* things you're
    |[...]
    |
    |Thinking a bit more, that might well fail. It assumes that our
    |high-level categories of things we want to block line up in a simple
    |way with the workings of the algorithm, which is very doubtful when
    |the set of things to filter is no longer highly restricted as in the
    |email case. Though some censorship targets probably *are* spammishly
    |predictable and unimaginative, no doubt lots aren't, too.
    |
    |I'm reminded of the experience of a miltary research project using
    |neural networks to recognise tanks in aerial photographs. They got
    |someone to go and take photos of tanks and other large tank-like
    |objects partially hidden in forested terrain, and trained their
    |network on a fraction of the photos. When they tested the network on
    |the rest of the photographs, they were delighted to discover that it
    |performed fantastically well, despite the great variability in the
    |appearance of the objects and terrain, distinguishing tank from
    |non-tank almost perfectly. Inexplicably, though, when they fed a very
    |similar set of photos to same network, it failed miserably. It turned
    |out that what the network had *really* trained on was not at all what
    |they'd assumed. To take the photos, they'd gone out and scattered the
    |real tanks over the landscape, and taken a set of tank photos. Then
    |they'd moved the tanks out and put the mock-tanks in, and taken a set
    |of mock-tank photos. Of course, that meant all the tank-photos were
    |taken in bright light, and all the mock-tank photos were in dim light
    |of late afternoon. The neural network sensibly picked up that the
    |easy way to tell the two apart was just to look at how bright they
    |were!
    |
    |I made the details of that story up, but who cares ;-)

    If you did, it's escaped into the wild: I read the same story in New
    Scientist(?) some years ago.
    --
    Marc Wilson

    Cleopatra Consultants Limited - IT Consultants
    2 The Grange, Cricklade Street, Old Town, Swindon SN1 3HG
    Tel: (44/0) 70-500-15051 Fax: (44/0) 870 164-0054
    Mail: Web: http://www.cleopatra.co.uk
    _________________________________________________________________
    Try MailTraq at https://my.mailtraq.com/register.asp?code=cleopatra
     
    Marc Wilson, Aug 30, 2003
    #2
    1. Advertising

  3. John J. Lee

    John J. Lee Guest

    Marc Wilson <> writes:

    > In comp.lang.python, (John J. Lee) (John J. Lee) wrote in
    > <>::

    [...]
    > |I made the details of that story up, but who cares ;-)
    >
    > If you did, it's escaped into the wild: I read the same story in New
    > Scientist(?) some years ago.


    Yeah, but the *details* are certain to be wrong. I didn't make up the
    *whole thing*!

    I saw it on some TV program years ago, IIRC.


    John
     
    John J. Lee, Aug 30, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mike
    Replies:
    0
    Views:
    384
  2. Paul Paterson
    Replies:
    0
    Views:
    395
    Paul Paterson
    Aug 30, 2003
  3. Ed Stoner
    Replies:
    1
    Views:
    375
    John J. Lee
    Sep 2, 2003
  4. Russ P.

    Python for kids?

    Russ P., Dec 7, 2008, in forum: Python
    Replies:
    12
    Views:
    3,742
    News123
    Dec 10, 2008
  5. Matt Mower
    Replies:
    8
    Views:
    168
    Lucas Carlson
    Apr 21, 2005
Loading...

Share This Page