Clustering technique

Discussion in 'Python' started by Luca, Dec 22, 2009.

  1. Luca

    Luca Guest

    Dear all, excuse me if i post a simple question.. I am trying to find
    a software/algorythm that can "cluster" simple data on an excel sheet

    Example:
    Variable a Variable b Variable c
    Case 1 1 0 0
    Case 2 0 1 1
    Case 3 1 0 0
    Case 4 1 1 0
    Case 5 0 1 1


    The systems recognizes that there are 3 possible clusters:

    the first with cases that has Variable a as true,
    the second has Variables b and c
    the third is "all the rest"

    Variabile a Variabile b Variabile c

    Case 1 1 0 0
    Case 3 1 0 0

    Case 2 0 1 1
    Case 5 0 1 1

    Case 4 1 1 0


    Thank you in advance
     
    Luca, Dec 22, 2009
    #1
    1. Advertising

  2. Luca

    Jon Clements Guest

    On Dec 22, 11:12 am, Luca <> wrote:
    > Dear all, excuse me if i post a simple question.. I am trying to find
    > a software/algorythm that can "cluster" simple data on an excel sheet
    >
    > Example:
    >                 Variable a   Variable b   Variable c
    > Case 1        1                   0              0
    > Case 2        0                   1              1
    > Case 3        1                   0              0
    > Case 4        1                   1              0
    > Case 5        0                   1              1
    >
    > The systems recognizes that there are 3 possible clusters:
    >
    > the first with cases that has Variable a as true,
    > the second has Variables b and c
    > the third is "all the rest"
    >
    >         Variabile a    Variabile b   Variabile c
    >
    > Case 1     1               0            0
    > Case 3     1               0            0
    >
    > Case 2     0               1            1
    > Case 5     0               1            1
    >
    > Case 4     1               1            0
    >
    > Thank you in advance


    If you haven't already, download and install xlrd from http://www.python-excel.org
    for a library than can read excel workbooks (but not 2007 yet).

    Or, export as CSV...

    Then using either the csv module/xlrd (both well documented) or any
    other way of reading the data, you effectively want to end up with
    something like this:

    rows = [
    #A #B #C #D
    ['Case 1', 1, 0 ,0],
    ['Case 2', 0, 1, 1],
    ['Case 3', 1, 0, 0],
    ['Case 4', 1, 1, 0],
    ['Case 5', 0, 1, 1]
    ]

    One approach is to sort 'rows' by B,C & D. This will bring the
    identical elements adjacent to each other in the list. Then you need
    an iterator to group them... take a look at itertools.groupby.

    Another is to use a defaultdict(list) found in collections. And just
    loop over the rows, again with B, C & D as a key, and A being appended
    to the list.

    hth
    Jon.
     
    Jon Clements, Dec 22, 2009
    #2
    1. Advertising

  3. Luca wrote:
    > Dear all, excuse me if i post a simple question.. I am trying to find
    > a software/algorythm that can "cluster" simple data on an excel sheet
    >
    > Example:
    > Variable a Variable b Variable c
    > Case 1 1 0 0
    > Case 2 0 1 1
    > Case 3 1 0 0
    > Case 4 1 1 0
    > Case 5 0 1 1
    >
    >
    > The systems recognizes that there are 3 possible clusters:
    >
    > the first with cases that has Variable a as true,
    > the second has Variables b and c
    > the third is "all the rest"
    >
    > Variabile a Variabile b Variabile c
    >
    > Case 1 1 0 0
    > Case 3 1 0 0
    >
    > Case 2 0 1 1
    > Case 5 0 1 1
    >
    > Case 4 1 1 0
    >
    >
    > Thank you in advance


    Luca,

    How many news groups and lists have you posted this on? I just answered
    this question on the PHP mailing list. If I'd seen your post here I
    would have written you a different, Pythonic answer (or left it to the
    other poster who has already given an answer). I appreciate that you
    want a reply, but it is *not good* to cross-post all over the place and
    waste a lot of people's time. Somebody takes the time and effort here to
    answer your questions whilst at the same time others are duplicating the
    effort elsewhere.

    If you need a solution specific to a certain language, then ask on that
    news group. If you're interested in a general answer then ask on a more
    general news group such as comp.programming or comp.theory.

    Taliesin Nuin.
     
    Taliesin Nuin, Dec 23, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tito

    Clustering and FrameWork.Net

    Tito, Jul 7, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    369
  2. Samuel Stammbach
    Replies:
    0
    Views:
    704
    Samuel Stammbach
    Aug 14, 2003
  3. Jing Zheng

    SOAP clustering

    Jing Zheng, Oct 9, 2003, in forum: Java
    Replies:
    0
    Views:
    393
    Jing Zheng
    Oct 9, 2003
  4. 1992

    Clustering & Tomcat

    1992, Nov 22, 2003, in forum: Java
    Replies:
    0
    Views:
    356
  5. 1992

    Clustering & Tomcat

    1992, Nov 22, 2003, in forum: Java
    Replies:
    0
    Views:
    375
Loading...

Share This Page