Concatenating dictionary values and keys, and further operations

Discussion in 'Python' started by Girish Sahani, Jun 5, 2006.

  1. I wrote the following code to concatenate every 2 keys of a dictionary and
    their corresponding values.
    e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get
    tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of
    features.
    Now i want to check each pair to see if they are connected...element of
    this pair will be one from the first list and one from the second....e.g
    for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and
    5,then 2 and 3,then 2 and 4,then 2 and 5.
    The information of this connected thing is in a text file as follows:
    1,'a',2,'b'
    3,'a',5,'a'
    3,'a',6,'a'
    3,'a',7,'b'
    8,'a',7,'b'
    ..
    ..
    This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected
    and so on.
    I am not able to figure out how to do this.Any pointers would be helpful
    Here is the code i have written till now:
    Code (Text):

    def genTI(tiDict):
        tiDict1 = {}
        tiList = [tiDict1.keys(),tiDict1.values()]
        length =len(tiDict1.keys())-1
        for i in range(0,length,1):
            for j in range(0,length,1):
                for k in range(1,length+1,1):
                    if j+k <= length:
                        key = tiList[i][j] + tiList[i][j+k]
                        value = [tiList[i+1][j],tiList[i+1][j+k]]
                        tiDict2[key] = value
                        continue
                    continue
                continue
            return tiDict2
     
    Thanks in advance,
    girish
     
    Girish Sahani, Jun 5, 2006
    #1
    1. Advertisements


  2. Girish

    It seems you want the Cartesian product of every pair of lists in the
    dictionary, including the product of lists with themselves (but you
    don't say why ;-)).

    I'm not sure the following is exactly what you want or if it is very
    efficient, but maybe it will start you off. It uses a function
    'xcombine' taken from a recipe in the ASPN cookbook by David
    Klaffenbach (2004).

    (It should give every possibility, which you then check in your file)

    Gerard

    -------------------------------------------------------------------------

    def nkRange(n,k):
    m = n - k + 1
    indexer = range(0, k)
    vector = range(1, k+1)
    last = range(m, n+1)
    yield vector
    while vector != last:
    high_value = -1
    high_index = -1
    for i in indexer:
    val = vector
    if val > high_value and val < m + i:
    high_value = val
    high_index = i
    for j in range(k - high_index):
    vector[j+high_index] = high_value + j + 1
    yield vector

    def kSubsets( alist, k ):
    n = len(alist)
    for vector in nkRange(n, k):
    ret = []
    for i in vector:
    ret.append( alist[i-1] )
    yield ret

    data = { 'a': [1,2], 'b': [3,4,5], 'c': [1,4,7] }

    pairs = list( kSubsets(data.keys(),2) ) + [ [k,k] for k in
    data.iterkeys() ]
    print pairs
    for s in pairs:
    for t in xcombine( data[s[0]], data[s[1]] ):
    print "%s,'%s',%s,'%s'" % ( t[0], s[0], t[1], s[1] )


    -------------------------------------------------------------------------

    1,'a',1,'c'
    1,'a',4,'c'
    1,'a',7,'c'
    2,'a',1,'c'
    2,'a',4,'c'
    2,'a',7,'c'
    1,'a',3,'b'
    1,'a',4,'b'
    1,'a',5,'b'
    2,'a',3,'b'
    2,'a',4,'b'
    2,'a',5,'b'
    1,'c',3,'b'
    1,'c',4,'b'
    1,'c',5,'b'
    4,'c',3,'b'
    4,'c',4,'b'
    4,'c',5,'b'
    7,'c',3,'b'
    7,'c',4,'b'
    7,'c',5,'b'
    1,'a',1,'a'
    1,'a',2,'a'
    2,'a',1,'a'
    2,'a',2,'a'
    1,'c',1,'c'
    1,'c',4,'c'
    1,'c',7,'c'
    4,'c',1,'c'
    4,'c',4,'c'
    4,'c',7,'c'
    7,'c',1,'c'
    7,'c',4,'c'
    7,'c',7,'c'
    3,'b',3,'b'
    3,'b',4,'b'
    3,'b',5,'b'
    4,'b',3,'b'
    4,'b',4,'b'
    4,'b',5,'b'
    5,'b',3,'b'
    5,'b',4,'b'
    5,'b',5,'b'
     
    Gerard Flanagan, Jun 5, 2006
    #2
    1. Advertisements

  3. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302478
     
    Gerard Flanagan, Jun 5, 2006
    #3
  4. I have a text file in the following format:

    1,'a',2,'b'
    3,'a',5,'c'
    3,'a',6,'c'
    3,'a',7,'b'
    8,'a',7,'b'
    ..
    ..
    ..
    Now i need to generate 2 things by reading the file:
    1) A dictionary with the numbers as keys and the letters as values.
    e.g the above would give me a dictionary like
    {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}
    2) A list containing pairs of numbers from each line.
    The above formmat would give me the list as
    [[1,2],[3,5],[3,6][3,7][8,7]......]

    I wrote the following codes for both of these but the problem is that
    lines returns a list like ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'".....]
    Now due to the "" around each line,it is treated like one object
    and i cannot access the elements of a line.

    Code (Text):

    #code to generate the dictionary
    def get_colocations(filename):
        lines = open(filename).read().split("\n")
        colocnDict = {}
        i = 0
        for line in lines:
            if i <= 2:
                colocnDict[line[i]] = line[i+1]
                i+=2
                continue
            return colocnDict
     
    Code (Text):

    def genPairs(filename):
        lines = open(filename).read().split("\n")
        pairList = []
        for line in lines:
            pair = [line[0],line[2]]
            pairList.append(pair)
            i+=2
            continue
    return pairList
     
    Please help :((
     
    Girish Sahani, Jun 6, 2006
    #4
  5. Girish Sahani

    K.S.Sreeram Guest

    def get_dict( f ) :
    out = {}
    for line in file(f) :
    n1,s1,n2,s2 = line.split(',')
    out.update( { int(n1):s1[1], int(n2):s2[1] } )
    return out
    def get_pairs( f ) :
    out = []
    for line in file(f) :
    n1,_,n2,_ = line.split(',')
    out.append( [int(n1),int(n2)] )
    return out

    Regards
    Sreeram


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.2.2 (MingW32)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

    iD8DBQFEhQdNrgn0plK5qqURAiVkAJ9Rr0XRRhofIP4Z2eYF1nFvvHTCUgCgmMkM
    6U9ieDTmvItGbW8QKUCWrFo=
    =wwVC
    -----END PGP SIGNATURE-----
     
    K.S.Sreeram, Jun 6, 2006
    #5
  6. Girish Sahani

    John Machin Guest

    Check out the csv module.
    You managed to split the file contents into lines using
    lines = open(filename).read().split("\n")
    Same principle applies to each line:

    |>>> lines = ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'"]
    |>>> lines[0].split(',')
    ['1', "'a'", '2', "'b'"]
    |>>> lines[1].split(',')
    ['3', "'a'", '5', "'c"]
    |>>>
    def get_both(filename):
    lines = open(filename).read().split("\n")
    colocnDict = {}
    pairList = []
    for line in lines:
    n1, b1, n2, b2 = line.split(",")
    n1 = int(n1)
    n2 = int(n2)
    a1 = b1.strip("'")
    a2 = b2.strip("'")
    colocnDict[n1] = a1
    colocnDict[n2] = a2
    pairList.append([n1, n2])
    return colocnDict, pairList

    def get_both_csv(filename):
    import csv
    reader = csv.reader(open(filename, "rb"), quotechar="'")
    colocnDict = {}
    pairList = []
    for n1, a1, n2, a2 in reader:
    n1 = int(n1)
    n2 = int(n2)
    colocnDict[n1] = a1
    colocnDict[n2] = a2
    pairList.append([n1, n2])
    return colocnDict, pairList

    HTH,
    John
     
    John Machin, Jun 6, 2006
    #6
  7. Hi Garish,

    it's better to reply to the Group.
    I'm confused. You say *for each* key-value pair, and you wrote above
    that the keys were the 'concatenation' of "every 2 keys of a
    dictionary".

    Sorry, too early for me. Maybe if you list every case you want, given
    the example data.

    All the best.

    Gerard
     
    Gerard Flanagan, Jun 6, 2006
    #7
  8. Really sorry for that indentation thing :)
    I tried out the code you have given, and also the one sreeram had written.
    In all of these,i get the same error of this type:
    Error i get in Sreeram's code is:
    n1,_,n2,_ = line.split(',')
    ValueError: need more than 1 value to unpack

    And error i get in your code is:
    for n1, a1, n2, a2 in reader:
    ValueError: need more than 0 values to unpack

    Any ideas why this is happening?

    Thanks a lot,
    girish
     
    Girish Sahani, Jun 6, 2006
    #8
  9. Girish Sahani

    John Machin Guest

    In the case of my code, this is consistent with the line being empty,
    probably the last line. As my mentor Bruno D. would say, your test data
    does not match your spec :) Which do you want to change, the spec or
    the data?

    You can change my csv-reading code to detect dodgy data like this (for
    example):

    for row in reader:
    if not row:
    continue # ignore empty lines, wherever they appear
    if len(row) != 4:
    raise ValueError("Malformed row %r" % row)
    n1, a1, n2, a2 = row

    In the case of Sreeram's code, perhaps you could try inserting
    print "line = ", repr(line)
    before the statement that is causing the error.
     
    John Machin, Jun 6, 2006
    #9
  10. Girish Sahani

    skip Guest

    Girish> I have a text file in the following format:
    Girish> 1,'a',2,'b'
    Girish> 3,'a',5,'c'
    Girish> 3,'a',6,'c'
    Girish> 3,'a',7,'b'
    Girish> 8,'a',7,'b'
    Girish> .
    Girish> .
    Girish> .
    Girish> Now i need to generate 2 things by reading the file:
    Girish> 1) A dictionary with the numbers as keys and the letters as values.
    Girish> e.g the above would give me a dictionary like
    Girish> {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}
    Girish> 2) A list containing pairs of numbers from each line.
    Girish> The above formmat would give me the list as
    Girish> [[1,2],[3,5],[3,6][3,7][8,7]......]

    Running this:

    open("some.text.file", "w").write("""\
    1,'a',2,'b'
    3,'a',5,'c'
    3,'a',6,'c'
    3,'a',7,'b'
    8,'a',7,'b'
    """)

    import csv

    class dialect(csv.excel):
    quotechar = "'"
    reader = csv.reader(open("some.text.file", "rb"), dialect=dialect)
    mydict = {}
    mylist = []
    for row in reader:
    numbers = [int(n) for n in row[::2]]
    letters = row[1::2]
    mydict.update(dict(zip(numbers, letters)))
    mylist.append(numbers)

    print mydict
    print mylist

    import os

    os.unlink("some.text.file")

    displays this:

    {1: 'a', 2: 'b', 3: 'a', 5: 'c', 6: 'c', 7: 'b', 8: 'a'}
    [[1, 2], [3, 5], [3, 6], [3, 7], [8, 7]]

    That seems to be approximately what you're looking for.

    Skip
     
    skip, Jun 6, 2006
    #10
  11. Thanks John, i just changed my Data file so as not to contain any empty
    lines, i guess that was the easier solution ;)
     
    Girish Sahani, Jun 7, 2006
    #11
  12. Girish said, through Gerard's forwarded message:
    The problem if that the two lists aren't distinguishable when
    concatenated, so what you get is [1, 2, 3, 4, 5]. You have to pack
    both lists in a tuple: {'ab': ([1, 2], [3, 4, 5]), ...}

    {'ac': ([1, 2], [6, 7]), 'ab': ([1, 2], [3, 4, 5]), 'bc': ([3, 4, 5], [6, 7])}
    You can do this without creating an additional dictionary:
    .... cartesian_product = [(x, y) for x in d for y in d[j]]
    .... print i + j, cartesian_product
    ....
    ac [(1, 6), (1, 7), (2, 6), (2, 7)]
    ab [(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)]
    bc [(3, 6), (3, 7), (4, 6), (4, 7), (5, 6), (5, 7)]

    You can do whatever you want with this cartesian product inside the loop.

    I don't understand the semantics of the file format, so I leave this
    as an exercise to the reader :)
    Best regards.
     
    Roberto Bonvallet, Jun 7, 2006
    #12
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.