I used defaultdic to store some variables but the output is blank

Discussion in 'Python' started by claire morandin, Jun 9, 2013.

  1. I have the following script which does not return anything, no apparent mistake but my output file is empty.I am just trying to extract some decimal number from a file according to their names which are in another file. from collections import defaultdict import numpy as np

    Code:
    ercc_contigs= {}
    for line in open ('Faq_ERCC_contigs_name.txt'):
    gene = line.strip().split()
    
    ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))
    output_file = open('out.txt','w')
    
    rpkm_file = open('RSEM_Faq_Q1.genes.results.txt')
    rpkm_file.readline()
    for line in rpkm_file:
    line = line.strip()
    columns =  line.strip().split()
    gene = columns[0].strip()
    rpkm_value = float(columns[6].strip())
    if gene in ercc_contigs:
    ercc_rpkm[gene] += rpkm_value
    
    ercc_fh = open ('out.txt','w')
    for gene, rpkm_value in ercc_rpkm.iteritems():
    ercc = '{0}\t{1}\n'.format(gene, rpkm_value)
    ercc_fh.write (ercc)
    If someone could help me spot what's wrong it would be much appreciate cheers
    claire morandin, Jun 9, 2013
    #1
    1. Advertising

  2. claire morandin

    Peter Otten Guest

    claire morandin wrote:

    > I have the following script which does not return anything, no apparent
    > mistake but my output file is empty.I am just trying to extract some
    > decimal number from a file according to their names which are in another
    > file. from collections import defaultdict import numpy as np
    >
    >
    Code:
    ercc_contigs= {}
    > for line in open ('Faq_ERCC_contigs_name.txt'):
    >     gene = line.strip().split()[/color]
    
    You probably planned to use the loop above to populate the ercc_contigs
    dict, but there's no code for that.
    
    [color=blue]
    > ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))
    > output_file = open('out.txt','w')
    >
    > rpkm_file = open('RSEM_Faq_Q1.genes.results.txt')
    > rpkm_file.readline()
    > for line in rpkm_file:
    >     line = line.strip()
    >     columns =  line.strip().split()
    >     gene = columns[0].strip()
    >     rpkm_value = float(columns[6].strip())[/color]
    
    Remember that ercc_contigs is empty; therefore the test
    [color=blue]
    >     if gene in ercc_contigs:[/color]
    
    always fails and the following line is never executed.
    [color=blue]
    >         ercc_rpkm[gene] += rpkm_value
    >
    > ercc_fh = open ('out.txt','w')
    > for gene, rpkm_value in ercc_rpkm.iteritems():
    >     ercc = '{0}\t{1}\n'.format(gene, rpkm_value)
    >     ercc_fh.write (ercc)
    >
    > If someone could help me spot what's wrong it would be much appreciate
    > cheers


    By the way: it is unclear to my why you are using a numpy array here:

    > ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))


    I think

    ercc_rpkm = defaultdict(float)

    should suffice. Also:

    > line = line.strip()
    > columns = line.strip().split()
    > gene = columns[0].strip()
    > rpkm_value = float(columns[6].strip())


    You can remove all strip() method calls here as line.split() implicitly
    removes all whitespace.
    Peter Otten, Jun 9, 2013
    #2
    1. Advertising

  3. Thanks Peter, true I did not realize that ercc_contigs is empty, but I am not sure how to "populate" the dictionary if I only have one column for the value but no key
    claire morandin, Jun 9, 2013
    #3
  4. claire morandin

    Peter Otten Guest

    claire morandin wrote:

    > Thanks Peter, true I did not realize that ercc_contigs is empty, but I am
    > not sure how to "populate" the dictionary if I only have one column for
    > the value but no key


    You could use a "dummy value"

    ercc_contigs = {}
    for line in open('Faq_ERCC_contigs_name.txt'):
    gene = line.split()[0]
    ercc_contigs[gene] = None

    but a better approach is to use a set instead of a dict:

    ercc_contigs = set()
    for line in open('Faq_ERCC_contigs_name.txt'):
    gene = line.split()[0]
    ercc_contigs.add(gene)
    Peter Otten, Jun 9, 2013
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?UnVkeQ==?=

    to store or not to store an image

    =?Utf-8?B?UnVkeQ==?=, Mar 29, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    612
    =?Utf-8?B?UnVkeQ==?=
    Mar 30, 2005
  2. Travis
    Replies:
    6
    Views:
    523
    Markus Schoder
    Jun 28, 2007
  3. Philipp
    Replies:
    3
    Views:
    1,117
    Roedy Green
    Nov 26, 2008
  4. Frank
    Replies:
    0
    Views:
    295
    Frank
    May 6, 2009
  5. Yvonne

    User Control used in Repeater, but no output

    Yvonne, Nov 14, 2006, in forum: ASP .Net Web Controls
    Replies:
    2
    Views:
    138
    Yvonne
    Nov 15, 2006
Loading...

Share This Page