Re: Compare list entry from csv files

Discussion in 'Python' started by Anatoli Hristov, Nov 27, 2012.

  1. On Tue, Nov 27, 2012 at 4:23 AM, Dave Angel <> wrote:
    > On 11/26/2012 05:27 PM, Anatoli Hristov wrote:
    >> I understand, but in my case I have for sure the field "Name" in the
    >> second file that contains at least the first or the last name on it...
    >> So probably it should be possible:)
    >> The Name "Billgatesmicrosoft" contains the word "Gates" so logically I
    >> might find a solution for it.
    >>

    >
    > (Please don't top-post. Or if you must, then delete everything after
    > your post, as I'm doing here. Otherwise you end up with insanities like
    > new stuff, quote-4, quote-1, quote-3, quote-2. In this case, long
    > tradition on this forum and many like it work well, even if Microsoft
    > mail programs and some others decide to put the cursor at the wrong end
    > of the existing text. In most programs, it's configurable.)
    >
    > If you can come up with an algorithm for comparing first+last in one
    > file to name in the other, then the problem can be solved. But you
    > can't do it by hand-waving, you have to actually figure out a mechanism.
    > Then we can help you code such a thing. And I can just about guarantee
    > that if these fields are created independently by human beings, that
    > there will be exceptions that have to fixed by human beings.
    >
    >
    > --
    >
    > DaveA


    Thanks for your help. I will do my best for the forum :)

    I advanced a little bit with the algorithm and at least I can now
    extract and compare the fields :)
    For my beginner skills I think this is too much for me. Now next step
    is to add the second field with the number to the Namelist and copy it
    to a third filename I suppose.

    import csv

    origf = open('c:/Working/Test_phonebook.csv', 'rt')
    secfile = open('c:/Working/phones.csv', 'rt')

    phonelist = []
    namelist = []

    names = csv.reader(origf, delimiter=';')
    phones = csv.reader(secfile, delimiter=';')
    for tel in phones:
    phonelist.append(tel)

    #print "*"*25,phonelist,"*"*25
    rows = 0

    def finder(name_row):
    for ex_phone in phonelist:
    # phonelist.append(tel)
    telstr = ex_phone[0].lower()
    # print "*"*25 + " I got %s" % telstr
    # print "\nGot name from Name_Find :%s" % name_row
    if telstr.find(name_row) >= 0:
    print "\nName found: %s" % name_row
    else:
    pass
    return
    # print "\nNot found %s" % name_row

    def name_find():

    for row in names:
    namelist.append(row)
    name_row = row[0].lower()
    # print "\nExtracted Name is :% s" % name_row
    finder(name_row)
    # name_find()
    # rows = rows +1
    name_find()
     
    Anatoli Hristov, Nov 27, 2012
    #1
    1. Advertising

  2. Anatoli Hristov

    Neil Cerutti Guest

    On 2012-11-27, Anatoli Hristov <> wrote:
    > Thanks for your help. I will do my best for the forum :)
    >
    > I advanced a little bit with the algorithm and at least I can
    > now extract and compare the fields :) For my beginner skills I
    > think this is too much for me. Now next step is to add the
    > second field with the number to the Namelist and copy it to a
    > third filename I suppose.


    I had to write a similar type of program, and I imagine it's a
    common problem. Sometimes new students provide incorrect SSN's or
    simply leave them blank. This makes it impossible for us to match
    their application for financial aid to their admissions record.

    You have to analyze how you're going to match records.

    In my case, missing SSN's are one case. A likeley match in this
    case is when the names are eerily similar.

    In the other case, where they simply got their SSN wrong, I have
    to check for both a similar SSN and a similar name.

    But you still have to define "similar." I looked up an algorithm
    on the web called Levenshtein Distance, and implemented it like
    so.

    def levenshteindistance(first, second):
    """Find the Levenshtein distance between two strings."""
    if len(first) > len(second):
    first, second = second, first
    if len(second) == 0:
    return len(first)
    first_length = len(first) + 1
    second_length = len(second) + 1
    distance_matrix = [[0] * second_length for x in range(first_length)]
    for i in range(first_length):
    distance_matrix[0] = i
    for j in range(second_length):
    distance_matrix[0][j]=j
    for i in range(1, first_length):
    for j in range(1, second_length):
    deletion = distance_matrix[i-1][j] + 1
    insertion = distance_matrix[j-1] + 1
    substitution = distance_matrix[i-1][j-1]
    if first[i-1] != second[j-1]:
    substitution += 1
    distance_matrix[j] = min(insertion, deletion, substitution)
    return distance_matrix[first_length-1][second_length-1]

    The algorithm return a count of every difference between the two
    strings, from 0 to the length of the longest string.

    Python provides difflib, which implements a similar algorithm, so
    I used that as well (kinda awkwardly). I used
    difflib.get_close_matches to get candidates, and then
    difflib.SequenceMatcher to provide me a score measuring the
    closeness.

    matches = difflib.get_close_matches(s1, s2)
    for m in matches:
    scorer = difflib.SequenceMatcher(None, s1, m)
    ratio = scorer.ratio()
    if ratio == 0.0:
    # perfect match
    if ratio > MAX_RATIO: # You gotta choose this. I used 0.1
    # close match

    The two algorithms come up with different guesses, and I pass on
    their suggestions for fixes to a human being. Both versions of
    the program take roughly 5 minutes to run the comparison on
    2000-12000 records between the two files.

    I like the results of Levenshtein distance a little better, but
    difflib finds some stuff that it misses.

    In your case, the name is munged horribly in one of the files so
    you'll first have to first sort it out somehow.

    --
    Neil Cerutti
     
    Neil Cerutti, Nov 27, 2012
    #2
    1. Advertising

  3. On Tue, Nov 27, 2012 at 4:05 PM, Neil Cerutti <> wrote:
    > On 2012-11-27, Anatoli Hristov <> wrote:
    >> Thanks for your help. I will do my best for the forum :)
    >>
    >> I advanced a little bit with the algorithm and at least I can
    >> now extract and compare the fields :) For my beginner skills I
    >> think this is too much for me. Now next step is to add the
    >> second field with the number to the Namelist and copy it to a
    >> third filename I suppose.

    >
    > I had to write a similar type of program, and I imagine it's a
    > common problem. Sometimes new students provide incorrect SSN's or
    > simply leave them blank. This makes it impossible for us to match
    > their application for financial aid to their admissions record.
    >
    > You have to analyze how you're going to match records.
    >
    > In my case, missing SSN's are one case. A likeley match in this
    > case is when the names are eerily similar.
    >
    > In the other case, where they simply got their SSN wrong, I have
    > to check for both a similar SSN and a similar name.
    >
    > But you still have to define "similar." I looked up an algorithm
    > on the web called Levenshtein Distance, and implemented it like
    > so.
    >
    > def levenshteindistance(first, second):
    > """Find the Levenshtein distance between two strings."""
    > if len(first) > len(second):
    > first, second = second, first
    > if len(second) == 0:
    > return len(first)
    > first_length = len(first) + 1
    > second_length = len(second) + 1
    > distance_matrix = [[0] * second_length for x in range(first_length)]
    > for i in range(first_length):
    > distance_matrix[0] = i
    > for j in range(second_length):
    > distance_matrix[0][j]=j
    > for i in range(1, first_length):
    > for j in range(1, second_length):
    > deletion = distance_matrix[i-1][j] + 1
    > insertion = distance_matrix[j-1] + 1
    > substitution = distance_matrix[i-1][j-1]
    > if first[i-1] != second[j-1]:
    > substitution += 1
    > distance_matrix[j] = min(insertion, deletion, substitution)
    > return distance_matrix[first_length-1][second_length-1]
    >
    > The algorithm return a count of every difference between the two
    > strings, from 0 to the length of the longest string.
    >
    > Python provides difflib, which implements a similar algorithm, so
    > I used that as well (kinda awkwardly). I used
    > difflib.get_close_matches to get candidates, and then
    > difflib.SequenceMatcher to provide me a score measuring the
    > closeness.
    >
    > matches = difflib.get_close_matches(s1, s2)
    > for m in matches:
    > scorer = difflib.SequenceMatcher(None, s1, m)
    > ratio = scorer.ratio()
    > if ratio == 0.0:
    > # perfect match
    > if ratio > MAX_RATIO: # You gotta choose this. I used 0.1
    > # close match
    >
    > The two algorithms come up with different guesses, and I pass on
    > their suggestions for fixes to a human being. Both versions of
    > the program take roughly 5 minutes to run the comparison on
    > 2000-12000 records between the two files.
    >
    > I like the results of Levenshtein distance a little better, but
    > difflib finds some stuff that it misses.
    >
    > In your case, the name is munged horribly in one of the files so
    > you'll first have to first sort it out somehow.
    >
    > --
    > Neil Cerutti
    > --
    > http://mail.python.org/mailman/listinfo/python-list



    Thank you all for the help, but I figured that out and the program now
    works perfect. I would appreciate if you have some notes about my
    script as I'm noob :)
    Here is the code:

    import csv

    origf = open('c:/Working/Test_phonebook.csv', 'rt')
    secfile = open('c:/Working/phones.csv', 'rt')

    phonelist = []
    namelist = []

    names = csv.reader(origf, delimiter=';')
    phones = csv.reader(secfile, delimiter=';')
    for tel in phones:
    phonelist.append(tel)



    def finder(name_row,rows):
    for ex_phone in phonelist:
    telstr = ex_phone[0].lower()
    if telstr.find(name_row) >= 0:
    print "\nName found: %s" % name_row
    namelist[rows][-1] = ex_phone[-1].lower()
    else:
    pass
    return

    def name_find():
    rows = 0
    for row in names:
    namelist.append(row)
    name_row = row[0].lower()
    finder(name_row,rows)
    rows = rows+1
    name_find()
    ofile = open('c:/Working/ttest.csv', "wb")
    writer = csv.writer(wfile, delimiter=';')
    for insert in namelist:
    writer.writerow(insert)
    wfile.close()
     
    Anatoli Hristov, Nov 27, 2012
    #3
  4. Anatoli Hristov

    Neil Cerutti Guest

    On 2012-11-27, Anatoli Hristov <> wrote:
    > Thank you all for the help, but I figured that out and the
    > program now works perfect. I would appreciate if you have some
    > notes about my script as I'm noob :) Here is the code:
    >
    > import csv
    >
    > origf = open('c:/Working/Test_phonebook.csv', 'rt')
    > secfile = open('c:/Working/phones.csv', 'rt')


    csv module expects files to be opened in binary mode in Python
    versions less than version 3.0. For Python versions >= 3.0, you
    use the special keyword argument, newlines='', instead.

    > phonelist = []
    > namelist = []


    The structure of your program is poor. It's workable for such a
    short script, and sometimes my first cuts are similar, but it's
    better to get out of the habit right away.

    Once you get this working the way you'd like you should clean up
    the structure as a service to your future self.

    > names = csv.reader(origf, delimiter=';')
    > phones = csv.reader(secfile, delimiter=';')


    You csv files don't seem to have header rows, but even so you can
    improve your code by providing fieldnames and using a DictReader
    instead.

    name_reader = csv.DictReader(origf, fieldnames=[
    'Name', 'Blah', 'Phone#'])

    Then you can read from records with

    name = row['Name']

    instead of using bare, undocumented integers.

    > for tel in phones:
    > phonelist.append(tel)
    >
    > def finder(name_row,rows):
    > for ex_phone in phonelist:
    > telstr = ex_phone[0].lower()
    > if telstr.find(name_row) >= 0:


    This strikes me as a crude way to match names. You don't really
    want Donald to match perfectly with McDonald, do you? Or for
    Smith to match with Smithfield?

    Yes, a human being will clean it up, but your program can do a
    better job.

    > print "\nName found: %s" % name_row
    > namelist[rows][-1] = ex_phone[-1].lower()
    > else:
    > pass
    > return
    >
    > def name_find():
    > rows = 0
    > for row in names:
    > namelist.append(row)
    > name_row = row[0].lower()
    > finder(name_row,rows)
    > rows = rows+1


    You can use the useful enumerate function instead of your own
    counter.

    for rows, row in enumerate(names):

    ....though I would find 'rownum' or 'num' or just 'i' better than
    the name 'rows', which I find confusing.

    > name_find()
    > ofile = open('c:/Working/ttest.csv', "wb")
    > writer = csv.writer(wfile, delimiter=';')
    > for insert in namelist:
    > writer.writerow(insert)
    > wfile.close()


    --
    Neil Cerutti
     
    Neil Cerutti, Nov 27, 2012
    #4
  5. On Tue, Nov 27, 2012 at 9:41 PM, Neil Cerutti <> wrote:
    > On 2012-11-27, Anatoli Hristov <> wrote:
    >> Thank you all for the help, but I figured that out and the
    >> program now works perfect. I would appreciate if you have some
    >> notes about my script as I'm noob :) Here is the code:
    >>
    >> import csv
    >>
    >> origf = open('c:/Working/Test_phonebook.csv', 'rt')
    >> secfile = open('c:/Working/phones.csv', 'rt')

    >
    > csv module expects files to be opened in binary mode in Python
    > versions less than version 3.0. For Python versions >= 3.0, you
    > use the special keyword argument, newlines='', instead.
    >
    >> phonelist = []
    >> namelist = []

    >
    > The structure of your program is poor. It's workable for such a
    > short script, and sometimes my first cuts are similar, but it's
    > better to get out of the habit right away.
    >
    > Once you get this working the way you'd like you should clean up
    > the structure as a service to your future self.
    >
    >> names = csv.reader(origf, delimiter=';')
    >> phones = csv.reader(secfile, delimiter=';')

    >
    > You csv files don't seem to have header rows, but even so you can
    > improve your code by providing fieldnames and using a DictReader
    > instead.
    >
    > name_reader = csv.DictReader(origf, fieldnames=[
    > 'Name', 'Blah', 'Phone#'])
    >
    > Then you can read from records with
    >
    > name = row['Name']
    >
    > instead of using bare, undocumented integers.
    >
    >> for tel in phones:
    >> phonelist.append(tel)
    >>
    >> def finder(name_row,rows):
    >> for ex_phone in phonelist:
    >> telstr = ex_phone[0].lower()
    >> if telstr.find(name_row) >= 0:

    >
    > This strikes me as a crude way to match names. You don't really
    > want Donald to match perfectly with McDonald, do you? Or for
    > Smith to match with Smithfield?
    >
    > Yes, a human being will clean it up, but your program can do a
    > better job.
    >
    >> print "\nName found: %s" % name_row
    >> namelist[rows][-1] = ex_phone[-1].lower()
    >> else:
    >> pass
    >> return
    >>
    >> def name_find():
    >> rows = 0
    >> for row in names:
    >> namelist.append(row)
    >> name_row = row[0].lower()
    >> finder(name_row,rows)
    >> rows = rows+1

    >
    > You can use the useful enumerate function instead of your own
    > counter.
    >
    > for rows, row in enumerate(names):
    >
    > ...though I would find 'rownum' or 'num' or just 'i' better than
    > the name 'rows', which I find confusing.
    >
    >> name_find()
    >> ofile = open('c:/Working/ttest.csv', "wb")
    >> writer = csv.writer(wfile, delimiter=';')
    >> for insert in namelist:
    >> writer.writerow(insert)
    >> wfile.close()

    >
    > --
    > Neil Cerutti
    > --
    > http://mail.python.org/mailman/listinfo/python-list


    Hello,

    Tried to document a little bit the script, but I'm not that good in that too :)

    The only problem I have is that I cant compare other field than the
    first one in
    for ex_phone in phones:
    telstr = ex_phone[0].lower()
    When I use telstr = ex_phone[0].lower() it says out of range and the
    strange think is that the range is 6 I can't figure that out. So when
    I edit the csv I modify the look of the file and then I start the
    script and it works, but I wanted to use more than one condition and I
    can't :(




    import csv

    # Open the file with the names and addresses
    origf = open('c:/Working/vpharma.csv', 'rt')
    # Open the file with the phone numbers
    secfile = open('c:/Working/navori.csv', 'rt')

    # Creates the empty list with the names
    namelist = []
    # Creates the empty list with the phone numbers
    PHONELIST = []


    # Reads the file with the names
    # Format "Name","Phone"
    names = csv.reader(origf, delimiter=';')

    # Reads the file with the phone numbers
    # Format "First name","Lastname","Address","City","Country","Phone"
    phones = csv.reader(secfile, delimiter=';')

    # Creates a list with phone numbers
    #for tel in phones:
    # PHONELIST.append(tel)


    def finder(Compare_Name,rows):
    '''
    Compare the names from the namelist with the names from the phonelist.
    If the name match - then the phone number is added to the specified field
    '''
    for ex_phone in phones:
    telstr = ex_phone[0].lower()
    print telstr
    if telstr.find(Compare_Name) >= 0:
    print "\nName found: %s" % Compare_Name
    namelist[rows][-1] = ex_phone[-1].lower()
    else:
    print "Not found %s" % Compare_Name
    pass
    return

    def name_find():
    rows = 0
    for row in names:
    namelist.append(row)
    Compare_Name = row[1].lower()
    finder(Compare_Name,rows)
    rows = rows+1

    if __name__ == '__main__':
    name_find()

    # Writes the list to a file
    wfile = open('c:/Working/ttest.csv', "wb")
    writer = csv.writer(wfile, delimiter=';')
    for insert in namelist:
    writer.writerow(insert)
    wfile.close()
     
    Anatoli Hristov, Nov 29, 2012
    #5
  6. Anatoli Hristov

    Thomas Bach Guest

    Can you please cut the message you are responding to the relevant
    parts?

    On Thu, Nov 29, 2012 at 11:22:28AM +0100, Anatoli Hristov wrote:
    > The only problem I have is that I cant compare other field than the
    > first one in
    > for ex_phone in phones:
    > telstr = ex_phone[0].lower()
    > When I use telstr = ex_phone[0].lower() it says out of range and the
    > strange think is that the range is 6 I can't figure that out.


    As I understood it phones is an csv.reader instance and you are
    iterating repeatedly over it. But, csv.reader does not work this
    way. You either have to reinstantiate phones with a fresh
    file-descriptor (not so good) or cache the values in an appropriate
    data structure (better) e.g. a list.

    > import csv
    >
    > # Open the file with the names and addresses
    > origf = open('c:/Working/vpharma.csv', 'rt')
    > # Open the file with the phone numbers
    > secfile = open('c:/Working/navori.csv', 'rt')


    Note that you never close origf and secfile.

    > […]
    > # Reads the file with the phone numbers
    > # Format "First name","Lastname","Address","City","Country","Phone"
    > phones = csv.reader(secfile, delimiter=';')


    So this should probably be
    PHONES = list(csv.reader(secfile, delimiter=';'))

    (in uppercase letters as it is a global)

    > […]
    > if __name__ == '__main__':
    > name_find()
    >
    > # Writes the list to a file
    > wfile = open('c:/Working/ttest.csv', "wb")
    > writer = csv.writer(wfile, delimiter=';')
    > for insert in namelist:
    > writer.writerow(insert)
    > wfile.close()


    This should go either in the "if __name__ = …" part or in a function
    on its own.

    Also have a look at the with statement you can use it in several
    places of your code.

    There are several other improvements you can make:
    + instead of having the file-names hard coded try to use argparse to
    get them from the command-line,
    + let functions stand at their own and use less globals,
    + try to avoid the use of the type of the data structure in the name
    (e.g. names is IMHO a better name then namelist),
    + add tests.

    Regards,
    Thomas
     
    Thomas Bach, Nov 29, 2012
    #6
  7. Anatoli Hristov wrote:

    > Hello,
    >
    > Tried to document a little bit the script, but I'm not that good in that too :)
    >
    > Theonly problem I have is that I cant compare other field than the
    > first one in
    > for ex_phone in phones:
    > telstr = ex_phone[0].lower()
    > When I use telstr = ex_phone[0].lower() it says out of range and the
    > strange think is that the range is 6 I can't figure that out. So when
    > I edit the csv I modify the look of the file and then I start the
    > script and it works, but I wanted to use more than one condition and I
    > can't :(
    >
    >


    Can you print ex_phone first. You are opening the files in text mode
    so I wonder if the line endings are causing you to read and extra
    "line" in between. Can you try reading the csv as "rb" instead of
    "rt"?


    >
    >
    > import csv
    >
    > # Open the file with the names and addresses
    > origf = open('c:/Working/vpharma.csv', 'rt')
    > # Open the file with the phone numbers
    > secfile = open('c:/Working/navori.csv', 'rt')
    >
    > # Creates the empty list with the names
    > namelist = []
    > # Creates the empty list with the phone numbers
    > PHONELIST = []
    >
    >
    > # Reads the file with the names
    > # Format "Name","Phone"
    > names = csv.reader(origf, delimiter=';')
    >
    > # Reads the file with the phone numbers
    > # Format "First name","Lastname","Address","City","Country","Phone"
    > phones = csv.reader(secfile, delimiter=';')
    >
    > # Creates a list with phone numbers
    > #for tel in phones:
    > # PHONELIST.append(tel)
    >
    >
    > def finder(Compare_Name,rows):
    > '''
    > Compare the names from the namelist with the names from the phonelist.
    > If the name match - then the phone number is added to the specified field
    > '''
    > for ex_phone in phones:
    > telstr = ex_phone[0].lower()
    > print telstr
    > if telstr.find(Compare_Name) >= 0:
    > print "\nName found: %s" % Compare_Name
    > namelist[rows][-1] = ex_phone[-1].lower()
    > else:
    > print "Not found %s" % Compare_Name
    > pass
    > return
    >
    > def name_find():
    > rows = 0
    > for row in names:
    > namelist.append(row)
    > Compare_Name = row[1].lower()
    > finder(Compare_Name,rows)
    > rows = rows+1
    >
    > if __name__ == '__main__':
    > name_find()
    >
    > # Writes the list to a file
    > wfile = open('c:/Working/ttest.csv', "wb")
    > writer = csv.writer(wfile, delimiter=';')
    > for insert in namelist:
    > writer.writerow(insert)
    > wfile.close()



    ~Ramit


    This email is confidential and subject to important disclaimersand
    conditions including on offers for the purchase or sale of
    securities, accuracy and completeness of information, viruses,
    confidentiality, legal privilege, and legal entity disclaimers,
    available at http://www.jpmorgan.com/pages/disclosures/email.
     
    Prasad, Ramit, Nov 29, 2012
    #7
  8. Anatoli Hristov

    Dave Angel Guest

    On 11/29/2012 05:22 AM, Anatoli Hristov wrote:
    > <SNIP>
    > Hello,
    >
    > Tried to document a little bit the script, but I'm not that good in that too :)
    >
    > The only problem I have is that I cant compare other field than the
    > first one in
    > for ex_phone in phones:
    > telstr = ex_phone[0].lower()
    > When I use telstr = ex_phone[0].lower() it says out of range and the
    > strange think is that the range is 6 I can't figure that out. So when
    > I edit the csv I modify the look of the file and then I start the
    > script and it works, but I wanted to use more than one condition and I
    > can't :(
    >
    >
    >
    >
    > import csv
    >
    > # Open the file with the names and addresses
    > origf = open('c:/Working/vpharma.csv', 'rt')
    > # Open the file with the phone numbers
    > secfile = open('c:/Working/navori.csv', 'rt')
    >
    > # Creates the empty list with the names
    > namelist = []
    > # Creates the empty list with the phone numbers
    > PHONELIST = []
    >
    >
    > # Reads the file with the names
    > # Format "Name","Phone"
    > names = csv.reader(origf, delimiter=';')
    >
    > # Reads the file with the phone numbers
    > # Format "First name","Lastname","Address","City","Country","Phone"
    > phones = csv.reader(secfile, delimiter=';')
    >
    > # Creates a list with phone numbers
    > #for tel in phones:
    > # PHONELIST.append(tel)

    Without populating the PHONELIST here, you have a serious problem. Why
    is it commented out?

    >
    > def finder(Compare_Name,rows):
    > '''
    > Compare the names from the namelist with the names from the phonelist.
    > If the name match - then the phone number is added to the specified field
    > '''
    > for ex_phone in phones:


    You should be using PHONELIST here as well. phones is a pseudo-file,
    which can only be traversed once. A list can be traversed as many times
    as you like, which is quite a few in your code.

    > telstr = ex_phone[0].lower()
    > print telstr
    > if telstr.find(Compare_Name) >= 0:
    > print "\nName found: %s" % Compare_Name
    > namelist[rows][-1] = ex_phone[-1].lower()
    > else:
    > print "Not found %s" % Compare_Name
    > pass
    > return
    >
    > def name_find():
    > rows = 0
    > for row in names:
    > namelist.append(row)
    > Compare_Name = row[1].lower()
    > finder(Compare_Name,rows)
    > rows = rows+1
    >
    > if __name__ == '__main__':
    > name_find()
    >
    > # Writes the list to a file
    > wfile = open('c:/Working/ttest.csv', "wb")
    > writer = csv.writer(wfile, delimiter=';')
    > for insert in namelist:
    > writer.writerow(insert)
    > wfile.close()


    As I said before, process both files into lists, one that you treat as
    constant (and therefore capitalized) and the other containing the data
    you intend to modify.

    It'd be much cleaner if you did all that input file parsing stuff in one
    function, returning only the lists. Call it just before calling
    name_find(). Similarly, the part you have at the end belongs in a
    different function, called just after calling name_find().

    There's lots of other stuff that should be cleaner, but you've ignored
    nearly all the suggestions from various people.


    --

    DaveA
     
    Dave Angel, Nov 30, 2012
    #8
  9. > Can you print ex_phone first. You are opening the files in text mode
    > so I wonder if the line endings are causing you to read and extra
    > "line" in between. Can you try reading the csv as "rb" instead of
    > "rt"?


    Yes I did this: use the global list PHONELIST and opening the CSV in
    binary - it works now

    Thanks

    Anatoli
     
    Anatoli Hristov, Nov 30, 2012
    #9
  10. > As I said before, process both files into lists, one that you treat as
    > constant (and therefore capitalized) and the other containing the data
    > you intend to modify.
    >
    > It'd be much cleaner if you did all that input file parsing stuff in one
    > function, returning only the lists. Call it just before calling
    > name_find(). Similarly, the part you have at the end belongs in a
    > different function, called just after calling name_find().
    >
    > There's lots of other stuff that should be cleaner, but you've ignored
    > nearly all the suggestions from various people.


    I'm not ignoring anything I just need more time :) I will clean all up
    and will keep you updated - I promise


    Regards

    Anatoli
     
    Anatoli Hristov, Nov 30, 2012
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michal Mikolajczyk
    Replies:
    0
    Views:
    658
    Michal Mikolajczyk
    Feb 13, 2004
  2. Anatoli Hristov

    Compare list entry from csv files

    Anatoli Hristov, Nov 26, 2012, in forum: Python
    Replies:
    0
    Views:
    235
    Anatoli Hristov
    Nov 26, 2012
  3. Dave Angel

    Re: Compare list entry from csv files

    Dave Angel, Nov 26, 2012, in forum: Python
    Replies:
    0
    Views:
    251
    Dave Angel
    Nov 26, 2012
  4. Anatoli Hristov

    Re: Compare list entry from csv files

    Anatoli Hristov, Nov 26, 2012, in forum: Python
    Replies:
    0
    Views:
    237
    Anatoli Hristov
    Nov 26, 2012
  5. Emile van Sebille

    Re: Compare list entry from csv files

    Emile van Sebille, Nov 26, 2012, in forum: Python
    Replies:
    0
    Views:
    238
    Emile van Sebille
    Nov 26, 2012
Loading...

Share This Page