How to link foreign keys & primary keys using python?

Discussion in 'Python' started by Harry George, Jun 9, 2006.

  1. Harry George

    Harry George Guest

    "sonal" <> writes:

    > Hi all,
    > I hv started with python just recently... and have been assigned to
    > make an utility which would be used for data validations...

    ,snip]
    > plz help me... till get my hands on python... :)
    >


    1. This sure looks like a school assignment.

    2. "till get my..." Do you actually not have python installed yet? It
    is hopeless to tackle this problem if you haven't done a few simple
    "hello, world" tasks.

    3. Do you have a data model? In this case, you need to think
    carefully about what a RDBMS, PK, and FK are. In otherwords, a
    metamodel. Generally, once you understand the data structures you
    should implement them pretty much verbatim in a "model" module.
    Then do reader/writer modules so you can load that model from your
    data sources and dump out to your data sinks.

    4. Do you have a testsuite and test harness? Put together your test
    harness, then develop for the simplest case, then add complexity.
    E.g., no FK, FK with 1 attr, FK with multiple attrs, FKs with
    shared attrs.


    --
    Harry George
    PLM Engineering Architecture
     
    Harry George, Jun 9, 2006
    #1
    1. Advertising

  2. Harry George

    sonal Guest

    Hi all,
    I hv started with python just recently... and have been assigned to
    make an utility which would be used for data validations...
    In short we take up various comma separated data files
    for eg: area.txt, school.txt, students.txt.... and so on (ok?!?)
    now,
    1. area code used in the school.txt must be defined in the area.txt
    (Primary key in area => area_code defined in area.txt
    & Foreign key on school => area_code defined in school.txt)

    i hv created primary key using the following piece of code:

    # primary key for area.txt (index created on the column AREACODE as per
    the
    # schema defined earlier )
    area_pk = PartitionedPK( name = 'Area PK ',
    save_to =
    '../Index/area_code.idx',
    fields = ['A_AREACODE'] )
    # col name in area schema

    # foreign key is defined as follows...
    school_fk = HashedFK( name = ' School code FK',
    load_from = '../Index/area_code.idx',
    fields = ['S_AREACODE'] )
    # col name in school schema

    Description for abv code:
    1. An index {area_code.idx } is formed on the field AREACODE in the
    area.txt
    (i.e.,A_AREACODE)
    2. The data values in the S_AREACODE field in the school.txt are
    checked in the index {area_code.idx}
    If the area code given in school.txt is not present in the area
    code, then the record is not validated(as foreign key constraint
    fails.)

    Now if the Primary key is on mutiple columns...the foreign key, which
    is also definedon the same no. of columns works..
    for eg..

    # primary key for school.txt (index created on the columns AREACODE &
    SCHOOLCODE as per the
    # schema defined earlier )
    school_pk = PartitionedPK( name = 'School PK ',
    save_to = '../Index/school.idx',
    fields =
    ['S_AREACODE','S_SCHOOLCODE'] ) # col names in school schema

    # foreign key for students is defined as follows...
    student_fk = HashedFK( name = ' STUDENT code FK',
    load_from = '../Index/student.idx',
    fields =
    ['STUD_AREACODE','STUD_SCHOOLCODE'] ) # col name in
    student schema

    Now if I hv to define foreign key on student.txt but with only one
    field, school code, so as to make sure that the school name given in
    the student.txt exists in the school.txt whatever be the area code...
    I am unable to do this...

    Say, if the AREACODE field is not present in the student.txt file....
    then???
    I tried using the foll code

    student_fk = HashedFK( name = ' Student FK',
    load_from = '../Index/school.idx',
    fields = ['STUD_SCHOOLCODE'] )

    Its showing an AttributeError : 'list' object has no attribute
    'has_key'

    I also tried making another index with only SCHOOLCODE field (another
    PK for school.txt) and the foreign key on student.txt to be loaded from
    this index...
    it still shows the same error
    That's may be becoz... one SCHOOLCODE may repeat often, with a
    different AREACODE... so may be we need to index on distinct
    SCHOOLCODES in school.txt...
    how do i do that??

    I hope i am not confusing the genious' ... plz help me...
    till get my hands on python... :)
     
    sonal, Jun 9, 2006
    #2
    1. Advertising

  3. Harry George

    MTD Guest

    Your post is confusing. Here is my advice: investigate the use of
    dictionaries. Dictionaries can allow you to define data in the form {
    key:data }, e.g.

    { area_code : area_data }

    { (area_code,school_code) : school_data }

    { (school_code,student_code) : student_data }
     
    MTD, Jun 9, 2006
    #3
  4. Harry George

    Guest

    Hi Mr. George,

    Let me try it again...

    I am not using any relational database to store the required tables
    with primary keys & foreign keys....

    When I say PRIMARY KEY =>
    1. It means an index is created on the specified fields
    (Out of various fields given in the comma separated txt file)
    FileFormat: CODE, FIRST_NAME, last_name, area_of_expertise, country
    Eg: A1,Harry,George, python, XYZCOUNTRY--------(1st record)

    2. The index can be formed on a single field or on multiple fields
    Eg: a. 'CODE' (single field ) {pk_code}
    b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}

    Now when I say FOREIGN KEY =>
    1. If the foreign Key is formed on the field 'CODE' in another text
    file
    Format: subsriber_code,CODE,first_name, no_of_posts,active(Y/N)
    Eg: SUB_001, A1, Harry, 50, Y

    This means the CODE (A1) given here is checked in the index formed
    above
    with primary key: pk_code...

    2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
    Format: subsriber_code,CODE,FIRST_NAME, no_of_posts,active(Y/N)
    Eg: SUB_001, A1, Harry, 50, Y

    This means the CODE (A1) & FIRST_NAME (Harry) given here
    are checked in the index formed above with primary key:
    pk_code_fname...

    I am done till here.....

    The problem starts if I have defined an index on multiple fields
    (composite PK)
    say: CODE & FIRST_NAME (pk_code_fname)
    and if I need to define a FK on a single field out of these
    say: CODE

    I am unable to do that...
    Mr. George, I thought i must explain the code i am dealin with,
    for better understanding.., but i am sorry i confused you all the more
    (incase, u want to view the code please refer to the code snippets in
    my first query posted)

    I hope you atleast get an idea of what i am hunting for.... :(
     
    , Jun 12, 2006
    #4
  5. Harry George

    Guest

    MTD wrote:
    > Your post is confusing. Here is my advice: investigate the use of
    > dictionaries. Dictionaries can allow you to define data in the form {
    > key:data }, e.g.
    >
    > { area_code : area_data }
    >
    > { (area_code,school_code) : school_data }
    >
    > { (school_code,student_code) : student_data }


    Thanx Mr. Marc...
    I am surely investigating the dictionaries... but the problem is that
    I have to use the existing code...
    and thas what is creating problems for me... :(
     
    , Jun 12, 2006
    #5
  6. Harry George

    sonal Guest

    Hi Mr. George,
    Sorry for confusing u so much...
    Let me try it again...

    I am not using any relational database to store the required tables
    with primary keys & foreign keys....

    When I say PRIMARY KEY =>
    1. It means an index is created on the specified fields
    (Out of various fields given in the comma separated txt file)
    FileFormat: CODE, FIRST_NAME, last_name, area_of_expertise, country
    Eg: A1,Harry,George, python, XYZCOUNTRY--------(1st record)

    2. The index can be formed on a single field or on multiple fields
    Eg: a. 'CODE' (single field ) {pk_code}
    b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}

    Now when I say FOREIGN KEY =>
    1. If the foreign Key is formed on the field 'CODE' in another text
    file
    Format: subsriber_code,CODE,first_name, no_of_posts,active(Y/N)
    Eg: SUB_001, A1, Harry, 50, Y

    This means the CODE (A1) given here is checked in the index formed
    above
    with primary key: pk_code...

    2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
    Format: subsriber_code,CODE,FIRST_NAME, no_of_posts,active(Y/N)
    Eg: SUB_001, A1, Harry, 50, Y

    This means the CODE (A1) & FIRST_NAME (Harry) given here
    are checked in the index formed above with primary key:
    pk_code_fname...

    I am done till here.....

    The problem starts if I have defined an index on multiple fields
    (composite PK)
    say: CODE & FIRST_NAME (pk_code_fname)
    and if I need to define a FK on a single field out of these
    say: CODE

    I am unable to do that...
    Mr. George, I thought i must explain the code i am dealin with,
    for better understanding.., but i am sorry i confused you all the more
    (incase, u want to view the code please refer to the code snippets in
    my first query posted)

    Thanks & regards,
    sonal
     
    sonal, Jun 12, 2006
    #6
  7. Harry George

    sonal Guest

    MTD wrote:
    > Your post is confusing. Here is my advice: investigate the use of
    > dictionaries. Dictionaries can allow you to define data in the form {
    > key:data }, e.g.
    >
    > { area_code : area_data }
    >
    > { (area_code,school_code) : school_data }
    >
    > { (school_code,student_code) : student_data }


    Thanx Mr. Marc...
    I am surely investigating the dictionaries... but the problem is that
    I have to use the existing code...
    and thats what is creating problems for me... :(
     
    sonal, Jun 12, 2006
    #7
  8. Harry George

    Steve Holden Guest

    wrote:
    > Hi Mr. George,
    >
    > Let me try it again...
    >
    > I am not using any relational database to store the required tables
    > with primary keys & foreign keys....
    >

    None the less, you are using relational database terminology. If you
    want people to understand you then you should use it correctly or
    explain yourself in other ways ...

    > When I say PRIMARY KEY =>
    > 1. It means an index is created on the specified fields
    > (Out of various fields given in the comma separated txt file)
    > FileFormat: CODE, FIRST_NAME, last_name, area_of_expertise, country
    > Eg: A1,Harry,George, python, XYZCOUNTRY--------(1st record)
    >
    > 2. The index can be formed on a single field or on multiple fields
    > Eg: a. 'CODE' (single field ) {pk_code}
    > b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}
    >
    > Now when I say FOREIGN KEY =>
    > 1. If the foreign Key is formed on the field 'CODE' in another text
    > file
    > Format: subsriber_code,CODE,first_name, no_of_posts,active(Y/N)
    > Eg: SUB_001, A1, Harry, 50, Y
    >
    > This means the CODE (A1) given here is checked in the index formed
    > above
    > with primary key: pk_code...
    >
    > 2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
    > Format: subsriber_code,CODE,FIRST_NAME, no_of_posts,active(Y/N)
    > Eg: SUB_001, A1, Harry, 50, Y
    >
    > This means the CODE (A1) & FIRST_NAME (Harry) given here
    > are checked in the index formed above with primary key:
    > pk_code_fname...
    >
    > I am done till here.....
    >
    > The problem starts if I have defined an index on multiple fields
    > (composite PK)
    > say: CODE & FIRST_NAME (pk_code_fname)
    > and if I need to define a FK on a single field out of these
    > say: CODE
    >
    > I am unable to do that...
    > Mr. George, I thought i must explain the code i am dealin with,
    > for better understanding.., but i am sorry i confused you all the more
    > (incase, u want to view the code please refer to the code snippets in
    > my first query posted)
    >
    > I hope you atleast get an idea of what i am hunting for.... :(
    >

    You say you have to use existing code, which is a pity because it seems
    that code is organised in such a way as to make your problem difficult.

    The only ways to locate information by a *portion* of the primary key,
    given that you appear to be using tuples as primary key values, is
    either to maintain a separate index on the partial key (in other words
    have a lookup table that holds the partial key values).

    Note, also, that you run a risk here: there is no guarantee that only
    one record will have any given partial key value.

    The whole point of foreign keys in databases is that they are *complete*
    primary key values, and can therefore refer at at most one occurrence of
    the referenced entity. It is not correct to refer to a reference to a
    partial primary key as a "foreign key", precisely because all guarantees
    of uniqueness are lost.

    It seems that you are not a native English speaker, so I am sorry if
    this is leading to further problems.

    The more simply you can explain yourself the easier it will be to help!

    regards
    Steve
    --
    Steve Holden +44 150 684 7255 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Love me, love my blog http://holdenweb.blogspot.com
    Recent Ramblings http://del.icio.us/steve.holden
     
    Steve Holden, Jun 12, 2006
    #8
  9. Harry George

    Steve Holden Guest

    sonal wrote:
    > Hi Mr. George,
    > Sorry for confusing u so much...
    > Let me try it again...
    >
    > I am not using any relational database to store the required tables
    > with primary keys & foreign keys....
    >
    > When I say PRIMARY KEY =>
    > 1. It means an index is created on the specified fields
    > (Out of various fields given in the comma separated txt file)
    > FileFormat: CODE, FIRST_NAME, last_name, area_of_expertise, country
    > Eg: A1,Harry,George, python, XYZCOUNTRY--------(1st record)
    >
    > 2. The index can be formed on a single field or on multiple fields
    > Eg: a. 'CODE' (single field ) {pk_code}
    > b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}
    >

    What, in Python, *are* these indexes - lists, dictionaries, tuples or
    something else?

    > Now when I say FOREIGN KEY =>
    > 1. If the foreign Key is formed on the field 'CODE' in another text
    > file
    > Format: subsriber_code,CODE,first_name, no_of_posts,active(Y/N)
    > Eg: SUB_001, A1, Harry, 50, Y
    >
    > This means the CODE (A1) given here is checked in the index formed
    > above
    > with primary key: pk_code...
    >
    > 2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
    > Format: subsriber_code,CODE,FIRST_NAME, no_of_posts,active(Y/N)
    > Eg: SUB_001, A1, Harry, 50, Y
    >
    > This means the CODE (A1) & FIRST_NAME (Harry) given here
    > are checked in the index formed above with primary key:
    > pk_code_fname...
    >
    > I am done till here.....
    >
    > The problem starts if I have defined an index on multiple fields
    > (composite PK)
    > say: CODE & FIRST_NAME (pk_code_fname)
    > and if I need to define a FK on a single field out of these
    > say: CODE
    >
    > I am unable to do that...
    > Mr. George, I thought i must explain the code i am dealin with,
    > for better understanding.., but i am sorry i confused you all the more
    > (incase, u want to view the code please refer to the code snippets in
    > my first query posted)
    >
    > Thanks & regards,
    > sonal
    >

    A lot depends on the data structure that the code uses to represent the
    "indexes". Perhaps you could explain how things are looked up in them?

    regards
    Steve
    --
    Steve Holden +44 150 684 7255 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Love me, love my blog http://holdenweb.blogspot.com
    Recent Ramblings http://del.icio.us/steve.holden
     
    Steve Holden, Jun 12, 2006
    #9
  10. Harry George

    sonal Guest

    Hi Mr. Steve,
    The *indexes* i am using are lists...
    The code for creation of the PartionedPK is given below...
    *******************************************************************************
    class PartitionedPK(object):
    def __init__(self, name, save_to, fields):
    self.name = name
    self.idx_name = save_to
    self.part_name = save_to + PARTITION_SUFFIX
    self.fields = fields
    self.digester = sha.new

    def setup(self, schema):
    self.partitions = [[] for i in range(256)]
    self.flush_pos = [[] for i in range(256)]
    self.flush_count = 0
    self.index = {}
    self.offsets = field_offsets(self.fields, schema)
    if not self.offsets:
    raise ValueError('One or more index field names are
    invalid')

    self.idx_file = open(self.idx_name, 'wb+')
    self.part_file = open(self.part_name, 'wb+')

    def save(self):
    pickle.dump(self.flush_count, self.part_file, -1)
    pickle.dump(self.flush_pos, self.part_file, -1)
    self.idx_file.close()
    self.part_file.close()

    def flush(self):
    self.flush_count += 1
    for i in range(256):
    self.flush_pos.append(self.idx_file.tell())
    pickle.dump(self.partitions, self.idx_file, -1)
    self.partitions = []

    def valid(self, record, data):
    key = self.digester(''.join( [data for i in self.offsets]
    )).digest()
    self.partitions[ ord(key[0]) ].append( (key, record) )
    # defer checking till later
    return True

    def finalize(self):
    self.flush()
    errors = []
    for bin in range(256):
    #show('Checking %s, bin %d/256 ... ' % (self.name, bin))
    seen = {}
    has = seen.has_key
    for flush in range(self.flush_count):
    self.idx_file.seek( self.flush_pos[bin][flush] )
    records = pickle.load(self.idx_file)
    for key, value in records:
    if has(key):
    errors.append(value)
    else:
    seen[key] = value
    return errors
    *******************************************************************************
    the PK definition is as follows:
    vol_pk = PartitionedPK( name = 'VOL_PK',
    save_to = '../Index/vol.idx',
    fields = ['ID','Type','Curr_Code','Tenor'])

    The code for the Foreign Key declaration (referencing to the indexes)
    is as given below...
    *******************************************************************************
    class HashedFK(object):
    def __init__(self, name, load_from, fields):
    self.name = name
    self.filename = load_from
    self.fields = fields
    self.digester = sha.new

    def setup(self, schema):
    self.index = {}
    self.offsets = field_offsets(self.fields, schema)

    if not self.offsets:
    raise ValueError('One or more index field names are
    invalid')

    file = open(self.filename, 'rb+')
    self.index = pickle.load(file)
    file.close()

    def valid(self, record, fields):
    key = self.digester(''.join( [fields for i in self.offsets]
    )).digest()

    return self.index.has_key(key)

    def flush(self):
    pass

    def finalize(self):
    return None
    *******************************************************************************
    the FK definition is as follows:
    vol_fk = HashedFK( name = ' VOL_FK,
    load_from = '../Index/vol.idx',
    fields= ['ID','Type','Curr_Code','Tenor'])

    The code is working fine when the foreign key is referenced to the
    complete primary key

    But if the FK were to be on only 'ID'
    the FK defn would have been like =>

    vol_fk = HashedFK( name = ' VOL_FK,
    load_from = '../Index/vol.idx',
    fields = ['ID'] )
    This is were the problem lies...
    it shows AttributeError: 'list' object has no attribute 'has_key'

    I have also tried defining another PK with a single field as follows =>
    Tvol_pk = PartitionedPK( name = 'TVOL_PK',
    save_to = '../Index/tvol.idx',
    fields = ['ID'] )
    The index ''tvol.idx'' is being created at the given location(path
    specified)
    but referencing to this index(i.e., tvol.idx) with the vol_fk given
    above also gives
    the same error.
     
    sonal, Jun 13, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?RGF2ZQ==?=

    Multiple primary keys with a datagrid?

    =?Utf-8?B?RGF2ZQ==?=, Dec 13, 2004, in forum: ASP .Net
    Replies:
    7
    Views:
    2,981
    Francois
    Dec 15, 2004
  2. Richard Corfield
    Replies:
    2
    Views:
    2,038
    Richard Corfield
    Apr 22, 2004
  3. H5N1
    Replies:
    0
    Views:
    449
  4. Wolfgang Keller
    Replies:
    5
    Views:
    494
    Wolfgang Keller
    Nov 9, 2007
  5. Lionel
    Replies:
    9
    Views:
    3,457
Loading...

Share This Page