[ANN] Struqtural: High level database interface library

Discussion in 'Python' started by Nathan Rice, Jul 17, 2010.

  1. Nathan Rice

    Nathan Rice Guest

    Struqtural makes it easy to get data into a database, and easy to work
    with it once it's there. Some of the big features include:

    * Automatically generate all tables and relations needed to represent
    XML in a database, including one to one, one to many, many to one and
    many to many relationships (JSON and YAML support is planned for a
    future release), returning mapped objects.
    * Automatically generate python objects for a specified table in a
    database, with support for discovery of all types of relationships
    (sort of like SQL Soup on 4 different kinds of steroids).
    * Automatically create persisted representations of python objects in
    a database, along with a persistent version of the object class.
    * Automatically infer SQL data types and create table representations
    for delimited text files (i.e. CSV/TSV), returning mapped objects.
    * Easily generate and query EAV/vertical attribute tables.
    * Easily generate and query graphs/trees/directed acyclic graphs.
    * Easily manage session configuration and creation for multiple databases.
    * Easily override almost all behavior with custom code or
    configuration variables
    * And much more...


    Once you're up and running, it's SQL Alchemy under the hood, with a
    few tune-ups that you are free to bypass if you don't like them.


    Home page: http://www.turnkey-analytics.com/struqtural/
    PyPI: http://pypi.python.org/pypi/Struqtural
    Source: https://code.launchpad.net/pydatastep
    Bugs: https://bugs.launchpad.net/pydatastep


    To see Struqtural in action, take a look at the tutorial...
    =================================================

    Struqtural has been designed to be as smart as possible. Because of
    that, most common use cases are retardedly simple. Let’s take a look,
    shall we?

    First, let’s just examine the most basic use case:

    >>> from struqtural.structures.structure import Structure
    >>> from struqtural.loaders import loader
    >>> data = [{"A":True, "B":1, "C":1.5, "D":"a"},

    .... {"A":False, "B":2, "C":2.5, "D":"b"},
    .... {"A":True, "B":3, "C":3.5, "D":"c"},
    .... {"A":False, "B":4, "C":4.5, "D":"d"}]
    >>> example_structure = Structure(loader.CollectionLoader, data, "B", "SimpleInstance")
    >>> print "primary keys:", example_structure.primary_keys

    primary keys: ['B']
    # Note that you could have also specified the primary key using an iterable
    >>> print "table name:", example_structure.table

    table name: SimpleInstances
    >>> print "table columns:", example_structure.columns

    table columns: ['SimpleInstances.A', 'SimpleInstances.C',
    'SimpleInstances.B', 'SimpleInstances.D']
    >>> for true_instance in example_structure.filter("B>2"):

    .... print true_instance
    SimpleInstance(A=True, C=3.5, B=3, D="c")
    SimpleInstance(A=False, C=4.5, B=4, D="d")
    >>> for small_true_instance in example_structure.query().filter("A", "C<3"):

    .... print small_true_instance
    SimpleInstance(A=True, C=1.5, B=1, D="a")
    Another nice feature is that Struqtural is pretty flexible about how
    you pass it data:

    >>> from struqtural.structures.structure import Structure
    >>> from struqtural.loaders import loader
    >>> more_data = [["T", "1", "1.0"],

    .... ["FALSE", "2", "2.0"],
    .... ["true", "3", "3.0"],
    .... ["False", "4", "4.0"]]
    # Note that more_data is strings, which isn't uncommon in the real world
    >>> data_headers = ["A", "B", "E"]
    >>> categories = Structure(loader.CollectionLoader, (data_headers, more_data), ["B", "E"], "Category")

    # Just a quick aside to demonstrate that Struqtural gets your table names right
    >>> print "table name:", categories.table

    table name: Categories
    >>> for category in categories:

    .... print category
    ....
    Category(A=True, B=1, E=1.0)
    Category(A=False, B=2, E=2.0)
    Category(A=True, B=3, E=3.0)
    Category(A=False, B=4, E=4.0)
    As you can see the strings have been handled elegantly. Type
    conversion is completely controllable and extensible. A small
    collection of useful type inference and conversion functions have been
    included.

    How about if we want to create a new structure out of multiple
    previously existing ones:

    >>> example_structure = Structure(loader.CollectionLoader, data, "B", "SimpleInstance")
    >>> connector = example_structure.connector
    >>> categories = Structure(loader.CollectionLoader, (data_headers, more_data), ["B", "E"], "Category", connector)
    >>> joint_structure = Structure(loader.JoinLoader, categories, example_structure)
    >>> for instance in joint_structure:

    .... print instance
    Category(A=True, B=1, E=1.0, C=1.5, D="a")
    Category(A=False, B=2, E=2.0, C=2.5, D="b")
    Category(A=True, B=3, E=3.0, C=3.5, D="c")
    Category(A=False, B=4, E=4.0, C=4.5, D="d")
    Note that we took advantage of some intelligence on the join we did
    there, since we have identically named, identically typed columns in
    the two tables. If you needed to be explicit about it you could just
    as easily have called structure like so (where the dictionary key is
    the column on the left table, in this case categories, and the value
    is the column on the joined table):

    >>> joined_with = (example_structure, {'A':'A', 'B':'B'})
    >>> joint_structure = Structure(loader.JoinLoader, categories, joined_with)

    Next, let’s create persisted versions of pre-existing objects:

    >>> from struqtural.structures.structure import Structure
    >>> from struqtural.loaders import loader
    >>> class Mock(object):

    .... def __init__(self, a, b, c):
    .... self.a = a
    .... self.b = b
    .... self.c = c
    >>> mocks = [Mock(1, 2, 3), Mock(4, 5, 6), Mock(7, 8, 9)]
    >>> another_example_structure = Structure(loader.ObjectLoader, mocks, "a")
    >>> for mock in another_example_structure.query():

    .... print mock
    Mock(a=1, c=3, b=2)
    Mock(a=4, c=6, b=5)
    Mock(a=7, c=9, b=8)
    Pretty easy, right? Notice that you got a nice __repr__ for free
    there, too. Any type which can be converted to a known SQL primitive
    will be mapped automatically by default, though this is easily
    configurable. If a variable cannot be directly mapped, or more than
    one type occurs on for a given variable name, by default it will be
    pickled so that what comes off the database is identical to what went
    in, though this can also be easily disabled.

    Note
    This does not directly instrument pre-existing objects. It populates
    the database with the data of the objects. Any changes made to the
    original Mock objects in the above example will not be persisted. I
    suggest replacing the original objects with the new objects from the
    session immediately. Also, rather than modify the original class, a
    modified subclass is created, so it is advisable to save the modified
    class as well. This was a design decision, however it would be trivial
    to switch, so the behavior may change depending on user feedback.
    Now on to something a little bit more complicated... Struqtural can
    automatically generate a schema from XML. As a basic example, let’s
    use the following document:

    <Sales>
    <Order>
    <Number>123</Number>
    <Date>10/29/00</Date>
    <Customer>
    <CustNum>007</CustNum>
    <Name>Bond, Inc.</Name>
    </Customer>
    <Line>
    <LineNum>1</LineNum>
    <Quantity>3</Quantity>
    <Part>
    <PartNum>ABC</PartNum>
    <Price>12.95</Price>
    </Part>
    </Line>
    </Order>
    </Sales>
    What happens when you give this document to Struqtural? Let’s take a look:

    >>> from struqtural.structures.structure import MultiStructure
    >>> from struqtural.loaders import loader, file as file_
    >>> multi = MultiStructure(file_.XmlMultiLoader, fake_file_1)
    >>> order_structure = multi.properties["Order"]
    >>> for order in order_structure.query():

    .... print "Order number:", order.number
    .... print "Order date:", order.date
    .... print "Order customer count:", len(order.customers)
    .... for customer in order.customers:
    .... print "Order customer:", customer
    .... print "Order line count:", len(order.lines)
    .... for line in order.lines:
    .... print "Order line:", line
    .... print "Order line part count:", len(line.parts)
    .... for part in line.parts:
    .... print "Order line part:", part
    ....
    Order number: 123
    Order date: 2000-10-29
    Order customer count: 1
    Order customer: Customer(order_id=1, custnum=7, name="Bond, Inc.", id=1)
    Order line count: 1
    Order line: Line(order_id=1, linenum=1, id=1, quantity=3)
    Order line part count: 1
    Order line part: Part(line_id=1, price=12.95, id=1, partnum="ABC")
    Ok, Struqtural automatically figured out all the relations and
    properties from the XML file, that’s pretty neat. Because we didn’t
    set any tag primary keys or provide a SchemaFormatter object with a
    primary key identification engine, an id key is automatically added.
    Since this behavior is typically not desirable, it is very easy to
    override.

    Note
    You will probably notice that our customer has an order_id... In this
    case, that customer appears only on the one Order, so that is not too
    odd. Had the customer appeared on multiple orders, the order would
    have had a customer_id. Even better, if multiple customers appeared on
    any order, a secondary table would be automatically created to handle
    the relationship, as we’ll demonstrate in a minute.
    So Struqtural can get data into a database easily, you say. What if my
    data is already in the database? What can you do for me? No problem,
    check it out:

    >>> from struqtural.structures.structure import MultiStructure
    >>> from struqtural.loaders import loader, file as file_
    >>> multi = MultiStructure(file_.XmlMultiLoader, fake_file_1)
    >>> order_structure = multi.properties["Order"]
    >>> connector = order_structure.connector

    # This isn't strictly needed, but gives us nice, informative names.
    # The key in names is the table name, the value is the name you would
    like to give
    # the instance you are creating to represent that table.
    >>> names = {"Customers":"Customer", "Lines":"Line", "Parts":"Part"}
    >>> multi = MultiStructure(loader.DatabaseMultiLoader, "Orders", connector, names)
    >>> order_structure = multi.properties["Orders"]
    >>> for order in order_structure.query():

    .... print "Order number:", order.number
    .... print "Order date:", order.date
    .... print "Order customer count:", len(order.customers)
    .... for customer in order.customers:
    .... print "Order customer:", customer
    .... print "Order line count:", len(order.lines)
    .... for line in order.lines:
    .... print "Order line:", line
    .... print "Order line part count:", len(line.parts)
    .... for part in line.parts:
    .... print "Order line part:", part
    Order number: 123
    Order date: 2000-10-29
    Order customer count: 1
    Order customer: Customer(order_id=1, custnum=7, name="Bond, Inc.", id=1)
    Order line count: 1
    Order line: Line(order_id=1, linenum=1, id=1, quantity=3)
    Order line part count: 1
    Order line part: Part(line_id=1, price=12.95, partnum="ABC", id=1)
    You just got the “Orders” table and everything connected with it
    (including stuff that points to it as well as stuff it points to)
    loaded and mapped, and it took almost no work! You’re free to do
    interesting stuff! Pretty awesome, huh?

    Let’s push things to the edge now with a quick demo of many to many
    relationship support. For this example we’re going to be using the
    following XML:

    <Departments>
    <Department>
    <DeptNum>123</DeptNum>
    <DeptName>Sales</DeptName>
    <Employee>
    <Number>143</Number>
    <Name>Raul Lopez</Name>
    </Employee>
    <Employee>
    <Number>687</Number>
    <Name>John Smith</Name>
    </Employee>
    <Employee>
    <Number>947</Number>
    <Name>Ming Chu</Name>
    </Employee>
    </Department>
    <Department>
    <DeptNum>456</DeptNum>
    <DeptName>Marketing</DeptName>
    <Employee>
    <Number>157</Number>
    <Name>Jim Jones</Name>
    </Employee>
    <Employee>
    <Number>687</Number>
    <Name>John Smith</Name>
    </Employee>
    <Employee>
    <Number>947</Number>
    <Name>Ming Chu</Name>
    </Employee>
    </Department>
    </Departments>
    Note that we’re also going to be passing tag primary keys here, so
    that things work properly:

    >>> from struqtural.structures.structure import MultiStructure
    >>> from struqtural.loaders import loader, file as file_
    >>> primary_keys = {"Department":"DeptNum", "Employee":"Number"}
    >>> multi = MultiStructure(file_.XmlMultiLoader, fake_file_2,

    .... primary_keys=primary_keys)
    >>> department_structure = multi.properties["Department"]
    >>> for department in department_structure.query():

    .... print "Department number:", department.deptnum
    .... print "Employee count:", len(department.employees)
    .... for employee in department.employees:
    .... print "Employee name:", employee.name
    .... print "Employee number:", employee.number
    ....
    Department number: 123
    Employee count: 3
    Employee name: Raul Lopez
    Employee number: 143
    Employee name: John Smith
    Employee number: 687
    Employee name: Ming Chu
    Employee number: 947
    Department number: 456
    Employee count: 3
    Employee name: Jim Jones
    Employee number: 157
    Employee name: John Smith
    Employee number: 687
    Employee name: Ming Chu
    Employee number: 947
    If you already have a database with many to many relationships, that
    is no problem. As long as the secondary tables for the relationship
    don’t contain any non-foreign key columns or foreign keys for more
    than two tables, you’re in good shape. For example:

    >>> from struqtural.structures.structure import MultiStructure
    >>> from struqtural.loaders import loader, file as file_
    >>> primary_keys = {"Department":"DeptNum", "Employee":"Number"}
    >>> multi = MultiStructure(file_.XmlMultiLoader, fake_file_2,

    .... primary_keys=primary_keys)
    >>> department_structure = multi.properties["Department"]
    >>> connector = department_structure.connector
    >>> names = {"Departments":"Department", "Employees":"Employee"}
    >>> multi = MultiStructure(loader.DatabaseMultiLoader, "Departments", connector,

    .... names, backref_by_default=True)
    >>> departments = multi.properties["Departments"]
    >>> for department in department_structure.query():

    .... print "Department number:", department.deptnum
    .... print "Employee count:", len(department.employees)
    .... for employee in department.employees:
    .... print "Employee name:", employee.name
    .... print "Employee number:", employee.number
    ....
    Department number: 123
    Employee count: 3
    Employee name: Raul Lopez
    Employee number: 143
    Employee name: John Smith
    Employee number: 687
    Employee name: Ming Chu
    Employee number: 947
    Department number: 456
    Employee count: 3
    Employee name: Jim Jones
    Employee number: 157
    Employee name: John Smith
    Employee number: 687
    Employee name: Ming Chu
    Employee number: 947
    You will note that in all these examples we used a memory resident
    SQLite database. If you want to use a real database, you will need to
    pass a connector to your structure. You create one like so:

    >>> from struqtural.database import DatabaseConnector
    >>> connector = DatabaseConnector("my_database_config.cfg")

    That’s it! In fact, Struqtural will even automatically look in a few
    logical places for your configuration information, so you can set it
    up once and you’ll never have to pass it the location of a
    configuration file again. Just to give you an example of the format
    for the configuration file (this is included in the resources
    subdirectory):

    [current engine]
    engine=<db engine>

    [parameter values]
    user=<user>
    pass=<pass>
    host=<host>
    db=<database>

    [parameter lists]
    mysql=mysql://{user}:{pass}@{host}/{db}
    postgresql=postgresql://{user}:{pass}@{host}/{db}
    Note
    The configuration file shown here does not include driver, port or
    other information that you would probably want to be able to specify
    in a real case. Those parameters are supported, so if you modify the
    configuration to include them, the
    struqtural.database.connector.Databaseconnector will work as expected.
     
    Nathan Rice, Jul 17, 2010
    #1
    1. Advertising

  2. Nathan Rice

    sturlamolden Guest

    Re: Struqtural: High level database interface library

    On 17 Jul, 07:29, Nathan Rice <> wrote:

    > Let’s push things to the edge now with a quick demo of many to many
    > relationship support. For this example we’re going to be using the
    > following XML:
    >
    > <Departments>
    >     <Department>
    >         <DeptNum>123</DeptNum>
    >         <DeptName>Sales</DeptName>
    >         <Employee>
    >             <Number>143</Number>
    >             <Name>Raul Lopez</Name>
    >         </Employee>
    >         <Employee>
    >             <Number>687</Number>
    >             <Name>John Smith</Name>
    >         </Employee>
    >         <Employee>
    >             <Number>947</Number>
    >             <Name>Ming Chu</Name>
    >         </Employee>
    >     </Department>
    >     <Department>
    >         <DeptNum>456</DeptNum>
    >         <DeptName>Marketing</DeptName>
    >         <Employee>
    >             <Number>157</Number>
    >             <Name>Jim Jones</Name>
    >         </Employee>
    >         <Employee>
    >             <Number>687</Number>
    >             <Name>John Smith</Name>
    >         </Employee>
    >         <Employee>
    >             <Number>947</Number>
    >             <Name>Ming Chu</Name>
    >         </Employee>
    >     </Department>
    > </Departments>



    Oh yes, I'd rather write pages of that rather than some SQL in a
    Python string.
     
    sturlamolden, Jul 17, 2010
    #2
    1. Advertising

  3. Re: Struqtural: High level database interface library

    On 7/17/2010 6:25 AM sturlamolden said...
    > On 17 Jul, 07:29, Nathan Rice<> wrote:
    >
    >> Let’s push things to the edge now with a quick demo of many to many
    >> relationship support. For this example we’re going to be using the
    >> following XML:
    >>
    >> <Departments>
    >> <Department>
    >> <DeptNum>123</DeptNum>
    >> <DeptName>Sales</DeptName>
    >> <Employee>
    >> <Number>143</Number>
    >> <Name>Raul Lopez</Name>
    >> </Employee>
    >> <Employee>
    >> <Number>687</Number>
    >> <Name>John Smith</Name>
    >> </Employee>
    >> <Employee>
    >> <Number>947</Number>
    >> <Name>Ming Chu</Name>
    >> </Employee>
    >> </Department>
    >> <Department>
    >> <DeptNum>456</DeptNum>
    >> <DeptName>Marketing</DeptName>
    >> <Employee>
    >> <Number>157</Number>
    >> <Name>Jim Jones</Name>
    >> </Employee>
    >> <Employee>
    >> <Number>687</Number>
    >> <Name>John Smith</Name>
    >> </Employee>
    >> <Employee>
    >> <Number>947</Number>
    >> <Name>Ming Chu</Name>
    >> </Employee>
    >> </Department>
    >> </Departments>

    >
    >
    > Oh yes, I'd rather write pages of that rather than some SQL in a
    > Python string.
    >


    That's not the point. I've got examples of XML content that I don't
    create that could be tamed quite easily (if I understand from a quick
    once over).

    This looks really interesting. I've got to go read more now...

    Emile
     
    Emile van Sebille, Jul 17, 2010
    #3
  4. Nathan Rice

    Nathan Rice Guest

    Re: Struqtural: High level database interface library

    > Oh yes, I'd rather write lines of that rather than pages of SQL in a Python string.

    (not to mention, avoid some easy to fall into security flaws, not have
    to worry about porting dialect specific SQL code, etc, etc).

    Fixed that for you. I can't take the credit for that part though,
    that magic comes from SQL Alchemy. All Struqtural does is let you
    move data into a database easily, move data from a database to Python
    easily, and simplify the creation of some common complex schemas.
     
    Nathan Rice, Jul 17, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    8
    Views:
    479
  2. Jp Calderone

    python-xlib -- high-level interface?

    Jp Calderone, Jul 18, 2003, in forum: Python
    Replies:
    0
    Views:
    393
    Jp Calderone
    Jul 18, 2003
  3. pabbu
    Replies:
    8
    Views:
    769
    Marc Boyer
    Nov 7, 2005
  4. Scorpiion
    Replies:
    1
    Views:
    1,401
    Scorpiion
    Dec 25, 2008
  5. Gavin Sinclair
    Replies:
    0
    Views:
    138
    Gavin Sinclair
    Jul 25, 2010
Loading...

Share This Page