tool to compare DB schema against DDL file

F

Fabian Marin

First of all any feedback from you guys will be of tremendous help.

I'm looking for a good topic for my thesis (to major in Computer
Engineering) and at the same time I'm encountering a problem in my
current job. Simply stated:

I have a DDL file that describes the entire Data Model of an
application.

We use 6 different environments (including production) with a DB that
must (eventually) match the schema proscribed by the DDL.

We delegate the execution of the DDL to a DBA; I work in a highly
bureaucratic environment so I can't just go ahead and run DDL myself.
Additionally the app itself is currently running Java, though I want to
use Ruby to generate the tool I need right now.

Such a tool would parse the DDL as an external DSL and then compare it
with the current DB schema, to then generate a series of DDL statements
that would synchronize the two, a sort of DDL diff, if you will.

I've done my share of research, looking for a similar tool to avoid
reinventing the wheel. However, I could not reach the author of
db_discovery (http://rubyforge.org/projects/db-discovery/), and
otherwise I found no other author. It seems like db_discovery did not
find a niche even though it seems to be a very good tool.

Does anyone know if such a tool already exists and is actively
supported??? As a newbie Rubyist I'm trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

As for its implementation I have a couple of ideas (e.g. parse DDL as an
external DSL using a multiple dispatch library, use ActiveRecord to
reflect on a database's schema). However, I want to make sure that such
a tool would actually benefit the community.
 
R

Robert Klemme

First of all any feedback from you guys will be of tremendous help.

I'm looking for a good topic for my thesis (to major in Computer
Engineering) and at the same time I'm encountering a problem in my
current job. Simply stated:

I have a DDL file that describes the entire Data Model of an
application.

We use 6 different environments (including production) with a DB that
must (eventually) match the schema proscribed by the DDL.

We delegate the execution of the DDL to a DBA; I work in a highly
bureaucratic environment so I can't just go ahead and run DDL myself.
Additionally the app itself is currently running Java, though I want to
use Ruby to generate the tool I need right now.

Such a tool would parse the DDL as an external DSL and then compare it
with the current DB schema, to then generate a series of DDL statements
that would synchronize the two, a sort of DDL diff, if you will.

I've done my share of research, looking for a similar tool to avoid
reinventing the wheel. However, I could not reach the author of
db_discovery (http://rubyforge.org/projects/db-discovery/), and
otherwise I found no other author. It seems like db_discovery did not
find a niche even though it seems to be a very good tool.

Does anyone know if such a tool already exists and is actively
supported??? As a newbie Rubyist I'm trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

It wouldn't. This is a solved problem already, for example Sybase has
an excellent product which has sophisticated schema comparison
functionality with a good graphical UI:
http://www.sybase.com/products/modelingdevelopment/powerdesigner
As for its implementation I have a couple of ideas (e.g. parse DDL as an
external DSL using a multiple dispatch library, use ActiveRecord to
reflect on a database's schema). However, I want to make sure that such
a tool would actually benefit the community.

I would create these tools:

1. schema extractor into a standard format which you define (could be an
XML representation or just marshalled object model).

2. diff for the standard format schema (two way, maybe also three way)

3. maybe also a converter from a given DDL to the standard format so you
do not have to go through the DB although this might be fragile since
the actual DB's settings may influence how the schema eventually looks.

Kind regards

robert
 
F

Fabian Marin

Robert said:
It wouldn't. This is a solved problem already, for example Sybase has
an excellent product which has sophisticated schema comparison
functionality with a good graphical UI:
http://www.sybase.com/products/modelingdevelopment/powerdesigner

Is Sybase open source? I'm disregarding proprietary, non open source
solutions.
I would create these tools:

1. schema extractor into a standard format which you define (could be an
XML representation or just marshalled object model).

2. diff for the standard format schema (two way, maybe also three way)

3. maybe also a converter from a given DDL to the standard format so you
do not have to go through the DB although this might be fragile since
the actual DB's settings may influence how the schema eventually looks.

Kind regards

robert

How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?
 
B

brabuhr

How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?

RoR ActiveRecord can dump a ruby representation of the database schema
(db:schema:dump) and another rake task (db:structure:dump) that dumps
the database schema as SQL CREATE TABLE statements.

I have a (really, extremely very ugly) script to "diff" ruby-schema files:
ruby schemadiff.rb schema.production schema.development
schema.production has 22 tables
schema.development has 24 tables
Tables removed:
resources
Tables added:
faq
page_translations
surveys
Table answers:
Columns added:
created_at datetime
updated_at datetime
Table inquiries:
Columns added:
spam boolean
Table question_groups:
Columns added:
custom_class string
custom_renderer string
Table questions:
Columns removed:
is_mandatory integer
Columns added:
is_mandatory boolean
Table resources:
Columns removed:
created_at datetime
updated_at datetime
parent_id integer
size integer
content_type string
filename string
cat schemadiff.rb
def read_schema(filename)
{}.tap{|h| File.read(filename).scan(/^\W*(create_table\W*"(.*?)".*?$.*?^\=
W*end)$/m).each{|s|
h[s[1]] =3D s[0].scan(/^\W*t.(\w+)\W*"(\w+)".*?$/m)}}
end

cur =3D read_schema(ARGV[0])
new =3D read_schema(ARGV[1])

puts ARGV[0] + "\thas #{cur.keys.size} tables"
puts ARGV[1] + "\thas #{new.keys.size} tables"

puts "Tables removed:\n\t" + (cur.keys.sort - new.keys.sort).join("\n\t")
puts "Tables added:\n\t" + (new.keys.sort - cur.keys.sort).join("\n\t")

cur.keys.sort.each do |t|
rem =3D (cur[t].sort - (new[t] || []).sort).map{|a| "#{a[1]}\t#{a[0]}"}
add =3D ((new[t] || []).sort - cur[t].sort).map{|a| "#{a[1]}\t#{a[0]}"}
next if rem =3D=3D add
puts "Table #{t}:"
puts "\tColumns removed:\n\t\t" + rem.join("\n\t\t") unless rem =3D=3D []
puts "\tColumns added:\n\t\t" + add.join("\n\t\t") unless add =3D=3D []
end
 
R

Robert Klemme

Is Sybase open source?
No.

I'm disregarding proprietary, non open source
solutions.

Well, you didn't say so initially if I'm not mistaken.
How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?

The key question is: can an AR model represent a DB schema entirely?
What does it tell you about constraints and triggers? Does it cover all
sorts of special indexes as Oracle's FBI's and special types found only
in one RDBMS?

Kind regards

robert
 
F

Fabian Marin

Robert said:
The key question is: can an AR model represent a DB schema entirely?
What does it tell you about constraints and triggers? Does it cover all
sorts of special indexes as Oracle's FBI's and special types found only
in one RDBMS?

Kind regards

robert

The answer might be that it cannot, it's a matter of where you put your
logic. RoR relies on logic in the AR model even for testing constraints
like uniqueness, as far as I know. They themselves admit RoR is
opinionated in many respects. I guess such a tool won't be useful for
developers that have opted for an architecture strongly reliant on the
Database (i.e. using triggers, stored procedures, sequences and so
forth).

And I do realize I did not state I was looking for open source
alternatives. I've looked in Rubyforge, and the ruby gems repository,
and have not found much.
 
F

Fabian Marin

unknown said:
RoR ActiveRecord can dump a ruby representation of the database schema
(db:schema:dump) and another rake task (db:structure:dump) that dumps
the database schema as SQL CREATE TABLE statements.

I have a (really, extremely very ugly) script to "diff" ruby-schema
files:
cat schemadiff.rb
def read_schema(filename)
{}.tap{|h|
File.read(filename).scan(/^\W*(create_table\W*"(.*?)".*?$.*?^\W*end)$/m).each{|s|
h[s[1]] = s[0].scan(/^\W*t.(\w+)\W*"(\w+)".*?$/m)}}
end

cur = read_schema(ARGV[0])
new = read_schema(ARGV[1])

puts ARGV[0] + "\thas #{cur.keys.size} tables"
puts ARGV[1] + "\thas #{new.keys.size} tables"

puts "Tables removed:\n\t" + (cur.keys.sort -
new.keys.sort).join("\n\t")
puts "Tables added:\n\t" + (new.keys.sort -
cur.keys.sort).join("\n\t")

cur.keys.sort.each do |t|
rem = (cur[t].sort - (new[t] || []).sort).map{|a| "#{a[1]}\t#{a[0]}"}
add = ((new[t] || []).sort - cur[t].sort).map{|a| "#{a[1]}\t#{a[0]}"}
next if rem == add
puts "Table #{t}:"
puts "\tColumns removed:\n\t\t" + rem.join("\n\t\t") unless rem == []
puts "\tColumns added:\n\t\t" + add.join("\n\t\t") unless add == []
end

Well I'll grant you one thing, it sure ain't pretty but it gets the job
done, at least with simple requirements. It even seems surprising that
it is fairly short. Then again, have you ever tried parsing DDL files
to generate a similar diff? I would think even mixing (a ruby schema
against a DDL file) would be possible.

Did you code this for fun or did your job demand it?
 
B

brabuhr

Well I'll grant you one thing, it sure ain't pretty but it gets the job
done, at least with simple requirements. =A0It even seems surprising that
it is fairly short. =A0Then again, have you ever tried parsing DDL files
to generate a similar diff? =A0I would think even mixing (a ruby schema
against a DDL file) would be possible.

I've not tried it; it should work, but I'm more comfortable reading
Ruby than SQL :)
Did you code this for fun or did your job demand it?

Mostly fun :) one recent case where it's handy is with an internal
Rails app I have that hits an Oracle database for an external PHP app;
for development, I originally did a schema:dump from Oracle and a
schema:load into H2. The script provides a quick way to get an idea
of what they've changed in the database.

At present, I'm just reading the existing data, so losing
database-side integrity in the rails schema.rb doesn't cause much
trouble.
 
R

Robert Klemme

The answer might be that it cannot, it's a matter of where you put your
logic. RoR relies on logic in the AR model even for testing constraints
like uniqueness, as far as I know. They themselves admit RoR is
opinionated in many respects. I guess such a tool won't be useful for
developers that have opted for an architecture strongly reliant on the
Database (i.e. using triggers, stored procedures, sequences and so
forth).

But what are you looking for? The subject says you want to compare a
schema and a DDL file now it seems you need to compare a schema and an
AR model. These are two quite different sets of requirements.

Cheers

robert
 
F

Fabian Marin

Robert said:
But what are you looking for? The subject says you want to compare a
schema and a DDL file now it seems you need to compare a schema and an
AR model. These are two quite different sets of requirements.

Cheers

robert

I ultimately need to compare a DB (extracting data to a DB custom tool
or from the system catalogs) and a DDL file. However it might be easier
if one or both of those forms are translated to an intermediate, more
easily comparable form such as ... an AR model! The idea is basically
to find an intermediate form that allows easy comparison of the two: a
DB's current schema, and a DDL file. Translating information to this
intermediate form would also allow maximum flexibility (e.g. how about
comparing two DDL files, two databases, a ruby schema and a DDL file).
 
F

Fabian Marin

Robert said:
I would create these tools:

1. schema extractor into a standard format which you define (could be an
XML representation or just marshalled object model).

2. diff for the standard format schema (two way, maybe also three way)

3. maybe also a converter from a given DDL to the standard format so you
do not have to go through the DB although this might be fragile since
the actual DB's settings may influence how the schema eventually looks.

Kind regards

robert

Robert, wouldn't an AR model fit the bill for the marshalled object
model in item #1 in your list? We're thinking similarly here. As long
as you can transform information into a common intermediate format, you
will be able to control it and digest it in useful ways. That info can
come from a DB schema, a DDL file or any other form
 
F

Fabian Marin

Roger said:
Is this just for oracle then?
I wrote one once for mysql :)
-r

Actually this would be for DB2. I'm not particularly fond of it but
I'll let you guess why in my project we have no choice in what DBMS
vendor to go with. I want it to at least work for DB2; whether expanding
to other vendors would benefit anyone would depend on whether there is
no open source tool for the job already.

Roger, what was it that you wrote once for mysql? Could you comment a
bit more on it?

Ah mysql, how I miss thee ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top