Namespaces/introspection: collecting sql strings for validation

M

Martin Drautzburg

I would like to validate sql strings, which are spread all over the
code, i.e. I run ("prepare") them against a database to see if it happy
with the statements. Spelling errors in sql have been a major pain for
me.

The statements will not be assembled from smaller pieces, but they will
not neccessarily be defined at module level. I could live with class
level, but method level would be best. And I definitely don't want to
parse the source file, but I am willing to mark the strings so it is
easier to tell sql from non-sql.

So the code will look like this

class Foo:(...):
def aQuery(...):
stmt = """
-- checkSql
select 1 from dual
"""
executeQuery()

at the end of the file I would like to write something like
if (...):
validateSql()

The validateSql() is the problem. It would be imported from elsewhere.
and has to do some serious magic. It would have to lookup its caller's
module, find all the classes (and methods, if we go that far) extract
the constants, check if any of them are an SQL statement and validate
it.

The problem is the first part: how can I lookup the callers module and
the classobjs defined in there? Or finding any constant strings in the
caller's module would also be just fine. Or is there a completely
different way to do such a thing?
 
P

Peter Otten

Martin said:
I would like to validate sql strings, which are spread all over the
code, i.e. I run ("prepare") them against a database to see if it happy
with the statements. Spelling errors in sql have been a major pain for
me.

The statements will not be assembled from smaller pieces, but they will
not neccessarily be defined at module level. I could live with class
level, but method level would be best. And I definitely don't want to
parse the source file, but I am willing to mark the strings so it is
easier to tell sql from non-sql.

So the code will look like this

class Foo:(...):
def aQuery(...):
stmt = """
-- checkSql
select 1 from dual
"""
executeQuery()

at the end of the file I would like to write something like
if (...):
validateSql()

The validateSql() is the problem. It would be imported from elsewhere.
and has to do some serious magic. It would have to lookup its caller's
module, find all the classes (and methods, if we go that far) extract
the constants, check if any of them are an SQL statement and validate
it.

The problem is the first part: how can I lookup the callers module and
the classobjs defined in there? Or finding any constant strings in the
caller's module would also be just fine. Or is there a completely
different way to do such a thing?

Since all strings are constants you could just tokenize the source code:

def strings(filename):
with open(filename, "rU") as instream:
for t in tokenize.generate_tokens(instream.readline):
if t[0] == token.STRING:
yield eval(t[1])

def validateSQL(filename=None):
if filename is None:
# by default operate on the calling module
filename = sys._getframe(1).f_globals["__file__"]
for s in strings(filename):
print "validating", repr(s)

Another option would be to mark SQL statements similar to gettext by
enclosing them in a function call

sql = SQL("select * from...")

and then defining SQL() as either a no-op in production or an actual
validation while you are debugging. Even simpler and safer would be to
always validate once:

def SQL(sql, checked=set()):
if sql in checked:
return True
if not valid_sql(sql): raise ValueError
checked.add(sql)
return sql

Peter
 
M

Martin Drautzburg

def validateSQL(filename=None):
if filename is None:
# by default operate on the calling module
filename = sys._getframe(1).f_globals["__file__"]
for s in strings(filename):
print "validating", repr(s)

This involves parsing the file. I can see that it would even work on a
pyc file and it actually does solve the problem. Still (for the glory
of the human mind) I would like to do this without parsing a file, but
just the python internal memory.
Another option would be to mark SQL statements similar to gettext by
enclosing them in a function call

sql = SQL("select * from...")

and then defining SQL() as either a no-op in production or an actual
validation while you are debugging. Even simpler and safer would be to
always validate once:

def SQL(sql, checked=set()):
if sql in checked:
return True
if not valid_sql(sql): raise ValueError
checked.add(sql)
return sql

No this does not do the trick. I will not be able to validate an sql
statement bofore I run over the piece of code that uses it. Or I would
have to define all my sql = SQL stuff on module level, isn't id. I
mean, the problem is: when are those sql = SQL statement really
ececuted?
 
S

Scott David Daniels

Martin said:
I would like to validate sql strings, which are spread all over the
code, .... The statements will not be assembled from smaller pieces,
but they will not neccessarily be defined at module level. I could
live with class level, ....
parse the source file, but I am willing to mark the strings so it is
easier to tell sql from non-sql.

... Or is there a completely different way to do such a thing?

How about using some variation of:
class _Dummy: pass
OLD_STYLE = type(_Dummy)


def generate_strings(module):
'''Generate <class> <name> <value> triples for a module'''
for top_level_name in dir(module):
top_level_value = getattr(module, top_level_name)
if isinstance(top_level_value, basestring): # strings
yield None, top_level_name, top_level_value
elif isinstance(top_level_value, type): # new-style class
for name in dir(top_level_value):
value = getattr(top_level_value, name)
if isinstance(value, basestring):
yield top_level_name, name, value


def sometest(somestring):
'''Your criteria for "is an SQL string and has misspellings".'''
return len(somestring) > 20 and '


def investigate(module):
for modname in sys.argv[1:]:
for class_, name, text in generate_strings(
__import__(modname)):
if remarkable(text):
if class_ is None:
print 'Look at %s's top-level string %s.' % (
modname, name)
else:
print "Look at %s, class %s, string %s.' %
modname, class_, name)


if __name__ == '__main__':
import sys
for modname in sys.argv[1: ]:
investigate(modname, sometest)
 
G

George Sakkis

I would like to validate sql strings, which are spread all over the
code, i.e. I run ("prepare") them against a database to see if it happy
with the statements. Spelling errors in sql have been a major pain for
me.

The statements will not be assembled from smaller pieces, but they will
not neccessarily be defined at module level. I could live with class
level, but method level would be best. And I definitely don't want to
parse the source file, but I am willing to mark the strings so it is
easier to tell sql from non-sql.

So the code will look like this

class Foo:(...):
def aQuery(...):
stmt = """
-- checkSql
select 1 from dual
"""
executeQuery()

at the end of the file I would like to write something like
if (...):
validateSql()

The validateSql() is the problem. It would be imported from elsewhere.
and has to do some serious magic. It would have to lookup its caller's
module, find all the classes (and methods, if we go that far) extract
the constants, check if any of them are an SQL statement and validate
it.

The problem is the first part: how can I lookup the callers module and
the classobjs defined in there? Or finding any constant strings in the
caller's module would also be just fine. Or is there a completely
different way to do such a thing?

Yes, there is: use an ORM to do the SQL generation for you. Check out
SQLAlchemy, it will buy you much more than what you asked for.

George
 
A

Alex Martelli

Martin Drautzburg said:
The problem is the first part: how can I lookup the callers module and
the classobjs defined in there? Or finding any constant strings in the
caller's module would also be just fine. Or is there a completely
different way to do such a thing?

Don't do black magic in production code.

For just hacking around, see sys._getframe -- it can give you a frame
object from where you can introspect into your caller's globals -- and
the inspect module of the standard Python library.

But don't put such black magic in production. The completely different
way is: just don't.


Alex
 
P

Peter Otten

Martin said:
No this does not do the trick. I will not be able to validate an sql
statement bofore I run over the piece of code that uses it. Or I would
have to define all my sql = SQL stuff on module level, isn't id. I
mean, the problem is: when are those sql = SQL statement really
ececuted?

Let's see:
.... print sql
........ b = SQL("class")
.... def method(self, c=SQL("default arg")):
.... d = SQL("method")
....
class # ha, class statements are executed, too...
default arg # ...as are default arguments

Peter
 
M

Martin Drautzburg

George said:
Yes, there is: use an ORM to do the SQL generation for you. Check out
SQLAlchemy, it will buy you much more than what you asked for.

Might look, though in general I don't like OR mappers much. Having SQL
generated feels as strange as having python code generated. Too much
magic, too many layers. I think it is better to simply learn SQL.

And I don't really believe in OO databases much. OO databases have been
around for several decades and still have not reached the maturity of
relational databases. My feeling is that OO and persistence to not play
together well.
 
M

Martin Drautzburg

Peter said:
...     print sql
...
...     b = SQL("class")
...     def method(self, c=SQL("default arg")):
...             d = SQL("method")
...

You are my hero. Indeed very cool!
 
A

Aahz

Don't do black magic in production code.

For just hacking around, see sys._getframe -- it can give you a frame
object from where you can introspect into your caller's globals -- and
the inspect module of the standard Python library.

But don't put such black magic in production. The completely different
way is: just don't.

Could you expand on that? After all, that's exactly what we do to
implement a super() that works with classic classes -- and it's certainly
production code.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings,
it's about treating strings as sequences of characters. The fact that
characters are also strings is the reason we have problems, but characters
are strings for other good reasons." --Aahz
 
A

Alex Martelli

Aahz said:
Could you expand on that? After all, that's exactly what we do to
implement a super() that works with classic classes -- and it's certainly
production code.

Personally, I'd much rather give up super (not a big loss) and classic
classes (a substantial net gain:) than have to maintain such "black
magic" in a production environment. Explicit is better than implicit,
&c.


Alex
 
A

Alex Martelli

Martin Drautzburg said:
Might look, though in general I don't like OR mappers much. Having SQL
generated feels as strange as having python code generated. Too much
magic, too many layers. I think it is better to simply learn SQL.

Amen, brother; this opinion is definitely an outdated and minority one,
today, but I do tend to share it (and yet I'm programming in Django a
lot these days... ah well, consistency, hobgoblin, little minds, &c:).


Alex
 
A

Aahz

Personally, I'd much rather give up super (not a big loss) and classic
classes (a substantial net gain:) than have to maintain such "black
magic" in a production environment. Explicit is better than implicit,
&c.

This application was started with Python 1.4. We still use Python 2.2,
which has some subtle but critical differences with the way new-style
classes are handled in 2.3+. Using super() makes our plugin model work;
otherwise, we'd have to switch to a non-cooperative mechanism for calling
plugin hooks. That would have some advantages, of course, but so much
code would need rewriting, and managing the calling of plugin hooks would
require creating an API registry.

The world is not always a simple place....
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings,
it's about treating strings as sequences of characters. The fact that
characters are also strings is the reason we have problems, but characters
are strings for other good reasons." --Aahz
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top