Finding out what row DBI chokes on?

Discussion in 'Ruby' started by Francis Hwang, Dec 30, 2004.

  1. Some times I deal with a legacy table with screwy datetime info, so you
    get fields that in MySQL are '0000-00-00' or other kinds of invalid
    values ... I pull this out with DBI I get an ArgumentError that simply
    says "argument out of range", which I believe is being caused by DBI
    calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something like that.

    Now, it makes sense to me that DBI doesn't want to deal with this, but
    in some cases it might be convenient for me to get a message that tells
    me specifically what row has the bad value. Does anybody know how I
    might do this, short of hacking DBI directly?

    Francis Hwang
    http://fhwang.net/
     
    Francis Hwang, Dec 30, 2004
    #1
    1. Advertising

  2. Hi Francis,

    --- Francis Hwang <> wrote:

    > Some times I deal with a legacy table with screwy
    > datetime info, so you
    > get fields that in MySQL are '0000-00-00' or other
    > kinds of invalid
    > values ... I pull this out with DBI I get an
    > ArgumentError that simply
    > says "argument out of range", which I believe is
    > being caused by DBI
    > calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something
    > like that.


    Is it possible to do validation at the SQL level
    itself? What does the SQL generates the result look
    like?

    -- shanko





    __________________________________
    Do you Yahoo!?
    Dress up your holiday email, Hollywood style. Learn more.
    http://celebrity.mail.yahoo.com
     
    Shashank Date, Dec 31, 2004
    #2
    1. Advertising

  3. On Dec 30, 2004, at 7:00 PM, Shashank Date wrote:

    > Hi Francis,
    >
    > --- Francis Hwang <> wrote:
    >
    >> Some times I deal with a legacy table with screwy
    >> datetime info, so you
    >> get fields that in MySQL are '0000-00-00' or other
    >> kinds of invalid
    >> values ... I pull this out with DBI I get an
    >> ArgumentError that simply
    >> says "argument out of range", which I believe is
    >> being caused by DBI
    >> calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something
    >> like that.

    >
    > Is it possible to do validation at the SQL level
    > itself? What does the SQL generates the result look
    > like?
    >
    > -- shanko


    Perhaps I didn't explain my problem very clearly. None of my current
    code is generating bad datetimes, but at work I deal with a database
    with legacy data going back 7 years, and from time to time I come
    across bad datetimes that were inserted years before. DBI can't handle
    these at all. Here's test code to show it:

    1) First I create a table and insert a bogus datetime:

    mysql> create table test ( dt datetime );
    Query OK, 0 rows affected (0.26 sec)

    mysql> insert into test( dt ) values( '2005-01-01' );
    Query OK, 1 row affected (0.13 sec)

    mysql> insert into test( dt ) values( '2004-01-01' );
    Query OK, 1 row affected (0.37 sec)

    mysql> insert into test( dt ) values( '0000-00-00' );
    Query OK, 1 row affected (0.21 sec)

    mysql> select * from test;
    +---------------------+
    | dt |
    +---------------------+
    | 2005-01-01 00:00:00 |
    | 2004-01-01 00:00:00 |
    | 0000-00-00 00:00:00 |
    +---------------------+
    3 rows in set (0.24 sec)

    2) Then I try to query this through DBI:

    irb(main):001:0> require 'dbi'=> true
    irb(main):002:0> dbh = DBI.connect( 'dbi:Mysql:test:localhost',
    'francis', 'xxxxxxx' )
    => #<DBI::DatabaseHandle:0x5e76a4 @trace_mode=2,
    @handle=#<DBI::DBD::Mysql::Database:0x5e6ec0 @handle=#<Mysql>,
    @mutex=#<Mutex:0x5df4b8 @waiting=[], @locked=false>,
    @have_transactions=true, @attr={"AutoCommit"=>true}>,
    @trace_output=#<IO:0x1d3a5c>>
    irb(main):003:0> dbh.select_all( 'select * from test' )
    ArgumentError: argument out of range
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:59:in `gm'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:59:in
    `as_timestamp'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:79:in `send'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:79:in `coerce'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:418:in
    `fill_array'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
    `each_with_index'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in `each'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
    `each_with_index'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
    `fill_array'
    from
    /usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:425:in `fetch'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1145:in
    `fetch_all'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1144:in `loop'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1144:in
    `fetch_all'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:909:in
    `fetch_all'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:666:in
    `select_all'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:662:in
    `execute'
    from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:662:in
    `select_all'
    from (irb):3irb(main):004:0>
    irb(main):005:0*

    Now, it isn't surprising or necessarily a bug that DBI won't even
    sensibly process this value, and catching the exception is easy enough.
    But it slows me down a little that

    1) I don't know exactly what the bad value is. You'd be surprised how
    many ways MySQL lets you insert bad datetime strings.
    2) I don't know which row the bad value is in if I've done a select
    that should return more than 1 row.

    So I'm wondering if, short of patching DBI, there is a way for a DBI
    client (Lafcadio, in my case) to get more info to return a more
    informative error message. I'm big into informative errors.

    Francis Hwang
    http://fhwang.net/
     
    Francis Hwang, Jan 1, 2005
    #3
  4. Hi Francis,

    Francis Hwang wrote:
    > Perhaps I didn't explain my problem very clearly. None of my current
    > code is generating bad datetimes, but at work I deal with a database
    > with legacy data going back 7 years, and from time to time I come across
    > bad datetimes that were inserted years before.


    This much I had guessed from your earlier email, so I guess you had
    explained the problem clearly ;-)

    Apologies if my answers did not make sense to you.

    > DBI can't handle these at all. Here's test code to show it:


    <snip>

    > Now, it isn't surprising or necessarily a bug that DBI won't even
    > sensibly process this value, and catching the exception is easy enough.
    > But it slows me down a little that
    >
    > 1) I don't know exactly what the bad value is. You'd be surprised how
    > many ways MySQL lets you insert bad datetime strings.


    I deal with such data almost on a routine basis (although not using
    MySQL), and use standard "scrubbing techniques":

    1. Try to limit the data in the SELECT, like borrowing from your example:

    SELECT * FROM TEST WHERE DT > '1997-01-01';

    or


    SELECT DISTINCT DT FROM TEST;

    to see what values show up

    2. Use the datetime / string conversion functions to try to identify the
    bad data.

    3. If the data set is small (< 100,000 rows) use CURSOR logic (usually
    frowned upon) to identify the bad rows

    4. Export the table to a CSV and then use the text filtering tools and
    then Import it back

    Again, not knowing exactly what your situation is, I do not know if all
    these are even relevant. So I guess, what I am trying to say is, as far
    as possible try to handle it at the backend BEFORE it reaches DBI.


    > 2) I don't know which row the bad value is in if I've done a select that
    > should return more than 1 row.
    > So I'm wondering if, short of patching DBI, there is a way for a DBI
    > client (Lafcadio, in my case) to get more info to return a more
    > informative error message. I'm big into informative errors.


    I am afraid there may be no way to do that. But I am by no means a DBI
    expert.

    HTH,

    > Francis Hwang
    > http://fhwang.net/


    -- shanko
     
    Shashank Date, Jan 2, 2005
    #4
  5. On Jan 2, 2005, at 10:31 AM, Shashank Date wrote:

    > I deal with such data almost on a routine basis (although not using
    > MySQL), and use standard "scrubbing techniques":
    >
    > 1. Try to limit the data in the SELECT, like borrowing from your
    > example:
    >
    > SELECT * FROM TEST WHERE DT > '1997-01-01';
    >
    > or
    >
    >
    > SELECT DISTINCT DT FROM TEST;
    >
    > to see what values show up
    >
    > 2. Use the datetime / string conversion functions to try to identify
    > the bad data.
    >
    > 3. If the data set is small (< 100,000 rows) use CURSOR logic (usually
    > frowned upon) to identify the bad rows
    >
    > 4. Export the table to a CSV and then use the text filtering tools and
    > then Import it back
    >
    > Again, not knowing exactly what your situation is, I do not know if
    > all these are even relevant. So I guess, what I am trying to say is,
    > as far as possible try to handle it at the backend BEFORE it reaches
    > DBI.


    These scrubbing techniques do help, thanks.

    Francis Hwang
    http://fhwang.net/
     
    Francis Hwang, Jan 2, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. William Payne

    New compiler chokes on template class

    William Payne, Aug 21, 2004, in forum: C++
    Replies:
    3
    Views:
    379
    Old Wolf
    Aug 22, 2004
  2. ‘5ÛHH575-UAZWKVVP-7H2H48V3
    Replies:
    7
    Views:
    681
    Kanenas
    Feb 15, 2005
  3. Bram Stolk
    Replies:
    4
    Views:
    346
    Bram Stolk
    May 25, 2005
  4. Rene Pijlman
    Replies:
    6
    Views:
    686
    Fredrik Lundh
    May 29, 2006
  5. Thomas Reat

    Finding out a DBI connection is broken?

    Thomas Reat, Dec 29, 2003, in forum: Perl Misc
    Replies:
    3
    Views:
    121
    Craig Manley
    Jan 4, 2004
Loading...

Share This Page