perl -d interfering with program execution?

P

Pat Deegan

Greetings,

I originally posted this to 'comp.lang.perl' which, apparently, doesn't
exist... brilliant...

I've been having issues while debugging programs. The problems only
appear when using the '-d' switch and I get the impression it has to do
with UTF8 and regex matching but I don't know how to deal with it.

One particular example is a program which uses the DBI to connect to a
database, which will run fine normally but will hang during the call to
DBI->connect() when running under the debugger.

In order to demonstrate the problem, I inserted a couple of bread crumbs
in the DBI.pm file, in the connect() method (at the <=== comments):

1) Just after the DSN is set (around line 460):


$dsn ||= $ENV{DBI_DSN} || $ENV{DBI_DBNAME} || '' unless $old_driver;

print STDERR "DSN: $dsn\n"; # <=== output DSN as passed to connect()


2) Just after the "dbi:driver" prefix is extracted from the DSN (near line
470):

# extract dbi:driver prefix from $dsn into $1
$dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
or '' =~ /()/; # ensure $1 etc are empty if match
fails

print STDERR "DSN: $dsn\nDriver: '$1'\n"; # <=== output modified DSN
# and extracted driver



Running the program therefore outputs some info on the dsn and driver to
stderr. A normal program run outputs:

$ perl ./myapp.pl
DSN: dbi:mysql:database=mydb;host=localhost
DSN: database=mydb;host=localhost
Driver: 'mysql'


The problem appears when calling the program with -d. Here the program is
loaded in the debugger and simply 'c'ontinued.

$ perl -d ./myapp.pl

Loading DB routines from perl5db.pl version 1.19 Editor support
available.

Enter h or `h h' for help, or `man perldebug' for more help.

main::(./myapp.pl:2): my $numTrials = shift || "1";
DB<1> c

DSN: dbi:mysql:database=mydb;host=localhost
DSN: dbi:mysql:database=mydb;host=localhost
Driver: ''


This is where the program hangs. It seems the (same) dsn did NOT match
the regex this time--the DSN is unchanged and the driver variable is
empty. Hitting ctrl-C will abort the loop and land me somewhere in a
UTF8-related sub, like:

utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:308): 308:
for ($key = $min; $key <= $max; $key++) {



And all this only happens with 'perl -d'. I tried tweaking a few
environment variables, like LANG and SUPPORTED, setting them as


$ export LANG=en_US
$ export SUPPORTED=en_US:en

in a vain attempt to avoid UTF-8 altogether. This had no effect.

Does anyone know how to resolve or work around this problem?


Thanks in advance for any assistance.






In case it's relevant, here's the output of 'perl -V' on this Red Hat
Linux 9 (Shrike) system:


Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
Platform:
osname=linux, osvers=2.4.20-2.48smp, archname=i386-linux-thread-multi
uname='linux stripples.devel.redhat.com 2.4.20-2.48smp #1 smp thu feb
13 11:44:55 est 2003 i686 i686 i386 gnulinux ' config_args='-des
-Doptimize=-O2 -march=i386 -mcpu=i686 -g -Dmyhostname=localhost
-Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc.
-Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux
-Dvendorprefix=/usr -Dsiteprefix=/usr
-Dotherlibdirs=/usr/lib/perl5/5.8.0 -Duseshrplib -Dusethreads
-Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db
-Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio
-Dinstallusrbinperl -Ubincompat5005 -Uversiononly
-Dpager=/usr/bin/less -isr' hint=recommended, useposix=true,
d_sigaction=define usethreads=define use5005threads=undef
useithreads=define usemultiplicity=define useperlio=define
d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef
use64bitall=undef uselongdouble=undef usemymalloc=n,
bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBUGGING -fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2 -march=i386 -mcpu=i686 -g', cppflags='-D_REENTRANT
-D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing
-I/usr/local/include -I/usr/include/gdbm' ccversion='',
gccversion='3.2.2 20030213 (Red Hat Linux 8.0 3.2.2-1)',
gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8,
byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib
/usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil
perllibs=-lnsl -ldl -lm -lpthread -lc -lcrypt -lutil
libc=/lib/libc-2.3.1.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.1'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic
-Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS
USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Locally applied patches:
MAINT18379
Built under linux
Compiled at Feb 18 2003 22:19:53
@INC:
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0
/usr/lib/perl5/vendor_perl
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0
 
V

Vetle Roeim

* Pat Deegan
Greetings,

I originally posted this to 'comp.lang.perl' which, apparently, doesn't
exist... brilliant...

It shouldn't exist, but for some reason it exists on
groups.google.com. For instance, here's your post on comp.lang.perl:

<URL:http://tinyurl.com/29sms>

I'm confused.


[...]
 
J

Joe Smith

Pat Deegan wrote:
Wrote two separate posts; comp.lang.perl and comp.lang.perl.misc

1) Don't multi-post. It's OK to cross-post a single message to multiple
groups simultaneously; it's not OK to post twice.
# extract dbi:driver prefix from $dsn into $1
$dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
or '' =~ /()/; # ensure $1 etc are empty if match fails
print STDERR "DSN: $dsn\nDriver: '$1'\n"; # <=== output modified DSN

That's not the way to deal with $1.

$raw_dsn="dbi:foo(bar):xyz";
if (my ($driver,$dsn) = $raw_dsn =~ /^dbi:(\w*?)(?:\((.*?)\))?:(.*)/i) {
print STDERR "DSN: $dsn\nDriver: '$driver'\n";
} else {
warn "Unable to parse '$raw_dsn'";
}

-Joe
 
J

Joe Smith

Vetle said:
* Pat Deegan


It shouldn't exist, but for some reason it exists on
groups.google.com.

Google archives everything, including newsgroups that were officially
disbanded many years ago. Too bad Google's USENET web interface
does not warn uninformed users about the folly of using comp.lang.perl.
 
J

Joe Smith

Pat said:
I originally posted this to 'comp.lang.perl' which, apparently, doesn't
exist... brilliant...

Appologies to anyone seeing uncancelled replies to the misunderstanding.
# extract dbi:driver prefix from $dsn into $1
$dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
or '' =~ /()/; # ensure $1 etc are empty if match fails
print STDERR "DSN: $dsn\nDriver: '$1'\n"; # <=== output modified DSN

A better way to deal with failed matches is:

$raw_dsn="dbi:foo(bar):xyz";
if (($driver,$dsn) = $raw_dsn =~ /^dbi:(\w*?)(?:\((.*?)\))?:(.*)/i) {
print STDERR "DSN: $dsn\nDriver: '$driver'\n";
} else {
warn "Unable to parse '$raw_dsn'";
$dsn = $raw_dsn; # Hope this is OK
}

-Joe
 
P

Pat Deegan

Greets

A better way to deal with failed matches is:
[snip]

Cool. The code snippet is from the DBI module though, so I can't do many
permanent mods on that end.

I've isolated the problem a bit better now...

As stated, this regex (from DBI.pm):

/^dbi:(\w*?)(?:\((.*?)\))?:(.*)/i

works fine under normal execution but fails under 'perl -d'. However,
everything works ok under both 'perl' and 'perl -d' if I edit the DBI.pm
file and remove the 'i' modifier, setting the regex to:

/^dbi:(\w*?)(?:\((.*?)\))?:(.*)/

Although this workaround will let me debug my code now without hanging or
aborting (so long as I stick to lower case DSNs for the DBI), I am worried
this behavior will affect other parts of my programs during debugging.

Any clues as to what's going on?
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Pat Deegan
As stated, this regex (from DBI.pm):

/^dbi:(\w*?)(?:\((.*?)\))?:(.*)/i

works fine under normal execution but fails under 'perl -d'. However,
everything works ok under both 'perl' and 'perl -d' if I edit the DBI.pm
file and remove the 'i' modifier

Check the data it is matched against, and whether you can reproduce th
failure in a standalone perl. If not, I suspect a bug in DBI XS code,
which may write over the a spurious pointer, breaking some Perl table.

Another hint: enable REx debugging, and see what differs between these
situations.

Hope this helps,
Ilya
 
P

Pat Deegan

Hello and thanks for your response,

Check the data it is matched against, and whether you can reproduce th
failure in a standalone perl. If not, I suspect a bug in DBI XS code,
which may write over the a spurious pointer, breaking some Perl table.

My main question here is: why would this only happen under the -d
debugger?
Another hint: enable REx debugging, and see what differs between these
situations.

Ah, this is interesting. I've gone through the joyous process of
compiling and installing the latest stable Perl (and just about every
third party module I use), with -DDEBUGGING enabled in order to run the
program with the -Dr switch.

The same behavior is repeated.

The entire process is about 10 times longer when running 'perl -Dr
../myapp.pl' than when I use 'perl -Dr -d ./myapp.pl'. With -d the REx
match fails like this:


Matching REx `^dbi:(\w*?)(?:\((.*?)\))?:' against
`dbi:mysql:database=mydb;host=localhost'
Setting an EVAL scope, savestack=223
0 <> <dbi:mysql:database=mydb;host=localhost> | 1: BOL 0 <>
<dbi:mysql:database=mydb;host=localhost> | 2: EXACTF <dbi:>
Guessing start of match, REx `^_<' against
`/usr/lib/perl5/5.8.3/utf8.pm'... String not equal...
Match rejected by optimizer



And here is the relevent portion of output when running standalone:


Matching REx `^dbi:(\w*?)(?:\((.*?)\))?:' against
`dbi:mysql:database=mydb;host=localhost'
Setting an EVAL scope, savestack=132
0 <> <dbi:mysql:database=mydb;host=localhost> | 1: BOL 0 <>
<dbi:mysql:database=mydb;host=localhost> | 2: EXACTF <dbi:> 4
<dbi:> <mysql:database=mydb;host=localhost> | 4: OPEN1 4
<dbi:> <mysql:database=mydb;host=localhost> | 6: MINMOD 4
<dbi:> <mysql:database=mydb;host=localhost> | 7: STAR
Setting an EVAL scope, savestack=132


.... goes on for a long time, then ...

9 <mysql> <:database=mydb;host=localhost> | 25: NOTHING 9
<mysql> <:database=mydb;host=localhost> | 26: EXACTF <:>
10 <ysql:> <database=mydb;host=localhost> | 28: END
Match successful!


So why does the re engine start
"Guessing start of match, REx `^_<' against `/usr/lib/perl5/5.8.3/utf8.pm'..."
when running 'perl -d ./myapp.pl' and not with just 'perl ./myapp.pl'?

Thanks for your assistance :)
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Pat Deegan
My main question here is: why would this only happen under the -d
debugger?

By definition. ;-) A write though a spurious pointer hits an
unpredictable target. Any change to memory layout may lead to a
change of symptoms...
Ah, this is interesting. I've gone through the joyous process of
compiling and installing the latest stable Perl (and just about every
third party module I use), with -DDEBUGGING enabled in order to run the
program with the -Dr switch.

?!!!

use re 'debugcolor';
Matching REx `^dbi:(\w*?)(?:\((.*?)\))?:' against
`dbi:mysql:database=mydb;host=localhost'
Setting an EVAL scope, savestack=223
0 <> <dbi:mysql:database=mydb;host=localhost> | 1: BOL 0 <>
<dbi:mysql:database=mydb;host=localhost> | 2: EXACTF <dbi:>
Guessing start of match, REx `^_<' against
`/usr/lib/perl5/5.8.3/utf8.pm'... String not equal...
Match rejected by optimizer

Your data is in utf8 mode. My guess is:

a) such a match calls an external Perl module to understand what
Unicode case-conversions of 'dbi:' are;

b) under debugger, calling Perl code results in a REx match /^_</
somewhere in the internals of the debugger;

c) As a result, another REx is executed while Perl is trying to
understand what `dbi:' means in the original REx;

d) apparently, nobody continued my work to make the REx engine
reenterable. As a result, the REx engine is confused.

So it looks like a triple fault (a+b+d).

Enjoy, :-(
Ilya
 
J

Joe Smith

Pat said:
Guessing start of match, REx `^_<' against
`/usr/lib/perl5/5.8.3/utf8.pm'... String not equal...
Match rejected by optimizer

Just a wild guess: Does putting in a dummy block to get utf8 loaded before
the regexp make any difference?

{ use utf8; } # Pre-load the utf8 module just in case
/^dbi:(\w*?)(?:\((.*?)\))?:(.*)/i
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,877
Messages
2,569,934
Members
46,216
Latest member
LouanneDim

Latest Threads

Top