perlio problem? redhat 9, perl 5.8.0

Discussion in 'Perl Misc' started by gordon, Jun 24, 2003.

  1. gordon

    gordon Guest

    I'm having some problems running a bit of legacy perl code on a newly
    installed redhat 9. With the perl 5.8.0 that comes with redhat 9, the
    match against the string literal works, the match against the same
    string read from a file fails.

    The weird thing is that with a 5.8.0 compiled from source with default
    settings, both matches work, which is what we'd expect.

    To make the script work you can add a binmode F after the second file
    open, or you can change the [^\s]+ to [\S]+ or \S+. But the question
    is, why are either of these things necessary? I don't understand the
    fundamental cause of this problem and don't want to go through all the
    legacy scripts if I can help it.

    Is there a problem in the redhat 9 perl distribution, or just a
    misunderstanding of something on my part?

    Here's a tiny script to show the problem:

    my $x = "this\n";

    open(F, ">/tmp/test.txt");
    print F "$x";
    close(F);

    open(F, "</tmp/test.txt");
    my @lines = <F>;
    close(F);

    if ($x =~ /^\s*([^\s]+)/)
    {
    print "'$x' internal string matched OK\n";
    }
    else
    {
    print "'$x' internal string FAILED!\n";
    }

    for $x (@lines)
    {
    if ($x =~ /^\s*([^\s]+)/)
    {
    print "'$x' file string matched OK\n";
    }
    else
    {
    print "'$x' file string FAILED!\n";
    }
    }

    - gordon
    gordon, Jun 24, 2003
    #1
    1. Advertising

  2. gordon

    gordon Guest

    (gordon) wrote in message news:<>...

    > To make the script work you can add a binmode F after the second file
    > open, or you can change the [^\s]+ to [\S]+ or \S+. But the question
    > is, why are either of these things necessary? I don't understand the
    > fundamental cause of this problem and don't want to go through all the
    > legacy scripts if I can help it.


    As an update, I can run the legacy scripts without change with
    LC_ALL=POSIX
    in the environment. But I still don't understand why it's necessary
    when I've some simple ASCII text (read: unchanged in UTF8 encoding vs
    ASCII encoding) being read/written to a file, and matched against
    [^\s].

    Very mysterious. Any feedback much appreciated!

    - gordon
    gordon, Jun 25, 2003
    #2
    1. Advertising

  3. On Tue, Jun 24, gordon inscribed on the eternal scroll:

    (well, no-one seems to have offered an answer, so I suppose I might
    try...)

    > The weird thing is that with a 5.8.0 compiled from source with default
    > settings, both matches work, which is what we'd expect.


    I don't see any logical reason why it would not work, so I'd rate it
    prima facie as a bug in the particular implementation that was
    giving the problem.

    Sorry, I'm not in a position to reproduce your error, so I'm neither
    confirming nor denying your report - just saying that on the basis of
    what you reported, it does seem like a bug.

    > Here's a tiny script to show the problem:


    [works for me, on several different platforms, but I didn't have the
    specific one you mentioned]

    (I think you'd need to report the release details of the RPM,
    the output of perl -V and so forth, to make it a proper bug report.)

    As you rightly say: with all of the characters involved being
    us-ascii, there shouldn't be any difference. Could it be that for
    some bizarre reason one of them got "upgraded" to unicode, and the
    other didn't, and they were then reported as not matching? But I
    might be talking rowlocks - it needs someone who understands the
    internals.
    Alan J. Flavell, Jun 26, 2003
    #3
  4. gordon

    gordon Guest

    "Alan J. Flavell" <> wrote in message news:<>...
    > On Tue, Jun 24, gordon inscribed on the eternal scroll:
    >
    > (well, no-one seems to have offered an answer, so I suppose I might
    > try...)
    >
    > > The weird thing is that with a 5.8.0 compiled from source with default
    > > settings, both matches work, which is what we'd expect.

    >
    > I don't see any logical reason why it would not work, so I'd rate it
    > prima facie as a bug in the particular implementation that was
    > giving the problem.
    >
    > Sorry, I'm not in a position to reproduce your error, so I'm neither
    > confirming nor denying your report - just saying that on the basis of
    > what you reported, it does seem like a bug.
    >
    > > Here's a tiny script to show the problem:

    >
    > [works for me, on several different platforms, but I didn't have the
    > specific one you mentioned]
    >
    > (I think you'd need to report the release details of the RPM,
    > the output of perl -V and so forth, to make it a proper bug report.)
    >
    > As you rightly say: with all of the characters involved being
    > us-ascii, there shouldn't be any difference. Could it be that for
    > some bizarre reason one of them got "upgraded" to unicode, and the
    > other didn't, and they were then reported as not matching? But I
    > might be talking rowlocks - it needs someone who understands the
    > internals.


    Thanks for your comments!

    Yeah, it looks like a bug to me the more I look at it. There's no
    reason,
    unicodedness or otherwise, that a match using [^\s] fails, and using
    [\S]
    or \S works. The printout of the strings that occurs in the test
    script
    show good text so nothing's obviously out of whack. Oh and, as you
    point out,
    it runs fine on lots of systems.

    FYI this was a pristine installation of redhat 9 inside of vmware.
    I'd not upgraded the perl or anything when I got the problem. The
    perl -V is attached to the end of this message in case someone can say
    "my god, why did they use *those* options" :) -- I'll report a bug to,
    errr, redhat or perl.

    It would be good though to have confirmation from someone else who can
    run this
    on a pristine redhat 9 with no LC_ALL=POSIX or C in their environment.

    perl -V:Summary of my perl5 (revision 5.0 version 8 subversion 0)
    configuration:
    Platform:
    osname=linux, osvers=2.4.20-2.48smp,
    archname=i386-linux-thread-multi
    uname='linux str'
    config_args='-des -Doptimize=-O2 -march=i386 -mcpu=i686 -g
    -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red
    Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux
    -Dvendorprefix=/usr -Dsiteprefix=/usr
    -Dotherlibdirs=/usr/lib/perl5/5.8.0 -Duseshrplib -Dusethreads
    -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db
    -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio
    -Dinstallusrbinperl -Ubincompat5005 -Uversiononly
    -Dpager=/usr/bin/less -isr'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef'
    useithreads=define usemultiplicity=
    useperlio= d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=un uselongdouble=
    usemymalloc=, bincompat5005=undef
    Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
    -DDEBUGGING -fno-strict-aliasing -I/usr/local/include
    -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
    -DDEBUGGING -fno-strict-aliasing -I/usr/local/include
    -I/usr/include/gdbm'
    ccversion='', gccversion='3.2.2 20030213 (Red Hat Linux 8.0
    3.2.2-1)', gccosandvers=''
    gccversion='3.2.2 200302'
    intsize=e, longsize= , ptrsize=p, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define,
    longdblsize=12
    ivtype='long'
    k', ivsize=4'
    ivtype='long'
    known_ext, nvtype='double'
    o_nonbl', nvsize=, Off_t='', lseeksize=8
    alignbytes=4, prototype=define
    Linker and Libraries:
    ld='gcc'
    l', ldflags =' -L/usr/local/lib'
    ldf'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil
    perllibs=
    libc=/lib/libc-2.3.1.so, so=so, useshrplib=true, libperl=libper
    gnulibc_version='2.3.1'
    Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so', d_dlsymun=undef,
    ccdlflags='-rdynamic
    -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC'
    ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5', lddlflags='s
    Unicode/Normalize XS/A'


    Characteristics of this binary (from libperl):
    Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS
    USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
    Locally applied patches:
    MAINT18379
    Built under linux
    Compiled at Feb 18 2003 22:19:53
    @INC:
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    gordon, Jun 27, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. U. George
    Replies:
    4
    Views:
    835
    U. George
    Aug 11, 2005
  2. Dave
    Replies:
    0
    Views:
    141
  3. Bo Lindbergh

    PerlIO omission

    Bo Lindbergh, Jun 8, 2006, in forum: Perl Misc
    Replies:
    6
    Views:
    118
    Charles DeRykus
    Jun 8, 2006
  4. Mark Seger

    I think I want to use of mmap and perlio

    Mark Seger, Jun 15, 2007, in forum: Perl Misc
    Replies:
    2
    Views:
    157
    Tim S
    Jun 15, 2007
  5. kj
    Replies:
    0
    Views:
    80
Loading...

Share This Page