Wide character in print

Discussion in 'Perl Misc' started by Yuri Shtil, Jul 31, 2003.

  1. Yuri Shtil

    Yuri Shtil Guest

    Hi all

    I am getting this when I try to print certain strings. Is it harmless ?

    If not, how do I get rid of it ?

    Yuri.
     
    Yuri Shtil, Jul 31, 2003
    #1
    1. Advertising

  2. "Yuri Shtil" <> wrote in message
    news:a3eWa.21387$cF.8823@rwcrnsc53...
    > Hi all
    >
    > I am getting this when I try to print certain strings. Is it harmless ?
    >
    > If not, how do I get rid of it ?
    >
    > Yuri.


    Upgrade! Fixed on Eniac 2.

    gtoomey
     
    Gregory Toomey, Aug 1, 2003
    #2
    1. Advertising

  3. Yuri Shtil

    Eric Amick Guest

    On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <> wrote:

    >What is Eniac 2 ?
    >
    >Sorry for ignorance !!!


    It's a stupid joke. Ignore it. I suspect you're trying to print Unicode
    characters to a filehandle that isn't expecting them. You should be able
    to fix the problem by adding

    binmode(FILEHANDLE, ":utf8");

    after the opening of the filehandle. If that doesn't work, you should be
    able to turn off the warning.

    perldoc perldiag

    --
    Eric Amick
    Columbia, MD
     
    Eric Amick, Aug 3, 2003
    #3
  4. On Sun, Aug 3, Eric Amick inscribed on the eternal scroll:

    > On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <>

    had, it seems, blurted out atop a fullquote:

    > >What is Eniac 2 ?
    > >
    > >Sorry for ignorance !!!

    >
    > It's a stupid joke.


    Well, I thought it was rather amusing; but then, the hon. Usenaut
    could perhaps be advised to pay more attention to Usenet posting
    conventions, and to entrust unknown terminology to a search engine of
    their choice before revealing ignorance of the history of computers in
    public... [An aside on the topic of character coding and old
    computers: http://www.mailcom.com/besm6/ shows what can happen when
    people try to put two different character codings into the same web
    page - Mozilla decided it must be Chinese, with unfortunate
    results...] [OK, so BESM-6 was a youngster compared to ENIAC]

    > I suspect you're trying to print Unicode
    > characters to a filehandle that isn't expecting them.


    OK, let's get serious.

    There is a Perl document (perldiag) which lists the error messages
    issued by perl itself. For 5.8.0 this document could be perused at
    http://www.perldoc.com/perl5.8.0/pod/perldiag.html ,
    although it's also part of any complete Perl installation.

    This should be the _first_ recourse for any unrecognised message.

    And indeed, here is the offending item:

    Wide character in %s
    (W utf8) Perl met a wide character (>255) when it wasn't expecting
    one. This warning is by default on for I/O (like print) but can be
    turned off by no warnings 'utf8';. You are supposed to explicitly
    mark the filehandle with an encoding, see open and perlfunc/binmode.

    Seems to me that they key phrase here is "You are supposed to...".

    > You should be able to fix the problem by adding
    >
    > binmode(FILEHANDLE, ":utf8");


    Do you think so? That tells Perl that the filehandle *is* expecting
    utf-8 encoding, but if it isn't in fact expecting it, then it's
    likely to cause an even worse problem.

    If the hon. Usenaut is expecting a particular character coding on
    their output, I would recommend (in 5.8.0) defining that coding in
    an encoding layer, to give Perl the chance to convert between "Wide
    characters" internally, and the expected encoding externally.

    Without some context, I've no idea whether the material in question
    might want to be koi8-r (the traditional encoding for Russian
    Cyrillic), or nothing more exciting than Windows-1252; but either way,
    an :encoding layer is what I'd recommend.

    The relevant documentation page that's called out from the binmode()
    page is: http://www.perldoc.com/perl5.8.0/lib/open.html

    (In earlier Perl versions, one needs to call the encoding explicitly,
    instead of including it in the open/binmode calls).

    > If that doesn't work, you should be able to turn off the warning.


    But again: the warning is there for a reason. Just hiding the warning
    doesn't make that reason go away. I would recommend identifying and
    then solving the problem, not just hiding it.

    You then added, almost it seems as an afterthought:

    > perldoc perldiag


    Oh, right: but I'd suggest putting that up-front, IMNSHO it's the
    single most important part of this reply.

    cheers
     
    Alan J. Flavell, Aug 3, 2003
    #4
  5. Yuri Shtil

    Yuri Shtil Guest

    I am amazed how a simple question can start something close to a flaming war
    !!!
    Are only superbly educated in computer history are allowed to participate in
    this group ?

    On the serious note, my problem showed up when I tried to parse/write XML
    code that came from a third party application.
    So I have no idea what to expect since the application does not specify the
    encoding (or at least I don't know how to extract it).

    These wide characters just showed up in some records.

    There is an another problem.

    My code passes extracted XML strings to an another application as counted
    strings. It seems that the Perl length function returns an incorrect result
    when these
    "wide" characters are present.

    Again, please pardon my ignorance and try to avoid flaming each other.

    "Alan J. Flavell" <> wrote in message
    news:p...
    > On Sun, Aug 3, Eric Amick inscribed on the eternal scroll:
    >
    > > On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <>

    > had, it seems, blurted out atop a fullquote:
    >
    > > >What is Eniac 2 ?
    > > >
    > > >Sorry for ignorance !!!

    > >
    > > It's a stupid joke.

    >
    > Well, I thought it was rather amusing; but then, the hon. Usenaut
    > could perhaps be advised to pay more attention to Usenet posting
    > conventions, and to entrust unknown terminology to a search engine of
    > their choice before revealing ignorance of the history of computers in
    > public... [An aside on the topic of character coding and old
    > computers: http://www.mailcom.com/besm6/ shows what can happen when
    > people try to put two different character codings into the same web
    > page - Mozilla decided it must be Chinese, with unfortunate
    > results...] [OK, so BESM-6 was a youngster compared to ENIAC]
    >
    > > I suspect you're trying to print Unicode
    > > characters to a filehandle that isn't expecting them.

    >
    > OK, let's get serious.
    >
    > There is a Perl document (perldiag) which lists the error messages
    > issued by perl itself. For 5.8.0 this document could be perused at
    > http://www.perldoc.com/perl5.8.0/pod/perldiag.html ,
    > although it's also part of any complete Perl installation.
    >
    > This should be the _first_ recourse for any unrecognised message.
    >
    > And indeed, here is the offending item:
    >
    > Wide character in %s
    > (W utf8) Perl met a wide character (>255) when it wasn't expecting
    > one. This warning is by default on for I/O (like print) but can be
    > turned off by no warnings 'utf8';. You are supposed to explicitly
    > mark the filehandle with an encoding, see open and perlfunc/binmode.
    >
    > Seems to me that they key phrase here is "You are supposed to...".
    >
    > > You should be able to fix the problem by adding
    > >
    > > binmode(FILEHANDLE, ":utf8");

    >
    > Do you think so? That tells Perl that the filehandle *is* expecting
    > utf-8 encoding, but if it isn't in fact expecting it, then it's
    > likely to cause an even worse problem.
    >
    > If the hon. Usenaut is expecting a particular character coding on
    > their output, I would recommend (in 5.8.0) defining that coding in
    > an encoding layer, to give Perl the chance to convert between "Wide
    > characters" internally, and the expected encoding externally.
    >
    > Without some context, I've no idea whether the material in question
    > might want to be koi8-r (the traditional encoding for Russian
    > Cyrillic), or nothing more exciting than Windows-1252; but either way,
    > an :encoding layer is what I'd recommend.
    >
    > The relevant documentation page that's called out from the binmode()
    > page is: http://www.perldoc.com/perl5.8.0/lib/open.html
    >
    > (In earlier Perl versions, one needs to call the encoding explicitly,
    > instead of including it in the open/binmode calls).
    >
    > > If that doesn't work, you should be able to turn off the warning.

    >
    > But again: the warning is there for a reason. Just hiding the warning
    > doesn't make that reason go away. I would recommend identifying and
    > then solving the problem, not just hiding it.
    >
    > You then added, almost it seems as an afterthought:
    >
    > > perldoc perldiag

    >
    > Oh, right: but I'd suggest putting that up-front, IMNSHO it's the
    > single most important part of this reply.
    >
    > cheers
     
    Yuri Shtil, Aug 4, 2003
    #5
  6. On Mon, Aug 4, Yuri Shtil continued in TOFU style:

    > Are only superbly educated in computer history are allowed to participate in
    > this group ?


    You're no fun in a usenet discussion...

    > On the serious note, my problem showed up when I tried to parse/write XML
    > code that came from a third party application.
    > So I have no idea what to expect since the application does not specify the
    > encoding


    But a text file is, in general, useless without a specification of its
    character encoding.

    > (or at least I don't know how to extract it).


    It's not normally something that one can "extract" in any formal way
    from the datastream itself; it's a piece of meta-data that goes along
    with the data. However, with some samples and some knowledge of
    context, someone could well offer a hypothesis.

    Perhaps if you'd show the data in context (accompanied for example by
    a hexadecimal dump of the bytes), someone could offer a suggestion
    about what it is.

    > These wide characters just showed up in some records.


    That's not a very definite description of symptoms, you know. I think
    we could have guessed that for ourselves based on your previous
    presentation. I for one was hoping to see something more definite in
    the way of an exhibit.

    > There is an another problem.
    >
    > My code passes extracted XML strings to an another application as counted
    > strings. It seems that the Perl length function returns an incorrect result
    > when these
    > "wide" characters are present.


    I'd have to guess that the Perl length function returns what it's
    documented to return, but that you're expecting something different.

    > Again, please pardon my ignorance


    Lack of knowledge (ignorance) is NOT the issue here, and is a
    perfectly normal and acceptable state of being, and (I think I can
    speak for many another here) is one of the reasons why we come to
    Usenet to share what we know. The *problem* is that you aren't
    showing us any working, so we don't know exactly what you're trying,
    we don't know exactly what results you are getting, we don't know what
    you expected the answer to be, and so we can't really offer any
    definite help.

    If you haven't tried it yet I'd suggest
    http://www.perldoc.com/perl5.8.0/pod/perluniintro.html
    and then
    http://www.perldoc.com/perl5.8.0/pod/perlunicode.html
    with particular reference to #Byte-and-Character-Semantics

    But most of all to
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

    have fun
     
    Alan J. Flavell, Aug 4, 2003
    #6
  7. Alan J. Flavell wrote:
    > On Mon, Aug 4, Yuri Shtil continued in TOFU style:

    [...]
    > The *problem* is that you aren't
    > showing us any working, so we don't know exactly what you're trying,
    > we don't know exactly what results you are getting, we don't know what
    > you expected the answer to be, and so we can't really offer any
    > definite help.
    >
    > If you haven't tried it yet I'd suggest
    > http://www.perldoc.com/perl5.8.0/pod/perluniintro.html
    > and then
    > http://www.perldoc.com/perl5.8.0/pod/perlunicode.html
    > with particular reference to #Byte-and-Character-Semantics
    >
    > But most of all to
    > http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html


    I'd like to add http://www.catb.org/~esr/faqs/smart-questions.html to that
    list.

    jue
     
    Jürgen Exner, Aug 5, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. iwongu
    Replies:
    1
    Views:
    371
    Victor Bazarov
    Dec 14, 2006
  2. George2
    Replies:
    2
    Views:
    377
    James Kanze
    Jan 25, 2008
  3. Disc Magnet
    Replies:
    2
    Views:
    719
    Jukka K. Korpela
    May 15, 2010
  4. Disc Magnet
    Replies:
    2
    Views:
    793
    Neredbojias
    May 14, 2010
  5. tcgo

    Why "Wide character in print"?

    tcgo, Sep 30, 2012, in forum: Perl Misc
    Replies:
    40
    Views:
    2,562
    Eric Pozharski
    Nov 13, 2012
Loading...

Share This Page