Imager::QRCode-ing octet sequences vs. zbarimg(1)

Discussion in 'Perl Misc' started by Ivan Shmakov, Mar 13, 2013.

  1. Ivan Shmakov

    Ivan Shmakov Guest

    [AIUI, discussion of Perl modules is more appropriate for
    Yet, it appears to be abandoned,
    thus cross-posting to Cross-posting
    to news:alt.barcodes, too, just in case.]

    I wonder if QR codes are suitable for encoding arbitrary octet
    sequences (AKA 8-bit data)? I've tried the following Perl code,
    but it appears that the resulting transformations aren't "8-bit
    clean." Somehow, I suspect a QR::Imager fault, although
    zbarimg(1) may be responsible. (Unfortunately, the Perl module
    itself doesn't provide a decoder.)

    Any idea what may be going on?


    (The leading 51522d436f64653a and the trailing 0a after
    "Decoded:" are "QR-Code:" and a newline, respectively. In the
    first example, the first three octets in the output, 621d4f,
    appear to match the input. Incidentally, the fourth octet has
    its most significant bit set.)

    $ perl \
    89br96tnpoogun68sfh1jkj1sb.perl # "use bytes;" commented out
    Blob: 621d4f87d3ae92b60932c96b7f81f3a916faff9b03ae54f97d8163987dc8733df1bd8f8b92fb5317657ee2a0a97eed1f12423cdbfa1a73b3166a39cb4b1c0f43
    Image: 123 by 123
    Decoded: 51522d436f64653a621d4fc287c393c2aec292c2b60932c3896b7fc281c3b3c2a916c3bac3bfc29b03c2ae54c3b97dc28163c2987dc388733dc3b1c2bdc28fc28bc292c3bb5317657ec3a2c2a0c2a97ec3ad1f12423cc39bc3ba1a73c2b3166a39c38b4b1c0f430a
    scanned 1 barcode symbols from 1 images in 0.02 seconds

    $ perl \
    89br96tnpoogun68sfh1jkj1sb.perl # "use bytes;" in place
    Blob: 8abdab3e25ae4e44fbc50d9aedcadfb34b1eb959f78ca306bff1182f00024d1ca9e5d7db8827fdd4ab8169a18130cc3de3b31da82150bff080fe57d591f909cf
    Image: 99 by 99
    Decoded: 51522d436f64653ac28ac2bdc2ab3e25c2ae4e44c3bbc3850dc29ac3adc38ac39fc2b34b1ec2b959c3b7c28cc2a306c2bfc3b1182f0a
    scanned 1 barcode symbols from 1 images in 0.02 seconds

    $ LC_ALL=C perl \
    89br96tnpoogun68sfh1jkj1sb.perl # "use bytes;" in place
    Blob: aba7c3b1e7721a22660308e7a7a7f6cfdb48b18fb2143d823021ece0bb2dde2ed0fe2d4b06fb56c4167e867a1f0ef4f495a46a6efb2ce76621fb58b5bd817605
    Image: 123 by 123
    Decoded: 51522d436f64653ac2abc2a7c383c2b1c3a7721a22660308c3a7c2a7c2a7c3b6c38fc39b48c2b1c28fc2b2143dc2823021c3acc3a0c2bb2dc39e2ec390c3be2d4b06c3bb56c384167ec2867a1f0ec3b4c3b4c295c2a46a6ec3bb2cc3a76621c3bb58c2b5c2bdc28176050a
    scanned 1 barcode symbols from 1 images in 0.03 seconds

    $ cat < 89br96tnpoogun68sfh1jkj1sb.perl
    use bytes;
    use common::sense;
    use English;

    require Imager::QRCode;
    require IPC::Open2;

    sub rand_blob (;$) {
    my ($len) = @_;
    //= 24;
    open (my $f, "<", "/dev/urandom")
    or die ($OS_ERROR);
    binmode ($f);
    my $s;
    die ($OS_ERROR)
    unless (read ($f, $s, $len) == $len);
    ## .

    my $blob
    = rand_blob (64);
    print ("Blob: ", unpack ("H*", $blob), "\n");

    my $qr
    = Imager::QRCode->new (qw (mode 8-bit casesensitive 1));
    my $img
    = $qr->plot ($blob);
    print ("Image: ", $img->getwidth (),
    " by ", $img->getheight (), "\n");

    my ($in, $out);
    my $pid
    = IPC::Open2::eek:pen2 ($in, $out, qw (zbarimg -- -))
    or die ($OS_ERROR);
    binmode ($in);
    binmode ($out);

    $img->write ("fh" => $out, "type" => "pnm")
    or die ($img->errstr ());
    close ($out);
    my $dec
    = <$in>;
    print ("Decoded: ", unpack ("H*", $dec), "\n");
    Ivan Shmakov, Mar 13, 2013
    1. Advertisements

  2. Ivan Shmakov

    Ivan Shmakov Guest

    [Dropping and from

    Indeed, I've read the documentation. It was my understanding
    that, in the nutshell, the "bytes" pragma makes Perl operate
    strictly on octet sequences for its strings, instead of allowing
    either strings of octets /or/ strings of Unicode characters.

    Frankly, I do not see any harm in using this pragma /provided/
    that the code doesn't switch it on and off at will.

    The question on what setting do the loaded modules use remains
    open, but for the specific example I've given (which uses no
    text-processing modules) I'd expect the chances of running into
    issues to be quite low.

    ? I may be having a bit too much Lisp background, but I've
    always considered something_that_one_can_read to be a way better
    identifier for a global than, say, ~.

    Besides, there's a chance that the code I write will be read by
    someone not quite knowing Perl.

    Is there a practical reason to forgo the compile-time arguments'
    type checking they offer? For me, code that fails to compile is
    better than code that suddenly dies after running for hours.
    (Which is still better than the code that dies at the wrong
    place; or doesn't die, but silently gives a wrong result.)

    ACK, thanks. (Although, my guess is that even if urandom(4) is
    worse than random(4), Perl's rand is worse, randomness-wise,
    Ivan Shmakov, Mar 13, 2013
    1. Advertisements

  3. [...]

    Prototypes are really only useful for 'compile-time argument checking'
    in code which circumvents the 'individual' ways Perl operators deal
    with their arguments by always putting them in brackets. Otherwise,
    you'll be adding similar 'individual ways to deal with arguments' to your
    subroutines as a (probably undesired) side effect and this might
    confuse people who expect all subroutines to behave identical when not
    bracketing the arguments.

    NB: I'm using prototypes for compile-time checking as well but usually
    remove them from any code I post here.
    Rainer Weikusat, Mar 13, 2013
  4. Ivan Shmakov

    Ivan Shmakov Guest

    Are (parentheses) meant here, specifically? (As per the
    Wiktionary entry, brackets may be {curly}, (round), [square], or
    even <angle />.)

    FWIW, I /always/ use parentheses for the function (subroutine)
    arguments in my Perl code. Thus, the point is not to use
    prototypes in the module's "public" interface?
    ACK, thanks! And now I see it's explained (although perhaps
    without the amount of warnings this issue seem to deserve) in

    Ivan Shmakov, Mar 13, 2013
  5. Issue I forgot about (mentioning it clearly): Prototypes also enforce
    an evaluation context for arguments, eg

    perl -e 'sub blah { print $_[0], "\n";} @bla = qw(3 2); blah(@bla)'

    This prints 3 because 3 is the first element of @bla but

    perl -e 'sub blah($) { print $_[0], "\n";} @bla = qw(3 2); blah(@bla)'

    this prints 2 because it evaluates @bla in scalar context which yields
    the number of elements in it.
    Rainer Weikusat, Mar 13, 2013
  6. My opinion on this is that I don't care about being able to emulate
    the IMHO too idiosyncratic way in which the syntax of different
    built-in Perl operators has been 'hand-optimized' and that I also
    don't care about the fact itself. Further, I can live with the minor
    nuisance that passing an array (or a hash) to a subroutine with a
    prototype enforcing scalar context does not result in passing a list
    of arguments to this subroutine, especially considering that this
    doesn't consistently work for built-in operators as well, eg, sprintf,
    when I get at least some kind of compile-time checking of subroutine
    calls in return. I also usually use a Makefile to run perl -cw
    -Mstrict on any changed Perl file of even the most remotely
    non-trivial 'Perl project' prior to attempting any 'runtime
    testing'. Considering that no such checking is possible for method
    calls, this may not really be a worthwhile tradeoff but rather a habit
    of mine I carried over from C. But I'm not convinced of this yet.

    OTOH, this is decidedly not the opinion of the people who removed the
    "if it looks like a function, it will work like a function" statement
    from the Perl documentation (or who agreed that this would be a good
    idea). Since these also generally don't believe that people do
    something like 'separate compile time checks' at all and can get
    pretty much arbitrarily nasty when being confronted with opinions they
    don't approve of, I - as I already wrote - usually just delete any
    prototypes when I include code 'from other sources' in postings to the
    Rainer Weikusat, Mar 13, 2013
  7. Some more reference to 'clean up', I guess ...
    Rainer Weikusat, Mar 14, 2013
  8. So, why didn't you just point out that I was wrong and quoted the
    passage in question? I searched for this a while ago, after stumbling
    over all kinds of other 'strange modifications' of texts I remembered,
    ultimatively triggered by the descision to remove the various 'OO
    tutorial texts' from the Perl distribution in favor of strongly
    suggesting that no one should want to learn about Perl OO, that
    people who use it are not exactly sane (something like 'You can find
    the reference documentation in ..., in case you have to maintain code
    written in this style') and that everybody should just download this
    or that (or maybe another) CPAN module and didn't find
    it. Consequently, I assumed that it had been removed because of
    'political incorrectness' as well.
    Rainer Weikusat, Mar 14, 2013
  9. Originally, you didn't. You posted the results of searching for some
    string in a number of perl documentation files. The result was a
    couple of half sentences coming from various files, among them being
    the perl56delta and perl561delta texts whose relevance for current
    versions of Perl is zero.
    This is, as a already posted here some time ago, what the perl 5.16.0
    changes document has been claiming since 2012, the corresponding text

    Removed Documentation
    Old OO Documentation

    The old OO tutorials, perltoot, perltooc, and perlboot, have
    been removed. The perlbot (bag of object tricks) document has
    been removed as well.

    The online version is available here:

    Rainer Weikusat, Mar 14, 2013
  10. Ivan Shmakov

    Ivan Shmakov Guest

    ACK. I've got three more questions, however:

    * how do I ensure that a value passed to my function is an octet
    sequence? (IOW, doesn't contain a code over \xFF);

    * how do I ensure that a non-ASCII octet is never considered to
    be a member of, say, the [[:alpha:]] set? as in the following
    code (although, perhaps, of questionable value):

    use common::sense;
    = pack ("H*", "23456789abcdef");
    print (unpack ("H*", $_), "\n");
    ## => 23ffff89abffff ; thus, it assumes \xCD, \xEF to be alphabetical;

    * is the "It breaks encapsulation" comment in bytes(3perl)
    really justified? if the function in question was designed to
    operate on octet sequences, and not character strings, then
    it's an error for the caller to supply it a character string
    in the first place.

    Given the amount of punctuation typically found in Perl code,
    I'd say that it'd be a very weak argument. Rather, I'd expect
    such a construct to completely fall out of one's vision,
    /thanks/ to it being composed entirely of punctuation.
    Whatever is the language, it gets divided as long as it lives.
    There're stylistic choices one has to stick to, and (unless I'm
    tweaking someone's else code) my choice is to use English.

    Why, we may just as well argue about the number of spaces used
    to indent nested code blocks! (Or should these be ASCII HT's?)
    Or about the use of DocBook for documentation vs. POD.

    I've seen "solutions" to this kind of "problem," such as those
    implemented by the designers of Python and Go. And the only
    thing that comes to my mind is the old saying (paraphrased):
    "If programmers are so smart, why aren't they walking in

    Impressive! (Although I've had to "use feature qw (say);".)

    ACK, thanks!
    Ivan Shmakov, Mar 14, 2013
  11. Ivan Shmakov

    Ivan Shmakov Guest

    ... Or it may not. It definitely worths checking out.

    ACK, thanks! With qw (level L margin 0 size 2) being added to
    the parameters, the code now gives (also using $ zbarimg --raw):

    Blob: ffffffffffffffffffffffffffffffffff
    Image: 42 by 42
    Decoded: c3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bfc3bf0a
    scanned 1 barcode symbols from 1 images in 0.05 seconds

    Thus, unless there's some magic in the resulting QR code saying
    that it's an ISO-8859-1-encoded string (I'm not familiar with QR
    encoding, so can't tell if it's a sensible guess), zbarimg(1),
    is indeed to blame, and perhaps the underlying library, too.
    In this case, there'd indeed be some benefit from using the
    smallest-possible image. OTOH, I do not expect for the problem
    of interoperability to arise anytime soon.

    Ivan Shmakov, Mar 14, 2013
  12. What I was referring to was something like this,

    or this

    which could be pulled together more succinctly as 'Someone being
    convinced of $something which seems totally wrong could mean that a
    piece of information necessary for understanding $something correctly
    is missing'.
    Rainer Weikusat, Mar 14, 2013
  13. As asked I would prefer the answer

    croak "value is not an octet sequence" if $value =~ m/[^\0-\xFF]/;

    I.e., if I want to ensure that a value matches the specified interface,
    I want the program to complain loudly if the specs are violated. I do
    not usually want to mask the error by silently discarding data. There
    are situation where it is appropriate, but this should be a conscious
    decision, not the default.

    Peter J. Holzer, Mar 17, 2013
  14. Ivan Shmakov

    Ivan Shmakov Guest

    That was a dumb question, indeed. (Although, as it was already
    pointed out, die () if ($x =~ m/[^\0-\xff]/); is actually what
    I've asked for.)
    ACK, thanks! And now that Debian Wheezy (which I happen to use)
    provides 5.14...

    Now, that's a valid point.

    Ivan Shmakov, Mar 17, 2013
  15. Ivan Shmakov

    Ivan Shmakov Guest

    ... Indeed it does, which made me file Debian Bug#703234 [1].

    Now, however, given that the Wikipedia article mentions
    ISO-8859-1 as the default (?) encoding for 8-bit QR codes, the
    issues zbarimg(1) and Barcode::ZBar have may be considered

    Taking into account that different symbologies may (and do) use
    different character to code mappings, it may be sensible for
    libzbar to recode the barcode read into an UTF-8 string. Better
    still is that Perl supports UTF-8 as its native character string
    representation. What's wrong, however, is that the UTF-8 string
    returned by libzbar to Perl is not properly marked as such, thus
    resulting in the observed (and incorrect) behavior.

    (The obvious workaround is to Encode::decode_utf8 () the
    symbol's data returned by ->get_data ().)

    OTOH, zbarimg(1) should probably respect the current locale's
    encoding, instead of using UTF-8 unconditionally.


    Ivan Shmakov, Mar 17, 2013
  16. Ivan Shmakov

    Ivan Shmakov Guest

    (Thanks for the comments regarding ZBar, BTW. I'm yet to check
    its sources myself, but I've also discovered that it behaves
    strangely not only for the octets having the most significant
    bit set, but for the "plain old" \x0D = \r just as well.)

    BTW, there's a longstanding bug filed at the CPAN RT [2] (along
    with a patch.) However, it appears to be filed against
    libwww-perl, while it actually belongs to Net-HTTP.

    The question is: how do I reassign it?

    Yes. As long as an ideal world is considered, that is.

    There're a few things to note, however. The general problems
    with upstream may include:

    * there's effectively no upstream;

    * the code in the distribution may be extensively modified, or
    improperly built, or be alleged to be; the upstream then may
    discourage the users of "non-authorized" builds to report bugs
    directly to them; consider, e. g.:

    --cut: --
    *** DON'T USE the foo2zjs package from:

    Ubuntu, SUSE, Mandrake/Manrivia, Debian, RedHat, Fedora, Gentoo,
    Xandros, EEE PC, Linpus, MacOSX, or BSD!

    *** Download it here and follow the directions below.
    --cut: --

    (or the Joerg Schilling, albeit sufficiently different, case);

    * the issue may indeed be specific to the distribution's build;
    (naturally, building from the upstream sources for every bug
    being I report just to check that it wasn't introduced by the
    packagers is hardly an option.)

    Personally, I tend to prefer either the Debian BTS, or the
    CPAN RT, for these make it possible to file bugs via email,
    /and/ are better compatible with Lynx (which happens to be my
    primary browser) than most of the other BTS currently in use.
    (I'm particularly fond of RT, although the version installed at
    CPAN has certain surprising issue when it comes to the
    compatibility with non-ECMAScript-enabled browsers.)

    Alas, even for the Perl modules, the CPAN RT is not always the
    preferred but tracker. Consider, e. g.:

    --cut: --
    Please report issues via github at
    --cut: --

    Lastly, given the developer- and user-base of Debian (especially
    if the derivatives are included), I'd not call it "random."
    That being said, I tend to agree that when the D-M in charge
    fails to forward the request to the upstream, the reporter
    generally should try to do it him- or herself.

    (OTOH, even if D-M forwards the request, it may not have the
    desired effect. Consider, e. g., Debian Bug#691221 [3].)


    Ivan Shmakov, Mar 30, 2013
  17. Ivan Shmakov

    Ivan Shmakov Guest

    (Not necessarily so.)

    ... Which only makes it more surprising that it wasn't already
    dealt with. (Especially given the simplicity of the patch.)

    Depending on the goals, it may or may not make sense to ever get
    involved with the latest development version.

    For instance, I'm occasionally employed by a local university,
    to carry over certain computer-related courses (mostly
    short-term.) Should I discover an issue while preparing for
    them, I'm most likely to report it to the developers. However,
    distracting myself to write a patch -- which is unlikely to be
    incorporated into the distribution I'll use (and recommend to
    the students) by the time the courses will start -- may bring no
    good to the courses themselves. In this case, clearly
    documenting the issue and providing a work-around for the
    students to use may constitute a better solution.

    Similarly, while maintaining a few hosts under my
    responsibility, I'd try to stick to the distribution-provided
    software whenever possible, preferably the "stable" branch.
    Given that patches other than security fixes won't generally be
    accepted into Debian "stable," and that there're typically a
    couple of years between releases...

    Yet, indeed, I've made a few contributions to some Git HEADs.
    (Most recently libtasn1, IIRC.)
    The best thing about Debian is that it's a community-based
    project. (Which was the reason for me to choose it in the first
    place.) Basically, the only privileges that the Debian
    Developer status conveys are: to upload, and to vote.

    Essentially, anyone (careful enough not to disrupt the
    established order) is welcome to do this (or any other, for that
    matter) part of the job. Why, (taking a glance over the latest
    upstream stable releases) I've just forwarded Debian Bug#700617
    and #700618 to CPAN RT#84467 and #84468, respectively.

    (Hopefully, I did the thing right; this time.)
    Indeed, these are set correctly in the current META.json.
    My point is that GitHubs come and go, but the code remains.
    Certainly, I'd prefer a service that could be easily "cloned,"
    such as a Usenet newsgroup, a Git archive, or similar.

    The Perl-based App::SD was intended to be just such a system.
    Alas, it has seen virtually no development from mid-2011 to
    late-2012. The situation seem to be slowly improving, though.
    Ivan Shmakov, Apr 6, 2013
  18. Ivan Shmakov

    Ivan Shmakov Guest

    [The particular example given is relevant to a longstanding bug
    in Net::HTTP, thus cross-posting to Omitting the latter from Followup-To:, though.]

    As there seems to be no progress on this one, I've had to
    finally learn the necessary magic for CPAN to patch that
    (trivial) bug for me on each installation attempt.

    The first part of the incantation is altering $CPAN::Config, to
    which I've added "patches_dir":

    --cut: ~/.cpan/CPAN/ --
    my $cpan_home
    = ($ENV{"CPAN"}
    // ($ENV{"HOME"} . "/.cpan"));
    $CPAN::Config = {
    'patches_dir' => $cpan_home . q (/patches),
    'prefs_dir' => $cpan_home . q (/prefs),
    --cut: ~/.cpan/CPAN/ --

    The "prefs_dir" value was already there, and it's the directory
    I've added the following YAML data:

    ### 6yy1cmawx4oqu5kdx7qks77n3n.yml -*- YAML -*-
    ## Patch the IPv6 support into Net::HTTP
    module: "Net::HTTP"
    - "cpmz4z7w7toa3mk6bi4rmp66n8.patch"
    ### 6yy1cmawx4oqu5kdx7qks77n3n.yml ends here

    The patch itself goes to the "patches_dir" as specified above
    (it's the same as the one given at [2], yet with an earlier
    $VERSION within the context section of the diff):

    --- lib/Net/ 2011-11-21 20:23:21.000000000 +0000
    +++ lib/Net/ 2012-01-08 18:13:21.000000000 +0000
    @@ -5,8 +5,13 @@

    $VERSION = "6.02";
    unless ($SOCKET_CLASS) {
    - eval { require IO::Socket::INET } || require IO::Socket;
    - $SOCKET_CLASS = "IO::Socket::INET";
    + if (eval { require IO::Socket::INET6 }) {
    + $SOCKET_CLASS = "IO::Socket::INET6";
    + } else {
    + eval { require IO::Socket::INET }
    + || require IO::Socket;
    + $SOCKET_CLASS = "IO::Socket::INET";
    + }
    require Net::HTTP::Methods;
    require Carp;

    Voil`a! The $ cpan Net::HTTP command that followed resulted in
    the fixed version of the module being installed. (Even though
    there was an expected "fuzz" warning from patch(1).)

    Ivan Shmakov, Jun 28, 2013
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.