How to read from a file with both text and binary

Discussion in 'Perl Misc' started by Brian, Jun 26, 2006.

  1. Brian

    Brian Guest

    I am processing print job files. They all contain a Start banner with plain
    text then following the banner comes the data to be printed. This data can
    either be plain text or PCL (binary). I need to be able to determine what
    format the data is in because the processing is different. I tried using -T
    and -B on the file, but Perl returns -T as TRUE, I guess becuase of the
    plain text header. The PCL always starts with the string ^[E^[... so I
    tried searching for that, but to no avail with the following code:

    while ( <SRCFILE> ) {
    if ( $_ =~ /\^\[E\^\[/ ) { # Does this line contain ^[E^[
    $pclFlag = 1;
    print " PCL File detected\n";
    }
    }

    Here is an example of part of a print job file containing PCL data:

    * PRINT TIME: 13:30:38
    *
    * PRINT DATE: 23 JUN 2006
    *
    * PRINT NAME: TA104002
    *
    * SYSTEM: MVSA
    *
    *
    *
    **START*****START*****START*****START*****START*****START*****START*****START**
    ^[E^[&u300D^[*v1O^[*v0N^[*c00001D^[)s64W@^B)$7^A^^¦^U^BÿªÃðÄðÇãñðãñåñððó÷^[*c0E^[(s17W^D^N^A^A^Ax^[*c1E^[(s17W^D^N^A^A^Ax^[*c2E^[(s17W^D
    ^N^A^A^Ax^[*c3E^[(s17W^D^N^A^A^Ax^[*c4E^[(s17W^D^N^A^A^Ax^[*c5E^[(s17W^D^N^A^A^Ax^[*c6E^[(s17W^D^N^A^A^Ax^[*c7E^[(s17W^D^N^A^A^Ax^[*c8E^[
    (s17W^D^N^A^A^Ax^[*c9E^[(s17W^D^N^A^A^Ax^[*c10E^[
     
    Brian, Jun 26, 2006
    #1
    1. Advertising

  2. Brian

    tuser Guest

    Brian wrote:
    > I am processing print job files. They all contain a Start banner with plain
    > text then following the banner comes the data to be printed. This data can
    > either be plain text or PCL (binary). I need to be able to determine what
    > format the data is in because the processing is different. I tried using-T
    > and -B on the file, but Perl returns -T as TRUE, I guess becuase of the
    > plain text header. The PCL always starts with the string ^[E^[... so I
    > tried searching for that, but to no avail with the following code:
    >
    > while ( <SRCFILE> ) {
    > if ( $_ =~ /\^\[E\^\[/ ) { # Does this line contain ^[E^[
    > $pclFlag = 1;
    > print " PCL File detected\n";
    > }
    > }
    >
    > Here is an example of part of a print job file containing PCL data:
    >
    > * PRINT TIME: 13:30:38
    > *
    > * PRINT DATE: 23 JUN 2006
    > *
    > * PRINT NAME: TA104002
    > *
    > * SYSTEM: MVSA
    > *
    > *
    > *
    > **START*****START*****START*****START*****START*****START*****START*****START**
    > ^[E^[&u300D^[*v1O^[*v0N^[*c00001D^[)s64W@^B)$7^A^^¦^U^BÿªÃðÄðÇãñðãñåñððó÷^[*c0E^[(s17W^D^N^A^A^Ax^[*c1E^[(s17W^D^N^A^A^Ax^[*c2E^[(s17W^D
    > ^N^A^A^Ax^[*c3E^[(s17W^D^N^A^A^Ax^[*c4E^[(s17W^D^N^A^A^Ax^[*c5E^[(s17W^D^N^A^A^Ax^[*c6E^[(s17W^D^N^A^A^Ax^[*c7E^[(s17W^D^N^A^A^Ax^[*c8E^[
    > (s17W^D^N^A^A^Ax^[*c9E^[(s17W^D^N^A^A^Ax^[*c10E^[


    I have taken your print job file literally, i.e. the sequence "^[E^["
    literally means 5 characters:
    character "^", followed by another character "[", followed by "E", then
    followed by another "^", and a final "["

    ....and, no surprise, the match "...if ( $_ =~ /\^\[E\^\[/ )..." works
    fine.

    However, I suspect that the 2-character sequence "^[" is not to be
    taken literally, but really stands for one single control character
    (possibly an octal "\033").

    You could try the following instead:
    if ( $_ =~ /\033E\033/ )
     
    tuser, Jun 26, 2006
    #2
    1. Advertising

  3. "tuser" <> wrote in
    news::

    > Brian wrote:
    >> I am processing print job files.


    ....

    >> Here is an example of part of a print job file containing PCL data:
    >>
    >> * PRINT TIME: 13:30:38
    >> *
    >> * PRINT DATE: 23 JUN 2006
    >> *
    >> * PRINT NAME: TA104002
    >> *
    >> * SYSTEM: MVSA
    >> *
    >> *
    >> *
    >>**START*****START*****START*****START*****START*****START*****START***
    >> **START**
    >> ^[E^[&u300D^[*v1O^[*v0N^[*c00001D^[)s64W@^B)$7^A^^¦^U^BÿªÃðÄ


    ....

    > However, I suspect that the 2-character sequence "^[" is not to be
    > taken literally, but really stands for one single control character
    > (possibly an octal "\033").
    >
    > You could try the following instead:
    > if ( $_ =~ /\033E\033/ )


    You are most likely correct that ^[ stands for the escape character. In
    that case, you can also use \e instead of the octal code.

    Sinan
    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 26, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Krish
    Replies:
    1
    Views:
    1,085
    =?Utf-8?B?Q3VydF9DIFtNVlBd?=
    Oct 20, 2005
  2. Jon A. Cruz
    Replies:
    1
    Views:
    423
    Jon A. Cruz
    Jul 20, 2003
  3. Roedy Green
    Replies:
    1
    Views:
    356
  4. Replies:
    5
    Views:
    1,117
    Ian Rastall
    Jun 29, 2005
  5. ABCL
    Replies:
    0
    Views:
    557
Loading...

Share This Page