my own perl "dos->unix"/"unix->dos"

Discussion in 'Perl Misc' started by Robert Wallace, Jan 21, 2004.

  1. i made a dos to unix, unix to dos program catered for my own purposes.

    it's seems to work fine. small program, 26 lines

    anyone see any potential problems with it? do you have a better way to
    do it?
    my next step is to slip in some code to automatically "detect" whether
    its a unix or dos file.

    #!/usr/bin/perl -w
    use strict;

    #setting based on filename argument.
    # if sym link is dos-unix or if sym link is unix-dos
    my ($from, $to);
    my $option=substr($0,rindex($0,'/')+1,length($0)-1); # $0 gives full
    path. this code gives just filename
    if ($option eq "dos-unix"){
    $from="\015\012";
    $to ="\012";
    } elsif ($option eq "unix-dos"){
    $from="[^\015]\012";
    $from="[^\015]\012";
    $to ="\015\012";
    } else {
    print "only sym links \"unix-dos\" and \"dos-unix\" are allowed\n";
    }

    foreach my $file (@ARGV){
    my @output;
    open READ, $file or die "Could not open file '$file' $!";
    while (<READ>){
    $_=~s/\Q$from/$to/;
    push @output, $_;
    }
    close (READ);

    open WRITE, ">$file" or die "Could not open output file '$file' $!";
    print WRITE @output;
    close (READ);
    }
     
    Robert Wallace, Jan 21, 2004
    #1
    1. Advertising

  2. Robert Wallace <> wrote in news::

    > i made a dos to unix, unix to dos program catered for my own purposes.

    ....
    > anyone see any potential problems with it? do you have a better way to
    > do it?


    > #!/usr/bin/perl -w
    > use strict;
    >
    > #setting based on filename argument.
    > # if sym link is dos-unix or if sym link is unix-dos
    > my ($from, $to);
    > my $option=substr($0,rindex($0,'/')+1,length($0)-1); # $0 gives full


    If you want your script to work on computers without symlinks, you might
    want to handle this using command line options. Also, the line below you
    comment is not related to your comment. If you keep with this commenting
    style, your comments and code can get really out of sync when the script
    gets bigger.

    > if ($option eq "dos-unix"){
    > $from="\015\012";
    > $to ="\012";
    > } elsif ($option eq "unix-dos"){
    > $from="[^\015]\012";
    > $from="[^\015]\012";


    Why the repeated assignment to $from?

    > $to ="\015\012";
    > } else {
    > print "only sym links \"unix-dos\" and \"dos-unix\" are allowed\n";
    > }
    >
    > foreach my $file (@ARGV){
    > my @output;
    > open READ, $file or die "Could not open file '$file' $!";
    > while (<READ>){
    > $_=~s/\Q$from/$to/;
    > push @output, $_;
    > }
    > close (READ);


    I would actually open a temp file (there is a module to do that safely
    and cleanly), write my output to that file, and only delete the original
    and rename the temp after everything is successfully completed. That
    would keep the script's memory usage constant regardless of file size.


    > open WRITE, ">$file"
    > or die "Could not open output file '$file' $!";
    > print WRITE @output;
    > close (READ);
    > }


    Ahem, if an error happens while you are writing to or closing the file,
    you will have already clobbered the original. I don't see your script
    asking the user if it is OK to nuke his file out of existence if the
    conversion fails. Again, using a temporary file would handle this.

    First, do no harm.

    Sinan.
    --
    A. Sinan Unur
    (reverse each component for email address)
     
    A. Sinan Unur, Jan 21, 2004
    #2
    1. Advertising

  3. A. Sinan Unur <> wrote:
    > Robert Wallace <> wrote in news::


    >> $from="[^\015]\012";
    >> $from="[^\015]\012";

    >
    > Why the repeated assignment to $from?



    To make doubly sure that it is set to the correct value?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jan 21, 2004
    #3
  4. Tad McClellan wrote:
    >
    > A. Sinan Unur <> wrote:
    > > Robert Wallace <> wrote in news::

    >
    > >> $from="[^\015]\012";
    > >> $from="[^\015]\012";

    > >
    > > Why the repeated assignment to $from?

    >
    > To make doubly sure that it is set to the correct value?
    >

    copy and paste from a terminal problem.

    copy and paste, scroll down, copy and paste.
    what's the big deal. sheesh...
     
    Robert Wallace, Jan 21, 2004
    #4
  5. Robert Wallace

    Paul Lalli Guest

    On Wed, 21 Jan 2004, Robert Wallace wrote:

    > anyone see any potential problems with it? do you have a better way to
    > do it?


    > } elsif ($option eq "unix-dos"){
    > $from="[^\015]\012";
    > $to ="\015\012";
    > }


    > while (<READ>){
    > $_=~s/\Q$from/$to/;
    > push @output, $_;
    > }


    Are you sure you don't want lookbehinds here? I haven't tried your code,
    but it looks to me like you're going to end up replacing both \012 and
    whatever character came before it with \015\012. I think you want to
    search for \012 that was not preceded by \015, but don't actually match
    (and therefore replace) whatever character did precede it. That involves
    lookbehinds, I believe.

    Paul Lalli
     
    Paul Lalli, Jan 21, 2004
    #5
  6. Robert Wallace

    Ben Morrow Guest

    Robert Wallace <> wrote:
    > #!/usr/bin/perl -w
    > use strict;


    use warnings;

    is better than -w.

    >
    > #setting based on filename argument.
    > # if sym link is dos-unix or if sym link is unix-dos
    > my ($from, $to);
    > my $option=substr($0,rindex($0,'/')+1,length($0)-1); # $0 gives full


    Use File::Spec or File::Basename to do this sort of thing.

    > path. this code gives just filename


    I try and keep code wrapped to a reasonable line length; part of doing
    this means not putting long comments on the ends of lines.

    > if ($option eq "dos-unix"){
    > $from="\015\012";
    > $to ="\012";
    > } elsif ($option eq "unix-dos"){
    > $from="[^\015]\012";
    > $from="[^\015]\012";
    > $to ="\015\012";
    > } else {
    > print "only sym links \"unix-dos\" and \"dos-unix\" are allowed\n";


    Use qq{}.

    > }
    >
    > foreach my $file (@ARGV){
    > my @output;
    > open READ, $file or die "Could not open file '$file' $!";


    Use lexical filehandles, and scope them rather than explicitly closing
    where you can.

    open my $READ, $file or die...;

    > while (<READ>){
    > $_=~s/\Q$from/$to/;
    > push @output, $_;
    > }
    > close (READ);
    >
    > open WRITE, ">$file" or die "Could not open output file '$file' $!";
    > print WRITE @output;
    > close (READ);


    ITYM close (WRITE)?

    > }


    Ben

    --
    I've seen things you people wouldn't believe: attack ships on fire off the
    shoulder of Orion; I've watched C-beams glitter in the darkness near the
    Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
    Time to die. |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-|
     
    Ben Morrow, Jan 21, 2004
    #6
  7. Paul Lalli wrote:
    > On Wed, 21 Jan 2004, Robert Wallace wrote:
    >>
    >>} elsif ($option eq "unix-dos"){
    >> $from="[^\015]\012";
    >> $to ="\015\012";
    >>}
    >>
    >> while (<READ>){
    >> $_=~s/\Q$from/$to/;
    >> push @output, $_;
    >> }

    >
    > Are you sure you don't want lookbehinds here? I haven't tried your
    > code, but it looks to me like you're going to end up replacing both
    > \012 and whatever character came before it with \015\012. I think
    > you want to search for \012 that was not preceded by \015, but
    > don't actually match (and therefore replace) whatever character did
    > precede it. That involves lookbehinds, I believe.


    Or capturing:

    $from = "([^\015])\012";

    ....

    $_ =~ s/$from/$1$to/;

    (quoting meta seems not to be correct)

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jan 21, 2004
    #7
  8. On Wed, 21 Jan 2004 10:24:10 -0500, Robert Wallace <>
    wrote:

    >i made a dos to unix, unix to dos program catered for my own purposes.
    >
    >it's seems to work fine. small program, 26 lines


    Have you ever heard of a (programmer's) virtue called lazyness?
    Whenever I'm too lazy to remember which is which and I do not have any
    particular requirement but converting to the native convention of the
    OS I'm using, I resort to

    perl -lpi -e '' <file(s)>

    Well that is under Linux, under Win I have to do

    perl -lpi.bak -e "" <file(s)>

    and

    perl -lpi.bak -e "BEGIN{@ARGV=map glob($_),@ARGV}" <file(s)>

    if I need to use wildcards. (yes, I know there's an AS executable that
    does shell wildcard expansion!)

    My program is 1 line only! It is not equivalent to yours though...
    ;-)


    HTH,
    Michele
    --
    # This prints: Just another Perl hacker,
    seek DATA,15,0 and print q... <DATA>;
    __END__
     
    Michele Dondi, Jan 22, 2004
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stefan Siegl
    Replies:
    1
    Views:
    776
  2. Saverio M.
    Replies:
    0
    Views:
    521
    Saverio M.
    Jul 3, 2006
  3. Walter Dnes (delete the 'z' to get my real address

    fgets()/strlen unix/DOS portability question

    Walter Dnes (delete the 'z' to get my real address, Jun 1, 2004, in forum: C Programming
    Replies:
    3
    Views:
    698
    Walter Dnes (delete the 'z' to get my real address
    Jun 2, 2004
  4. Aki Niimura
    Replies:
    12
    Views:
    607
    Nick Coghlan
    Jan 15, 2005
  5. David Filmer
    Replies:
    17
    Views:
    269
    J. Romano
    Aug 18, 2004
Loading...

Share This Page