One-liner removing duplicate lines

Damien Wyart · Oct 5, 2005

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Many thanks in advance,

Ryan Leavengood · Oct 5, 2005

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Ryan

Stefan Lang · Oct 5, 2005

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent
to this Perl one-liner removing duplicate lines in a file
(without sorting it at first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Click to expand...

I tried creating a version that mimics the Perl one (because Ruby
also has the -n option), but in the end this seemed easier (and
much more readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

or:
ruby -e 'puts ARGF.readlines.uniq' infile > outfile

Eric Mahurin · Oct 5, 2005

Here is a pretty close translation that does what you want:

ruby -ne 's||=3D{};s[$_]||print;s[$_]=3Dtrue'

--- Damien Wyart said:
Hello,
=20
Converting from Perl to Ruby, I am trying to find an
equivalent to this
Perl one-liner removing duplicate lines in a file (without
sorting it at
first) :
=20
perl -ne'$s{$_}++||print' infile >outfile
=20
I guess uniq method could be used, but I can't find how.
=20
=20
Many thanks in advance,
=20
--=20
Damien Wyart
=20
=20

=09
__________________________________=20
Yahoo! Mail - PC Magazine Editors' Choice 2005=20
http://mail.yahoo.com

Ryan Leavengood · Oct 5, 2005

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||=3D{};s[$_]||print;s[$_]=3D1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby lacks
some of the more terse (and obfuscating) features of Perl, it may not
be possible.

Ryan

James Edward Gray II · Oct 5, 2005

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to
this
Perl one-liner removing duplicate lines in a file (without sorting
it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Click to expand...

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

That slurps the file though, of course, so mind your memory
requirements.

Here's a more direct translation (untested):

ruby -ne 'BEGIN { $lines = Hash.new(0) }; print if ($lines[$_] += 1)
== 1' infile > outfile

James Edward Gray II

Simon Kröger · Oct 5, 2005

Damien said:
Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

true,

open(outfile, 'w'){|out| out << IO.readlines(infile).uniq.join}

cheers

Simon

Stefan Lang · Oct 5, 2005

I tried creating a version that mimics the Perl one (because Ruby
also has the -n option), but in the end this seemed easier (and
much more readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Click to expand...

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||={};s[$_]||print;s[$_]=1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby
lacks some of the more terse (and obfuscating) features of Perl, it
may not be possible.

ruby -ne 'a||={};a[$_]||=(print;1)' infile > outfile

Vincent Foley · Oct 5, 2005

How about the uniq(1) program? uniq infile > outfile

Simon Kröger · Oct 5, 2005

Stefan said:
I tried creating a version that mimics the Perl one (because Ruby
also has the -n option), but in the end this seemed easier (and
much more readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Click to expand...

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||={};s[$_]||print;s[$_]=1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby
lacks some of the more terse (and obfuscating) features of Perl, it
may not be possible.

Click to expand...

ruby -ne 'a||={};a[$_]||=(print;1)' infile > outfile

ruby -ne 'a||={};a[$_]||=print|1' infile > outfile

cheers

Simon

Simon Kröger · Oct 5, 2005

Simon said:
Stefan Lang wrote:
=20

I tried creating a version that mimics the Perl one (because Ruby
also has the -n option), but in the end this seemed easier (and
much more readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||=3D{};s[$_]||print;s[$_]=3D1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby
lacks some of the more terse (and obfuscating) features of Perl, it
may not be possible.

Click to expand...

ruby -ne 'a||=3D{};a[$_]||=3D(print;1)' infile > outfile

Click to expand...

=20
=20
ruby -ne 'a||=3D{};a[$_]||=3Dprint|1' infile > outfile

ruby -ne 'a||=3D{};a[$_]||=3D!print' infile > outfile

Louis J Scoras · Oct 5, 2005

------=_Part_10553_1413365.1128546972034
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

How about the uniq(1) program? uniq infile > outfile

Converting from Perl to Ruby, I am trying to find an equivalent to this

Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

He doesn't want sort the file first =3D)

------=_Part_10553_1413365.1128546972034--

Gyoung-Yoon Noh · Oct 5, 2005

ruby -ne 'BEGIN{$s=3D{}};$s[$_]=3Dnil;END{puts$s}' infile > outfile

Louis J Scoras · Oct 5, 2005

------=_Part_11074_30095953.1128548280433
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Converting from Perl to Ruby, I am trying to find an equivalent to this

He doesn't want sort the file first =3D)

Actually, to do this strait up unix, you'd need something like this
(probably doesn't work perfectly in all cases--play with the sed part at th=
e
end):

$ cat -n input | sort -k2 | uniq -f1 | sort | sed -e 's/^ *[0-9]*\t//' >
output

------=_Part_11074_30095953.1128548280433--

William James · Oct 6, 2005

Damien said:
Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

awk '!a[$0]++' infile >outfile

Jeremy Kemper · Oct 6, 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Damien said:
Damien said:

Converting from Perl to Ruby, I am trying to find an equivalent to
this
Perl one-liner removing duplicate lines in a file (without sorting
it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

Click to expand...

awk '!a[$0]++' infile >outfile

My head a splode. Old school.

Regards,
jeremy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (Darwin)

iD8DBQFDRIlaAQHALep9HFYRAkWoAJ4sfaj+rDB428AXttWTyWXzjyvwYwCeJyA3
rNKlJMmyjc9HkkKlgLhNHrQ=
=di8F
-----END PGP SIGNATURE-----

Devin Mullins · Oct 6, 2005

Jeremy said:
awk '!a[$0]++' infile >outfile

Click to expand...

My head a splode. Old school.

Seriously. Awe.

Here's different way. Not a golf-winner, but maybe more Rubyish?
ruby -e"o=nil; ARGF.each {|l| puts l or o=l unless o==l}" infile > outfile

Devin
Or disgust. Not sure.

Damien Wyart · Oct 6, 2005

* "William James said:
awk '!a[$0]++' infile >outfile

This one is very nice, thanks ! I had an Awk version which was slightly
longer.

Damien Wyart · Oct 6, 2005

* "Vincent Foley said:
How about the uniq(1) program? uniq infile > outfile

Using uniq is not stable, ie you have to use sort(1) before, and the
initial order of lines is not kept.

Damien Wyart · Oct 6, 2005

Many thanks to everyone who responded, the answers are very interesting
and enlightening !

Howto print postmatch variable with perl one-liner?	1	Apr 14, 2012
One liner to remove duplicate records	6	Apr 30, 2010
Multiline Search-Replace With Perl One-liner	2	Jan 12, 2012
Help: Duplicate and Unique Lines Problem	16	Sep 29, 2008
Removing Duplicate Objects from Object List	8	Oct 9, 2006
windows one liner to output unix line feed	13	Aug 19, 2009
Problem with one-liner string conversion	8	Jun 17, 2008
perl one liner	16	May 11, 2005

One-liner removing duplicate lines

Damien Wyart

Ryan Leavengood

Stefan Lang

Eric Mahurin

Ryan Leavengood

James Edward Gray II

Simon Kröger

Stefan Lang

Vincent Foley

Simon Kröger

Simon Kröger

Louis J Scoras

Gyoung-Yoon Noh

Louis J Scoras

William James

Jeremy Kemper

Devin Mullins

Damien Wyart

Damien Wyart

Damien Wyart

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads