# uniq without sort <-------------- GURU NEEDED

Discussion in 'C Programming' started by gnuist006@gmail.com, Jan 25, 2008.

1. ### Guest

This is a tough problem, and needs a guru.

I know it is very easy to find uniq or non-uniq lines if you scramble
all of them and sort them. Its trivially

echo -e "a\nc\nd\nb\nc\nd" | sort | uniq

\$ echo -e "a\nc\nd\nb\nc\nd"
a
c
d
b
c
d

\$ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq
a
b
c
d

So it is TRIVIAL with sort.

I want uniq without sorting the initial order.

The algorithm is this. For every line, look above if there is another
line like it. If so, then ignore it. If not, then output it. I am
sure, I can spend some time to write this in C. But what is the
solution using shell ? This way I can get an output that preserves the
order of first occurrence. It is needed in many problems.

Thanks to the star who can help
gnuist

, Jan 25, 2008

2. ### Michele DondiGuest

On Thu, 24 Jan 2008 18:45:24 -0800 (PST), wrote:

>I want uniq without sorting the initial order.
>
>The algorithm is this. For every line, look above if there is another
>line like it. If so, then ignore it. If not, then output it. I am
>sure, I can spend some time to write this in C. But what is the
>solution using shell ? This way I can get an output that preserves the
>order of first occurrence. It is needed in many problems.

In shell I don't know. In Perl it's well known to be as trivial as

perl -ne 'print unless \$saw{\$_}++' file

(And it's not even the most golfed down solution!)

Michele
--
Se, nella notte in cui concepi' il duce,
Donna Rosa, toccata da divina luce,
avesse dato al fabbro predappiano
invece della fica il deretano,
l'avrebbe presa in culo quella sera
Rosa sola e non l'Italia intera.
- Poesia antifascista

Michele Dondi, Jan 29, 2008