How to sort a bitset vector quickly?

Discussion in 'C++' started by halfsearch2@gmail.com, May 27, 2005.

  1. Guest

    Hi all,
    I want to sort a bitset vector ( bit 1 > bit 0 ), while it seems bitset
    doesn't have < or > operation. I tried to use a loop to compare each
    bit one by one but it's unbearably slow. Any suggestion?
    , May 27, 2005
    #1
    1. Advertising

  2. <> wrote in message
    news:...
    > Hi all,
    > I want to sort a bitset vector ( bit 1 > bit 0 ), while it seems
    > bitset
    > doesn't have < or > operation. I tried to use a loop to compare each
    > bit one by one but it's unbearably slow. Any suggestion?



    You could count the number of zeros and construct a new bitset, with
    the number of zeros you counted before and then the rest of ones (exact
    number is determined by the size of the original bitset minus the number
    of zeros).

    hth
    --
    jb

    (reply address in rot13, unscramble first)
    Jakob Bieling, May 27, 2005
    #2
    1. Advertising

  3. * :
    > I want to sort a bitset vector ( bit 1 > bit 0 ), while it seems bitset
    > doesn't have < or > operation. I tried to use a loop to compare each
    > bit one by one but it's unbearably slow. Any suggestion?


    Assuming this is a std::vector< std::bitset<n> >:

    std::sort( v.begin(), v.end(), bitsetLess<n> );

    where 'bitsetLess' is e.g.

    template< unsigned n >
    bool bitsetLess( std::bitset<n> const& left, std::bitset<n> const& right )
    {
    return left.to_ulong() < right.to_ulong();
    }

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, May 27, 2005
    #3
  4. Wiseguy Guest

    scribbled on the stall wall:
    > Hi all,
    > I want to sort a bitset vector ( bit 1 > bit 0 ), while it seems bitset
    > doesn't have < or > operation. I tried to use a loop to compare each
    > bit one by one but it's unbearably slow. Any suggestion?
    >


    WHY?!

    the whole purpose of bitsets is that they are generally position
    dependent and lose their meaning when reordered.

    rethink your reason for doing this.


    --
    dual 2.8Ghz Xeon; 2GB RAM; 500GB ATA-133; nVidia powered
    Linux 2.6.10; glibc-2.3.5; vendor neutral home-brewed installation

    ----anything after this line is ANNOYING CRAP that the newsserver adds ----
    ---directly contact newsfeeds and ISPs that piggy back them to complain----


    ----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
    http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
    ----= East and West-Coast Server Farms - Total Privacy via Encryption =----
    Wiseguy, May 27, 2005
    #4
  5. Jeff Flinn Guest

    "Wiseguy" <> wrote in message
    news:42976e94$...
    > scribbled on the stall wall:
    >> Hi all,
    >> I want to sort a bitset vector ( bit 1 > bit 0 ), while it seems bitset
    >> doesn't have < or > operation. I tried to use a loop to compare each
    >> bit one by one but it's unbearably slow. Any suggestion?
    >>

    >
    > WHY?!
    >
    > the whole purpose of bitsets is that they are generally position
    > dependent and lose their meaning when reordered.
    >
    > rethink your reason for doing this.


    Perhaps the OP means std::vector< std::bitset<N> >?

    Jeff Flinn
    Jeff Flinn, May 27, 2005
    #5
  6. Guest

    Hi Alf,
    Your code reflects what I really want to do, thanks. :)
    The problem is each bitset contains more than 32*10 bits
    , May 27, 2005
    #6
  7. * :
    >
    > Your code reflects what I really want to do, thanks. :)
    > The problem is each bitset contains more than 32*10 bits


    So?

    In the comparision function just loop over the bits till the first
    difference.

    Alternatively use to_string(), but although that might in practice
    be efficient enough it _feels_ so inefficient that no self-respecting
    C++ programmer would use it... ;-)


    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, May 27, 2005
    #7
  8. Guest

    Alf P. Steinbach wrote:
    > In the comparision function just loop over the bits till the first
    > difference.
    >

    This is exactly what I did, it's too slow. The vector contains more
    than 10 million bitsets, each bitset contains more than 32*10 bits.
    , May 27, 2005
    #8
  9. [About sorting a vector of bitsets]

    * :
    > * Alf P. Steinbach wrote:
    > > In the comparision function just loop over the bits till the first
    > > difference.
    > >

    > This is exactly what I did, it's too slow. The vector contains more
    > than 10 million bitsets, each bitset contains more than 32*10 bits.


    Ah. It's time for the Big Revelation (TM). Tweaking the comparision
    function will at best buy you a small constant improvement, whereas
    optimizing the _algorithm_ and/or _data structure_ can yield really
    impressive results.

    I think what you've related so far indicates you have a good handle
    on the C++ side of things, but that's probably not where the problem
    is. There are a number of techniques that can be applied at the C++
    level but I fear that if I mention them they'll spur you to blindly
    move ahead on a path that doesn't lead to any real solution. So:

    What's it all for, i.e. what's the original problem that this huge
    vector of bitsets, and sorting of it, is intended to solve?


    Cross-posted to [comp.programming], where this probably belongs! ;-)

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, May 27, 2005
    #9
  10. rossum Guest

    On 27 May 2005 12:59:29 -0700, wrote:

    >
    >
    >Alf P. Steinbach wrote:
    >> In the comparision function just loop over the bits till the first
    >> difference.
    >>

    >This is exactly what I did, it's too slow. The vector contains more
    >than 10 million bitsets, each bitset contains more than 32*10 bits.


    Some thoughts:

    1 Can you fit all these bitsets into real memory at one time? If not
    then you may want to look at doing an in-memory sort of a chunk of the
    bitsets that will fit into memory and then merging the chunks to
    produce the final result.

    2 How about a radix sort? At each stage partition the current
    in-memory chunk by looking at a single bit only. The do a
    divide-and-conquer on each half of the result, again looking at only
    one bit.

    After first pass looking at bit[0]:

    0xxxxxxx
    ..
    ..
    0xxxxxxx
    1xxxxxxx
    ..
    ..
    1xxxxxxx

    After second pass, looking at bit[1]:

    00xxxxxx
    ..
    ..
    00xxxxxx
    01xxxxxx
    ..
    ..
    01xxxxxx
    10xxxxxx
    ..
    ..
    10xxxxxx
    11xxxxxx
    ..
    ..
    11xxxxxx

    and so on.


    rossum



    The ultimate truth is that there is no ultimate truth
    rossum, May 27, 2005
    #10
  11. <> wrote in message
    news:...
    > Alf P. Steinbach wrote:
    >> In the comparision function just loop over the bits till the first
    >> difference.
    >>

    > This is exactly what I did, it's too slow. The vector contains more
    > than 10 million bitsets, each bitset contains more than 32*10 bits.



    Unfortunately, std::bitset is not intended as a 'big int' library,
    and does not support (efficient) numeric computations.
    If I were you I would eventually write a custom bitset class,
    and optimize the representation (e.g. endianness, separate
    storage of exponent, etc) based on the typical distribution
    of values you use.
    That is, unless one of the existing 'bigint' packages can do
    [but those often use heap-allocated storage, which may cause
    performance problems in your case].

    This said, have you tried a simple sorting based on to_ulong,
    just to verify that performance will be at least satisfying
    if you find an efficient comparison method?


    I hope this helps,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
    Ivan Vecerina, May 28, 2005
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    8
    Views:
    1,895
    Csaba
    Feb 18, 2006
  2. Thomas Kowalski
    Replies:
    4
    Views:
    385
    Thomas Kowalski
    Sep 22, 2006
  3. KevinSimonson
    Replies:
    31
    Views:
    1,381
  4. Navin
    Replies:
    1
    Views:
    674
    Ken Schaefer
    Sep 9, 2003
  5. Ninds
    Replies:
    14
    Views:
    711
    W Karas
    Dec 3, 2012
Loading...

Share This Page