rope vs. string

Discussion in 'C++' started by James Aguilar, Mar 27, 2005.

  1. Hey all,

    I have a program that runs the Manber-Myers suffix array search algorithm
    with longest common prefix values. I need an efficient way to represent a
    sequential array of characters to be passed into the algorithm. Here are
    the features I need for the representation:

    * Must be randomly accessible in constant time
    * Must be capable of efficiently storing strings of more than 10 million
    characters

    Here are some plusses:

    * Dereferencing an iterator should be fast, not just constant time, but
    quick constant time, since it will be happening often (at least (size of
    pattern) + log(size of text) times)
    * Incrementing an iterator should be similarly fast

    It would also be useful if this representation provided an efficient method
    of extracting all the characters for the problem from the standard input
    (again, no less than 5 million characters, randomly distributed among A, C,
    G, and T).

    Now, here's the question: which is better for this problem, a string or a
    rope. If a rope would be best, can you offer any guidance as to ways to
    make the use of the rope as fast and painless as possible (if that is the
    preferable representation).

    - JFA1
     
    James Aguilar, Mar 27, 2005
    #1
    1. Advertising

  2. "James Aguilar" <> wrote in message
    news:d2532a$g4q$...
    Hi James,
    > I have a program that runs the Manber-Myers suffix array search algorithm
    > with longest common prefix values. I need an efficient way to represent a
    > sequential array of characters to be passed into the algorithm. Here are
    > the features I need for the representation:
    >
    > * Must be randomly accessible in constant time
    > * Must be capable of efficiently storing strings of more than 10 million
    > characters
    >
    > Here are some plusses:
    >
    > * Dereferencing an iterator should be fast, not just constant time, but
    > quick constant time, since it will be happening often (at least (size of
    > pattern) + log(size of text) times)
    > * Incrementing an iterator should be similarly fast
    >
    > It would also be useful if this representation provided an efficient
    > method of extracting all the characters for the problem from the standard
    > input (again, no less than 5 million characters, randomly distributed
    > among A, C, G, and T).
    >
    > Now, here's the question: which is better for this problem, a string or a
    > rope. If a rope would be best, can you offer any guidance as to ways to
    > make the use of the rope as fast and painless as possible (if that is the
    > preferable representation).


    The implementation of rope::iterator is somewhat more complex than for
    string (where it is basically a pointer), and will introduce an overhead.

    The only true advantage of a rope (i.e. the non-standard rope class from
    the SGI STL) is that it supports efficient *editing* of a small portion
    of the string (not just replacing, but also inserting/deleting/splicing
    subsequences within the original string). Unless this is relevant to
    what you need to do, there is no point in using rope instead of string.

    The caveat with std::string, depending on how the platform you use
    implements the standard library, is to try to avoid unnecessary copies
    of the data structure (be careful to pass the strings as references
    whenever possible, to avoid object copies in case your implementation
    does not use reference-counting).

    In case top performance is critical, passing the input as a file
    and mapping it into memory (with platform-specific mmap/MapViewOfFile),
    will most likely be the fastest approach at run-time.

    Note also that your algorithm could be templated on the iterator type,
    allowing it to be used on any type of source data range.


    I hope this helps,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
    Brainbench MVP for C++ <> http://www.brainbench.com
     
    Ivan Vecerina, Mar 27, 2005
    #2
    1. Advertising

  3. Oh! Thanks for the help. I didn't realize the rope was not in -the- STL.
    I got the sense from the site that I was reading that it was. If it's not
    standard, I'm not going to mess with it, as it is only for a homework
    assignment. Thank you so much for your advice nonetheless.

    - JFA1
     
    James Aguilar, Mar 27, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?ISO-8859-1?Q?S=F6ren?=

    sgi rope experiences?

    =?ISO-8859-1?Q?S=F6ren?=, Jul 29, 2003, in forum: C++
    Replies:
    0
    Views:
    613
    =?ISO-8859-1?Q?S=F6ren?=
    Jul 29, 2003
  2. castironpi

    rope class (heavyweight string)

    castironpi, Sep 2, 2008, in forum: Python
    Replies:
    0
    Views:
    330
    castironpi
    Sep 2, 2008
  3. Rustom Mody

    rope python-emacs problem

    Rustom Mody, Oct 15, 2008, in forum: Python
    Replies:
    2
    Views:
    339
    rustom
    Oct 16, 2008
  4. Giles Bowkett

    "more than enough rope"

    Giles Bowkett, Dec 5, 2006, in forum: Ruby
    Replies:
    7
    Views:
    133
    M. Edward (Ed) Borasky
    Dec 6, 2006
  5. John Miller

    Rope Data Structures

    John Miller, Aug 22, 2007, in forum: Ruby
    Replies:
    8
    Views:
    190
    Daniel Berger
    Oct 20, 2007
Loading...

Share This Page