c-style string vs std::string

BGB · Sep 19, 2011

Actually implementing a good hash table is not trivial, for two reasons.

Firstly, the decision of which kind of hash table. Unlike eg. red-black
trees, there are many different possible hash table implementations, and
there's no one single that would be optimal. What type of hash table is
best may actually depend on what kind of data is inserted into it.

fair enough, one has to decide on the type of hash table, ...
but they are not all that complicated code-wise, and are typically not
all that large (WRT code), albeit a factor worth consideration is the
size (too small will not be very effective or will fail, depending on
variety, and too large may waste memory or actually perform worse than a
smaller hash).

however, "rules of thumb" are generally reasonably effective.

Secondly, a naive hashing function may have hidden surprises. It might
seem to work like a charm in all testcases... but then there may be some
pathological input (which is not even necessarily deliberately chosen to
break the hash table, but could be natural input from somewhere) that will
trigger the weakness in the hashing function (which usually presents itself
by the vast majority of the elements ending up in only a fraction of the
hash table positions, in other words, many elements having the same hash
values). This pathological situation might not be discovered in testing,
only when a client somewhere uses it with some unexpected input.

I have had generally good success with polynomial hashes (generally
better collision behavior than the more traditional shifts+xors hashes).

there are also some nifty (almost magical) properties of prime-numbers
when used in hash-functions and similar (note: mersenne and non-mersenne
primes exhibit different properties).

found this:
http://en.wikipedia.org/wiki/Universal_hashing

one of the algos mentioned is similar to one I often use.

however, I developed my variant on my own, initially mostly by fiddling
and similar...

The major inefficiency in std::set and std::map is the memory allocation
that happens with each element. However, this inefficiency can be greatly
alleviated by using a fast memory allocator (which all STL containers
support). Such an allocator can make those containers faster by even an
order of magnitude or so.

Of course using a ready-made implementation of a data container (be it
std::set, std::map or their unordered variants in the new standard) also
reduces the amount of bug-hunting in the program, which is always a great
asset.

possibly, but I have generally not had too many problems with bugs WRT
hashing, but OTOH, this is also one of those things that one may end up
internalizing due to frequent use.

Rearranging .ply file via C++ String Parsing	0	Dec 14, 2019
std::string recplace char	2	Sep 28, 2009
Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
Aw: char array nummterminated to std::vector<std::string>	1	Apr 22, 2011
std::string on "const char *"	16	Jan 21, 2013
std::map<std::string,int> Problem	3	Aug 24, 2012
Rich Text Format (RTF) Document Builder in C++: Code and Features	0	Sep 28, 2025
Mini Web Server in C++ (Part One)	4	Oct 2, 2025

c-style string vs std::string

BGB

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads