Forums
New posts
Search forums
Members
Current visitors
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Menu
Log in
Register
Install the app
Install
Forums
Archive
Archive
C Programming
Container library (continued)
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
[QUOTE="jacob navia, post: 3998627"] Any container will add *some* complexity and overhead to the data it stores. A container object in C will be never as fast as managing each individual datum by hand, i.e. giving to each datum an address and managing each datum individually, as it is done in assembly language. A container allows for scalability precisely by simplifying data management. So, we pay an overhead. How much of an overhead? Let's take lcc-win's strings. They encapsulate a String object, reduced to the bare essentials: typedef struct _StringA { size_t count; // Elements char *content; size_t capacity; // Allocated space } String; In a 64 bit system, an 8 character string will take: 24+8 --> 32 bytes, i.e. an overhead of 300%. In a 32 bit system we would have 12+8 --> 20 bytes, an overhead of 150%. Let's compare this to Java's string class. There, a character string of 8 characters will take 64 bytes, and there is nothing that the programmer can do about it. In C++ the size of a string of 89 characters appears to be 40 bytes. I used the following program: D:\temp>type str.cpp #include <string> #include <iostream> using namespace std; int main(void) { string m("12345678"); cout << sizeof(m) << endl; } This outputs 40 In a C container library, it is possible to design a "Small string type" (maybe in Java too, I do not know). That type can be restricted to strings shorter than 65535 characters (probably 99.999999% of the strings used in a program) If we do that, we can reduce the overhead from 24 to 16 bytes. The alignments requirements in 64 bits still hunt us. If we get rid of the pointer however, and store the characters in a structure with a variable "tail" as introduced by C99, we can curtail the alignment requirements and we have an overhead of just 16 bytes: two unsigned shorts that specify the length and the capacity, followed by the actual data. In C (99) we would have typedef struct _smallString { unsigned short count; // Elements unsigned short capacity; // Allocated space char contents[]; } smString; In this case the overhead is 4 bytes, i.e. only 50%. By making apparent the workings of the container, C has the advantage of making the programmer aware of what he/she is paying for each container. The big problem in Java and other Java-like languages (like C#) is that programmers are not used (and even not supposed to) design their own data types but to reuse some package that will do the job without caring about possible overhead costs. For a container library in C, fighting overhead and giving versions of smaller data types that have less overhead will be an important point. In a typical Java heap we have tens/hundreds of thousands, even millions of live collections. [1] Java heaps have grown from 500 MB to 2-3GB now, without supporting more features or users. It is increasingly common to require 1GB of memory just to support a few hundred users, saving 500K session state PER USER, or requiring 2MB for a text index per simple document, or creating 100K temporary objects per web hit. The consequences are clear: scalability disappears, at several thousand users, the Java solution will require more than the 16GB installed RAM, and will start swapping, killing performance. Power usage goes up, more machines need to be bought to support the bloat. With more machines, more communications and more overhead, etc. It is common to propose here that C can't be used for normal workstation applications. I am convinced that this is wrong. C can be used for web servers and web applications, and with a reasonable container library it could have a three "killer arguments" for its use: Scalability, low overhead, performance. jacob (yes, I am biased) [1] [URL]http://domino.research.ibm.com/comm/research_people.nsf/pages/sevitsky.pubs.html/$FILE/oopsla08%20memory-efficient%20java%20slides.pdf[/URL] [/QUOTE]
Verification
Post reply
Forums
Archive
Archive
C Programming
Container library (continued)
Top