I am working on this during the commute, and I prefer to polish the
code to some degree before mindlessly compiling it. I am sharing this
code with the group while it is being developed to let you clowns
participate, not for you to make unfounded judgements on competence.
Kernighan counts it as a virtue that Dennis Ritchie completed the so-
called "regular expression" processor in Beautiful Code in an hour,
and this attitude results in a culture of undocumented slapdash, or
programmers who have no life. Life is short, art is long.
The current code will be posted at the bottom of this reply.
not a good idea. The preprocessor doesn't really understand C
and this can lead to problems. Much better to define type aliases
using the typedef keyword that C provides for this purpose.
I'd suggest
typedef struct ptrRopeNode * ROPE;
except it's usually a bad idea to hide pointers in typedefs
and if it's a typedef I wouldn't use UC. Hence
typedef struct ptrRopeNode Rope;
and I'd probably use another typedef to hide the struct (about
half of clc will disagree though)
This suggestion makes sense, because I agree that the preprocessor
fucks things up. I will revise the next edition to use typedefs as you
suggest. This is why I am sharing this code with this group: to
improve the toxic atmosphere of this group by letting people make
suggestions without impugning their professional credentials (unless
their combined stupidity and malice exceeds an upper bound).
and I still think the abstraction may be poor. Does your application
Don't know if the abstraction can be much better while staying in C
(and that's my point).
(you must have something in mind) need to distinguish strings from
ropes? Is this a generic string class or does the application
deal with "ropes" (which I guess are huge easy to edit strings).
Text editor application?
[...] It hasn't been tested,
I'd strongly suggest compiling *and* testing before posting.
If you write the tests at the same time as the code it is quite
painless
(cf. TDD- Though Don't Drink - the Kool Aid)
As I note above, as a full time Hong Kong teacher who works a six day
week, I work on this during the commute. As I note above, this is a
participatory project for this group.
I am past caring about a "professional reputation", since I don't
choose to work as a programmer for money any more, and the normalized
deviance of this group, and of programming groups in meat space, is
the reason.
with my suggested typedef this would be
Rope *ptrStringNode =
though better would be
Rope *rope =
Very nice.
or (I don't like this but it matches the style of the rest of the
program).
Rope *ptrRope =
adding comments to a mess doesn't make it less of a mess.
Break it down into separate statements. Heavy commenting
is often a sign of poor modularisation (not something to tell
people or they stop writing the comments...).
I need to show the parameters of the call. I agree that heavy
commenting by people who can't write sucks. I can write.
/* perhaps like this */
if (uintSegments <= 1)
Rope *rope = mkRopeNode
(ptrSourceChars,
lngStartIndex,
uintSegmentLength,
ptrParent, NULL, NULL);
else
if (uintSegments < 2)
Rope *rope = mkRopeNode
(ptrSourceChars + SEGMENT_MAX_LENGTH,
lngStartIndex + SEGMENT_MAX_LENGTH,
MIN (uintSegmentLength - SEGMENT_MAX_LENGTH,
uintSegmentLength),
ptrParent,
ptrLeft =
mkRopeNode
(ptrSourceChars, lngStartIndex,
SEGMENT_MAX_LENGTH, NULL, NULL, NULL),
NULL);
else
Rope *rope = mkRopeNode
(ptrSourceChars + SEGMENT_MAX_LENGTH,
lngStartIndex + SEGMENT_MAX_LENGTH,
MIN(uintSegmentLength - SEGMENT_MAX_LENGTH,
uintSegmentLength),
ptrParent,
ptrLeft =
mkRopeNode (ptrSourceChars, lngStartIndex,
SEGMENT_MAX_LENGTH, NULL,
NULL, NULL),
ptrRight =
mkString
(NULL,
ptrSourceChars +
uintDoubleSegmentLength,
lngStartIndex +
uintDoubleSegmentLength,
uintSegmentLength -
uintDoubleSegmentLength));
[I may have introduced bugs but it illustrates the idea]
Yes, and unlike Richard Heathfield, I won't use the bugs to impugn
your professional reputation.
It's still messy but it tries to break things into steps.
I'd use yet more temporary variables. I don't like that ptrRight
ptrLeft assignment in the function call. They seem candidates to be
assigned to outside the function call.
OK, I prefer expressions to statements. I should have learned Lisp
more thoroughly and stayed with it.
I haven't had the time to see if the revised statement compiles since
I am adding comments and improvements to the rest of the code, but the
above is meant, in the spirit of Knuth's definition, to communicate my
intention to human beings [...]
which, I submit, you failed to do.
That's not true. You've been able to make useful suggestions and
rewrite the Godzilla statement in your preferred way.
Anyway, here is the code right now. It hasn't been compiled and it
hasn't been tested. It needs a routine to determineStringLength()
which will use recursion and which will be quite simple to write (add
the length of the left segment to the length passed, zero at the top
level of recursion, and call yourself for the right hand side).
But to make such a routine elegant, it turns out that each node needs
to have the length of all nodes below it, so this entails the revision
of other code.
// ***************************************************************
// * *
// * Rope implementation *
// * *
// * This program implements strings, called ropes, which can *
// * contain any character and can be of any length.
*
// * *
// * This code is best viewed using a monospace font such as *
// * Courier New, in a line length of 67 characters. *
// * *
// * The rest of this header addresses these topics: *
// * *
// * *
// * * The "rope" as represented in this code *
// * * Configuring this code for testing *
// * *
// * *
// * THE "ROPE" AS REPRESENTED IN THIS CODE -------------------- *
// * *
// * The rope is defined as the binary tree of its nodes, where *
// * each node points to up to SEGMENT_MAX_LENGTH characters of *
// * the string. Each node of the binary tree contains: *
// * *
// * *
// * * A pointer to the start of the segment *
// * *
// * * The node type: *
// * *
// * + ENUnodeType.child indicates that the node is a *
// * child node of another node.
*
// * *
// * + ENUnodeType.ancestorLengthKnown indicates that the*
// * node has no parent AND that the length of the *
// * entire string is at this time, known. *
// * *
// * + ENUnodeType.ancestorLengthTooLarge indicates that *
// * the node has no parent AND that the length of the *
// * entire string exceeds a maximum value. *
// * *
// * By default, this maximum value is 2^63-1, the *
// * maximum length of an unsigned "long long" less a *
// * margin to allow for testing the length. The *
// * maximum value, however, can be set to smaller *
// * values for testing. *
// * *
// * + ENUnodeType.ancestorLengthUnknown indicates that *
// * the node has no parent AND that the length of the *
// * entire string is at this time, fully unknown. *
// * *
// * * The 64 bit unsigned string length. Note that this *
// * value may be unusable: *
// * *
// * + The length may be undetermined: here the node *
// * type will be ENUnodeType.ancestorLengthUnknown *
// * *
// * + The length may exceed long-long precision: here *
// * the node type will be *
// * ENUnodeType.ancestorLengthUnknown *
// * *
// * + The node may be a child node: here the node type *
// * will be ENUnodeType.child *
// * *
// * * The start index of the rope segment (substring) *
// * within the full string; note that in rare cases this*
// * is unknown when the string is longer than 2^64-1 *
// * characters, and the segment is located on the right *
// * of the string, this value may also be unknown. Here *
// * it is set to -1 (UNKNOWNLENGTH). *
// * *
// * * The length of the rope segment (substring) *
// * *
// * * The address of the parent of the node or NULL *
// * *
// * * The address of the left child of the node (holding *
// * characters to the left of the segment) or NULL *
// * *
// * * The address of the right child of the node (holding *
// * characters to the right of the segment) or NULL *
// * *
// * *
// * CONFIGURING THIS CODE FOR TESTING ------------------------- *
// * *
// * The following symbols may be set to configure a test of this*
// * code. *
// * *
// * *
// * * SEGMENT_MAX_LENGTH: maximum length of a rope string *
// * segment *
// * *
// * * STRING_MAX_RECORDABLE_LENGTH: maximum recordable *
// * length of a string, beyond which the string length *
// * is officially "unknown". To test whether strings *
// * whose lengths are not known, this symbol may be set *
// * to a small value *
// * *
// * * TEST_STRING_LENGTH: length of the string used to *
// * test this code *
// * *
// * *
// * C H A N G E R E C O R D --------------------------------- *
// * DATE PROGRAMMER DESCRIPTION OF CHANGE *
// * -------- ---------- --------------------------------- *
// * 08 14 09 Nilges Started version 1 *
// * *
// * *
// * I S S U E S ----------------------------------------------- *
// * DATE POSTER DESCRIPTION AND RESOLUTION *
// * -------- ---------- --------------------------------- *
// * 08 14 09 Nilges To be implemented: use extra *
// * precision arithmetic for unbounded*
// * strings *
// * *
// * *
// ***************************************************************
// ***** Standard includes ***************************************
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <stddef.h>
// ***** Configuration *******************************************
#define SEGMENT_MAX_LENGTH 32
#define STRING_MAX_RECORDABLE_LENGTH 1024 // Set to 2^63-1 for
// actual use
#define TEST_STRING_LENGTH 2000
// ***** Macros **************************************************
// --- Statement macro to handle errors
#define ERRORHANDLER(strMessage) \
{ \
printf("%s\n", (strMessage)); abort(); \
}
// --- Expression macro to perform malloc
#define MALLOC_(bytesNeeded, cast) \
( (cast)(malloc((bytesNeeded))) )
// --- Statement macro to perform malloc with error checking
#define MALLOC(ptr, bytesNeeded, cast, purpose)
{ \
if (((ptr) = MALLOC_((bytesNeeded),(cast))) == NULL)) \
{ \
printf \
("Can't get storage: purpose was: %s\n", (purpose)); \
errorHandler("Malloc failed"); \
} \
}
// --- Symbols
#define KNOWNLENGTH 0
#define UNKNOWNLENGTH -1
#define ABSURDLENGTH -2
// ***** Rope definition *****************************************
#define ROPE struct TYPropeNode *
enum ENUnodeType
{
child,
ancestorLengthKnown,
ancestorLengthTooLarge,
ancestorLengthUnknown
};
struct TYPropeNode
{
enum ENUnodeType enuNodeType;
char * ptrNodeCharacters;
unsigned long long ulngStringLength;
unsigned long long ulngSegmentStartIndex;
unsigned int uintSegmentLength;
ROPE ptrParent;
ROPE ptrLeft;
ROPE ptrRight;
};
// ***** Procedures **********************************************
// ---------------------------------------------------------------
// Create a rope node
//
// To create the Ancestor node, pass a non-null ptrParent and a
// lngStringLength:
//
// * 0 or positive indicates a known length
//
// * -1 (UNKNOWNLENGTH) indicates a fully unknown length
//
// * -2 (ABSURDLENGTH) indicates an unknown length greater
// than the maximum recordable length
//
//
ROPE mkRopeNode
(char * ptrNodeCharacters,
long long lngStringLength
long long lngSegmentStartIndex,
unsigned int uintSegmentLength,
ROPE ptrParent,
ROPE ptrLeft,
ROPE ptrRight)
{
ROPE ptrNewNode;
MALLOC(ptrNewNode, sizeof(ROPE), ROPE, "Get a rope node")
(*ptrNewNode).ptrNodeCharacters = ptrNodeCharacters;
if (ptrParent == NULL)
{
if (lngStringLength >= 0)
{
(*ptrNewNode).enuNodeType =
ENUnodeType.ancestorLengthKnown;
} else
{
if (lngStringLength == UNKNOWNLENGTH)
{
(*ptrNewNode).enuNodeType =
ENUnodeType.ancestorLengthUnknown;
} else
{
if (lngStringLength == ABSURDLENGTH)
{
(*ptrNewNode).enuNodeType =
ENUnodeType.ancestorLengthTooLarge;
} else
{
errorHandler("Invalid string length");
}
}
}
} else
{
(*ptrNewNode).enuNodeType = ENUnodeType.child;
}
(*ptrNewNode).ulngStringLength = lngStringLength;
(*ptrNewNode).lngSegmentStartIndex = lngSegmentStartIndex;
(*ptrNewNode).uintSegmentLength = uintSegmentLength;
(*ptrNewNode).ptrParent = ptrParent;
(*ptrNewNode).ptrLeft = ptrLeft;
(*ptrNewNode).ptrRight = ptrRight;
return ptrNewNode;
}
// ---------------------------------------------------------------
// Make string
//
//
ROPE mkString
(ROPE ptrParent,
char * ptrSourceChars,
unsigned int lngStartIndex,
unsigned int uintSegmentLength)
{
unsigned int uintOffset;
ROPE ptrLeft = NULL;
ROPE ptrRight = NULL;
unsigned int uintSegments;
unsigned int uintDoubleSegmentLength;
if (uintSegmentLength <= SEGMENT_MAX_LENGTH)
uintSegments = 1;
else
{
if (uintSegmentLength
<=
(uintDoubleSegmentLength = SEGMENT_MAX_LENGTH << 1))
uintSegments = 2;
else
uintSegments = 3;
}
ROPE ptrStringNode =
mkRopeNode
(// ----- Pointer to source chars -----------------------
(uintSegments > 1
?
ptrSourceChars + SEGMENT_MAX_LENGTH
:
ptrSourceChars),
// ----- Parent segment start index --------------------
(lngStartIndex
+
(uintSegments > 1
?
SEGMENT_MAX_LENGTH
:
0)),
// ----- Parent segment length (screw you Heathfield) --
(uintSegments > 1
?
MIN(uintSegmentLength - SEGMENT_MAX_LENGTH,
uintSegmentLength)
:
uintSegmentLength),
// ----- Parent's own parent or null -------------------
ptrParent,
// ----- Left child node exists when segments > 1 ------
(uintSegments > 1
?
ptrLeft = mkRopeNode
(ptrSourceChars,
lngStartIndex,
SEGMENT_MAX_LENGTH,
NULL, NULL, NULL)
:
NULL),
// ----- Right child node exists when segments > 1 ----
(uintSegments > 2
?
(ptrRight = mkString
(NULL,
ptrSourceChars
+
uintDoubleSegmentLength,
lngStartIndex
+
uintDoubleSegmentLength,
uintSegmentLength
-
uintDoubleSegmentLength))
:
NULL);
if (ptrStringNode != NULL)
{
(*ptrStringNode).ptrLeft = ptrLeft;
(*ptrStringNode).ptrRight = ptrRight;
if (ptrLeft != NULL)
(*((*ptrStringNode).ptrLeft)).ptrParent =
ptrStringNode;
if (ptrRight != NULL)
(*((*ptrStringNode).ptrRight)).ptrParent =
ptrStringNode;
}
return ptrStringNode;
}
// ---------------------------------------------------------------
// Find rope segment's ancestor (top of rope tree)
//
//
ROPE findRopeAncestor(ROPE ptrRopeNode)
{
if (ptrToNode == NULL)
ERRORHANDLER
("Null parameter passed to findRopeAncestor()");
if ((*ptrRopeNode).enuNodeType != ENUnodeType.child)
return ptrRopeNode;
while((*ptrRopeNode).ptrParent != NULL)
ptrRopeNode = (*ptrRopeNode).ptrParent;
return ptrRopeNode;
}
// ---------------------------------------------------------------
// Determine rope length
//
//
void determineRopeLength(ROPE ptrToNode,
unsigned long long ulngLength)
{
// Will check for overflow (old length plus length of left
// segment out of bounds), then, if no overflow, will add
// the left segment length to the length of the right subtree
}
// ---------------------------------------------------------------
// Rope to string length
//
//
unsigned long long rope2StringLength
(ROPE ptrToNode, int * intPtrStatus)
{
if (ptrToNode == NULL)
ERRORHANDLER
("Null parameter passed to rope2StringLength()");
ptrToNode = findRopeAncestor(ptrToNode);
if ((*ptrToNode).enuNodeType
==
ENUnodeType.ancestorLengthUnknown)
determineRopeLength(ptrToNode);
if ((*ptrToNode).enuNodeType
==
ENUnodeType.ancestorLengthKnown)
{
(*intPtrStatus) = KNOWNLENGTH;
return (*ptrToNode).ulngStringLength;
}
if ((*ptrToNode).enuNodeType
==
ENUnodeType.ancestorLengthUnknown)
{
(*intPtrStatus) = UNKNOWNLENGTH;
return 0;
} else
{
if ((*ptrToNode).enuNodeType
==
ENUnodeType.ancestorLengthTooLarge)
{
(*intPtrStatus) = UNKNOWNLENGTH;
return 0;
} else
{
ERRORHANDLER("Invalid node type found")
}
}
}
char* rope2String(ROPE ptrToNode)
{
char * ptrStringCopy =
(char *)malloc(rope2StringLength(ptrToNode));
if (ptrStringCopy == NULL) return NULL;
}
int main()
{
printf("Unlimited strings in basic C");
// ***** Insert test code *****
return 0;
}