reformatting strings and memory issue

D

David Jacques

I have to reformat a string from the form

"SRID=4269;POINT(-90.673 69.4310000006199)"

to

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );

I have a function to do this by taking the input string as char* and
returning
another char* as the result string. This function is called repeatedly in a
loop
(once for each table record). However the second time this function is
called,
the memory is not allocated. I figure out the size of the output string by
the
length of the inputs + 1 for the null char. I can't issue free() within the
function
since that memory is the char* object I am returning. Anyone have an idea
what
is going on ?


char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;

pchr1 = strrchr(in_geom_columndata,'=');
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);
sGeometry = (char*)calloc(Length,sizeof(char));
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
return sGeometry;
}
 
J

Jens.Toerring

David Jacques said:
I have to reformat a string from the form
"SRID=4269;POINT(-90.673 69.4310000006199)"

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );

That doesn't exactly look like what you're trying to do below, but I
guess you just did got something mixed up here - otherwise there
would have to be some transformations of the numbers after "POINT".
I have a function to do this by taking the input string as char* and
returning another char* as the result string. This function is called
repeatedly in a loop (once for each table record). However the second
time this function is called, the memory is not allocated. I figure
out the size of the output string by the length of the inputs + 1 for
the null char.

No, you don't, see below.
char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;
pchr1 = strrchr(in_geom_columndata,'=');

It's prudent to check that strrchr() didn't return NULL. (Why do
you use strrchr() instead of strchr(), BTW?)
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);

That's not enough memory since you also try to write the number
stored in sSRID plus some extra characters (quotes, a comma and
probably also two parentheses and a semicolon, but which you
probably just forgot in the format string) into the string.
sGeometry = (char*)calloc(Length,sizeof(char));
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
return sGeometry;
}
Regards, Jens
 
D

David Jacques

Oops...

I used the wrong cut and paste for the example.
The number pairs should be identical as below:

input : "SRID=4269;POINT(-90.673 69.4310000006199)"
output : "GeometryFromText('POINT (-90.673 69.4310000006199)',4269) ); "

As for strchr() vs strrchr() I guess its academic since each character
I search for to split the string ('=' and ';' ) occurs only once in the
input string anyway.

I'll look at the extra characters issue. It does look line I need to extend
the length
by the # of digits in SRID + 6 . However even if the space allocated is too
small,
shouldn't that cause a run-time error in sprintf, not calloc. calloc is what
is not
succeeding.


David Jacques said:
I have to reformat a string from the form
"SRID=4269;POINT(-90.673 69.4310000006199)"

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );

That doesn't exactly look like what you're trying to do below, but I
guess you just did got something mixed up here - otherwise there
would have to be some transformations of the numbers after "POINT".
I have a function to do this by taking the input string as char* and
returning another char* as the result string. This function is called
repeatedly in a loop (once for each table record). However the second
time this function is called, the memory is not allocated. I figure
out the size of the output string by the length of the inputs + 1 for
the null char.

No, you don't, see below.
char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;
pchr1 = strrchr(in_geom_columndata,'=');

It's prudent to check that strrchr() didn't return NULL. (Why do
you use strrchr() instead of strchr(), BTW?)
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);

That's not enough memory since you also try to write the number
stored in sSRID plus some extra characters (quotes, a comma and
probably also two parentheses and a semicolon, but which you
probably just forgot in the format string) into the string.
sGeometry = (char*)calloc(Length,sizeof(char));
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
return sGeometry;
}
Regards, Jens
 
C

Charlie Gordon

David Jacques said:
I have to reformat a string from the form

"SRID=4269;POINT(-90.673 69.4310000006199)"

to

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );
there seems to be an extra closing parenthesis ?

C is notoriously bad at doing this kind of textual stuff: a one line awk script
would do fine.
I have a function to do this by taking the input string as char* and
returning
another char* as the result string. This function is called repeatedly in a
loop
(once for each table record). However the second time this function is
called,
the memory is not allocated. I figure out the size of the output string by
the
length of the inputs + 1 for the null char. I can't issue free() within the
function
since that memory is the char* object I am returning. Anyone have an idea
what
is going on ?


char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
useless, inefficient, use const char *sGeomPrefix = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;

pchr1 = strrchr(in_geom_columndata,'=');
why strrchr() instead of strchr()? also should check pchr1 != NULL
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
same as above.
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);
not large enough: forgot the space needed for the ID and the punctuation.
sGeometry = (char*)calloc(Length,sizeof(char));
who is going to free() this buffer ?
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
format string is hard to understand, missing closing parentheses, why not use:
snprintf(sGeometry, Length, "GeometryFromText('%s', %d);", pchr2,sSRID);
after painfully computing Length.
or use a better approach using asprintf() (*) :
char *p;
asprintf(&p, "GeometryFromText('%s', %d);", pchr2,sSRID);
if (p == NULL) {
/* complain about memory allocation failure */
} else {
return p;
}
return sGeometry;
}

The problem with your approach is that you do not clearly specify whose
responsibility it is to free() the pointer allocated by you function. The
pseudo spec is very confusing. You should define what you pass the function and
how the function returns the result.
A more afficient approach would be to pass the function a buffer large enough to
hold the result string along with its size, and use snprintf into that directly.

From your description, I assume you intend the following behaviour :

FILE *fin, *fout;
char buf[1024]; /* some palatable maximum size */
char buf2[1024]; /* some palatable maximum size */
fin = fopen(input_filename, "r");
fout = fopen(output_filename, "w");
/* fopen failure tests omitted */
while (fgets(buf, sizeof(buf), fin)) {
char *p;
if ((p = strrchr(buf, '\n')) != NULL)
*p = '\0';
BuildGeometryInsert(buf, buf2, sizeof(buf2));
fprintf(fout, "%s\n", buf2);
}
fclose(fout);
fclose(fin);

the "reformatting" function could return an error code, error handling is
omitted here.

-------------

(*) As for the sissies about to complain on non standard functions like
asprintf(), here is an almost portable implementation that you can add to your
project :

#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>

int asprintf(char **ptr, const char *fmt, ...);
int vasprintf(char **ptr, const char *fmt, va_list ap);

int asprintf(char **ptr, const char *fmt, ...) {
va_list ap;
int res;

va_start(ap, fmt);
res = vasprintf(ptr, fmt, ap);
va_end(ap);

return res;
}

int vasprintf(char **ptr, const char *fmt, va_list ap) {
int res, size;

res = vsnprintf(NULL, 0, fmt, ap);
if (res < 0) {
*ptr = NULL;
} else {
*ptr = malloc(res + 1);
if (*ptr) {
res = vsnprintf(*ptr, res + 1, fmt, ap);
}
}
return res;
}

portability issues are:
- reuse of ap argument twice in vasprintf()
- assumes correct return value from snprintf()
 
J

Jens.Toerring

David Jacques said:
I'll look at the extra characters issue. It does look line I need to extend
the length by the # of digits in SRID + 6 . However even if the space
allocated is too small, shouldn't that cause a run-time error in sprintf,
not calloc. calloc is what is not succeeding.

No, unfortunately that's not how it typically works. Once you write into
memory that doesn't belong to you can't count on anything anymore. You
may (or may not) write over some other data (or, on some machines, also
parts of your program). But quite often that has no immediate effect but
only creates a delayed problem, i.e. something that fails only when the
data you overwrote are actually needed, which might be a long time after
you destroyed them. It's quite common that such problems result in crashes
in malloc(), free() etc, since these functions are often implemented in
a way that bookkeeping information is stored between the allocated
blocks of memory, so if you write past the end of a such a block you
can destroy these important data and in a later call of malloc() or
free() etc. things suddenly fail.

Regards, Jens
 
C

Chris Croughton

No, unfortunately that's not how it typically works. Once you write into
memory that doesn't belong to you can't count on anything anymore. You
may (or may not) write over some other data (or, on some machines, also
parts of your program). But quite often that has no immediate effect but
only creates a delayed problem, i.e. something that fails only when the
data you overwrote are actually needed, which might be a long time after
you destroyed them. It's quite common that such problems result in crashes
in malloc(), free() etc, since these functions are often implemented in
a way that bookkeeping information is stored between the allocated
blocks of memory, so if you write past the end of a such a block you
can destroy these important data and in a later call of malloc() or
free() etc. things suddenly fail.

The dmalloc wrapper can help with that, especially if you enable 'guard'
areas (if you set it to output everything it (a) takes a very long time
but (b) often indicates exactly what is writing to where). Since it
does memory stuff it is system specific (to POSIX/Unix systems,
including Cygwin). See http://sourceforge.net/projects/dmalloc/ (and
several distributions have it available as a binary). There are also
similar things available for Windows but I haven't used them.

Chris C
 
J

jjr2004a

I have to reformat a string from the form
"SRID=4269;POINT(-90.673 69.4310000006199)"

to

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );

char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;

pchr1 = strrchr(in_geom_columndata,'=');
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);
sGeometry = (char*)calloc(Length,sizeof(char));
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
return sGeometry;
}

The are two main issues here:

1. You not allocated enough space since you haven't accounted for the
length of SSRID in your length calculation.

2. You have not placed a '\0' at the end of your char data.

Other issues:

1. Why use atoi if your just going to convert back to char* again?
You can eliminate sprintf (and issue #2 above) if you manipulate
char* data via strcpy, strncpy, strcat, etc.
 
D

David Jacques

I've shelved the need for this function since I discovered that
the database table can accept the geometry column data in the form
"SRID=XXXX;GeometryType(coordinatePair1,.....coordinatePairN)"

So I was following the syntax of the SQL statement from the PostGIS
manual a tad too literally. However everyone's advice did help for
other parts of this project.



jjr2004a said:
I have to reformat a string from the form

"SRID=4269;POINT(-90.673 69.4310000006199)"

to

GeometryFromText('POINT (-141.095 68.5430000006417)',4269) );

char* BuildGeometryInsert(char* in_geom_columndata)
{
int sSRID;
char sGeomPrefix[] = "GeometryFromText('";
char* pchr1;
char* pchr2;
int Length;
char* sGeometry;

pchr1 = strrchr(in_geom_columndata,'=');
pchr1++;
sSRID = atoi(pchr1);
pchr2 = strrchr(in_geom_columndata,';');
pchr2++;
Length = (strlen(sGeomPrefix) + strlen(pchr2)+1);
sGeometry = (char*)calloc(Length,sizeof(char));
if (sGeometry == NULL)
{
printf("\nFATAL ERROR: Cannot allocate memory for sGeometry\n");
printf(" Exit now...\n");
exit(0);
}
sprintf(sGeometry,"%s%s%',%d",sGeomPrefix,pchr2,sSRID);
return sGeometry;
}

The are two main issues here:

1. You not allocated enough space since you haven't accounted for the
length of SSRID in your length calculation.

2. You have not placed a '\0' at the end of your char data.

Other issues:

1. Why use atoi if your just going to convert back to char* again?
You can eliminate sprintf (and issue #2 above) if you manipulate
char* data via strcpy, strncpy, strcat, etc.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top