Memory allocation problem

Discussion in 'C Programming' started by Bob, Mar 23, 2006.

  1. Bob

    Bob Guest

    I have been working on the following program. The goal is to have a
    tokenizing routine that avoids some of the problems of strtok(), the
    comments should explain the features.

    This runs fine on Solaris/gcc, but crashes when run from VC++ (C mode).
    The problem occurs after the first true reallocation (second pass
    through main loop). The little debug part at the end of the loop prints
    one time, it crashes before a second is displayed. Seems likely that
    something is trashing the heap but I haven't spotted it.

    Code follows:

    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>


    void cleanup(char **array, size_t n)
    {
    if (array != 0)
    {
    unsigned int i;
    for (i = 0; i <= n; i++)
    {
    free(array);
    }

    free(array);
    }
    }

    /* Divides a string into tokens
    Does not alter input string
    Single delimiter char
    Does not merge adjacent delimiters
    Returns null-ptr-terminated array of substring pointers
    */

    char** tokenize (const char *instring, int delimiter)
    {
    char **strarray = 0;
    char **tmp = 0;
    char *tok = 0;
    const char *start = instring;
    const char *end = 0;
    int done = 0;
    int n = 0;
    size_t len = 0;
    int i;

    if (start == 0)
    {
    return 0;
    }

    while (!done)
    {
    end = strchr(start, delimiter);

    if (end == 0) /* end of string */
    {
    len = strlen(start);
    done = 1;
    }
    else
    {
    len = end - start;
    }

    n++; /* reflects num elements in array */

    /* allocates one extra pointer for null */
    tmp = realloc(strarray, n+1);

    if (tmp == 0) /* allocation failed */
    {
    cleanup(strarray, n);
    strarray = 0;
    done = 1;
    }
    else
    {
    strarray = tmp;
    tok = malloc(len+1);
    if (tok == 0)
    {
    cleanup(strarray, n);
    strarray = 0;
    done = 1;
    }
    else
    {
    strncpy(tok, start, len);
    tok[len] = 0;
    strarray[n-1] = tok;
    strarray[n] = 0;
    }
    }

    start = end + 1;

    /* debug code intermediate state of array */
    for(i=0;strarray;i++)
    puts(strarray);


    }

    return strarray;

    }


    int main(void)
    {
    char *s1 = "one and two and three";
    char **arr;
    int i = 0;

    arr = tokenize(s1, ' ');

    if (arr == 0)
    {
    puts("Error");
    }

    else
    {
    while (arr != 0)
    {
    puts(arr);
    i++;
    }
    }

    return 0;
    }
     
    Bob, Mar 23, 2006
    #1
    1. Advertising

  2. Bob

    santosh Guest

    Bob wrote:
    > I have been working on the following program. The goal is to have a
    > tokenizing routine that avoids some of the problems of strtok(), the
    > comments should explain the features.
    >
    > This runs fine on Solaris/gcc, but crashes when run from VC++ (C mode).
    > The problem occurs after the first true reallocation (second pass> through main loop). The little debug part at the end of the loop prints
    > one time, it crashes before a second is displayed. Seems likely that
    > something is trashing the heap but I haven't spotted it.
    >
    > Code follows:
    >
    > #include <stdio.h>
    > #include <string.h>
    > #include <stdlib.h>
    >
    >
    > void cleanup(char **array, size_t n)
    > {
    > if (array != 0)
    > {
    > unsigned int i;
    > for (i = 0; i <= n; i++)
    > {
    > free(array);
    > }
    >
    > free(array);
    > }
    > }
    >
    > /* Divides a string into tokens
    > Does not alter input string
    > Single delimiter char
    > Does not merge adjacent delimiters
    > Returns null-ptr-terminated array of substring pointers
    > */
    >
    > char** tokenize (const char *instring, int delimiter)


    Why not declare delimiter as just char?

    > {
    > char **strarray = 0;
    > char **tmp = 0;
    > char *tok = 0;
    > const char *start = instring;
    > const char *end = 0;
    > int done = 0;
    > int n = 0;
    > size_t len = 0;
    > int i;
    >
    > if (start == 0)
    > {
    > return 0;
    > }
    >
    > while (!done)
    > {
    > end = strchr(start, delimiter);
    >
    > if (end == 0) /* end of string */


    Not quite. If delimiter is not a null character and it doesn't occur in
    start, strchr will return null pointer.

    > {
    > len = strlen(start);
    > done = 1;
    > }
    > else
    > {
    > len = end - start;
    > }


    All the above logic will fail unless delimiter is a null character or
    if not, *does* occur in start.
     
    santosh, Mar 23, 2006
    #2
    1. Advertising

  3. Bob

    CBFalconer Guest

    Bob wrote:
    >
    > I have been working on the following program. The goal is to have
    > a tokenizing routine that avoids some of the problems of strtok(),
    > the comments should explain the features.
    >
    > This runs fine on Solaris/gcc, but crashes when run from VC++ (C
    > mode). The problem occurs after the first true reallocation
    > (second pass through main loop). The little debug part at the end
    > of the loop prints one time, it crashes before a second is
    > displayed. Seems likely that something is trashing the heap but I
    > haven't spotted it.
    >
    > Code follows:


    .... code snipped ...

    Try the following:

    /* ------- file toksplit.h ----------*/
    #ifndef H_toksplit_h
    # define H_toksplit_h

    # ifdef __cplusplus
    extern "C" {
    # endif

    #include <stddef.h>

    /* copy over the next token from an input string, after
    skipping leading blanks (or other whitespace?). The
    token is terminated by the first appearance of tokchar,
    or by the end of the source string.

    The caller must supply sufficient space in token to
    receive any token, Otherwise tokens will be truncated.

    Returns: a pointer past the terminating tokchar.

    This will happily return an infinity of empty tokens if
    called with src pointing to the end of a string. Tokens
    will never include a copy of tokchar.

    released to Public Domain, by C.B. Falconer.
    Published 2006-02-20. Attribution appreciated.
    */

    const char *toksplit(const char *src, /* Source of tokens */
    char tokchar, /* token delimiting char */
    char *token, /* receiver of parsed token */
    size_t lgh); /* length token can receive */
    /* not including final '\0' */

    # ifdef __cplusplus
    }
    # endif
    #endif
    /* ------- end file toksplit.h ----------*/

    /* ------- file toksplit.c ----------*/
    #include "toksplit.h"

    /* copy over the next token from an input string, after
    skipping leading blanks (or other whitespace?). The
    token is terminated by the first appearance of tokchar,
    or by the end of the source string.

    The caller must supply sufficient space in token to
    receive any token, Otherwise tokens will be truncated.

    Returns: a pointer past the terminating tokchar.

    This will happily return an infinity of empty tokens if
    called with src pointing to the end of a string. Tokens
    will never include a copy of tokchar.

    A better name would be "strtkn", except that is reserved
    for the system namespace. Change to that at your risk.

    released to Public Domain, by C.B. Falconer.
    Published 2006-02-20. Attribution appreciated.
    */

    const char *toksplit(const char *src, /* Source of tokens */
    char tokchar, /* token delimiting char */
    char *token, /* receiver of parsed token */
    size_t lgh) /* length token can receive */
    /* not including final '\0' */
    {
    if (src) {
    while (' ' == *src) *src++;

    while (*src && (tokchar != *src)) {
    if (lgh) {
    *token++ = *src;
    --lgh;
    }
    src++;
    }
    if (*src && (tokchar == *src)) src++;
    }
    *token = '\0';
    return src;
    } /* toksplit */

    #ifdef TESTING
    #include <stdio.h>

    #define ABRsize 6 /* length of acceptable token abbreviations */

    int main(void)
    {
    char teststring[] = "This is a test, ,, abbrev, more";

    const char *t, *s = teststring;
    int i;
    char token[ABRsize + 1];

    puts(teststring);
    t = s;
    for (i = 0; i < 4; i++) {
    t = toksplit(t, ',', token, ABRsize);
    putchar(i + '1'); putchar(':');
    puts(token);
    }

    puts("\nHow to detect 'no more tokens'");
    t = s; i = 0;
    while (*t) {
    t = toksplit(t, ',', token, 3);
    putchar(i + '1'); putchar(':');
    puts(token);
    i++;
    }

    puts("\nUsing blanks as token delimiters");
    t = s; i = 0;
    while (*t) {
    t = toksplit(t, ' ', token, ABRsize);
    putchar(i + '1'); putchar(':');
    puts(token);
    i++;
    }
    return 0;
    } /* main */

    #endif
    /* ------- end file toksplit.c ----------*/

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Also see <http://www.safalra.com/special/googlegroupsreply/>
     
    CBFalconer, Mar 23, 2006
    #3
  4. Bob

    Pedro Graca Guest

    Bob wrote:
    > I have been working on the following program. The goal is to have a
    > tokenizing routine that avoids some of the problems of strtok(), the
    > comments should explain the features.
    >
    > This runs fine on Solaris/gcc, but crashes when run from VC++ (C mode).
    > The problem occurs after the first true reallocation (second pass
    > through main loop). The little debug part at the end of the loop prints
    > one time, it crashes before a second is displayed. Seems likely that
    > something is trashing the heap but I haven't spotted it.


    Try valgrind.

    $ valgrind --tool=memcheck ./a.out
    ==13772== Memcheck, a memory error detector for x86-linux.
    ==13772== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al.
    ==13772== Using valgrind-2.4.0, a program supervision framework for x86-linux.
    ==13772== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al.
    ==13772== For more details, rerun with: -v
    ==13772==
    ==13772== Invalid write of size 4
    ==13772== at 0x80485D3: tokenize (foo.c:124)
    ==13772== by 0x8048681: main (foo.c:144)
    ==13772== Address 0x1BA51028 is 0 bytes inside a block of size 2 alloc'd

    [biiiiiiiiiiiig snip]

    ==13772== ERROR SUMMARY: 16 errors from 8 contexts (suppressed: 13 from 1)
    ==13772== malloc/free: in use at exit: 11 bytes in 3 blocks.
    ==13772== malloc/free: 4 allocs, 1 frees, 13 bytes allocated.
    ==13772== For counts of detected errors, rerun with: -v
    ==13772== searching for pointers to 3 not-freed blocks.
    ==13772== checked 77824 bytes.
    ==13772==
    ==13772== LEAK SUMMARY:
    ==13772== definitely lost: 8 bytes in 2 blocks.
    ==13772== possibly lost: 0 bytes in 0 blocks.
    ==13772== still reachable: 3 bytes in 1 blocks.
    ==13772== suppressed: 0 bytes in 0 blocks.
    ==13772== Use --leak-check=full to see details of leaked memory.

    > Code follows:

    [snip]
    > char** tokenize (const char *instring, int delimiter)
    > {

    [snip]
    > /* allocates one extra pointer for null */
    > tmp = realloc(strarray, n+1);


    BANG!

    Try this instead

    tmp = realloc(strarray, (n+1) * sizeof *tmp);


    --
    If you're posting through Google read <http://cfaj.freeshell.org/google>
     
    Pedro Graca, Mar 23, 2006
    #4
  5. Bob

    Flash Gordon Guest

    Bob wrote:
    > I have been working on the following program. The goal is to have a
    > tokenizing routine that avoids some of the problems of strtok(), the
    > comments should explain the features.
    >
    > This runs fine on Solaris/gcc, but crashes when run from VC++ (C mode).
    > The problem occurs after the first true reallocation (second pass
    > through main loop). The little debug part at the end of the loop prints
    > one time, it crashes before a second is displayed. Seems likely that
    > something is trashing the heap but I haven't spotted it.
    >
    > Code follows:
    >
    > #include <stdio.h>
    > #include <string.h>
    > #include <stdlib.h>
    >
    >
    > void cleanup(char **array, size_t n)
    > {
    > if (array != 0)
    > {
    > unsigned int i;
    > for (i = 0; i <= n; i++)


    This looks dubious. Normally in C you have
    for (i = 0; i < n; i++)
    In your case, since you null terminate your array of pointers I would use:
    for (i = 0; array; i++)

    Then get rid of the extra parameter. This would also allow you to call
    it from main to free things up without counting it first.

    > {
    > free(array);
    > }
    >
    > free(array);
    > }
    > }
    >
    > /* Divides a string into tokens
    > Does not alter input string
    > Single delimiter char
    > Does not merge adjacent delimiters
    > Returns null-ptr-terminated array of substring pointers
    > */
    >
    > char** tokenize (const char *instring, int delimiter)
    > {
    > char **strarray = 0;
    > char **tmp = 0;
    > char *tok = 0;
    > const char *start = instring;
    > const char *end = 0;
    > int done = 0;
    > int n = 0;
    > size_t len = 0;
    > int i;
    >
    > if (start == 0)
    > {
    > return 0;
    > }
    >
    > while (!done)
    > {
    > end = strchr(start, delimiter);
    >
    > if (end == 0) /* end of string */
    > {
    > len = strlen(start);
    > done = 1;
    > }
    > else
    > {
    > len = end - start;
    > }
    >
    > n++; /* reflects num elements in array */
    >
    > /* allocates one extra pointer for null */
    > tmp = realloc(strarray, n+1);


    realloc takes a count of bytes, it doesn't know it is allocating space
    for pointers.
    tmp = realloc(strarray, (n+1) * sizeof *ptr);

    > if (tmp == 0) /* allocation failed */
    > {
    > cleanup(strarray, n);

    cleanup(strarray, n-1);
    > strarray = 0;
    > done = 1;
    > }
    > else
    > {
    > strarray = tmp;
    > tok = malloc(len+1);


    OK, here you may need the +1 to allow space for the null termination.
    I've not checked the algorithm.

    > if (tok == 0)
    > {
    > cleanup(strarray, n);
    > strarray = 0;
    > done = 1;
    > }
    > else
    > {
    > strncpy(tok, start, len);
    > tok[len] = 0;
    > strarray[n-1] = tok;
    > strarray[n] = 0;
    > }
    > }
    >
    > start = end + 1;
    >
    > /* debug code intermediate state of array */
    > for(i=0;strarray;i++)
    > puts(strarray);
    >
    >
    > }
    >
    > return strarray;
    >
    > }
    >
    >
    > int main(void)
    > {
    > char *s1 = "one and two and three";
    > char **arr;
    > int i = 0;
    >
    > arr = tokenize(s1, ' ');
    >
    > if (arr == 0)
    > {
    > puts("Error");
    > }
    >
    > else
    > {
    > while (arr != 0)
    > {
    > puts(arr);
    > i++;
    > }
    > }


    You ought to clean up here.

    > return 0;
    > }

    --
    Flash Gordon, living in interesting times.
    Web site - http://home.flash-gordon.me.uk/
    comp.lang.c posting guidelines and intro:
    http://clc-wiki.net/wiki/Intro_to_clc
     
    Flash Gordon, Mar 23, 2006
    #5
  6. Bob

    Bob Guest

    Pedro Graca wrote:
    > Bob wrote:


    > > /* allocates one extra pointer for null */
    > > tmp = realloc(strarray, n+1);

    >
    > BANG!
    >
    > Try this instead
    >
    > tmp = realloc(strarray, (n+1) * sizeof *tmp);



    Oh boy, that was bad. Can you tell I've been away working in C++ for a
    while? I forgot all about the size thing! Thanks much.
     
    Bob, Mar 23, 2006
    #6
  7. On Thu, 23 Mar 2006-0500,CBFalconer<> wrote:
    >Bob wrote:
    >> I have been working on the following program. The goal is to have
    >> a tokenizing routine that avoids some of the problems of strtok(),
    >> the comments should explain the features.
    >>
    >> This runs fine on Solaris/gcc, but crashes when run from VC++ (C
    >> mode). The problem occurs after the first true reallocation
    >> (second pass through main loop). The little debug part at the end
    >> of the loop prints one time, it crashes before a second is
    >> displayed. Seems likely that something is trashing the heap but I
    >> haven't spotted it.
    >>
    >> Code follows:

    >
    >... code snipped ...
    >
    >Try the following:


    .... code snipped ...

    Try the following: (i think it is better and more fast than you all
    routines for break tokens )

    ; nasmw -f obj this_file.asm
    ; bcc32 file.c this_file.obj

    section _DATA public align=4 class=DATA use32

    global _tkn_line
    global _delimita

    array times 258 dd 0

    section _TEXT public align=1 class=CODE use32

    ; void delimita(char* delimitatori)
    ; inserisce i *caratteri* delimitatori per i token
    ; usati da tkn_line (sotto forma di stringa)
    ; s=0j, 4b, 8a, 12ra, 16@delimitatori
    _delimita:
    push eax
    push ebx
    push edi
    %define @delimitatori esp+16
    mov eax, array
    mov edi, 0
    ..c0:
    mov dword[eax+4*edi], 0
    inc edi
    cmp edi, 255
    jbe .c0
    mov edi, [@delimitatori]
    xor ebx, ebx
    ..c1:
    cmp byte[edi], 0
    je .cf
    mov bl, [edi]
    mov dword[eax+4*ebx], 1
    inc edi
    jmp short .c1
    ..cf:
    %undef @delimitatori
    pop edi
    pop ebx
    pop eax
    ret

    ; int tkn_line(char** v, char* buf, int limit)
    ; divide la linea puntata da "buf" in token di numero
    ; massimo "limit" e li termina nello stesso array con '\0'.
    ; inoltre fa puntare un puntatore al primo elemento di ciascun
    ; token e ritorna il numero di token trovati oppure -1
    ; se ci sono tokens maggiori di "limit" nella linea
    ; s=0j, 4i, 8b, 12ra, 16@v, 20@buf, 24@limit
    _tkn_line:
    push ebx
    push esi
    push edi
    %define @v esp+16
    %define @buf esp+20
    %define @limit esp+24
    xor edi, edi
    cmp dword[@v], 0
    je .fn
    cmp dword[@buf], 0
    je .fn
    cmp dword[@limit], 0
    jle .fn
    mov esi, [@buf]
    xor ebx, ebx
    ..a0:
    mov bl, [esi]
    cmp dword[array+4*ebx], 0
    je .a1
    inc esi
    jmp short .a0
    ..a1:
    cmp ebx, 0
    je .fn
    cmp edi, [@limit]
    jne .a2
    mov edi, -1
    jmp short .fn
    ..a2:
    mov eax, [@v]
    mov dword[eax+4*edi], esi
    inc edi
    ..a3:
    mov bl, [esi]
    cmp ebx, 0
    je .a4
    cmp dword[array+4*ebx], 0
    jne .a4
    inc esi
    jmp short .a3
    ..a4:
    cmp ebx, 0
    je .fn
    mov byte[esi], 0
    inc esi
    jmp short .a0
    ..fn:
    mov eax, edi
    %undef @v
    %undef @buf
    %undef @limit
    pop edi
    pop esi
    pop ebx
    ret

    #include <stdio.h>

    static const char *pc_pro = "[insert tokens (EOF for end)]> ";

    /* inserisci i delimitatori per il token */
    void delimita(char* delimitatori);

    /* rompe la stringa di "buf" in tokens accessibili
    tramite "v" */
    int tkn_line(char** v, char* buf, int limit);

    int skip_line(FILE* pf)
    {int c; while( (c=fgetc(pf))!='\n' && c!=EOF ); return c;}

    #if 1
    #define BUFSIZ1 32
    #else
    #define BUFSIZ1 BUFSIZ
    #endif

    int main(void)
    {char buf[BUFSIZ1] = {0}, *a[32], *pc;
    struct you *p;
    int cv, i;
    /////////////////////////////////
    delimita(" \t+*\n");
    printf("delimitatori=%s BSZ=%u\n",
    "<SPACE><TAB>+*<NEW_LINE>", (unsigned)BUFSIZ1);
    la0:;
    while (1)
    {printf("%s", pc_pro); fflush(stdout);
    buf[BUFSIZ1-2]=0; /* massimo BUFSIZ1-1 chars */
    if( fgets(buf, BUFSIZ1, stdin )!=0 )
    {if(buf[BUFSIZ1-2]!=0 && buf[BUFSIZ1-2]!='\n')
    {printf("Linea troppo lunga\n");
    if(feof(stdin)) return 0;
    if( skip_line(stdin) == EOF )
    return 0;
    goto la0;
    }

    cv=tkn_line(a, buf, 8);
    if(cv<=0 || cv>8)
    {printf("Parametri non corretti\n");
    goto la;
    }
    for(i=0; i<cv; ++i)
    printf("%s^^", a);
    printf("\n");
    }// if fgets
    la:;
    if(feof(stdin)) return 0;
    }//while
    }
     
    RSoIsCaIrLiIoA, Mar 24, 2006
    #7
  8. On Fri, 24 Mar 2006 10:52:14 +0100, RSoIsCaIrLiIoA <> wrote:

    >On Thu, 23 Mar 2006-0500,CBFalconer<> wrote:
    >>Bob wrote:
    >>> I have been working on the following program. The goal is to have
    >>> a tokenizing routine that avoids some of the problems of strtok(),
    >>> the comments should explain the features.
    >>>
    >>> This runs fine on Solaris/gcc, but crashes when run from VC++ (C
    >>> mode). The problem occurs after the first true reallocation
    >>> (second pass through main loop). The little debug part at the end
    >>> of the loop prints one time, it crashes before a second is
    >>> displayed. Seems likely that something is trashing the heap but I
    >>> haven't spotted it.
    >>>
    >>> Code follows:

    >>
    >>... code snipped ...
    >>
    >>Try the following:

    >
    >... code snipped ...
    >
    >Try the following: (i think it is better and more fast than you all
    >routines for break tokens )


    ; nasmw -f obj this_file.asm
    ; bcc32 file.c this_file.obj

    section _DATA public align=4 class=DATA use32

    global _tkn_line
    global _delimita

    section _TEXT public align=1 class=CODE use32

    ; void delimita(int* 256_ints, char* delimitatori)
    ; 256_ints = puntatore a un vettore di almeno 256 interi
    ; deilimitatori = singoli caratteri delimitatori di token
    ;
    ; inserisce i *caratteri* delimitatori per i token
    ; nel vettore 256_ints usato da tkn_line
    ; s=0j, 4b, 8a, 12ra, 16@256_ints, 20@delimitatori
    _delimita:
    push eax
    push ebx
    push edi
    %define @256_ints esp+16
    %define @delimitatori esp+20
    mov eax, [@256_ints]
    mov edi, 0
    ..c0:
    mov dword[eax+4*edi], 0
    inc edi
    cmp edi, 255
    jbe .c0
    mov edi, [@delimitatori]
    xor ebx, ebx
    ..c1:
    cmp byte[edi], 0
    je .cf
    mov bl, [edi]
    mov dword[eax+4*ebx], 1
    inc edi
    jmp short .c1

    ..cf:
    %undef @256_ints
    %undef @delimitatori
    pop edi
    pop ebx
    pop eax
    ret

    ; int tkn_line(char** v, int limit, int* 256_ints)
    ; v = punta a un vettore di almeno "limit"+1 vettori a char
    ; l'ultimo puntatore v[limit] punta a una stringa da
    ; dividere in token.
    ; limit = il numero di token permessi
    ; il vettore v deve avere almeno limit+1 elementi
    ; 256_ints = vettore di 256 interi almeno che *devono* essere
    ; riempiti al primo uso di tkn_line()
    ; con la funzione delimita()
    ; Divide la linea puntata da "v[limit]" in token di numero
    ; massimo "limit" e li termina nello stesso array con '\0'.
    ; inoltre fa puntare un puntatore al primo elemento di ciascun
    ; token e ritorna il numero di token trovati.
    ; se qualche array tra v, v[limit], 256_ints e' NULL ritorna 0
    ; se limit<=0 ritorna 0
    ; se raggiunge il puntatore v[limit-1] e ci sono ancora token
    ; allora memorizza in v[limit] la posizione in cui e' arrivato
    ; e ritorna -1
    ; s=0j, 4i, 8r, 12b, 16ra, 20@v, 24@limit, 28@256_ints
    _tkn_line:
    push ebx
    push edx
    push esi
    push edi
    %define @v esp+20
    %define @limit esp+24
    %define @256_ints esp+28
    xor edi, edi
    mov eax, [@v]
    mov ebx, [@limit]
    cmp eax, 0
    je .fn
    cmp ebx, 0
    jle .fn
    mov esi, [eax+4*ebx]
    mov edx, [@256_ints]
    cmp esi, 0
    je .fn
    cmp edx, 0
    je .fn
    xor ebx, ebx
    ..a0:
    mov bl, [esi]
    cmp dword[edx+4*ebx], 0
    je .a1
    inc esi
    jmp short .a0
    ..a1:
    cmp ebx, 0
    je .fn
    cmp edi, [@limit]
    jne .a2
    mov dword[eax+4*edi], esi
    mov edi, -1
    jmp short .fn
    ..a2:
    mov dword[eax+4*edi], esi
    inc edi
    ..a3:
    mov bl, [esi]
    cmp ebx, 0
    je .fn
    cmp dword[edx+4*ebx], 0
    jne .a4
    inc esi
    jmp short .a3
    ..a4:
    mov byte[esi], 0
    inc esi
    jmp short .a0
    ..fn:
    mov eax, edi
    %undef @v
    %undef @limit
    %undef @256_ints
    pop edi
    pop esi
    pop edx
    pop ebx
    ret


    #include <stdio.h>

    static const char *pc_pro = "[insert tokens (EOF for end)]> ";

    #ifdef __cplusplus
    extern "C" {
    #endif

    /* inserisci i delimitatori per il token */
    void delimita(int* a256array, char* delimitatori);

    int tkn_line(char** v, int limit, int* a256array);

    #ifdef __cplusplus
    }
    #endif


    int skip_line(FILE* pf)
    {int c; while( (c=fgetc(pf))!='\n' && c!=EOF ); return c;}

    #if 1
    #define BUFSIZ1 32
    #else
    #define BUFSIZ1 BUFSIZ
    #endif

    int main(void)
    {char buf[BUFSIZ1]={0}, *a[32];
    int pc[258], cv, i, z;
    /*/////////////////////////////*/
    if(sizeof(int)!=4)
    {printf("interi di dimensione non prevista\n");
    return 0;
    }
    delimita(pc, " \t+*\n");
    printf("delimitatori=%s BSZ=%u\n",
    "<SPACE><TAB>+*<NEW_LINE>", (unsigned)BUFSIZ1);
    la0:;
    while (1)
    {printf("%s", pc_pro); fflush(stdout);
    buf[BUFSIZ1-2]=0; /* massimo BUFSIZ1-1 chars */
    if( fgets(buf, BUFSIZ1, stdin )!=0 )
    {if(buf[BUFSIZ1-2]!=0 && buf[BUFSIZ1-2]!='\n')
    {printf("Linea troppo lunga\n");
    if(feof(stdin)) return 0;
    if( skip_line(stdin) == EOF )
    return 0;
    goto la0;
    }
    a[8]=buf;
    la2:; cv =tkn_line(a, 8, pc);
    if(cv==0)
    {printf("Parametri zero\n");
    goto la;
    }
    else printf("\nParametri=%d\n", cv);
    z=(cv==-1? 8: cv);
    for(i=0; i<z; ++i)
    printf("%s^^", a);
    if(cv==-1) goto la2;
    printf("\n");
    }// if fgets
    la:;
    if(feof(stdin)) return 0;
    }//while
    }
     
    RSoIsCaIrLiIoA, Mar 24, 2006
    #8
  9. ; nasmw -f obj this_file.asm
    ; bcc32 file.c this_file.obj

    section _DATA public align=4 class=DATA use32

    global _tkn_line
    global _delimita

    section _TEXT public align=1 class=CODE use32

    ; void delimita(int* 256_ints, char* delimitatori)
    ; 256_ints = puntatore a un vettore di almeno 256 interi
    ; deilimitatori = singoli caratteri delimitatori di token
    ; inserisce i *caratteri* delimitatori per i token
    ; nel vettore 256_ints usato da tkn_line
    ; s=0j, 4b, 8a, 12ra, 16@256_ints, 20@delimitatori
    _delimita:
    push eax
    push ebx
    push edi
    %define @256_ints esp+16
    %define @delimitatori esp+20
    mov eax, [@256_ints]
    mov edi, 0
    ..c0:
    mov dword[eax+4*edi], 0
    inc edi
    cmp edi, 255
    jbe .c0
    mov edi, [@delimitatori]
    xor ebx, ebx
    ..c1:
    cmp byte[edi], 0
    je .cf
    mov bl, [edi]
    mov dword[eax+4*ebx], 1
    inc edi
    jmp short .c1
    ..cf:
    %undef @256_ints
    %undef @delimitatori
    pop edi
    pop ebx
    pop eax
    ret

    ; int tkn_line(char** v, int limit, int* 256_ints)
    ; v = punta a un vettore di *almeno* "limit"+1 puntatori a
    ; char l'ultimo puntatore v[limit] punta a una stringa da
    ; dividere in token.
    ; limit = il numero di token permessi
    ; il vettore v deve avere almeno limit+1 elementi
    ; 256_ints = vettore di 256 interi almeno che *devono* essere
    ; riempiti al primo uso di tkn_line()
    ; con la funzione delimita()
    ; Divide la linea puntata da "v[limit]" in token di numero
    ; massimo "limit" e li termina nello stesso array con '\0'.
    ; inoltre fa puntare un puntatore di "v" al primo elemento
    ; di ciascun token e ritorna il numero di token trovati.
    ; se qualche array tra v, v[limit], 256_ints e' NULL ritorna 0
    ; se limit<=0 ritorna 0
    ; se raggiunge il puntatore v[limit-1] e ci sono ancora token
    ; allora memorizza in v[limit] la posizione in cui e' arrivato
    ; e ritorna -1
    ; s=0j, 4i, 8r, 12b, 16ra, 20@v, 24@limit, 28@256_ints
    _tkn_line:
    push ebx
    push edx
    push esi
    push edi
    %define @v esp+20
    %define @limit esp+24
    %define @256_ints esp+28
    xor edi, edi
    mov eax, [@v]
    mov ebx, [@limit]
    cmp eax, 0
    je .fn
    cmp ebx, 0
    jle .fn
    mov esi, [eax+4*ebx]
    mov edx, [@256_ints]
    cmp esi, 0
    je .fn
    cmp edx, 0
    je .fn
    xor ebx, ebx
    ..a0:
    mov bl, [esi]
    cmp dword[edx+4*ebx], 0
    je .a1
    inc esi
    jmp short .a0
    ..a1:
    cmp ebx, 0
    je .fn
    cmp edi, [@limit]
    jne .a2
    mov dword[eax+4*edi], esi
    mov edi, -1
    jmp short .fn
    ..a2:
    mov dword[eax+4*edi], esi
    inc edi
    ..a3:
    mov bl, [esi]
    cmp ebx, 0
    je .fn
    cmp dword[edx+4*ebx], 0
    jne .a4
    inc esi
    jmp short .a3
    ..a4:
    mov byte[esi], 0
    inc esi
    jmp short .a0
    ..fn:
    mov eax, edi
    %undef @v
    %undef @limit
    %undef @256_ints
    pop edi
    pop esi
    pop edx
    pop ebx
    ret


    #include <stdio.h>

    static const char *pc_pro = "[insert tokens (EOF for end)]> ";

    #ifdef __cplusplus
    extern "C" {
    #endif
    /* inserisci i delimitatori per il token */
    void delimita(int* a256array, char* delimitatori);

    int tkn_line(char** v, int limit, int* a256array);
    #ifdef __cplusplus
    }
    #endif


    int skip_line(FILE* pf)
    {int c; while( (c=fgetc(pf))!='\n' && c!=EOF ); return c;}

    #if 1
    #define BUFSIZ1 32
    #else
    #define BUFSIZ1 BUFSIZ
    #endif

    int main(void)
    {char buf[BUFSIZ1]={0}, *a[32];
    int pc[258], cv, i, z;
    /*/////////////////////////////*/
    if(sizeof(int)!=4)
    {printf("interi di dimensione non prevista\n");
    return 0;
    }
    delimita(pc, " \t+*\n");
    printf("delimitatori=%s BSZ=%u\n",
    "<SPACE><TAB>+*<NEW_LINE>", (unsigned)BUFSIZ1);
    la0:;
    while (1)
    {printf("%s", pc_pro); fflush(stdout);
    buf[BUFSIZ1-2]=0; /* massimo BUFSIZ1-1 chars */
    if( fgets(buf, BUFSIZ1, stdin )!=0 )
    {if(buf[BUFSIZ1-2]!=0 && buf[BUFSIZ1-2]!='\n')
    {printf("Linea troppo lunga\n");
    if(feof(stdin)) return 0;
    if( skip_line(stdin) == EOF )
    return 0;
    goto la0;
    }
    a[8]=buf; z=0;
    do{cv =tkn_line(a, 8, pc);
    if(cv==0)
    {printf("Parametri zero\n");
    goto la;
    }
    else if(z==0)
    printf("Parametri iniziali=%d\n", cv);
    z=(cv==-1? 8: cv);
    for(i=0; i<z; ++i)
    printf("%s^^", a);
    }while(cv==-1);
    printf("\n");
    }// if fgets
    la:;
    if(feof(stdin)) return 0;
    }//while
    }
     
    RSoIsCaIrLiIoA, Mar 25, 2006
    #9
  10. Bob

    Jordan Abel Guest

    On 2006-03-24, RSoIsCaIrLiIoA <> wrote:
    > On Thu, 23 Mar 2006-0500,CBFalconer<> wrote:
    >>Bob wrote:
    >>> I have been working on the following program. The goal is to have
    >>> a tokenizing routine that avoids some of the problems of strtok(),
    >>> the comments should explain the features.
    >>>
    >>> This runs fine on Solaris/gcc, but crashes when run from VC++ (C
    >>> mode). The problem occurs after the first true reallocation
    >>> (second pass through main loop). The little debug part at the end
    >>> of the loop prints one time, it crashes before a second is
    >>> displayed. Seems likely that something is trashing the heap but I
    >>> haven't spotted it.
    >>>
    >>> Code follows:

    >>
    >>... code snipped ...
    >>
    >>Try the following:

    >
    > ... code snipped ...
    >
    > Try the following: (i think it is better and more fast than you all
    > routines for break tokens )


    Except for the fact that it's i386 asm, and commented in italian. Either
    one of those is likely to cause problems on their own, together - why do
    you bother?
     
    Jordan Abel, Mar 25, 2006
    #10
  11. Bob

    santosh Guest

    Jordan Abel wrote:
    > On 2006-03-24, RSoIsCaIrLiIoA <> wrote:
    > > On Thu, 23 Mar 2006-0500,CBFalconer<> wrote:
    > >>Bob wrote:
    > >>> I have been working on the following program. The goal is to have
    > >>> a tokenizing routine that avoids some of the problems of strtok(),
    > >>> the comments should explain the features.
    > >>>
    > >>> This runs fine on Solaris/gcc, but crashes when run from VC++ (C
    > >>> mode). The problem occurs after the first true reallocation
    > >>> (second pass through main loop). The little debug part at the end
    > >>> of the loop prints one time, it crashes before a second is
    > >>> displayed. Seems likely that something is trashing the heap but I
    > >>> haven't spotted it.
    > >>>
    > >>> Code follows:
    > >>
    > >>... code snipped ...
    > >>
    > >>Try the following:

    > >
    > > ... code snipped ...
    > >
    > > Try the following: (i think it is better and more fast than you all
    > > routines for break tokens )

    >
    > Except for the fact that it's i386 asm, and commented in italian. Either
    > one of those is likely to cause problems on their own, together - why do
    > you bother?


    To be fair, he *did* append a C version (presumably), of his assembly
    code, though it's so obfuscated, I didn't bother trying to read or
    compile it.
     
    santosh, Mar 25, 2006
    #11
  12. santosh opined:
    > Jordan Abel wrote:
    >> On 2006-03-24, RSoIsCaIrLiIoA <> wrote:
    >> > On Thu, 23 Mar 2006-0500,CBFalconer<> wrote:
    >> >>Bob wrote:
    >> >>> I have been working on the following program. The goal is to
    >> >>> have a tokenizing routine that avoids some of the problems of
    >> >>> strtok(), the comments should explain the features.
    >> >
    >> > Try the following: (i think it is better and more fast than you
    >> > all routines for break tokens )

    >>
    >> Except for the fact that it's i386 asm, and commented in italian.
    >> Either one of those is likely to cause problems on their own,
    >> together - why do you bother?

    >
    > To be fair, he *did* append a C version (presumably), of his assembly
    > code, though it's so obfuscated, I didn't bother trying to read or
    > compile it.


    And I thought, and hoped, that RSoIsCaIrLiIoA is in everybody's kill
    file.

    --
    BR, Vladimir

    When a fellow says, "It ain't the money but
    the principle of the thing," it's the money.
    -- Kim Hubbard
     
    Vladimir S. Oka, Mar 25, 2006
    #12
  13. Bob

    CBFalconer Guest

    Jordan Abel wrote:
    > On 2006-03-24, RSoIsCaIrLiIoA <> wrote:
    >

    .... snip ...
    >>
    >> Try the following: (i think it is better and more fast than you
    >> all routines for break tokens )

    >
    > Except for the fact that it's i386 asm, and commented in italian.
    > Either one of those is likely to cause problems on their own,
    > together - why do you bother?


    --

    +-------------------+ .:\:\:/:/:.
    | PLEASE DO NOT F :.:\:\:/:/:.:
    | FEED THE TROLLS | :=.' - - '.=:
    | | '=(\ 9 9 /)='
    | Thank you, | ( (_) )
    | Management | /`-vvv-'\
    +-------------------+ / \
    | | @@@ / /|,,,,,|\ \
    | | @@@ /_// /^\ \\_\
    @x@@x@ | | |/ WW( ( ) )WW
    \||||/ | | \| __\,,\ /,,/__
    \||/ | | | jgs (______Y______)
    /\/\/\/\/\/\/\/\//\/\\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
    ==============================================================

    fix (vb.): 1. to paper over, obscure, hide from public view; 2.
    to work around, in a way that produces unintended consequences
    that are worse than the original problem. Usage: "Windows ME
    fixes many of the shortcomings of Windows 98 SE". - Hutchison
     
    CBFalconer, Mar 25, 2006
    #13
  14. On 25 Mar 2006 07:51:51 GMT, in comp.lang.c , Jordan Abel
    <> wrote:

    >On 2006-03-24, RSoIsCaIrLiIoA <> wrote:


    stuff

    >- why do you bother?


    --

    _____________________
    /| /| | |
    ||__|| | Please do not |
    / O O\__ | feed the |
    / \ | Trolls |
    / \ \|_____________________|
    / _ \ \ ||
    / |\____\ \ ||
    / | | | |\____/ ||
    / \|_|_|/ | _||
    / / \ |____| ||
    / | | | --|
    | | | |____ --|
    * _ | |_|_|_| | \-/
    *-- _--\ _ \ | ||
    / _ \\ | / `
    * / \_ /- | | |
    * ___ c_c_c_C/ \C_c_c_c____________
     
    Mark McIntyre, Mar 25, 2006
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. s.subbarayan

    Dynamic memory allocation and memory leak...

    s.subbarayan, Mar 18, 2005, in forum: C Programming
    Replies:
    10
    Views:
    764
    Eric Sosman
    Mar 22, 2005
  2. Rodrigo Dominguez

    memory allocation and freeing memory

    Rodrigo Dominguez, Jun 13, 2005, in forum: C Programming
    Replies:
    11
    Views:
    634
    Jean-Claude Arbaut
    Jun 15, 2005
  3. Ken
    Replies:
    24
    Views:
    3,950
    Ben Bacarisse
    Nov 30, 2006
  4. chris
    Replies:
    6
    Views:
    1,033
    chris
    Oct 28, 2005
  5. Bjarke Hammersholt Roune
    Replies:
    14
    Views:
    1,227
    Bjarke Hammersholt Roune
    Mar 6, 2011
Loading...

Share This Page