knowing exact string array length ?

Discussion in 'C Programming' started by alberto, Jun 17, 2006.

  1. alberto

    alberto Guest

    Hi. Im newbie in C language. I have a binary file with many character
    arrays of 50 character defined as

    char array[50]

    But in some cases, many of these 50 characters are not being used. I
    would like to know how could I know how many characters are really being
    used in each array ?

    Thanks
    Alberto
     
    alberto, Jun 17, 2006
    #1
    1. Advertising

  2. alberto

    Bill Pursell Guest

    alberto wrote:
    > Hi. Im newbie in C language. I have a binary file with many character
    > arrays of 50 character defined as
    >
    > char array[50]
    >
    > But in some cases, many of these 50 characters are not being used. I
    > would like to know how could I know how many characters are really being
    > used in each array ?



    What do you mean by "used"? Do you mean that they are non-zero?
    The following will give you a count of how many of the first characters
    are non-zero.

    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>

    #define BLOCK_SIZE 50
    int
    main(void)
    {
    char buf[BLOCK_SIZE+1];
    int count=0;

    buf[BLOCK_SIZE] = 0;
    while(fread(buf, sizeof *buf, BLOCK_SIZE, stdin) == BLOCK_SIZE)
    printf("Block %d used %d bytes\n", ++count,
    strlen(buf));
    }
     
    Bill Pursell, Jun 17, 2006
    #2
    1. Advertising

  3. alberto <> writes:
    > Hi. Im newbie in C language. I have a binary file with many character
    > arrays of 50 character defined as
    >
    > char array[50]
    >
    > But in some cases, many of these 50 characters are not being used. I
    > would like to know how could I know how many characters are really
    > being used in each array ?


    If you can define what you mean by "used", the answer will be obvious.
    If you can't, there is no answer.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jun 17, 2006
    #3
  4. alberto

    alberto Guest

    Bill Pursell escribió:
    > alberto wrote:
    >> Hi. Im newbie in C language. I have a binary file with many character
    >> arrays of 50 character defined as
    >>
    >> char array[50]
    >>
    >> But in some cases, many of these 50 characters are not being used. I
    >> would like to know how could I know how many characters are really being
    >> used in each array ?

    >
    >
    > What do you mean by "used"? Do you mean that they are non-zero?
    > The following will give you a count of how many of the first characters
    > are non-zero.
    >
    > #include <stdlib.h>
    > #include <stdio.h>
    > #include <string.h>
    >
    > #define BLOCK_SIZE 50
    > int
    > main(void)
    > {
    > char buf[BLOCK_SIZE+1];
    > int count=0;
    >
    > buf[BLOCK_SIZE] = 0;
    > while(fread(buf, sizeof *buf, BLOCK_SIZE, stdin) == BLOCK_SIZE)
    > printf("Block %d used %d bytes\n", ++count,
    > strlen(buf));
    > }
    >

    tnbx for your answer. Yes, I mean that if I have the array

    char buf[50]

    then some times it will contain 20 characters not '\0' I want to know
    the amount of these characters, because I want to do a linked dynamic
    list with pointers and each node should have the "string" of the static
    array of 50 chars with really characters not '\0', so I must know the
    number and after that use calloc or malloc functions
     
    alberto, Jun 17, 2006
    #4
  5. alberto

    alberto Guest

    Keith Thompson escribió:
    > alberto <> writes:
    >> Hi. Im newbie in C language. I have a binary file with many character
    >> arrays of 50 character defined as
    >>
    >> char array[50]
    >>
    >> But in some cases, many of these 50 characters are not being used. I
    >> would like to know how could I know how many characters are really
    >> being used in each array ?

    >
    > If you can define what you mean by "used", the answer will be obvious.
    > If you can't, there is no answer.
    >

    I told on previous massage.
    For example, I declare this array:

    char arr[50];

    And I want to put on that variable what user write on the keyboard with
    scanf function (for example, his name). But some times the user will
    type 20 characters, and other times will type 40 characters...

    The same if the array would be on a file. But how do I know exactly how
    many characters typed the user ?
     
    alberto, Jun 17, 2006
    #5
  6. alberto wrote:
    > Keith Thompson escribió:

    -snip-

    >> If you can define what you mean by "used", the answer will be obvious.
    >> If you can't, there is no answer.
    >>

    > I told on previous massage.
    > For example, I declare this array:
    >
    > char arr[50];
    >
    > And I want to put on that variable what user write on the keyboard with
    > scanf function (for example, his name). But some times the user will
    > type 20 characters, and other times will type 40 characters...
    >
    > The same if the array would be on a file. But how do I know exactly how
    > many characters typed the user ?


    I think you should look at strlen from <string.h>... Isn't that what you
    meant by "used"?


    Best regards
    Martin Jørgensen

    --
    ---------------------------------------------------------------------------
    Home of Martin Jørgensen - http://www.martinjoergensen.dk
     
    =?ISO-8859-1?Q?Martin_J=F8rgensen?=, Jun 17, 2006
    #6
  7. alberto

    Michael Mair Guest

    alberto schrieb:
    > Keith Thompson escribió:
    >> alberto <> writes:
    >>
    >>> Hi. Im newbie in C language. I have a binary file with many character
    >>> arrays of 50 character defined as
    >>>
    >>> char array[50]
    >>>
    >>> But in some cases, many of these 50 characters are not being used. I
    >>> would like to know how could I know how many characters are really
    >>> being used in each array ?

    >>
    >>
    >> If you can define what you mean by "used", the answer will be obvious.
    >> If you can't, there is no answer.
    >>

    > I told on previous massage.
    > For example, I declare this array:
    >
    > char arr[50];
    >
    > And I want to put on that variable what user write on the keyboard with
    > scanf function (for example, his name). But some times the user will
    > type 20 characters, and other times will type 40 characters...
    >
    > The same if the array would be on a file. But how do I know exactly how
    > many characters typed the user ?


    You restricted yourself to a binary file, thus you do not know
    how many characters are "used" -- you read 50 and then determine
    how many are "used".
    If you change your notion to "the file contains zero-terminated
    character sequences (vulgo: strings) each of which is no longer
    than 50 characters including the terminator", you can read up
    to 50 characters, stop earlier when you encounter '\0' and know
    how many characters you read ("used") up to that point.
    This is curiously close to "the text file contains lines of up
    to 50 characters including the newline character". C provides
    a function to deal with this case: fgets(). You can use fgets()
    also to read from stdin. If you want to be on the safe side,
    you can check whether the user really entered/the file really
    contained such a character sequence: The last character of the
    string's "content" must be a '\n'.
    Especially for reading from stdin, you often are restricted to
    line-buffered input, so you get the user's input only after
    the user hit return.
    strlen() can be used to determine the number of characters plus
    the '\n'.
    If you do not need the '\n', you can overwrite it with '\0'.


    Cheers
    Michael
    --
    E-Mail: Mine is an /at/ gmx /dot/ de address.
     
    Michael Mair, Jun 17, 2006
    #7
  8. alberto

    alberto Guest

    Michael Mair escribió:
    > alberto schrieb:
    >> Keith Thompson escribió:
    >>> alberto <> writes:
    >>>
    >>>> Hi. Im newbie in C language. I have a binary file with many character
    >>>> arrays of 50 character defined as
    >>>>
    >>>> char array[50]
    >>>>
    >>>> But in some cases, many of these 50 characters are not being used. I
    >>>> would like to know how could I know how many characters are really
    >>>> being used in each array ?
    >>>
    >>>
    >>> If you can define what you mean by "used", the answer will be obvious.
    >>> If you can't, there is no answer.
    >>>

    >> I told on previous massage.
    >> For example, I declare this array:
    >>
    >> char arr[50];
    >>
    >> And I want to put on that variable what user write on the keyboard
    >> with scanf function (for example, his name). But some times the user
    >> will type 20 characters, and other times will type 40 characters...
    >>
    >> The same if the array would be on a file. But how do I know exactly
    >> how many characters typed the user ?

    >
    > You restricted yourself to a binary file, thus you do not know
    > how many characters are "used" -- you read 50 and then determine
    > how many are "used".
    > If you change your notion to "the file contains zero-terminated
    > character sequences (vulgo: strings) each of which is no longer
    > than 50 characters including the terminator", you can read up
    > to 50 characters, stop earlier when you encounter '\0' and know
    > how many characters you read ("used") up to that point.
    > This is curiously close to "the text file contains lines of up
    > to 50 characters including the newline character". C provides
    > a function to deal with this case: fgets(). You can use fgets()
    > also to read from stdin. If you want to be on the safe side,
    > you can check whether the user really entered/the file really
    > contained such a character sequence: The last character of the
    > string's "content" must be a '\n'.
    > Especially for reading from stdin, you often are restricted to
    > line-buffered input, so you get the user's input only after
    > the user hit return.
    > strlen() can be used to determine the number of characters plus
    > the '\n'.
    > If you do not need the '\n', you can overwrite it with '\0'.
    >
    >
    > Cheers
    > Michael

    tnx for info. Yes, the array of chars is a member of a struct and stored
    in a binery file. I think I should read each "register" of the file, and
    then try to determine the real lenght of the field "array of char" and
    after that call calloc or malloc function properly. Correct ?
    Tnx
    Alberto
     
    alberto, Jun 17, 2006
    #8
  9. alberto

    Malcolm Guest

    "alberto" <> wrote in message
    > tnx for info. Yes, the array of chars is a member of a struct and stored
    > in a binery file. I think I should read each "register" of the file, and
    > then try to determine the real lenght of the field "array of char" and
    > after that call calloc or malloc function properly. Correct ?
    > Tnx
    > Alberto


    That's right.

    struct record
    {
    char *astring;
    };

    Will store a string of any length, if you allocate the "astring" member with
    malloc().
    It is very debatable whether there will be any benefit over a fixed buffer
    size of 50 bytes. Normally malloc uses a few bytes for overhead, and then
    you have the extra run-time and complication of calling the allocation
    function and freeing it.
    However if the strings were several kilobytes in length, this would be the
    only realistic option.
    --
    Buy my book 12 Common Atheist Arguments (refuted)
    $1.25 download or $7.20 paper, available www.lulu.com/bgy1mm
     
    Malcolm, Jun 17, 2006
    #9
  10. Malcolm said:

    <snip>

    > struct record
    > {
    > char *astring;
    > };
    >
    > Will store a string of any length, if you allocate the "astring" member
    > with malloc().


    Nonsense, Malcolm. Care to try again?

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
     
    Richard Heathfield, Jun 17, 2006
    #10
  11. alberto

    Michael Mair Guest

    alberto schrieb:
    > Michael Mair escribió:
    >> alberto schrieb:
    >>> Keith Thompson escribió:
    >>>> alberto <> writes:
    >>>>
    >>>>> Hi. Im newbie in C language. I have a binary file with many character
    >>>>> arrays of 50 character defined as
    >>>>>
    >>>>> char array[50]
    >>>>>
    >>>>> But in some cases, many of these 50 characters are not being used. I
    >>>>> would like to know how could I know how many characters are really
    >>>>> being used in each array ?
    >>>>
    >>>> If you can define what you mean by "used", the answer will be obvious.
    >>>> If you can't, there is no answer.
    >>>>
    >>> I told on previous massage.
    >>> For example, I declare this array:
    >>>
    >>> char arr[50];
    >>>
    >>> And I want to put on that variable what user write on the keyboard
    >>> with scanf function (for example, his name). But some times the user
    >>> will type 20 characters, and other times will type 40 characters...
    >>>
    >>> The same if the array would be on a file. But how do I know exactly
    >>> how many characters typed the user ?

    >>
    >> You restricted yourself to a binary file, thus you do not know
    >> how many characters are "used" -- you read 50 and then determine
    >> how many are "used".
    >> If you change your notion to "the file contains zero-terminated
    >> character sequences (vulgo: strings) each of which is no longer
    >> than 50 characters including the terminator", you can read up
    >> to 50 characters, stop earlier when you encounter '\0' and know
    >> how many characters you read ("used") up to that point.
    >> This is curiously close to "the text file contains lines of up
    >> to 50 characters including the newline character". C provides
    >> a function to deal with this case: fgets(). You can use fgets()
    >> also to read from stdin. If you want to be on the safe side,
    >> you can check whether the user really entered/the file really
    >> contained such a character sequence: The last character of the
    >> string's "content" must be a '\n'.
    >> Especially for reading from stdin, you often are restricted to
    >> line-buffered input, so you get the user's input only after
    >> the user hit return.
    >> strlen() can be used to determine the number of characters plus
    >> the '\n'.
    >> If you do not need the '\n', you can overwrite it with '\0'.

    >
    > tnx for info. Yes, the array of chars is a member of a struct and stored
    > in a binery file. I think I should read each "register" of the file, and
    > then try to determine the real lenght of the field "array of char" and
    > after that call calloc or malloc function properly. Correct ?


    This depends. As this is a newsgroup about C, use code to convey
    what you are talking about -- this makes clear what you are
    trying to do.
    If I understood you correctly, you have a binary file which
    contains a number of data "registers" which you read in as
    struct:
    struct myFooRegister {
    ....
    char bar[50];
    ....
    };
    Now, you want to "repackage" the data to another internal format,
    say
    struct myBaz {
    ....
    char *qux;
    ....
    };
    You could have gone with "char qux[50];" as well but have
    reason to believe that the additional effort of allocating and
    freeing memory is well spent as you really need the memory[*].
    Now, you want to have
    struct myBaz baz;
    char *baz_qux = malloc(strlen(foo_reg.bar) + 1);
    if (baz_qux != NULL) {
    strcpy(baz_qux, foo_reg.bar);
    }
    else {
    /* Your error handling here */
    }
    baz.qux = baz_qux;
    Is this what you mean?

    Cheers
    Michael

    [*] For fifty bytes, it usually is not worth it. Say you have
    25 bytes "used" on average. You get at least an additional
    sizeof (char*) bytes for the pointer itself, some memory for
    the system's internal allocation data, and maybe the system
    gives you always 64 bytes, 256 bytes, or 1K to make sure that
    there is no "odd-sized hole" in the memory and that you can
    realloc() without having to copy the contents.
    You _can_ of course write some memory handling of your own
    to take care of that but it probably is not worth the overhead.

    However, it certainly is a good exercise :)
    --
    E-Mail: Mine is an /at/ gmx /dot/ de address.
     
    Michael Mair, Jun 17, 2006
    #11
  12. alberto

    Joe Wright Guest

    alberto wrote:
    > Hi. Im newbie in C language. I have a binary file with many character
    > arrays of 50 character defined as
    >
    > char array[50]
    >
    > But in some cases, many of these 50 characters are not being used. I
    > would like to know how could I know how many characters are really being
    > used in each array ?
    >
    > Thanks
    > Alberto


    Assuming a binary file consisting of 'char array[50]' elements, and
    assuming further that the data in the elements are strings, you can
    fseek the file on 50-byte boundaries, fread 50 bytes into a memory array
    and then run strlen() on the array.

    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Jun 17, 2006
    #12
  13. alberto

    Joe Estock Guest

    alberto wrote:
    > Hi. Im newbie in C language. I have a binary file with many character
    > arrays of 50 character defined as
    >
    > char array[50]
    >
    > But in some cases, many of these 50 characters are not being used. I
    > would like to know how could I know how many characters are really being
    > used in each array ?
    >
    > Thanks
    > Alberto


    I presume you are wanting to know how many bytes were read from the
    file. If this is the case then you can simply store the return value of
    fread. This will tell you how many bytes it actually read.
     
    Joe Estock, Jun 17, 2006
    #13
  14. Richard Heathfield <> writes:
    > Malcolm said:
    >
    > <snip>
    >
    >> struct record
    >> {
    >> char *astring;
    >> };
    >>
    >> Will store a string of any length, if you allocate the "astring" member
    >> with malloc().

    >
    > Nonsense, Malcolm. Care to try again?


    It makes sense if you assume that "allocate a pointer" is shorthand
    for "allocate an array and set the pointer to point to its first
    element". (But I'm not sure I see the point of wrapping it in a
    structure.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jun 17, 2006
    #14
  15. Keith Thompson said:

    > Richard Heathfield <> writes:
    >> Malcolm said:
    >>
    >> <snip>
    >>
    >>> struct record
    >>> {
    >>> char *astring;
    >>> };
    >>>
    >>> Will store a string of any length, if you allocate the "astring" member
    >>> with malloc().

    >>
    >> Nonsense, Malcolm. Care to try again?

    >
    > It makes sense if you assume that "allocate a pointer" is shorthand
    > for "allocate an array and set the pointer to point to its first
    > element".


    It still won't store a string of *any* length, though. There are infinitely
    many strings. Of these, infinitely many are infinitely long, and Malcolm
    could reasonably argue that he didn't mean those, but if we discount them,
    there remain infinitely many strings that are finite in length but still so
    very very long that they are unrepresentable within a single universe, let
    alone a single machine.

    > (But I'm not sure I see the point of wrapping it in a
    > structure.)


    That, at least, makes sense as a first cut which is heading in a direction
    such as this:

    struct elastistring
    {
    char *astring;
    size_t curlen;
    size_t maxlen;
    };

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
     
    Richard Heathfield, Jun 17, 2006
    #15
  16. On Sat, 17 Jun 2006 12:39:15 -0500, Joe Estock
    <> wrote:

    >alberto wrote:
    >> Hi. Im newbie in C language. I have a binary file with many character
    >> arrays of 50 character defined as
    >>
    >> char array[50]
    >>
    >> But in some cases, many of these 50 characters are not being used. I
    >> would like to know how could I know how many characters are really being
    >> used in each array ?
    >>
    >> Thanks
    >> Alberto

    >
    >I presume you are wanting to know how many bytes were read from the
    >file. If this is the case then you can simply store the return value of
    >fread. This will tell you how many bytes it actually read.


    fread is not a string oriented function. In the absence of an I/O
    error or an end of data condition, it will always read the requested
    number of bytes, regardless of how many are '\0'. Additionally,
    unless fread is called with the second argument set to 1, the return
    value is the number of objects read, not the number of bytes.


    Remove del for email
     
    Barry Schwarz, Jun 18, 2006
    #16
  17. On Sat, 17 Jun 2006 10:50:53 +0200, alberto <> wrote:

    >Bill Pursell escribió:
    >> alberto wrote:
    >>> Hi. Im newbie in C language. I have a binary file with many character
    >>> arrays of 50 character defined as
    >>>
    >>> char array[50]
    >>>
    >>> But in some cases, many of these 50 characters are not being used. I
    >>> would like to know how could I know how many characters are really being
    >>> used in each array ?

    >>
    >>
    >> What do you mean by "used"? Do you mean that they are non-zero?
    >> The following will give you a count of how many of the first characters
    >> are non-zero.
    >>
    >> #include <stdlib.h>
    >> #include <stdio.h>
    >> #include <string.h>
    >>
    >> #define BLOCK_SIZE 50
    >> int
    >> main(void)
    >> {
    >> char buf[BLOCK_SIZE+1];
    >> int count=0;
    >>
    >> buf[BLOCK_SIZE] = 0;
    >> while(fread(buf, sizeof *buf, BLOCK_SIZE, stdin) == BLOCK_SIZE)
    >> printf("Block %d used %d bytes\n", ++count,
    >> strlen(buf));
    >> }
    >>

    >tnbx for your answer. Yes, I mean that if I have the array
    >
    >char buf[50]
    >
    >then some times it will contain 20 characters not '\0' I want to know
    >the amount of these characters, because I want to do a linked dynamic
    >list with pointers and each node should have the "string" of the static
    >array of 50 chars with really characters not '\0', so I must know the
    >number and after that use calloc or malloc functions


    Binary files often contain '\0' bytes in positions other than the end
    of strings. Is the data in your binary file truly strings or is it
    possible that there could be imbedded '\0' bytes and more data in buf
    that follows? If they are strings, why are you processing the file in
    binary mode?


    Remove del for email
     
    Barry Schwarz, Jun 18, 2006
    #17
  18. Barry Schwarz <> writes:
    [...]
    > Binary files often contain '\0' bytes in positions other than the end
    > of strings. Is the data in your binary file truly strings or is it
    > possible that there could be imbedded '\0' bytes and more data in buf
    > that follows? If they are strings, why are you processing the file in
    > binary mode?


    Strings are arrays of char terminated by '\0'. A '\0' character
    normally wouldn't appear in a text file. (Text files contain lines,
    which are often read into strings.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jun 18, 2006
    #18
  19. alberto

    Malcolm Guest

    "Richard Heathfield" <> wrote
    >>
    >> It makes sense if you assume that "allocate a pointer" is shorthand
    >> for "allocate an array and set the pointer to point to its first
    >> element".

    >
    > It still won't store a string of *any* length, though. There are
    > infinitely
    > many strings. Of these, infinitely many are infinitely long, and Malcolm
    > could reasonably argue that he didn't mean those, but if we discount them,
    > there remain infinitely many strings that are finite in length but still
    > so
    > very very long that they are unrepresentable within a single universe, let
    > alone a single machine.
    >

    A infinite string isn't a string, because C stirngs are NUL-terminated and
    an infinite array has no terminating member.
    I do take the point about long strings. If a size_t won't hold the length of
    my string (the Encyclopedia Britannica, not a contrived example in any way)
    the what is the point of it?

    I recommend for C2006 a arbitrary-precison representation of size_t.
    --
    Buy my book 12 Common Atheist Arguments (refuted)
    $1.25 download or $7.20 paper, available www.lulu.com/bgy1mm
     
    Malcolm, Jun 18, 2006
    #19
  20. alberto

    johnny Guest

    Barry Schwarz escribió:
    > On Sat, 17 Jun 2006 12:39:15 -0500, Joe Estock
    > <> wrote:
    >
    >> alberto wrote:
    >>> Hi. Im newbie in C language. I have a binary file with many character
    >>> arrays of 50 character defined as
    >>>
    >>> char array[50]
    >>>
    >>> But in some cases, many of these 50 characters are not being used. I
    >>> would like to know how could I know how many characters are really being
    >>> used in each array ?
    >>>
    >>> Thanks
    >>> Alberto

    >> I presume you are wanting to know how many bytes were read from the
    >> file. If this is the case then you can simply store the return value of
    >> fread. This will tell you how many bytes it actually read.

    >
    > fread is not a string oriented function. In the absence of an I/O
    > error or an end of data condition, it will always read the requested
    > number of bytes, regardless of how many are '\0'. Additionally,
    > unless fread is called with the second argument set to 1, the return
    > value is the number of objects read, not the number of bytes.
    >
    >
    > Remove del for email

    the file is a binary file containing some "structs" , and one field of
    the struct is

    char array[50]

    which I must "cut" when I find the first '\0' character to know the real
    characters used.

    So I think fread can be used here
     
    johnny, Jun 18, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mitchua
    Replies:
    5
    Views:
    2,747
    Eric J. Roode
    Jul 17, 2003
  2. Sam
    Replies:
    3
    Views:
    14,110
    Karl Seguin
    Feb 17, 2005
  3. Replies:
    5
    Views:
    667
    John W. Kennedy
    Jan 11, 2007
  4. Replies:
    21
    Views:
    706
    Pete Becker
    Apr 13, 2006
  5. Hans Müller
    Replies:
    1
    Views:
    558
    Emile van Sebille
    Mar 19, 2010
Loading...

Share This Page