Request critique of first program

Discussion in 'C Programming' started by cs, Sep 2, 2007.

  1. cs

    cs Guest

    Hi,

    I'm new to C and would appreciate any feedback on the following
    program, asplit, which splits a file into 2 new files, putting a
    certain number of lines in the first file, and all the rest in the
    second file.

    Any comments as to non-portability, stylistic infelicities, outright
    bugs or anything else would be very much appreciated.

    It compiles fine using gcc 4.1.2, and --std=c99 -pedantic -Wall, with
    no warnings, and seems to work.

    There are two files, a header and a source file.

    ## Begin asplit.h ##

    #ifndef ASPLIT_H_
    #define ASPLIT_H_

    enum e_status { SUCCESS, ERROR_FILE_INPUT, ERROR_FILE_OUTPUT_A,
    ERROR_FILE_OUTPUT_B };

    typedef enum e_status status;

    typedef struct {
    int status;
    long int num_rem_lines;
    } asplit_result;

    /**
    * Asymmetrically splits file named in_file into 2 files, out_file_a,
    and
    * out_file_b, putting num_lines (must be non-negative) lines in the
    first
    * file, and the rest in the second file.
    *
    * Any of the filenames may be "-", indicating stdin in the case of
    in_file,
    * or stdout in the case of out_file_a or out_file_b.
    *
    * Returns a pointer to an asplit_result struct (unless unable to
    allocate,
    * in which case it returns NULL), which shows success or failure and
    the
    * number of lines that were included in 2nd file if successful.
    * Possible status values are given by the e_status enumeration in
    this
    * header file. Success is indicated by SUCCESS, and each of the 3
    different
    * failure states is indicated by one of the enum values beginning
    with
    * ERROR_, indicating which file was unable to be opened.
    *
    * If the status is SUCCESS, then the num_rem_lines value indicates
    * the number of lines remaining after the first file was filled,
    * all of which were entered into the second file.
    *
    * The client is responsible for memory management of the result
    struct.
    *
    */
    asplit_result * asplit(const char *in_file, const char *out_file_a,
    const char *out_file_b,
    const long int num_lines);

    #endif /*ASPLIT_H_*/

    ## End asplit.h ##

    ## Begin asplit.c ##

    #include <stdio.h>
    #include <stdlib.h>

    #include <string.h>

    #include <getopt.h>

    #include "asplit.h"

    char *copy_string(const char *const s);

    int close_file(FILE * f);

    FILE * open_infile(const char * fname);

    FILE * open_outfile(const char *fname);

    /* Name under which program was invoked. */
    char *program_name;

    void usage(int status) {

    if (status != EXIT_SUCCESS)
    fprintf(stderr, "Try `%s -h' for more information.\n",
    program_name);
    else {
    printf(
    "\n\
    Usage: %s [OPTION] -i IN_FILE -a OUT_FILE_A -b OUT_FILE_B -n NUM_LINES
    \n\n\
    ",
    program_name);
    puts("\
    Asymmetrically split IN_FILE into 2 files, putting the first NUM_LINES
    \n\
    into OUT_FILE_A and the rest into OUT_FILE_B.\n\
    \n\
    -i input file, IN_FILE\n\
    -a name of first output file, OUT_FILE_A\n\
    -b name of second output file, OUT_FILE_B\n\
    -n the number of lines to put in OUT_FILE_A\n\
    \n\
    -h display this help and exit\n\
    -v verbose output\n\
    ");
    puts("\
    \n\
    When IN_FILE is -, use standard input; when OUT_FILE_A or OUTFILE_B is
    -,\n\
    use standard output.\n\
    ");
    printf(
    "\
    \n\
    Examples:\n\n\
    %s -n 100 -i - -a output1 -b output2\n\
    Read from standard input, writing first 100 lines to output1
    file and\n\
    all following lines to output2.\n\n\
    %s -n 1000 -i infile.txt -a output1.txt -b -\n\
    Read from infile.txt, writing the first 1000 lines to
    output1.txt and\n\
    all following lines to standard output.\n\n\
    %s -n 1 -i - -a - -b -\n\
    Copy standard input to standard output.\n\n\
    \n",
    program_name, program_name,
    program_name);
    puts("\nReturns 0 on success, and 1 on failure.\n");
    printf("\nReport bugs to .
    \n\n");
    }
    exit(status);
    }

    asplit_result * asplit(const char *in_file, const char *out_file_a,
    const char *out_file_b,
    const long int num_lines) {

    long int curr_line_num;
    int curr_chr;
    FILE *in_f, *out_f;
    asplit_result *result;

    size_t size = sizeof(asplit_result);
    result = malloc(size);
    if (result == NULL) {
    return NULL;
    } else {
    result->num_rem_lines = 0;
    }

    /* Open in_file if not "-", returning unsuccessfully if unable
    to open. */
    if ((in_f = open_infile(in_file)) == NULL) {
    result->status = ERROR_FILE_INPUT;
    return result;
    }

    /* If num_lines is 0, then nothing goes into the first file.*/
    if (num_lines == 0) {
    /*
    * So open second file if not "-", returning
    unsuccessfully
    * if unable to open.
    */
    if ((out_f = open_outfile(out_file_b)) == NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_B;
    return result;
    }
    } else {
    /*
    * If num_lines is greater than 0, then open first
    file, returning
    * unsuccessfully if unable to open.
    */
    if ((out_f = open_outfile(out_file_a)) == NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_A;
    return result;
    }
    }

    /* Initial line number is 0, and increments after each newline
    is read. */
    curr_line_num = 0;

    /* While input file still has characters: */
    while ((curr_chr = getc(in_f)) != EOF) {

    /* Output char to current output file. */
    putc(curr_chr, out_f);

    /*
    * If we read a newline, and if we've just hit the
    maximum
    * allowed in the first file, we must close the first
    file, and
    * open the second file, returning unsuccessfully if
    unable to open.
    */
    if (curr_chr == '\n'&& ++curr_line_num == num_lines) {
    close_file(out_f);
    if ((out_f = open_outfile(out_file_b)) ==
    NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_B;
    return result;
    }
    }
    }
    /* Close input and output files if necessary. */
    close_file(in_f);
    close_file(out_f);

    /* Return the number of lines output to the second file. */
    result->status = 0;
    result->num_rem_lines = (curr_line_num < num_lines) ? 0 :
    (curr_line_num - num_lines);
    return result;

    }

    int main(int argc, char **argv) {

    program_name = *argv;

    char *input_file = NULL;
    char *output_file_a = NULL;
    char *output_file_b = NULL;

    int num_lines = -1;
    int verbose = 0;
    asplit_result *result = NULL;

    int ich;

    while ((ich = getopt(argc, argv, "hvi:a:b:n:")) != EOF) {
    switch (ich) {
    case 'h':
    usage(0);
    break;
    case 'i':
    input_file = (char *) copy_string(optarg);
    break;
    case 'a':
    output_file_a = (char *) copy_string(optarg);
    break;
    case 'b':
    output_file_b = (char *) copy_string(optarg);
    break;
    case 'n':
    num_lines = atoi(optarg);
    if (num_lines < 0)
    printf("-n arg must be at least 0, not
    %d.\n", num_lines);
    break;
    case 'v':
    verbose = 1;
    break;
    default:
    usage(1);
    break;
    }
    }

    /* Verify that all required arguments were submitted. */
    if (input_file == NULL|| output_file_a == NULL|| output_file_b
    == NULL
    || num_lines < 0)
    usage(1);


    /* Do the actual split. */
    result = asplit(input_file, output_file_a, output_file_b,
    num_lines);

    if (result == NULL) {
    fprintf(stderr, "Error: asplit returned NULL result.");
    exit(1);
    }

    /* Determine whether it terminated successfully, and handle
    accordingly. */
    switch (result->status) {
    case SUCCESS:
    if (verbose)
    printf("Output %li lines to 2nd file.\n",
    result->num_rem_lines);
    exit(0);
    case ERROR_FILE_INPUT:
    fprintf(stderr, "Unable to open input file: %s\n",
    input_file);
    exit(1);
    case ERROR_FILE_OUTPUT_A:
    fprintf(stderr, "Unable to open output file: %s\n",
    output_file_a);
    exit(1);
    case ERROR_FILE_OUTPUT_B:
    fprintf(stderr, "Unable to open output file: %s\n",
    output_file_b);
    exit(1);
    default:
    fprintf(stderr, "Error: unexpected return value %d\n", result-
    >status);

    exit(1);
    }

    }

    /**
    * Create a copy of the string referenced by the given pointer,
    returning
    * a pointer to the newly created copy, or NULL if unable to malloc
    memory
    * for the copy or original pointer was null or the string was empty.
    *
    * The client is responsible for memory management of the new string.
    */
    char *copy_string(const char *str_orig) {

    char *str_copy = NULL;

    if (str_orig != NULL && *str_orig != 0) {
    size_t size = strlen(str_orig) + 1;

    str_copy = malloc(size);

    if (str_copy != NULL)
    memcpy(str_copy, str_orig, size);
    }

    return str_copy;

    }

    /**
    * Close the file referenced by the given file pointer if it is not
    one of
    * the standard system files.
    *
    * Returns 0 if the file was actually closed, and 1 if not closed
    */
    int close_file(FILE * f) {
    if (f != stdin&& f != stdout&& f != stderr) {
    fclose(f);
    return 0;
    }
    return 1;
    }

    /**
    * Open the input file (mode 'r') referenced by the given pointer,
    * returning a pointer to the FILE. If fname is "-", return stdin.
    */
    FILE * open_infile(const char *const fname) {
    return strcmp(fname, "-") == 0 ? stdin : fopen(fname, "r");
    }

    /**
    * Open the output file (mode 'w') referenced by the given pointer,
    * returning a pointer to the FILE. If fname is "-", return stdout.
    */
    FILE * open_outfile(const char *const fname) {
    return strcmp(fname, "-") == 0 ? stdout : fopen(fname, "w");
    }

    ## End asplit.c ##

    I did browse the FAQ already and found a couple of things to improve,
    which I already did, but any additional feedback would be appreciated.

    Cheers,

    calvin
     
    cs, Sep 2, 2007
    #1
    1. Advertising

  2. cs

    cs Guest

    Ouch, I wish google groups had a preview. I see immediately that I've
    mixed tabs and spaces, so the code looks awful. My apologies for not
    catching that first.
     
    cs, Sep 2, 2007
    #2
    1. Advertising

  3. cs

    cs Guest

    Here is the source file with all tabs changed to spaces, so I hope it
    renders cleanly now:

    #include <stdio.h>
    #include <stdlib.h>

    #include <string.h>

    #include <getopt.h>

    #include "asplit.h"

    char *copy_string(const char *const s);

    int close_file(FILE * f);

    FILE * open_infile(const char * fname);

    FILE * open_outfile(const char *fname);

    /* Name under which program was invoked. */
    char *program_name;

    void usage(int status) {

    if (status != EXIT_SUCCESS)
    fprintf(stderr, "Try `%s -h' for more information.\n",
    program_name);
    else {
    printf("\n\
    Usage: %s [OPTION] -i IN_FILE -a OUT_FILE_A -b OUT_FILE_B -n NUM_LINES
    \n\n\
    ", program_name);
    puts("\
    Asymmetrically split IN_FILE into 2 files, putting the first NUM_LINES
    \n\
    into OUT_FILE_A and the rest into OUT_FILE_B.\n\
    \n\
    -i input file, IN_FILE\n\
    -a name of first output file, OUT_FILE_A\n\
    -b name of second output file, OUT_FILE_B\n\
    -n the number of lines to put in OUT_FILE_A\n\
    \n\
    -h display this help and exit\n\
    -v verbose output\n\
    ");
    puts("\
    \n\
    When IN_FILE is -, use standard input; when OUT_FILE_A or OUTFILE_B is
    -,\n\
    use standard output.\n\
    ");
    printf("\
    \n\
    Examples:\n\n\
    %s -n 100 -i - -a output1 -b output2\n\
    Read from standard input, writing first 100 lines to output1
    file and\n\
    all following lines to output2.\n\n\
    %s -n 1000 -i infile.txt -a output1.txt -b -\n\
    Read from infile.txt, writing the first 1000 lines to
    output1.txt and\n\
    all following lines to standard output.\n\n\
    %s -n 1 -i - -a - -b -\n\
    Copy standard input to standard output.\n\n\
    \n", program_name, program_name, program_name);
    puts("\nReturns 0 on success, and 1 on failure.\n");
    printf("\nReport bugs to .\n\n");
    }
    exit(status);
    }

    asplit_result * asplit(const char *in_file, const char *out_file_a,
    const char *out_file_b, const long int
    num_lines) {

    long int curr_line_num;
    int curr_chr;
    FILE *in_f, *out_f;
    asplit_result *result;

    size_t size = sizeof(asplit_result);
    result = malloc(size);
    if (result == NULL) {
    return NULL;
    } else {
    result->num_rem_lines = 0;
    }

    /* Open in_file if not "-", returning unsuccessfully if unable to
    open. */
    if ((in_f = open_infile(in_file)) == NULL) {
    result->status = ERROR_FILE_INPUT;
    return result;
    }

    /* If num_lines is 0, then nothing goes into the first file.*/
    if (num_lines == 0) {
    /*
    * So open second file if not "-", returning unsuccessfully
    * if unable to open.
    */
    if ((out_f = open_outfile(out_file_b)) == NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_B;
    return result;
    }
    } else {
    /*
    * If num_lines is greater than 0, then open first file,
    returning
    * unsuccessfully if unable to open.
    */
    if ((out_f = open_outfile(out_file_a)) == NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_A;
    return result;
    }
    }

    /* Initial line number is 0, and increments after each newline
    char. */
    curr_line_num = 0;

    /* While input file still has characters: */
    while ((curr_chr = getc(in_f)) != EOF) {

    /* Output char to current output file. */
    putc(curr_chr, out_f);

    /*
    * If we read a newline, and if we've just hit the maximum
    * allowed in the first file, we must close the first file,
    and
    * open the second file, returning unsuccessfully if unable to
    open.
    */
    if (curr_chr == '\n'&& ++curr_line_num == num_lines) {
    close_file(out_f);
    if ((out_f = open_outfile(out_file_b)) == NULL) {
    close_file(in_f);
    result->status = ERROR_FILE_OUTPUT_B;
    return result;
    }
    }
    }
    /* Close input and output files if necessary. */
    close_file(in_f);
    close_file(out_f);

    /* Return the number of lines output to the second file. */
    result->status = 0;
    if (curr_line_num < num_lines)
    result->num_rem_lines = 0;
    else
    result->num_rem_lines = curr_line_num - num_lines;

    return result;

    }

    int main(int argc, char **argv) {

    program_name = *argv;

    char *input_file = NULL;
    char *output_file_a = NULL;
    char *output_file_b = NULL;

    int num_lines = -1;
    int verbose = 0;
    asplit_result *result = NULL;

    int ich;

    while ((ich = getopt(argc, argv, "hvi:a:b:n:")) != EOF) {
    switch (ich) {
    case 'h':
    usage(0);
    break;
    case 'i':
    input_file = (char *) copy_string(optarg);
    break;
    case 'a':
    output_file_a = (char *) copy_string(optarg);
    break;
    case 'b':
    output_file_b = (char *) copy_string(optarg);
    break;
    case 'n':
    num_lines = atoi(optarg);
    if (num_lines < 0)
    printf("-n arg must be at least 0, not %d.\n",
    num_lines);
    break;
    case 'v':
    verbose = 1;
    break;
    default:
    usage(1);
    break;
    }
    }

    /* Verify that all required arguments were submitted. */
    if (input_file == NULL || output_file_a == NULL || output_file_b
    == NULL
    || num_lines < 0)
    usage(1);


    /* Do the actual split. */
    result = asplit(input_file, output_file_a, output_file_b,
    num_lines);

    if (result == NULL) {
    fprintf(stderr, "Error: asplit returned NULL result.");
    exit(1);
    }

    /* Determine whether it terminated successfully, and handle
    accordingly.*/
    switch (result->status) {
    case SUCCESS:
    if (verbose)
    printf("Output %li lines to 2nd file.\n", result-
    >num_rem_lines);

    exit(0);
    case ERROR_FILE_INPUT:
    fprintf(stderr, "Unable to open input file: %s\n",
    input_file);
    exit(1);
    case ERROR_FILE_OUTPUT_A:
    fprintf(stderr, "Unable to open output file: %s\n",
    output_file_a);
    exit(1);
    case ERROR_FILE_OUTPUT_B:
    fprintf(stderr, "Unable to open output file: %s\n",
    output_file_b);
    exit(1);
    default:
    fprintf(stderr, "Error: unexpected return value: %d\n",
    result->status);
    exit(1);
    }

    }

    /**
    * Create a copy of the string referenced by the given pointer,
    returning
    * a pointer to the newly created copy, or NULL if unable to malloc
    memory
    * for the copy or original pointer was null or the string was empty.
    *
    * The client is responsible for memory management of the new string.
    */
    char *copy_string(const char *str_orig) {

    char *str_copy = NULL;

    if (str_orig != NULL && *str_orig != 0) {
    size_t size = strlen(str_orig) + 1;

    str_copy = malloc(size);

    if (str_copy != NULL)
    memcpy(str_copy, str_orig, size);
    }

    return str_copy;

    }

    /**
    * Close the file referenced by the given file pointer if it is not
    one of
    * the standard system files.
    *
    * Returns 0 if the file was actually closed, and 1 if not closed
    */
    int close_file(FILE * f) {
    if (f != stdin&& f != stdout&& f != stderr) {
    fclose(f);
    return 0;
    }
    return 1;
    }

    /**
    * Open the input file (mode 'r') referenced by the given pointer,
    * returning a pointer to the FILE. If fname is "-", return stdin.
    */
    FILE * open_infile(const char *const fname) {
    return strcmp(fname, "-") == 0 ? stdin : fopen(fname, "r");
    }

    /**
    * Open the output file (mode 'w') referenced by the given pointer,
    * returning a pointer to the FILE. If fname is "-", return stdout.
    */
    FILE * open_outfile(const char *const fname) {
    return strcmp(fname, "-") == 0 ? stdout : fopen(fname, "w");
    }
     
    cs, Sep 2, 2007
    #3
  4. On Sep 2, 1:24 pm, cs <> wrote:
    > I'm new to C and would appreciate any feedback on the
    > following program, asplit, which splits a file into 2 new
    > files, putting a certain number of lines in the first file,
    > and all the rest in the second file.
    >
    > Any comments as to non-portability, stylistic infelicities,
    > outright bugs or anything else would be very much
    > appreciated.
    >
    > It compiles fine using gcc 4.1.2, and --std=c99 -pedantic
    > -Wall, with no warnings, and seems to work.
    >
    > There are two files, a header and a source file.
    >
    > ## Begin asplit.h ##
    >
    > #ifndef ASPLIT_H_
    > #define ASPLIT_H_


    I prefer...

    #ifndef h_asplit_h
    #define h_asplit_h

    > enum e_status { SUCCESS, ERROR_FILE_INPUT, ERROR_FILE_OUTPUT_A,
    > ERROR_FILE_OUTPUT_B };


    Capitalised identifiers that begin with E are reserved if
    <errno.h> is included. So you are robustness with those names.

    > typedef enum e_status status;
    >
    > typedef struct {
    > int status;
    > long int num_rem_lines;
    > } asplit_result;
    >

    <snip>
    > asplit_result * asplit(const char *in_file, const char


    You can return a struct as is. You don't need to dynamically
    allocate it, thereby placing a burdon on the caller to free
    it.

    In any case, it's more usual to return a simple int and let
    the caller pass in a pointer as an output parameter...

    int asplit( const char *in_file,
    const char *out_file_a,
    const char *out_file_b,
    long num_lines,
    long *rem_lines );

    It's very unusual to dynamically allocate an object to
    return when it's just a status.

    > *out_file_a, const char *out_file_b, const long int
    > num_lines);


    There's no point declaring num_lines to be a const.

    > #endif /*ASPLIT_H_*/
    >
    > ## End asplit.h ##
    >
    > ## Begin asplit.c ##
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    > #include <string.h>
    >
    > #include <getopt.h>


    This is not a C standard header. [It may be POSIX, I can't
    recall.]

    > #include "asplit.h"
    >
    > char *copy_string(const char *const s);
    > int close_file(FILE * f);
    > FILE * open_infile(const char * fname);
    > FILE * open_outfile(const char *fname);
    >
    > /* Name under which program was invoked. */
    > char *program_name;


    Unless you intent to change the string that's pointed to,
    you should make that const char *.

    > void usage(int status) {


    I prefer...

    void usage(FILE *fp)
    {
    fprintf(fp, ...);
    fprintf(fp, ...);
    fprintf(fp, ...);
    ...
    exit(fp == stderr ? EXIT_FAILURE : EXIT_SUCCESS);
    }

    That allows usage() to give the same usage message to
    stdout when the user invokes the program only to ask
    for help (in which case the message should be written
    to stdout, not stderr.) YMMV.

    <snip>
    > printf(
    > "\n\


    It's generally not good style to use line splicing to
    break up string literals. [Translations of line breaks
    when porting code to other systems can turn line splices
    into syntax errors.] You should just use ordinary string
    literal concatination...

    printf( "Hello"
    " World\n" );

    Note that there is a length limit on string literals.
    [I can't recall it, something like 509 characters.]

    cosnt char *out_file_a,
    <snip>
    > input_file = (char *) copy_string(optarg);


    The cast is redundant since copy_string is declared as
    returning a char * already. If you need to cast a
    return value for an assignment, it can be a sign that
    you're doing something wrong. Hence, such a cast is
    often viewed as immediately suspicious.

    In any case, I'm not sure why you don't just parse
    argv yourself.

    <snip>
    >
    > exit(1);


    1 is not a portable value to pass to exit. Use 0,
    EXIT_SUCCESS or EXIT_FAILURE.

    <snip>
    > int close_file(FILE * f) {
    > if (f != stdin&& f != stdout&& f != stderr) {
    > fclose(f);


    You should check the status of fclose. It can fail for
    output files if the implementation (or the os) can't
    flush the buffer.

    > return 0;
    > }
    > return 1;
    > }


    --
    Peter
     
    Peter Nilsson, Sep 2, 2007
    #4
  5. cs

    cs Guest

    Thanks for the feedback, Peter. Further comments interspersed below.

    On Sep 1, 9:12 pm, Peter Nilsson <> wrote:
    > On Sep 2, 1:24 pm, cs <> wrote:
    >
    > > #ifndef ASPLIT_H_
    > > #define ASPLIT_H_

    >
    > I prefer...
    >
    > #ifndef h_asplit_h
    > #define h_asplit_h


    Thanks. I copied that style from some GNU source, but I like your
    style better myself too.

    > Capitalised identifiers that begin with E are reserved if
    > <errno.h> is included. So you are robustness with those names.


    Ah, I wasn't aware of that. I changed them to begin with 'F', since
    there seem to be no similar restriction there based on <http://
    web.archive.org/web/20040209031039/http://oakroadsystems.com/tech/c-
    predef.htm#identF>).

    > > asplit_result * asplit(const char *in_file, const char

    >
    > You can return a struct as is. You don't need to dynamically
    > allocate it, thereby placing a burdon on the caller to free
    > it.


    If I understand you correctly, you mean to return the struct rather
    than a pointer, and to initialize it when it is declared. I made that
    change, before getting rid of the struct as you suggested below.

    > In any case, it's more usual to return a simple int and let
    > the caller pass in a pointer as an output parameter...
    >
    > int asplit( const char *in_file,
    > const char *out_file_a,
    > const char *out_file_b,
    > long num_lines,
    > long *rem_lines );
    >
    > It's very unusual to dynamically allocate an object to
    > return when it's just a status.


    Ah, I see. That is much simpler.

    > > *out_file_a, const char *out_file_b, const long int
    > > num_lines);

    >
    > There's no point declaring num_lines to be a const.


    You mean there's no point in having the function prototype declare the
    num_lines to be const, since it doesn't matter in the prototype,
    right? I still want it in the implementation, because I'd like a
    compile-time error (if possible) if I should mistakenly try to change
    num_lines inside the function.

    > > #include <getopt.h>

    >
    > This is not a C standard header. [It may be POSIX, I can't
    > recall.]


    Thanks. I didn't realize that was non-standard. I guess I'll replace
    it to keep it standard.

    > > /* Name under which program was invoked. */
    > > char *program_name;

    >
    > Unless you intent to change the string that's pointed to,
    > you should make that const char *.


    It's not intended to change, of course, so I'll make it const.

    > > void usage(int status) {

    >
    > I prefer...
    >
    > void usage(FILE *fp)
    > ...


    That's an improvement. Thanks.

    > It's generally not good style to use line splicing to
    > break up string literals....
    > You should just use ordinary string literal concatination....


    That's another style choice that I copied from some GNU code. I wasn't
    aware of the potential for problems, and will use the concatenation
    format in the future.

    > Note that there is a length limit on string literals.
    > [I can't recall it, something like 509 characters.]


    I just looked it up to confirm, and ANSI does indeed specify 509 as
    the max. (c99, incidentally, bumps it to 4095.)

    > > input_file = (char *) copy_string(optarg);

    >
    > The cast is redundant since copy_string is declared as
    > returning a char * already.


    Right you are. I think they may have been required in some earlier
    version, before I added copy_string.

    > In any case, I'm not sure why you don't just parse
    > argv yourself.


    Yeah, I will change it. I was originally planning on supporting long
    args too, and getopt seemed to make that easier, but since I changed
    my mind about long args, and getopt is non-standard, I will just parse
    the args myself.

    > 1 is not a portable value to pass to exit. Use 0,
    > EXIT_SUCCESS or EXIT_FAILURE.


    Updated. I wasn't aware of those macros.

    > > int close_file(FILE * f) {
    > > if (f != stdin&& f != stdout&& f != stderr) {
    > > fclose(f);

    >
    > You should check the status of fclose. It can fail for
    > output files if the implementation (or the os) can't
    > flush the buffer.


    Done.

    Thanks so much for the many definite improvements.

    Cheers,

    Calvin
     
    cs, Sep 2, 2007
    #5
  6. On Sep 2, 6:20 am, cs <> wrote:
    > Thanks for the feedback, Peter. Further comments interspersed below.


    > > > /* Name under which program was invoked. */
    > > > char *program_name;

    >
    > > Unless you intent to change the string that's pointed to,
    > > you should make that const char *.

    >
    > It's not intended to change, of course, so I'll make it const.


    Since you mention GNU previously, you may be interested to know
    that glibc already defines two symbols:
    program_invocation_name and program_invocation_short_name (I
    am on a non-real (ie windows) machine as I write this, so
    I might have those names wrong) that provides the appropriate
    values. Obviously, using them is non-portable.

    Also, w.r.t. argument parsing, you might want to look at gnu's
    argp_parse() instead of getopt(). In fact, I would say
    you have exactly 2 choices: parse the arguments yourself
    or use argp_parse(). getopt() is IMO obsolete.
     
    William Pursell, Sep 2, 2007
    #6
  7. cs

    cs Guest

    On Sep 2, 7:32 am, William Pursell <> wrote:
    >
    > Since you mention GNU previously, you may be interested to know
    > that glibc already defines two symbols:
    > program_invocation_name and program_invocation_short_name (I
    > am on a non-real (ie windows) machine as I write this, so
    > I might have those names wrong) that provides the appropriate
    > values. Obviously, using them is non-portable.


    Thanks for the information. I wasn't aware of them, but it looks like
    they're in errno.h, and they are indeed the names you remembered.

    > Also, w.r.t. argument parsing, you might want to look at gnu's
    > argp_parse() instead of getopt(). In fact, I would say
    > you have exactly 2 choices: parse the arguments yourself
    > or use argp_parse(). getopt() is IMO obsolete.


    I'll remember argp_parse if I have more complex commandline parsing
    needs. It looks like it provides much more than getopt.

    Cheers,

    Calvin
     
    cs, Sep 2, 2007
    #7
  8. On 2 Sep, 05:12, Peter Nilsson <> wrote:
    > On Sep 2, 1:24 pm, cs <> wrote:


    > > I'm new to C and would appreciate any feedback on the
    > > following program, asplit, which splits a file into 2 new
    > > files, putting a certain number of lines in the first file,
    > > and all the rest in the second file.


    why? why not put it all in one file? I only use .h files
    for declarations and macros shared between .c files.


    > > Any comments as to non-portability, stylistic infelicities,
    > > outright bugs or anything else would be very much
    > > appreciated.


    your functions looked rather large

    <snip>

    > > ## Begin asplit.h ##

    >
    > > #ifndef ASPLIT_H_
    > > #define ASPLIT_H_

    >
    > I prefer...
    >
    > #ifndef h_asplit_h
    > #define h_asplit_h


    why? A macro is defined and by the usual conventions
    a macro is in upper case.

    <snip>

    --
    Nick Keighley
     
    Nick Keighley, Sep 3, 2007
    #8
  9. Nick Keighley <> writes:
    [...]
    >> On Sep 2, 1:24 pm, cs <> wrote:
    >> > I'm new to C and would appreciate any feedback on the
    >> > following program, asplit, which splits a file into 2 new
    >> > files, putting a certain number of lines in the first file,
    >> > and all the rest in the second file.

    >
    > why? why not put it all in one file? I only use .h files
    > for declarations and macros shared between .c files.

    [...]

    I don't see any indication that the program is intended to operate on
    C source files.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 3, 2007
    #9
  10. cs

    cs Guest

    On Sep 3, 2:30 am, Nick Keighley <>
    wrote:
    > On 2 Sep, 05:12, Peter Nilsson <> wrote:
    >
    > > On Sep 2, 1:24 pm, cs <> wrote:
    > > > I'm new to C and would appreciate any feedback on the
    > > > following program, asplit, which splits a file into 2 new
    > > > files, putting a certain number of lines in the first file,
    > > > and all the rest in the second file.

    >
    > why? why not put it all in one file? I only use .h files
    > for declarations and macros shared between .c files.


    There are 2 separate issues that are being conflated here, if I
    understand you correctly.

    (1) the program itself is intended to split enormous line-oriented
    text files. E.g., I have some text files where each line is
    independent of the next, and some contain up to about 6 million lines
    or so. I needed to take a sample to prime a database (i.e., run
    ANALYZE in postgresql), before loading the rest, so I wanted an easy
    way of splitting the file into 2 files, 1 containing the first
    100,000, and the second containing the following 5.9 million. The
    standard Linux split program only allows you to specify a number of
    lines, which it then uses to split the file into N/num_lines files, so
    there is no way to specify only 2 files, first containing x lines, and
    second containing N-x lines.

    (2) why did I not put all the source in the asplit.c file and do away
    with the header? I probably could, as it's unlike anybody else will
    use the program, and I might never want to use it elsewhere either.
    But isn't it best to have the option of reusing the macros and
    declarations elsewhere?

    >
    > your functions looked rather large
    >


    Yes, the asplit and main are both pretty large. I'll see if I can
    refactor them and make them smaller and more cohesive.

    > > > ## Begin asplit.h ##

    >
    > > > #ifndef ASPLIT_H_
    > > > #define ASPLIT_H_

    >
    > > I prefer...

    >
    > > #ifndef h_asplit_h
    > > #define h_asplit_h

    >
    > why? A macro is defined and by the usual conventions
    > a macro is in upper case.
    >


    Well, I'll leave that to Peter to answer, but I commented that I
    prefer the second because I don't like dangling underscores for purely
    aesthetic reasons. If upper case is a widely observed macro convention
    though, I'll change them to uppercase. Is there any reason why the
    obvious ASPLIT_H is not suitable? If it must be uppercase, that seems
    the most natural to me, given that the filename is asplit.h.

    Cheers,

    Calvin
     
    cs, Sep 3, 2007
    #10
  11. cs

    cs Guest

    On Sep 3, 1:32 pm, Keith Thompson <> wrote:
    > Nick Keighley <> writes:
    >
    > [...]>> On Sep 2, 1:24 pm, cs <> wrote:
    > >> > I'm new to C and would appreciate any feedback on the
    > >> > following program, asplit, which splits a file into 2 new
    > >> > files, putting a certain number of lines in the first file,
    > >> > and all the rest in the second file.

    >
    > > why? why not put it all in one file? I only use .h files
    > > for declarations and macros shared between .c files.

    >
    > [...]
    >
    > I don't see any indication that the program is intended to operate on
    > C source files.
    >


    Yes, it's not intended for C source files. If I were only splitting
    smallish files like C source files, I'd just use the standard head and
    tail programs, extracting the first part with head and then the rest
    with tail.

    -calvin
     
    cs, Sep 3, 2007
    #11
  12. Nick Keighley <> wrote:
    > Peter Nilsson <> wrote:
    > > > ## Begin asplit.h ##
    > > >
    > > > #ifndef ASPLIT_H_
    > > > #define ASPLIT_H_

    > >
    > > I prefer...
    > >
    > > #ifndef h_asplit_h
    > > #define h_asplit_h

    >
    > why?


    For reasons you snipped. In general, XXXX is more likely to
    be a reserved identifier than h_xxxx_h; particularly in
    contexts that are beyond the control of the header file.

    [Of course, if the committee just standardised #pragma once,
    there wouldn't be an issue.]

    > A macro is defined and by the usual conventions
    > a macro is in upper case.


    Occasionally you need to stop and examine the 'ususal conventions'.
    Some of them have the odd flaw. It's unfortunate that XXXXX_H is,
    as you say, a usual convention.

    Of course, there are alternative conventions like defining
    HDR_XXXX, etc. I just stated what mine was.

    --
    Peter
     
    Peter Nilsson, Sep 3, 2007
    #12
  13. cs <> writes:
    > On Sep 3, 1:32 pm, Keith Thompson <> wrote:
    >> Nick Keighley <> writes:
    >>
    >> [...]>> On Sep 2, 1:24 pm, cs <> wrote:
    >> >> > I'm new to C and would appreciate any feedback on the
    >> >> > following program, asplit, which splits a file into 2 new
    >> >> > files, putting a certain number of lines in the first file,
    >> >> > and all the rest in the second file.

    >>
    >> > why? why not put it all in one file? I only use .h files
    >> > for declarations and macros shared between .c files.

    >>
    >> [...]
    >>
    >> I don't see any indication that the program is intended to operate on
    >> C source files.
    >>

    >
    > Yes, it's not intended for C source files. If I were only splitting
    > smallish files like C source files, I'd just use the standard head and
    > tail programs, extracting the first part with head and then the rest
    > with tail.


    Why don't you do that anyway? It would mean scanning the first
    portion of the file twice; I don't know whether that would be a
    problem for you.

    It would also be easy to do in awk or perl without scanning part of
    the file twice.

    Of course, if you want help with these approaches, you'll need to ask
    in a different newsgroup.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 4, 2007
    #13
  14. cs

    cs Guest

    On Sep 3, 4:49 pm, Keith Thompson <> wrote:
    > cs <> writes:
    >
    > > Yes, it's not intended for C source files. If I were only splitting
    > > smallish files like C source files, I'd just use the standard head and
    > > tail programs, extracting the first part with head and then the rest
    > > with tail.

    >
    > Why don't you do that anyway?


    I was looking for an excuse to use C, as a learning exercise.
    Certainly not because it was the quickest thing to implement.
     
    cs, Sep 4, 2007
    #14
  15. cs <> writes:
    > On Sep 3, 4:49 pm, Keith Thompson <> wrote:
    >> cs <> writes:
    >> > Yes, it's not intended for C source files. If I were only splitting
    >> > smallish files like C source files, I'd just use the standard head and
    >> > tail programs, extracting the first part with head and then the rest
    >> > with tail.

    >>
    >> Why don't you do that anyway?

    >
    > I was looking for an excuse to use C, as a learning exercise.
    > Certainly not because it was the quickest thing to implement.


    That's an excellent reason!

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 4, 2007
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fao, Sean

    Critique Request: CheckBoxColumn

    Fao, Sean, Feb 15, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    523
    Fao, Sean
    Feb 15, 2006
  2. cs

    Request critique of first program

    cs, Sep 2, 2007, in forum: C Programming
    Replies:
    3
    Views:
    284
  3. Zack

    Critique of first python code

    Zack, Feb 8, 2008, in forum: Python
    Replies:
    9
    Views:
    416
    George Sakkis
    Feb 17, 2008
  4. voipfc
    Replies:
    0
    Views:
    140
    voipfc
    Oct 11, 2006
  5. kazack
    Replies:
    3
    Views:
    290
    Tad McClellan
    Oct 22, 2003
Loading...

Share This Page