Challenge: tightest code to find-replace a string

Discussion in 'C Programming' started by DFS, Jun 6, 2014.

  1. DFS

    DFS Guest

    * reads an existing file
    * writes changes to new file
    * counts replacements made by line
    * counts total replacements made
    * no fancy usage of sed!

    I KNOW someone can better my piddly effort below (actually one I found
    online and made mods to):
    #include <stdio.h>
    #include <string.h>

    int findreplace(void)
    int bufferSize = 0x1000;
    int i = 0, k = 0, j = 0;
    char buffer[bufferSize];
    FILE *inFile = fopen("random_in.txt", "rt");
    FILE *outFile = fopen("random_out.txt", "w+");
    char *find = "46";
    char *replace = "----";

    if(inFile == NULL || outFile == NULL)
    printf("Error opening file(s)");
    return 1;

    printf("Replace '%s' with '%s':\n", find, replace);

    while(fgets(buffer, bufferSize, inFile) != NULL)
    char *stop = NULL;
    char *start = buffer;
    k = 0;

    stop = strstr(start, find);

    if(stop == NULL)
    fwrite(start, 1, strlen(start), outFile);
    } else {
    fwrite(start, 1, stop - start, outFile);
    fwrite(replace, 1, strlen(replace), outFile);
    start = stop + strlen(find);

    j += k;
    printf("Line %d: %d replacements made\n", i, k);
    printf("%d replacements made.\n", j);


    return 0;

    int main(void) {
    return 0;


    input (random_in.txt)


    output (random_out.txt)



    [[email protected] files]$ ./find_replace
    Replace '46' with '----':
    Line 1: 0 replacements made
    Line 2: 2 replacements made
    Line 3: 3 replacements made
    Line 4: 2 replacements made
    Line 5: 1 replacements made
    Line 6: 1 replacements made
    Line 7: 1 replacements made
    Line 8: 0 replacements made
    Line 9: 0 replacements made
    Line 10: 0 replacements made
    10 replacements made.

    DFS, Jun 6, 2014
    1. Advertisements

  2. DFS

    Stefan Ram Guest

    What you wrote does not replace strings that contain
    line breaks or occur at 0x1000 boundaries.
    Stefan Ram, Jun 6, 2014
    1. Advertisements

  3. DFS

    Ike Naar Guest

    This could be simplified to

    while (stop = strstr(start, find), stop != NULL)
    fwrite(start, 1, stop - start, outFile);
    fputs(replace, outFile);
    start = stop + strlen(find);
    fputs(start, outFile);
    Ike Naar, Jun 6, 2014
  4. DFS

    Noob Guest

    This doesn't "feel" very idiomatic.


    while ((stop = strstr(start, find)) != NULL)

    or even

    while (stop = strstr(start, find))

    The second one raises warnings with most compilers.

    while ((stop = strstr(start, find)))

    may shut them up.
    Noob, Jun 6, 2014
  5. DFS

    DFS Guest


    But the challenge isn't to say what it can't do. It's to show a tighter
    piece of code that does it as well or better.

    Looking forward to your entry!
    DFS, Jun 6, 2014
  6. DFS

    Jorgen Grahn Guest

    But to do that you need to understand what the program is supposed to

    And by the way, I don't understand what "tight" means. I'd personally
    optimize for memory and I/O use.

    Jorgen Grahn, Jun 6, 2014
  7. But it does disqualify your entry as it doesn't accomplish the stated
    goal. Looking forward to your fix!
    Mark Storkamp, Jun 6, 2014
  8. It reports rather than counts these matches. I would never write a
    function with this spec. because it destroys its usefulness in other
    contexts. A function should do one thing well.

    I'd write a string match/replace function that returns the number of
    matches. If I needed the counts reported by line, I'd write a wrapper
    that adds those.

    int replace_string(const char *match, const char *repl, int stopper,
    FILE *fi, FILE *fo)
    int nmatches = 0, c;
    const char *mp = match;
    while ((c = fgetc(fi)) != EOF && c != stopper)
    if (c == *mp) {
    if (!*++mp) {
    fputs(repl, fo);
    else {
    mp = match;
    fputc(c, fo);
    return nmatches;

    Called with stopper == EOF it processes a whole file. Note how removing
    the line buffer actually simplifies the code, whilst also removing an
    unnecessary restriction. It's not uncommon for this to happen (there
    was a recent thread about this).

    Called with stopper == '\n' it processes a line and so this wrapper
    prints the report:

    void replace_string_report(const char *match, const char *repl,
    FILE *fi, FILE *fo)
    int total_matches = 0, lineno = 0;
    while (!feof(fi)) {
    int nm = replace_string(match, repl, '\n', fi, fo);
    printf("\nLine %d: %d replacements\n", ++lineno, nm);
    total_matches += nm;
    printf("%d replacements\n", total_matches);

    Here's the driver for testing.

    int main(int argc, char **argv)
    if (argc > 2) {
    FILE *fin = argc > 3 ? fopen(argv[3], "r") : stdin;
    FILE *fout = argc > 4 ? fopen(argv[4], "w") : stdout;
    if (fin && fout)
    replace_string_report(argv[1], argv[2], fin, fout);

    Functions that mix tasks that can be logically separated are best
    avoided. Functions with hard-wired file names and strings are, well,
    let's just say, sub-optimal. Students used to say "but it's because I'm
    just testing" but a simple driver like the one above makes testing
    much easier than having the files and strings hard wired.

    Ben Bacarisse, Jun 6, 2014
  9. DFS

    BartC Guest

    The OP's findreplace() function where everything was hard-coded inside it,
    rather than being passed as arguments did grate a little (that would also be
    the first thing I'd change).

    But I wouldn't bother with command line parameters for testing until it's
    finished. Far easier to just write:

    int main(void) {
    replace_string_report("46","----", "random_in.txt", "random_out.txt");

    (Although you'd have to decide whether file names or handles are going to be
    passed. If this is the only find&replace operation on the file, then file
    names are probably more appropriate, although it will need more
    error-checking inside the function.)
    BartC, Jun 6, 2014
  10. /* bug here? fwrite(match, mp-match, 1, of); */
    I think there's a bug in this. Fix untested.
    Malcolm McLean, Jun 6, 2014
  11. DFS

    Jorgen Grahn Guest

    The easiest and most useful is to default to stdin and stdout, just
    like sed(1) does. The second most useful is to emulate Perl's <>
    operator (stdin, or a sequence of named files, including "-" which
    means stdin).

    Jorgen Grahn, Jun 6, 2014
  12. DFS

    Stefan Ram Guest

    I think this is not a sufficient specification of requirements.

    For example, it mentions »changes« in line two, but before line
    two, it was not said that anything should be changed at all. So
    it is not clear what »changes« refers to.

    And »to count« something is not behavior that is visible from the

    BTW: When given the task

    »replace all the occurences of "abcabc" by "defdef" in
    "012abcabcabc789" would
    "012defdefdef789" be a correct result? the only correct result?«
    Stefan Ram, Jun 6, 2014
  13. Actually, this fulfills the requirement pretty much in all cases. All
    cases in which the search string has no occurences, that is.
    Here's my take at it:

    int findreplace(int searchstring) {
    if (searchstring < 2) {
    return 0;
    } else if (searchstring == 2) {
    return 1;
    } else {
    for (int i = 2; i < searchstring; i++) {
    if ((searchstring % i) == 0) {
    return 0;
    return 1;


    Ah, der neueste und bis heute genialste Streich unsere großen
    Kosmologen: Die Geheim-Vorhersage.
    - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$>
    Johannes Bauer, Jun 6, 2014
  14. Yes, thanks. The unmatched portion needs to be printed on failure.
    Ben Bacarisse, Jun 6, 2014
  15. For a few lines of code you get much greater flexibility in testing.
    Maybe your environment does not make command-line programs easy to run?
    Why would you ever use file names? It's inherently a stream operation,
    so limiting it to named files just makes it clunky, in my view.
    Ben Bacarisse, Jun 6, 2014
  16. DFS

    BartC Guest

    I don't understand streams. I like things to have a beginning and an end,
    and a whole file is a well-understood chunk of data to work on, if it's not
    possible to just work on strings (which would be my approach; then it would
    be independent from files *and* streams).

    Imagine if you were creating some string functions where strings didn't have
    a well-defined end and could conceivably have an unlimited length...
    BartC, Jun 6, 2014
  17. A stream can have an end. And not all named files do. I don't think
    this is useful distinction.

    Anyway, if you don't like streams, I see no reason to make you like
    them. I like my way and I imagine you are happy with yours.
    That does not sound like what I meant by "this is a stream operation".
    It certainly does not apply in the case being discussed.
    Ben Bacarisse, Jun 6, 2014
  18. DFS

    Chad Guest

    Ack, my browser refuses to include the quoted text in my reply. Anyhow, having strings hardwired into a function in some cases could possibly change and/or break a function. The one example that comes to mind are functions that add text to some kind of graphic. If the string name was hardwired in, the computer could possibly interpret that string as a single point on the plane. That could be bad since a piece of text can sometimes span across a line.
    Chad, Jun 7, 2014
  19. DFS

    Stefan Ram Guest

    Typically, a newsreader is used for Usenet access.
    How is the »hello, world« program written, then,
    without a »string hardwired« into the main function?
    drawText( canvas, "hello, world" )

    risks that the computer can interpret »hello, world«
    as »a single point on the plane«?
    Stefan Ram, Jun 7, 2014
  20. DFS

    Chad Guest

    From my limited experience, the problem comes from if you view the functionas performing some kind of action on a string. From this vantage point, the function would move the string along some line as it executes. Once the function is done, the string would stop at some point. Now if you would let s represent some string, the same thing would happen. However, s would be the entire length of the traversal.
    Chad, Jun 7, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.