lseek and write question

Discussion in 'C++' started by golden, Nov 16, 2007.

  1. golden

    golden Guest

    Hello,

    I am going to ask a question regarding
    write and lseek. I will provide code at the end of this, but first
    some background.


    I am trying to identify the cause of some latency in writing to disk.
    My user claims that performance is much slower on SAN than on local
    disk. The developer provided me a C++ program that performed a write
    test that confirmed his suspicions. I modified the code to better
    fit
    my needs which it does now.


    What I found during the test is that fsync is an expensive operation
    and will block waiting for a confirmation from the disk device. What
    I am trying to understand is the lseek function.


    From what I read, it simply moves the pointer in the file descriptor
    as directed. When I use this lseek function, writes are faster.


    My question is why? When I use the write command, does the pointer
    get reset and on each write, it will search for EOF?


    This is running Linux sytem.


    Thanks in advance:


    #include <sys/types.h>
    #include <sys/time.h>


    #include <errno.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>


    int main(int argc, char **argv)
    {
    struct timeval start, end;
    double usecs;
    long val;
    int ch, fd, idx, ops, numThreads;
    char *fname= "";
    int filesize = 40000000;
    int bytes = 0;
    bool dosync = true, doSeek=false;


    bytes = 0;
    ops = 0;
    char *buf = new char[bytes];
    fname = argv[1];


    while (( ch = getopt(argc,argv, "b:eek::f:sl")) != EOF)
    switch (ch) {
    case 'b' :
    bytes = atoi(optarg);
    break;
    case 'o' :
    ops = atoi(optarg);
    break;
    case 'f' :
    fname = (optarg);
    break;
    case 's' :
    dosync = false;
    break;
    case 'l' :
    doSeek = true;
    break;
    }
    argc -= optind;
    argv += optind;


    gettimeofday(&start,NULL);


    memset(buf,0,bytes);
    if ( dosync ) {
    printf("Processing %d bytes with %d Operations of fsync :
    \t", bytes,ops);
    } else {
    printf("Processing %d bytes with %d Operations of fsync :
    \t", bytes,1);
    }


    // unlink(fname);
    if ((fd = open(fname, O_RDWR | O_CREAT, 0666)) == -1)
    {
    int errNum = errno;
    printf("ERROR: failed to open %s: n",fname);
    return(0);
    }


    for ( int idx(0) ; idx < ops ; idx++)
    {


    if (write(fd, buf, bytes) != bytes)
    {
    printf("write: \n");
    exit (1);
    }


    if ( dosync ) {
    if (fsync(fd) != 0)
    {
    printf("fsync: \n");
    exit (1);
    }
    }
    if ( doSeek )
    {
    if (lseek(fd, (off_t)0, SEEK_SET) == -1)
    {
    printf("lseek: %s\n",
    strerror(errno));
    exit (1);
    }
    }


    }


    // One last sync


    if (fsync(fd) != 0)
    {
    printf("fsync: \n");
    exit (1);
    }
    gettimeofday(&end,NULL);


    int totalSec = 0;
    long totalUSec = 0;


    if (start.tv_usec > end.tv_usec) {
    end.tv_usec += 1000000;
    end.tv_sec--;
    }


    totalSec = end.tv_sec - start.tv_sec;
    totalUSec = end.tv_usec - start.tv_usec;
    int t = totalSec + (totalUSec / 1000000);


    printf("%ld Hours ",t / ( 60 * 60));
    t %= (60*60);
    printf("%ld Minutes ",t / 60);
    t %= 60;
    printf("%ld.%ld Seconds ",t ,totalUSec);
    printf("%ld.%ld Seconds\n ",totalSec ,totalUSec);
    }
     
    golden, Nov 16, 2007
    #1
    1. Advertising

  2. golden wrote:
    > I am going to ask a question regarding
    > write and lseek. I will provide code at the end of this, but first
    > some background.
    > [..]
    > What I found during the test is that fsync is an expensive operation
    > and will block waiting for a confirmation from the disk device. What
    > I am trying to understand is the lseek function.
    >
    >
    > From what I read, it simply moves the pointer in the file descriptor
    > as directed. When I use this lseek function, writes are faster.
    >
    >
    > My question is why? When I use the write command, does the pointer
    > get reset and on each write, it will search for EOF?
    >
    >
    > This is running Linux sytem.
    >
    > [..]


    First a nit pick: 'write' is not a command. It's a function. IIRC,
    it's a POSIX function, which isn't really on topic here. Now that's
    out of the way, second, in C++ we'd use the 'fwrite' function (from
    the C Standard Library). Have you tried switching to using 'fwrite'
    instead?

    And the last point: you might want to consider asking in the Linux
    newsgroup since I/O performance depends greatly on the platform, and
    there is no real explanation from the language point of view why
    'write' is so slow without 'lseek'.

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Nov 16, 2007
    #2
    1. Advertising

  3. golden

    golden Guest

    On Nov 16, 3:45 pm, "Victor Bazarov" <> wrote:
    > golden wrote:
    > > I am going to ask a question regarding
    > > write and lseek. I will provide code at the end of this, but first
    > > some background.
    > > [..]
    > > What I found during the test is that fsync is an expensive operation
    > > and will block waiting for a confirmation from the disk device. What
    > > I am trying to understand is the lseek function.

    >
    > > From what I read, it simply moves the pointer in the file descriptor
    > > as directed. When I use this lseek function, writes are faster.

    >
    > > My question is why? When I use the write command, does the pointer
    > > get reset and on each write, it will search for EOF?

    >
    > > This is running Linux sytem.

    >
    > > [..]

    >
    > First a nit pick: 'write' is not a command. It's a function. IIRC,
    > it's a POSIX function, which isn't really on topic here. Now that's
    > out of the way, second, in C++ we'd use the 'fwrite' function (from
    > the C Standard Library). Have you tried switching to using 'fwrite'
    > instead?
    >
    > And the last point: you might want to consider asking in the Linux
    > newsgroup since I/O performance depends greatly on the platform, and
    > there is no real explanation from the language point of view why
    > 'write' is so slow without 'lseek'.
    >
    > V
    > --
    > Please remove capital 'A's when replying by e-mail
    > I do not respond to top-posted replies, please don't ask


    Thanks... the nitpicking will make me better, so I welcome that. I am
    so used to programming in perl the "command" seems automatic. I will
    try the fwrite and visit the linux group. Thanks for the reply.
     
    golden, Nov 17, 2007
    #3
  4. golden

    James Kanze Guest

    On Nov 16, 9:45 pm, "Victor Bazarov" <> wrote:
    > golden wrote:
    > > I am going to ask a question regarding
    > > write and lseek. I will provide code at the end of this, but first
    > > some background.
    > > [..]
    > > What I found during the test is that fsync is an expensive operation
    > > and will block waiting for a confirmation from the disk device. What
    > > I am trying to understand is the lseek function.


    > > From what I read, it simply moves the pointer in the file descriptor
    > > as directed. When I use this lseek function, writes are faster.


    > > My question is why? When I use the write command, does the pointer
    > > get reset and on each write, it will search for EOF?


    > > This is running Linux sytem.
    > > [..]


    > First a nit pick: 'write' is not a command. It's a function. IIRC,
    > it's a POSIX function, which isn't really on topic here. Now that's
    > out of the way, second, in C++ we'd use the 'fwrite' function (from
    > the C Standard Library). Have you tried switching to using 'fwrite'
    > instead?


    It won't work. He's using a functionality (synchronized
    writing) which isn't available in the standard library. The
    most you can ever guarantee with the standard library (either
    FILE* or iostream) is that the data has been transfered to the
    OS; his call to fsych guarantees that it has been physically
    written on the medium.

    > And the last point: you might want to consider asking in the Linux
    > newsgroup since I/O performance depends greatly on the platform, and
    > there is no real explanation from the language point of view why
    > 'write' is so slow without 'lseek'.


    With regards to his particular question, the answer seems
    obvious (and will probably be the same on any system, anytime he
    doesn't use synchronized writes): because of the seek, he's
    always writing the data at the same place on the disk, which
    means that the system can always reuse the same sector cache,
    and never has to go to disk. Without the seek, he's writing a
    fairly large file, and the system probably won't keep all of the
    cached data around, but will write to disk.

    Is it really surprising that writing a file with one record is
    significantly faster than writing one with ops records (where
    ops is probably fairly large)?

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Nov 17, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Elephant

    Code for lseek

    Elephant, Jan 12, 2006, in forum: C Programming
    Replies:
    2
    Views:
    815
    Walter Roberson
    Jan 12, 2006
  2. venkat

    regarding lseek and fread

    venkat, May 27, 2007, in forum: C Programming
    Replies:
    2
    Views:
    459
    Martin Ambuhl
    May 27, 2007
  3. Gordon Beaton
    Replies:
    3
    Views:
    1,477
  4. pavunkumar

    Lseek Error

    pavunkumar, Mar 31, 2009, in forum: C Programming
    Replies:
    2
    Views:
    508
    Rob Clarke
    Mar 31, 2009
  5. Replies:
    3
    Views:
    152
    Andreas Perstinger
    May 14, 2013
Loading...

Share This Page