Scanf and number formats

Discussion in 'C Programming' started by Vig, Mar 14, 2005.

  1. Vig

    Vig Guest

    Is scanf or any other function capable of reading numbers in the format
    1.2345d-13 where 'd' serves the same role as 'e' usually does in scientific
    notation? This operation is iterated through several times and we really
    would like not to have to read it as a string first or anything like that.

    Thanks
    --
    Vig
     
    Vig, Mar 14, 2005
    #1
    1. Advertising

  2. In article <d14905$n5s$>,
    Vig <> wrote:
    :Is scanf or any other function capable of reading numbers in the format
    :1.2345d-13 where 'd' serves the same role as 'e' usually does in scientific
    :notation?

    Not scanf(), and not any other standard C library routine that I can think of.

    :This operation is iterated through several times and we really
    :would like not to have to read it as a string first or anything like that.

    Surely the slow part of the operation would be the read from disk?
    Once the input line has been read from disk, it is going to be in
    memory, in which case you can replace the 'd' with 'e' and sscanf()
    the result. All it costs is examining the input line once or twice more
    in memory.

    If you get stuck, then provided the copyright issues are
    compatible with your legal situations, you could use a slightly
    modified version of glibc's scanf() function.
    --
    I was very young in those days, but I was also rather dim.
    -- Christopher Priest
     
    Walter Roberson, Mar 14, 2005
    #2
    1. Advertising

  3. Vig

    Vig Guest

    "Walter Roberson" <-cnrc.gc.ca> wrote in message
    news:d14aoa$nj9$...

    > Not scanf(), and not any other standard C library routine that I can think

    of.

    Yes. Me neither. The only place I've seen the format used is by Fortran
    people. Can anyone confirm that C does not support reading numbers like
    this?


    > Surely the slow part of the operation would be the read from disk?
    > Once the input line has been read from disk, it is going to be in
    > memory, in which case you can replace the 'd' with 'e' and sscanf()
    > the result. All it costs is examining the input line once or twice more
    > in memory.


    Almost everything we read from files are numbers. Currently, it is scanned
    with a %lf unless otherwise specified. If we are to handle the problem of
    the 'd' that would mean almost multiplying our time for reading even good
    files without d's by 3. Also, I cannot directly replace an e with a d
    because Scientific notation is usually written as 0.123456e+01 while d is
    1.23456d0 (I am not completely sure, which is why I want C to handle it all
    for me :) )

    --
    Vig
     
    Vig, Mar 14, 2005
    #3
  4. In article <d14c1v$ogl$>,
    Vig <> wrote:
    :Also, I cannot directly replace an e with a d
    :because Scientific notation is usually written as 0.123456e+01 while d is
    :1.23456d0 (I am not completely sure, which is why I want C to handle it all
    :for me :) )

    On output, C's e format,

    is converted to the style [-]d.ddde+dd, where there is one digit
    before the decimal-point character (which is nonzero if the
    argument is nonzero)

    On input, a string of digits is accepted before the decimal point.
    The sign after the 'e' on input is optional. Thus, 0.123456e+01
    and 1.23456e0 are equivilent [except perhaps in the last bit or two
    when one is at the limit of precision.]


    :Almost everything we read from files are numbers. Currently, it is scanned
    :with a %lf unless otherwise specified. If we are to handle the problem of
    :the 'd' that would mean almost multiplying our time for reading even good
    :files without d's by 3.

    No, that doesn't follow. The time required to read data from a file is
    largely dominated by the disk I/O rate... modified by operating
    system predictive reads, direct I/O or not, DMA block size, SCSI
    Command Tag Queuing (CTQ), ability of the OS to flip a DMA page
    directly into user space without having to copy it, and so on.

    When you use scanf(), then unless you have specifically turned off
    buffering, the C I/O library will usually [but not promised in the
    standard] fill a block from the I/O subsytem (or I/O cache),
    putting the block into your memory space; the block size is often
    8 Kb. Once the block has been read in, scanf() is really just
    reading the data from memory, as if it were using getc() to fetch
    each character. [It has to be that way because you are allowed
    to mix getc() and scanf(), so they both have to read from the
    same input buffer, and it usually isn't worth duplicating the
    logic.] getc() is usually a macro that works with the FILE
    structure.

    The slow part of reading is getting the data from disk to your
    program the first time; once there, you could examine the data a
    number of times before the next batch was ready. For example if your
    disk subsystem is SCSI-2 Fast, your disk might be limited to
    20 megabytes per second; on a 2 GHz CPU, you could run 100
    cycles per character and still keep up with the disk.

    If you are sufficiently starved for CPU resources that
    doing a quick scan-and-replace over the buffer is slowing you
    down, then you should probably already have done a bunch
    of work on custom I/O (e.g., using "real time" partitions,
    using a raw partition instead of a block device, using
    scatter-gather buffering, using any available O/S
    facilities to bypass caching; ensuring your input data
    is always a multiple of an I/O page and always reading
    in full blocks instead of going through the per-character
    end-of-buffer checks imposed by getc().) You should not
    presume that a simple scan over the buffer will prove
    to be the limiting speed factor on your program: it
    probably won't.

    Speaking of limiting speed factors: consider having a
    pre-pass program that does nothing other than reading in
    the data and converting it to binary and storing the
    binary as a file with fixed length records. Such a program
    could probably run asynchronously with whatever calculation
    you are doing -- and if you are reading the input file
    multiple times in different programs, you will have
    saved having to convert the ASCII multiple times.
    You will get about a 3:1 compression ratio by converting
    the input to binary.
    --
    Any sufficiently old bug becomes a feature.
     
    Walter Roberson, Mar 14, 2005
    #4
  5. Vig

    Vig Guest

    "Walter Roberson" <-cnrc.gc.ca> wrote in message
    news:d14hdf$2u8$...

    > :Also, I cannot directly replace an e with a d
    > :because Scientific notation is usually written as 0.123456e+01 while d is
    > :1.23456d0 (I am not completely sure, which is why I want C to handle it

    all
    > :for me :) )
    >
    > On output, C's e format,
    >
    > is converted to the style [-]d.ddde+dd, where there is one digit
    > before the decimal-point character (which is nonzero if the
    > argument is nonzero)


    Yes...It's pretty retarded of me to grumble about convention if converting
    d's to e's will still be read correctly.

    > On input, a string of digits is accepted before the decimal point.
    > The sign after the 'e' on input is optional. Thus, 0.123456e+01
    > and 1.23456e0 are equivilent [except perhaps in the last bit or two
    > when one is at the limit of precision.]
    >
    >
    > :Almost everything we read from files are numbers. Currently, it is

    scanned
    > :with a %lf unless otherwise specified. If we are to handle the problem

    of
    > :the 'd' that would mean almost multiplying our time for reading even good
    > :files without d's by 3.
    >
    > No, that doesn't follow. The time required to read data from a file is
    > largely dominated by the disk I/O rate... modified by operating
    > system predictive reads, direct I/O or not, DMA block size, SCSI
    > Command Tag Queuing (CTQ), ability of the OS to flip a DMA page
    > directly into user space without having to copy it, and so on.


    Ya...just thinking it out and talking to you has made me remove a lot of
    ridiculous code I had put in place. I think the d to e substitution will
    work albeit it would have to be done smartly when I am more awake :)

    > When you use scanf(), then unless you have specifically turned off
    > buffering, the C I/O library will usually [but not promised in the
    > standard] fill a block from the I/O subsytem (or I/O cache),
    > putting the block into your memory space; the block size is often
    > 8 Kb. Once the block has been read in, scanf() is really just
    > reading the data from memory, as if it were using getc() to fetch
    > each character. [It has to be that way because you are allowed
    > to mix getc() and scanf(), so they both have to read from the
    > same input buffer, and it usually isn't worth duplicating the
    > logic.] getc() is usually a macro that works with the FILE
    > structure.
    >
    > The slow part of reading is getting the data from disk to your
    > program the first time; once there, you could examine the data a
    > number of times before the next batch was ready. For example if your
    > disk subsystem is SCSI-2 Fast, your disk might be limited to
    > 20 megabytes per second; on a 2 GHz CPU, you could run 100
    > cycles per character and still keep up with the disk.
    >
    > If you are sufficiently starved for CPU resources that
    > doing a quick scan-and-replace over the buffer is slowing you
    > down, then you should probably already have done a bunch
    > of work on custom I/O (e.g., using "real time" partitions,
    > using a raw partition instead of a block device, using
    > scatter-gather buffering, using any available O/S
    > facilities to bypass caching; ensuring your input data
    > is always a multiple of an I/O page and always reading
    > in full blocks instead of going through the per-character
    > end-of-buffer checks imposed by getc().) You should not
    > presume that a simple scan over the buffer will prove
    > to be the limiting speed factor on your program: it
    > probably won't.
    >
    > Speaking of limiting speed factors: consider having a
    > pre-pass program that does nothing other than reading in
    > the data and converting it to binary and storing the
    > binary as a file with fixed length records. Such a program
    > could probably run asynchronously with whatever calculation
    > you are doing -- and if you are reading the input file
    > multiple times in different programs, you will have
    > saved having to convert the ASCII multiple times.
    > You will get about a 3:1 compression ratio by converting
    > the input to binary.


    That is actually a good idea, but I had to stamp it out of my head in about
    10 seconds because I am only fixing a bug right now and there doesn't seem
    to be a possibility of me being able to talk people into this :)

    > Any sufficiently old bug becomes a feature.


    And Vice Versa :)

    Thanks for all the help
    --
    Vig
     
    Vig, Mar 14, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. moosdau
    Replies:
    51
    Views:
    1,599
    Keith Thompson
    Jan 4, 2006
  2. =?ISO-8859-1?Q?Martin_J=F8rgensen?=

    scanf (yes/no) - doesn't work + deprecation errors scanf, fopen etc.

    =?ISO-8859-1?Q?Martin_J=F8rgensen?=, Feb 16, 2006, in forum: C Programming
    Replies:
    185
    Views:
    3,453
    those who know me have no need of my name
    Apr 3, 2006
  3. =?ISO-8859-1?Q?Martin_J=F8rgensen?=

    difference between scanf("%i") and scanf("%d") ??? perhaps bug inVS2005?

    =?ISO-8859-1?Q?Martin_J=F8rgensen?=, Apr 26, 2006, in forum: C Programming
    Replies:
    18
    Views:
    690
    Richard Bos
    May 2, 2006
  4. James Brown

    number formats

    James Brown, Nov 13, 2006, in forum: C Programming
    Replies:
    7
    Views:
    385
    Eric Sosman
    Nov 14, 2006
  5. Sandy Beech

    Only number input thru scanf()

    Sandy Beech, Nov 2, 2010, in forum: C Programming
    Replies:
    8
    Views:
    5,071
    David Thompson
    Nov 16, 2010
Loading...

Share This Page