Creating a file with $SIZE

Discussion in 'Python' started by k.i.n.g., Mar 12, 2008.

  1. k.i.n.g.

    k.i.n.g. Guest

    Hi All,

    I would like create files of different size, taking size as user
    input. I need to check the data transfer rates from one network to
    another . In order to do this I will have to create files of diff size
    and work out. I am new to Python

    Thanks in advance.

    KK
    k.i.n.g., Mar 12, 2008
    #1
    1. Advertising

  2. k.i.n.g.

    Chris Guest

    On Mar 12, 12:32 pm, "k.i.n.g." <> wrote:
    > Hi All,
    >
    > I would like create files of different size, taking size as user
    > input. I need to check the data transfer rates from one network to
    > another . In order to do this I will have to create files of diff size
    > and work out. I am new to Python
    >
    > Thanks in advance.
    >
    > KK


    Welcome to Python.

    If you just want to create files with random junk from the user input
    then maybe something along these lines would help:

    import sys, random

    def random_junk(number_of_characters):
    tmp = []
    while number_of_characters:
    tmp.append(random.randint(0, 127))
    number_of_characters -= 1
    return ''.join(map(str,tmp))

    if len(sys.argv) < 2:
    sys.exit('Usage:python %s <space seperated
    filesizes>'%sys.argv[0])

    for each_argv in sys.argv[1:]:
    output_file = open(each_argv,'wb').write(random_junk(each_argv))
    Chris, Mar 12, 2008
    #2
    1. Advertising

  3. k.i.n.g.

    Chris Guest

    On Mar 12, 12:52 pm, Chris <> wrote:
    > On Mar 12, 12:32 pm, "k.i.n.g." <> wrote:
    >
    > > Hi All,

    >
    > > I would like create files of different size, taking size as user
    > > input. I need to check the data transfer rates from one network to
    > > another . In order to do this I will have to create files of diff size
    > > and work out. I am new to Python

    >
    > > Thanks in advance.

    >
    > > KK

    >
    > Welcome to Python.
    >
    > If you just want to create files with random junk from the user input
    > then maybe something along these lines would help:
    >
    > import sys, random
    >
    > def random_junk(number_of_characters):
    >     tmp = []
    >     while number_of_characters:
    >         tmp.append(random.randint(0, 127))
    >         number_of_characters -= 1
    >     return ''.join(map(str,tmp))
    >
    > if len(sys.argv) < 2:
    >     sys.exit('Usage:python %s <space seperated
    > filesizes>'%sys.argv[0])
    >
    > for each_argv in sys.argv[1:]:
    >     output_file = open(each_argv,'wb').write(random_junk(each_argv))


    Sorry, meant

    def random_junk(number_of_characters):
    tmp = []
    while number_of_characters:
    tmp.append(chr(random.randint(0, 127)))
    number_of_characters -= 1
    return ''.join(tmp)
    Chris, Mar 12, 2008
    #3
  4. k.i.n.g.

    k.i.n.g. Guest

    I think I am not clear with my question, I am sorry. Here goes the
    exact requirement.

    We use dd command in Linux to create a file with of required size. In
    similar way, on windows I would like to use python to take the size of
    the file( 50MB, 1GB ) as input from user and create a uncompressed
    file of the size given by the user.

    ex: If user input is 50M, script should create 50Mb of blank or empty
    file

    Thank you
    k.i.n.g., Mar 12, 2008
    #4
  5. k.i.n.g.

    Robert Bossy Guest

    k.i.n.g. wrote:
    > I think I am not clear with my question, I am sorry. Here goes the
    > exact requirement.
    >
    > We use dd command in Linux to create a file with of required size. In
    > similar way, on windows I would like to use python to take the size of
    > the file( 50MB, 1GB ) as input from user and create a uncompressed
    > file of the size given by the user.
    >
    > ex: If user input is 50M, script should create 50Mb of blank or empty
    > file
    >

    def make_blank_file(path, size):
    f = open(path, 'w')
    f.seek(size - 1)
    f.write('\0')
    f.close()

    I'm not sure the f.seek() trick will work on all platforms, so you can:

    def make_blank_file(path, size):
    f = open(path, 'w')
    f.write('\0' * size)
    f.close()

    Cheers,
    RB
    Robert Bossy, Mar 12, 2008
    #5
  6. Robert Bossy wrote:
    > k.i.n.g. wrote:
    >> I think I am not clear with my question, I am sorry. Here goes the
    >> exact requirement.
    >>
    >> We use dd command in Linux to create a file with of required size. In
    >> similar way, on windows I would like to use python to take the size of
    >> the file( 50MB, 1GB ) as input from user and create a uncompressed
    >> file of the size given by the user.
    >>
    >> ex: If user input is 50M, script should create 50Mb of blank or empty
    >> file
    >>

    > def make_blank_file(path, size):
    > f = open(path, 'w')
    > f.seek(size - 1)
    > f.write('\0')
    > f.close()
    >
    > I'm not sure the f.seek() trick will work on all platforms, so you can:
    >
    > def make_blank_file(path, size):
    > f = open(path, 'w')
    > f.write('\0' * size)
    > f.close()


    I point out that a 1 GB string is probably not a good idea.

    def make_blank_file(path, size):
    chunksize = 10485760 # 10 MB
    chunk = '\0' * chunksize
    left = size
    fh = open(path, 'wb')
    while left > chunksize:
    fh.write(chunk)
    left -= chunksize
    if left > 0:
    fh.write('\0' * left)
    fh.close()

    > Cheers,
    > RB

    --
    Matt Nordhoff, Mar 12, 2008
    #6
  7. k.i.n.g.

    Robert Bossy Guest

    Matt Nordhoff wrote:
    > Robert Bossy wrote:
    >
    >> k.i.n.g. wrote:
    >>
    >>> I think I am not clear with my question, I am sorry. Here goes the
    >>> exact requirement.
    >>>
    >>> We use dd command in Linux to create a file with of required size. In
    >>> similar way, on windows I would like to use python to take the size of
    >>> the file( 50MB, 1GB ) as input from user and create a uncompressed
    >>> file of the size given by the user.
    >>>
    >>> ex: If user input is 50M, script should create 50Mb of blank or empty
    >>> file
    >>>
    >>>

    >> def make_blank_file(path, size):
    >> f = open(path, 'w')
    >> f.seek(size - 1)
    >> f.write('\0')
    >> f.close()
    >>
    >> I'm not sure the f.seek() trick will work on all platforms, so you can:
    >>
    >> def make_blank_file(path, size):
    >> f = open(path, 'w')
    >> f.write('\0' * size)
    >> f.close()
    >>

    >
    > I point out that a 1 GB string is probably not a good idea.
    >
    > def make_blank_file(path, size):
    > chunksize = 10485760 # 10 MB
    > chunk = '\0' * chunksize
    > left = size
    > fh = open(path, 'wb')
    > while left > chunksize:
    > fh.write(chunk)
    > left -= chunksize
    > if left > 0:
    > fh.write('\0' * left)
    > fh.close()
    >

    Indeed! Maybe the best choice for chunksize would be the file's buffer
    size... I won't search the doc how to get the file's buffer size because
    I'm too cool to use that function and prefer the seek() option since
    it's lighning fast regardless the size of the file and it takes near to
    zero memory.

    Cheers,
    RB
    Robert Bossy, Mar 12, 2008
    #7
  8. k.i.n.g.

    Guest

    On Mar 12, 2:44 pm, Robert Bossy <> wrote:
    > Matt Nordhoff wrote:
    > > Robert Bossy wrote:

    >
    > >> k.i.n.g. wrote:

    >
    > >>> I think I am not clear with my question, I am sorry. Here goes the
    > >>> exact requirement.

    >
    > >>> We use dd command in Linux to create a file with of required size. In
    > >>> similar way, on windows I would like to use python to take the size of
    > >>> the file( 50MB, 1GB ) as input from user and create a uncompressed
    > >>> file of the size given by the user.

    >
    > >>> ex: If user input is 50M, script should create 50Mb of blank or empty
    > >>> file

    >
    > >> def make_blank_file(path, size):
    > >> f = open(path, 'w')
    > >> f.seek(size - 1)
    > >> f.write('\0')
    > >> f.close()

    >
    > >> I'm not sure the f.seek() trick will work on all platforms, so you can:

    >
    > >> def make_blank_file(path, size):
    > >> f = open(path, 'w')
    > >> f.write('\0' * size)
    > >> f.close()

    >
    > > I point out that a 1 GB string is probably not a good idea.

    >
    > > def make_blank_file(path, size):
    > > chunksize = 10485760 # 10 MB
    > > chunk = '\0' * chunksize
    > > left = size
    > > fh = open(path, 'wb')
    > > while left > chunksize:
    > > fh.write(chunk)
    > > left -= chunksize
    > > if left > 0:
    > > fh.write('\0' * left)
    > > fh.close()

    >
    > Indeed! Maybe the best choice for chunksize would be the file's buffer
    > size... I won't search the doc how to get the file's buffer size because
    > I'm too cool to use that function and prefer the seek() option since
    > it's lighning fast regardless the size of the file and it takes near to
    > zero memory.
    >
    > Cheers,
    > RB


    But what platforms does it work on / not work on?
    , Mar 12, 2008
    #8
  9. Robert Bossy wrote:

    > Indeed! Maybe the best choice for chunksize would be the file's buffer
    > size... I won't search the doc how to get the file's buffer size because
    > I'm too cool to use that function and prefer the seek() option since
    > it's lighning fast regardless the size of the file and it takes near to
    > zero memory.


    And makes a hole in the file, I suppose, hence the fragmentation.

    The OP explicitly asked for an uncompressed file.
    Marco Mariani, Mar 12, 2008
    #9
  10. k.i.n.g.

    Guest

    On Mar 12, 7:37 am, "k.i.n.g." <> wrote:
    > We use dd command in Linux to create a file with of required size.

    If you just want to get your work done, you might consider the cygwin
    dd command.
    Learning to write python is a worthwhile endeavour in any case.
    , Mar 13, 2008
    #10
  11. k.i.n.g.

    k.i.n.g. Guest

    On Mar 13, 8:07 am, "" <> wrote:
    > On Mar 12, 7:37 am, "k.i.n.g." <> wrote:> We use dd command in Linux to create a file with of required size.
    >
    > If you just want to get your work done, you might consider the cygwin
    > dd command.
    > Learning to write python is a worthwhile endeavour in any case.


    While I just started learning programming/python, I got this
    requirement at my workplace. I want to learn python than just get
    things done.

    Thank you all for the solutions, I will try them and let you all know
    about my results.
    k.i.n.g., Mar 13, 2008
    #11
  12. k.i.n.g.

    Robert Bossy Guest

    wrote:
    > On Mar 12, 2:44 pm, Robert Bossy <> wrote:
    >
    >> Matt Nordhoff wrote:
    >>
    >>> Robert Bossy wrote:
    >>>
    >>>> k.i.n.g. wrote:
    >>>>
    >>>>> I think I am not clear with my question, I am sorry. Here goes the
    >>>>> exact requirement.
    >>>>>
    >>>>> We use dd command in Linux to create a file with of required size. In
    >>>>> similar way, on windows I would like to use python to take the size of
    >>>>> the file( 50MB, 1GB ) as input from user and create a uncompressed
    >>>>> file of the size given by the user.
    >>>>>
    >>>>> ex: If user input is 50M, script should create 50Mb of blank or empty
    >>>>> file
    >>>>>
    >>>> def make_blank_file(path, size):
    >>>> f = open(path, 'w')
    >>>> f.seek(size - 1)
    >>>> f.write('\0')
    >>>> f.close()
    >>>>
    >>>> I'm not sure the f.seek() trick will work on all platforms, so you can:
    >>>>
    >>>> def make_blank_file(path, size):
    >>>> f = open(path, 'w')
    >>>> f.write('\0' * size)
    >>>> f.close()
    >>>>
    >>> I point out that a 1 GB string is probably not a good idea.
    >>>
    >>> def make_blank_file(path, size):
    >>> chunksize = 10485760 # 10 MB
    >>> chunk = '\0' * chunksize
    >>> left = size
    >>> fh = open(path, 'wb')
    >>> while left > chunksize:
    >>> fh.write(chunk)
    >>> left -= chunksize
    >>> if left > 0:
    >>> fh.write('\0' * left)
    >>> fh.close()
    >>>

    >> Indeed! Maybe the best choice for chunksize would be the file's buffer
    >> size... I won't search the doc how to get the file's buffer size because
    >> I'm too cool to use that function and prefer the seek() option since
    >> it's lighning fast regardless the size of the file and it takes near to
    >> zero memory.
    >>
    >> Cheers,
    >> RB
    >>

    >
    > But what platforms does it work on / not work on?
    >

    Posix. It's been ages since I touched Windows, so I don't know if XP and
    Vista are posix or not.
    Though, as Marco Mariani mentioned, this may create a fragmented file.
    It may or may not be an hindrance depending on what you want to do with
    it, but the circumstances in which this is a problem are quite rare.

    RB
    Robert Bossy, Mar 13, 2008
    #12
  13. k.i.n.g.

    Bryan Olson Guest

    k.i.n.g. wrote:
    > I think I am not clear with my question, I am sorry. Here goes the
    > exact requirement.
    >
    > We use dd command in Linux to create a file with of required size. In
    > similar way, on windows I would like to use python to take the size of
    > the file( 50MB, 1GB ) as input from user and create a uncompressed
    > file of the size given by the user.
    >
    > ex: If user input is 50M, script should create 50Mb of blank or empty
    > file


    You mean all zero bytes? Python cannot guarantee that the system
    will not compress such a file. For testing data transfer rates,
    random data is a usually a better choice.


    --
    --Bryan
    Bryan Olson, Mar 13, 2008
    #13
  14. k.i.n.g.

    Bryan Olson Guest

    Robert Bossy wrote:
    > wrote:
    >> Robert Bossy wrote:
    >>> Indeed! Maybe the best choice for chunksize would be the file's buffer
    >>> size...


    That bit strikes me as silly.

    >>> I won't search the doc how to get the file's buffer size because
    >>> I'm too cool to use that function and prefer the seek() option since
    >>> it's lighning fast regardless the size of the file and it takes near to
    >>> zero memory.

    >>
    >> But what platforms does it work on / not work on?
    >>

    > Posix.


    Posix is on the does-work side, just to be clear.

    http://www.opengroup.org/onlinepubs/000095399/functions/fseek.html

    > It's been ages since I touched Windows, so I don't know if XP and
    > Vista are posix or not.


    I tried on WinXP, with both an NTFS and FAT32 disk, and it worked
    on both.

    I found some Microsoft documentation noting: "On some
    platforms, seeking past the end of a file and then doing a write
    operation results in undefined behavior."

    http://msdn2.microsoft.com/en-us/library/system.io.filestream.seek(VS.71).aspx


    > Though, as Marco Mariani mentioned, this may create a fragmented file.
    > It may or may not be an hindrance depending on what you want to do with
    > it, but the circumstances in which this is a problem are quite rare.


    Writing zeros might also create a fragmented and/or compressed file.
    Using random data, which is contrary to the stated requirement but
    usually better for stated application, will prevent compression but
    not prevent fragmentation.

    I'm not entirely clear on what the OP is doing. If he's testing
    network throughput just by creating this file on a remote server,
    the seek-way-past-end-then-write trick won't serve his purpose.
    Even if the filesystem has to write all the zeros, the protocols
    don't actually send those zeros.


    --
    --Bryan
    Bryan Olson, Mar 14, 2008
    #14
  15. k.i.n.g.

    Robert Bossy Guest

    Bryan Olson wrote:
    > Robert Bossy wrote:
    >
    >> wrote:
    >>
    >>> Robert Bossy wrote:
    >>>
    >>>> Indeed! Maybe the best choice for chunksize would be the file's buffer
    >>>> size...
    >>>>

    >
    > That bit strikes me as silly.
    >

    The size of the chunk must be as little as possible in order to minimize
    memory consumption. However below the buffer-size, you'll end up filling
    the buffer anyway before actually writing on disk.


    >> Though, as Marco Mariani mentioned, this may create a fragmented file.
    >> It may or may not be an hindrance depending on what you want to do with
    >> it, but the circumstances in which this is a problem are quite rare.
    >>

    >
    > Writing zeros might also create a fragmented and/or compressed file.
    > Using random data, which is contrary to the stated requirement but
    > usually better for stated application, will prevent compression but
    > not prevent fragmentation.
    >
    > I'm not entirely clear on what the OP is doing. If he's testing
    > network throughput just by creating this file on a remote server,
    > the seek-way-past-end-then-write trick won't serve his purpose.
    > Even if the filesystem has to write all the zeros, the protocols
    > don't actually send those zeros.

    Amen.

    Cheers,
    RB
    Robert Bossy, Mar 14, 2008
    #15
  16. k.i.n.g.

    Bryan Olson Guest

    Robert Bossy wrote:
    > Bryan Olson wrote:
    >> Robert Bossy wrote:
    >>>> Robert Bossy wrote:
    >>>>> Indeed! Maybe the best choice for chunksize would be the file's buffer
    >>>>> size...

    >>
    >> That bit strikes me as silly.
    >>

    > The size of the chunk must be as little as possible in order to minimize
    > memory consumption. However below the buffer-size, you'll end up filling
    > the buffer anyway before actually writing on disk.


    First, which buffer? The file library's buffer is of trivial size,
    a few KB, and if we wanted to save even that we'd use os.open and
    have no such buffer at all. The OS may set up a file-specific
    buffer, but again those are small, and we could fill our file much
    faster with larger writes.

    Kernel buffers/pages are dynamically assigned on modern operating
    systems. There is no particular buffer size for the file if you mean
    the amount of kernel memory holding the written data. Some OS's
    do not buffer writes to disk files; the write doesn't return until
    the data goes to disk (though they may cache it for future reads).

    To fill the file fast, there's a large range of reasonable sizes
    for writing, but user-space buffer size - typically around 4K - is
    too small. 1 GB is often disastrously large, forcing paging to and
    from disk to access the memory. In this thread, Matt Nordhoff used
    10MB; fine size today, and probably for several years to come.

    If the OP is writing to a remote disk file to test network
    throughput, there's another size limit to consider. Network file-
    system protocols do not steam very large writes; the client has to
    break a large write into several smaller writes. NFS version 2 had
    a limit of 8 KB; version 3 removed the limit by allowing the server
    to tell the client the largest size it supports. (Version 4 is now
    out, in hundreds of pages of RFC that I hope to avoid reading.)


    --
    --Bryan
    Bryan Olson, Mar 14, 2008
    #16
  17. k.i.n.g.

    Guest

    Quoting Bryan Olson <>:

    > Robert Bossy wrote:
    > > Bryan Olson wrote:
    > >> Robert Bossy wrote:
    > >>>> Robert Bossy wrote:
    > >>>>> Indeed! Maybe the best choice for chunksize would be the file's buffer
    > >>>>> size...
    > >>
    > >> That bit strikes me as silly.
    > >>

    > > The size of the chunk must be as little as possible in order to minimize
    > > memory consumption. However below the buffer-size, you'll end up filling
    > > the buffer anyway before actually writing on disk.

    >
    > First, which buffer? The file library's buffer is of trivial size,
    > a few KB, and if we wanted to save even that we'd use os.open and
    > have no such buffer at all. The OS may set up a file-specific
    > buffer, but again those are small, and we could fill our file much
    > faster with larger writes.
    >
    > Kernel buffers/pages are dynamically assigned on modern operating
    > systems. There is no particular buffer size for the file if you mean
    > the amount of kernel memory holding the written data. Some OS's
    > do not buffer writes to disk files; the write doesn't return until
    > the data goes to disk (though they may cache it for future reads).
    >
    > To fill the file fast, there's a large range of reasonable sizes
    > for writing, but user-space buffer size - typically around 4K - is
    > too small. 1 GB is often disastrously large, forcing paging to and
    > from disk to access the memory. In this thread, Matt Nordhoff used
    > 10MB; fine size today, and probably for several years to come.
    >
    > If the OP is writing to a remote disk file to test network
    > throughput, there's another size limit to consider. Network file-
    > system protocols do not steam very large writes; the client has to
    > break a large write into several smaller writes. NFS version 2 had
    > a limit of 8 KB; version 3 removed the limit by allowing the server
    > to tell the client the largest size it supports. (Version 4 is now
    > out, in hundreds of pages of RFC that I hope to avoid reading.)


    Wow. That's a lot knowledge in a single post. Thanks for the information, Bryan.

    Cheers,
    RB
    , Mar 15, 2008
    #17
  18. En Wed, 12 Mar 2008 09:37:58 -0200, k.i.n.g. <>
    escribi�:

    > We use dd command in Linux to create a file with of required size. In
    > similar way, on windows I would like to use python to take the size of
    > the file( 50MB, 1GB ) as input from user and create a uncompressed
    > file of the size given by the user.


    The equivalent command on Windows would be:

    fsutil file createnew filename size

    --
    Gabriel Genellina
    Gabriel Genellina, Mar 16, 2008
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tiewknvc9
    Replies:
    6
    Views:
    647
    Chris Uppal
    Oct 1, 2006
  2. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,522
    Michael Jung
    May 25, 2008
  3. Keith Thompson

    Re: File Size - Big File Size

    Keith Thompson, Oct 1, 2009, in forum: C Programming
    Replies:
    6
    Views:
    279
    Phil Carmody
    Oct 3, 2009
  4. Michael Tsang

    Re: File Size - Big File Size

    Michael Tsang, Oct 4, 2009, in forum: C Programming
    Replies:
    2
    Views:
    306
    Keith Thompson
    Oct 4, 2009
  5. Nobody

    Re: File Size - Big File Size

    Nobody, Oct 5, 2009, in forum: C Programming
    Replies:
    10
    Views:
    2,005
    Flash Gordon
    Oct 10, 2009
Loading...

Share This Page