writing binary files

Discussion in 'Perl Misc' started by mermadak, Aug 4, 2005.

  1. mermadak

    mermadak Guest

    I am trying to convert an ANSI encoded ASCII text file to a binary file. I
    have looked at the b2a_qp( data[, quotetabs, istext, header]) function at
    http://aspn.activestate.com/ASPN/docs/ActivePython/2.3/python/lib/module-binascii.html
    but I am not sure if it will do what I need it to or how set it up to take
    the data.

    Also, the parts of this that really make it an issue is that the data is
    coming off of a DOS machine (so endian is a concern here right?) and is a
    rather large text file with a ton of scientific data points (from 500k to
    5MB files).

    Any help would be greatly appreciated.

    Thanks,
    Dennis Aust
     
    mermadak, Aug 4, 2005
    #1
    1. Advertising

  2. mermadak

    John Bokma Guest

    "mermadak" <> wrote:

    > I am trying to convert an ANSI encoded ASCII text file to a binary
    > file. I have looked at the b2a_qp( data[, quotetabs, istext, header])
    > function at
    > http://aspn.activestate.com/ASPN/docs/ActivePython/2.3/python/lib/modul
    > e-binascii.html but I am not sure if it will do what I need it to or
    > how set it up to take the data.


    Python... hmmm....

    > Also, the parts of this that really make it an issue is that the data
    > is coming off of a DOS machine (so endian is a concern here right?)
    > and is a rather large text file with a ton of scientific data points
    > (from 500k to 5MB files).


    So basically you want to convert numbers in a text file to some short
    binary notation?

    5MB... you are aware that the current year is 2005? :)

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, Aug 4, 2005
    #2
    1. Advertising

  3. mermadak

    mermadak Guest

    "John Bokma" <> wrote in message
    news:Xns96A8BA198CF8castleamber@130.133.1.4...
    >> Also, the parts of this that really make it an issue is that the data
    >> is coming off of a DOS machine (so endian is a concern here right?)
    >> and is a rather large text file with a ton of scientific data points
    >> (from 500k to 5MB files).

    >
    > So basically you want to convert numbers in a text file to some short
    > binary notation?


    Exactly... any ideas?

    > 5MB... you are aware that the current year is 2005? :)


    Does that mean 5MB shouldn't be a problem???
    I originally tried writing a program to simply maniplute these files in my
    native programming languages of VB and C++ which would hang due to the size
    of these files. I finally found a PERL script that would handle parsing this
    much data.

    Dennis Aust
     
    mermadak, Aug 4, 2005
    #3
  4. mermadak

    John Bokma Guest

    "mermadak" <> wrote:

    >
    > "John Bokma" <> wrote in message
    > news:Xns96A8BA198CF8castleamber@130.133.1.4...
    >>> Also, the parts of this that really make it an issue is that the
    >>> data is coming off of a DOS machine (so endian is a concern here
    >>> right?) and is a rather large text file with a ton of scientific
    >>> data points (from 500k to 5MB files).

    >>
    >> So basically you want to convert numbers in a text file to some short
    >> binary notation?

    >
    > Exactly... any ideas?


    Python or Perl, since your post referred to Python :-D

    >> 5MB... you are aware that the current year is 2005? :)

    >
    > Does that mean 5MB shouldn't be a problem???


    Yup, your computer probably has 100 times as much memory.

    > I originally tried writing a program to simply maniplute these files
    > in my native programming languages of VB and C++ which would hang due
    > to the size of these files.


    If a C++ program would hang on 5MB files, how can programs handle 10M
    MP3 files, or 700 MB movies?

    > I finally found a PERL script that would


    PERL is not an acronym :)

    > handle parsing this much data.


    Again: 5MB is not much. My best guess is that you should rethink your
    algoritm(s).

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, Aug 4, 2005
    #4
  5. mermadak

    mermadak Guest


    > Python or Perl, since your post referred to Python :-D


    Perl... preferrably. My point there is that I am grasping at straws at this
    point...

    I looked at the pack function that was also recommended but I am not sure
    how to use it. Could anyone possibly give me an example? Mainly it looks as
    though my data can only contain strings, floating point decimals, or fixed
    point decimals but not a combination there of. My data is ASCII format but
    would it be considered string data even though a data string may look like
    "2005-08-05, 13:36:06, 3236.453232, 11123.456, 0.0, 21, 224.332" for
    purposes of conversion to raw binary format? Also, the function says it
    calls for a TEMPLATE variable to be passed to it. Is this required? And this
    looks as though it would require a template character to be passed for every
    character in the file??? This seems like it will be very processor intensive
    as well as nearly impractical from a code writing perspective, as I would
    have to build an array of the TEMPLATE characters and then build a
    comparison function to check which character matches the TEMPLATE
    designation and then convert each character to binary at that point. Am I
    way off base here? Just seems like there would be a more practical way to
    achieve this.

    > Yup, your computer probably has 100 times as much memory.


    True... but what does that have to do with process intensity and the
    capabilities of the tools?

    > If a C++ program would hang on 5MB files, how can programs handle 10M
    > MP3 files, or 700 MB movies?


    I agree with your point. Admittedly it was probably due to poor programming.
    I have only been coding for 3 years now and only part time at that. But I
    would be glad to send you the programs I was working on and see if you make
    them work. ;-) Although, I did finally get that covered with Perl so it not
    much of a concern at the moment.

    > Again: 5MB is not much. My best guess is that you should rethink your
    > algoritm(s).


    Agreed, see above. Thank you for pointing out all of the obvious problems
    here. Perhaps you would be so kind as to make some suggestions on how I
    could actually accomplish this now?
     
    mermadak, Aug 6, 2005
    #5
  6. mermadak

    John Bokma Guest

    "mermadak" <> wrote:

    >
    >> Python or Perl, since your post referred to Python :-D

    >
    > Perl... preferrably. My point there is that I am grasping at straws at
    > this point...
    >
    > I looked at the pack function that was also recommended but I am not
    > sure how to use it. Could anyone possibly give me an example? Mainly
    > it looks as though my data can only contain strings, floating point
    > decimals, or fixed point decimals but not a combination there of. My
    > data is ASCII format but would it be considered string data even
    > though a data string may look like "2005-08-05, 13:36:06, 3236.453232,
    > 11123.456, 0.0, 21, 224.332" for purposes of conversion to raw binary
    > format?


    A better question is: is compression really required? What is causing
    the current problem(s). I am sure it's not managing 5 MB of data, which
    is on a recent PC close to nothing.

    > Also, the function says it calls for a TEMPLATE variable to be
    > passed to it. Is this required?


    The whole idea of pack is that it packs data according to a TEMPLATE, so
    guess :)

    > And this looks as though it would
    > require a template character to be passed for every character in the
    > file???


    More or less, yes.

    > This seems like it will be very processor intensive as well as
    > nearly impractical from a code writing perspective, as I would have to
    > build an array of the TEMPLATE characters and then build a comparison
    > function to check which character matches the TEMPLATE designation and
    > then convert each character to binary at that point. Am I way off base
    > here? Just seems like there would be a more practical way to achieve
    > this.


    Yup: the most practical problem is: find the real bottle neck of your
    problem. If you just require compression, use a compression solution.
    Pack indeed needs to "know" what is in the string you want to be packed.
    So if you want to pack a date followed by 3 floats on line 1 and 4
    floats and a fixed number on line 2, you have to provide the correct
    template to pack.

    >> Yup, your computer probably has 100 times as much memory.

    >
    > True... but what does that have to do with process intensity and the
    > capabilities of the tools?


    That there shouldn't be any problem reading 5 MB of data into memory and
    use it.

    Regarding pack: if your lines don't follow a fixed format (e.g. a date
    followed by exactly 5 floats, and 2 fixed point nrs), you already have
    to do some parsing in your program. You can use the same parsing set up
    to compress/convert your data to binary. If you only want to use the
    output in Perl, you might consider writing out the compact version using
    Storable.

    If you have access to the program that creates those "big" files, and
    it's written in Perl, you just have to tweak the output part, since that
    part decides the structure of the output file. If it's not written in
    Perl, you have to create a compatible binary output format (which is not
    that hard). However, I recommend, especially if your files are around 5
    MB, to stick with ASCII. It's human readable :)

    >> If a C++ program would hang on 5MB files, how can programs handle 10M
    >> MP3 files, or 700 MB movies?

    >
    > I agree with your point. Admittedly it was probably due to poor
    > programming. I have only been coding for 3 years now and only part
    > time at that. But I would be glad to send you the programs I was
    > working on and see if you make them work. ;-)


    No problem. I do such things professionally (ie. for money ;-) ). It
    might save you a lot of time and trouble.

    > Although, I did finally
    > get that covered with Perl so it not much of a concern at the moment.
    >
    >> Again: 5MB is not much. My best guess is that you should rethink your
    >> algoritm(s).

    >
    > Agreed, see above. Thank you for pointing out all of the obvious
    > problems here. Perhaps you would be so kind as to make some
    > suggestions on how I could actually accomplish this now?


    If handling 5 MB of data is a problem for your program, why is it a
    problem?

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, Aug 6, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daniel Gowans

    Reading/Writing pure binary files

    Daniel Gowans, May 27, 2004, in forum: VHDL
    Replies:
    2
    Views:
    5,280
  2. =?Utf-8?B?TWljaGFlbFk=?=

    Writing Binary files to clients over SSL

    =?Utf-8?B?TWljaGFlbFk=?=, Nov 17, 2005, in forum: ASP .Net
    Replies:
    5
    Views:
    492
    Joerg Jooss
    Nov 18, 2005
  3. Daniel Moree

    Reading and Writing to Binary Files

    Daniel Moree, Nov 23, 2004, in forum: C++
    Replies:
    9
    Views:
    1,026
    Jonathan Mcdougall
    Nov 24, 2004
  4. Ron Eggler

    writing binary file (ios::binary)

    Ron Eggler, Apr 25, 2008, in forum: C++
    Replies:
    9
    Views:
    969
    James Kanze
    Apr 28, 2008
  5. Jim
    Replies:
    6
    Views:
    768
Loading...

Share This Page