re-organize original text data then write in files

Discussion in 'Ruby' started by Junhui Liao, Jul 24, 2010.

  1. Junhui Liao

    Junhui Liao Guest

    Dear all,

    Recently, I have to do this job.
    Re-organize the original text data then write in files.
    The original data is like this (tsv format).

    First line: time_1.1 signal_1.1 time_2.1 signal_2.1 ...
    time_4096.1 signal_4096.1 (total 4096 pairs).
    Second line: time_1.2 signal_1.2 time_2.2 signal_2.2 ...
    time_4096.2 signal_4096.2(total 4096 pairs).
    .......
    last line(totally 2048 lines): time_1.2048 signal_1.2048 time_2.2048
    signal_2.2048 ... time_4096.2048 signal_4096.2048 (total 4096 pairs).

    What shall I do is,

    Step 0, all of the time_n.* should subtract to the time_n.1. That is to
    say,
    time_1.1, time_1.2, ... time_1.2048 should subtract time_1.1.
    time_2.1, time_2.2, ... time_2.2048 should subtract time_2.1.
    ....
    time_4096.1, time_4096.2, ... time_4096.2048 should subtract
    time_4096.1.

    Step 1, make all of the time_k.* and signal_k.* in each line collected
    together and save in files, let's say, file_k.tsv .
    Namely, all of the time_1.1 , signal_1.1, time_1.2, signal_1.2 ......
    time_1.2048, signal_1.2048 should save in file_1.tsv. And the first line
    is time_1.1 signal_1.1; the second line is time_1.2, signal_1.2 ......
    the last line is time_1.2048, signal_1.2048.

    All of the time_2.1 , signal_2.1, time_2.2, signal_2.2 ......
    time_2.2048, signal_1.2048 should save in file_2.tsv. And the first line
    is time_2.1 signal_2.1; the second line is time_2.2, signal_2.2 ......
    the last line is time_2.2048, signal_2.2048.
    ......
    All of the time_4096.1 , signal_4096.1, time_4096.2, signal_4096.2
    ...... time_4096.2048, signal_4096.2048 should save in file_4096.tsv.
    And the first line is time_4096.1 signal_4096.1; the second line is
    time_4096.2, signal_4096.2 ...... the last line is time_4096.2048,
    signal_4096.2048.


    Already, I developed a script in C++, but it cost around 3 hours to deal
    with this job.
    And I am totally new guy to ruby, perl, a little on Python.
    So, my question is,

    1, how many time it will be cost to do this job under ruby?
    If the time less than one and a half hours, then it worth to study for
    me. I was attracted by the beautiful ruby, already : ) .
    2, Is there any similar example ?

    Best regards !
    Junhui
    --
    Posted via http://www.ruby-forum.com/.
     
    Junhui Liao, Jul 24, 2010
    #1
    1. Advertising

  2. [Note: parts of this message were removed to make it a legal post.]

    On Sat, Jul 24, 2010 at 11:24 AM, Junhui Liao <>wrote:

    > ...
    > Already, I developed a script in C++, but it cost around 3 hours to deal
    > with this job.
    > And I am totally new guy to ruby, perl, a little on Python.
    > So, my question is,
    >
    > 1, how many time it will be cost to do this job under ruby?
    > If the time less than one and a half hours, then it worth to study for
    > me. I was attracted by the beautiful ruby, already : ) .



    I'm neither a Ruby expert nor an expert programmer, but I have been using
    Ruby (for my own purposes) for over 8 years, and as a thought exercise I
    tried this (not actually running anything), and it took me about 20 to 30
    minutes, *provided* the computer memory is big enough to hold all the data.
    (I couldn't think of an easy way to what I think you want to do without
    reading in all the data first, modifying it, then writing it out. That, or
    open 4096 files at the same time: neither way seems elegant.)

    And if you can do that in C++ then I'm sure you can probably do it in Ruby,
    Perl, Python, etc, etc. If you can program in C++ then I see no reason why
    you wouldn't be able to program in Ruby, Perl, Python, etc. (It might look
    like C++ rewritten in R, P, P, etc, but so what if you're trying things
    out.)

    Personally, if I didn't have much time, and I wanted to try something out in
    another computer language, I'd go with a language that I knew a little
    about, so in my case that would be Ruby, Pascal, Qbasic (!!!), and - in your
    case - maybe try something quick in Python. (But I'd also encourage you to
    look at Ruby sometime and try it.)

    Maybe it partly depends on what standard methods/functions are available:
    for example, in Ruby you can read a line from a file into a String, and then
    use a builtin method on the String to split it into an array of values using
    a specified delimiter, so in your case a space character? But I'd be very
    surprised if there weren't similar builtins in Perl and Python.
     
    Colin Bartlett, Jul 25, 2010
    #2
    1. Advertising

  3. Junhui Liao

    Junhui Liao Guest


    > (I couldn't think of an easy way to what I think you want to do without
    > reading in all the data first, modifying it, then writing it out. That,
    > or
    > open 4096 files at the same time: neither way seems elegant.)




    Actually, I developed two versions of C++ script. One is opening 4096
    files
    at the same time. This cost 3 hours. Another version is saving all of
    the
    data in a big vector, then scanning the vector to pick the right items
    to write
    in files. This cost 2 hours and 45 minutes. :).


    >
    > Personally, if I didn't have much time, and I wanted to try something
    > out in
    > another computer language, I'd go with a language that I knew a little
    > about, so in my case that would be Ruby, Pascal, Qbasic (!!!), and - in
    > your
    > case - maybe try something quick in Python. (But I'd also encourage you
    > to
    > look at Ruby sometime and try it.)



    Thanks a lot for your encourage, I tried to read something on ruby
    already.
    Since this language is very simple and beautiful, no matter it works for
    my case
    or not(But I hope it could be).



    >
    > Maybe it partly depends on what standard methods/functions are
    > available:
    > for example, in Ruby you can read a line from a file into a String, and
    > then
    > use a builtin method on the String to split it into an array of values
    > using
    > a specified delimiter, so in your case a space character?





    I need this kind of comment seriously, saying, what are the knowledge
    which is necessary and enough to do my job. If there are some special
    and
    powerful methods or stances to do this kind of stuff.
    Or be better, give a example just very close my case. I can get the
    detailed
    by reading book(s) or googling.

    Anyway, thanks a lot for your reply!
    Best !
    Junhui
    --
    Posted via http://www.ruby-forum.com/.
     
    Junhui Liao, Jul 25, 2010
    #3
  4. I'm putting this at the top of my post because I think the basic problem
    here may be intensive numeric calculations, and - even more so - disk (inpu=
    t
    and) output of about 16 MiB x N bytes of data, where N is 8 bytes (? for
    Floating point numbers), so about 128 MiB in total, and other people will
    have a better knowledge of some possibly useful links.

    On Sun, Jul 25, 2010 at 11:15 PM, Junhui Liao <>wro=
    te:

    > Actually, I developed two versions of C++ script.
    > One is opening 4096 files at the same time. This cost 3 hours.
    > Another version is saving all of the data in a big vector,
    > then scanning the vector to pick the right items to write
    > in files. This cost 2 hours and 45 minutes. :).
    >

    Sorry - in my post I misunderstood what you meant by "cost". I think it is
    (very?) unlikely that any Ruby (or Perl or Python, etc?) program will run
    faster than your C++ scripts. Where Ruby (or Python - I'm not so sure about
    Perl, I haven't used it) does have an advantage is that I think development
    may be quicker. So there are trade-offs. (Incidentally, I'm not an expert,
    but those timings suggest to me that the major processing cost may be in
    writing the results out to disk, so changing the language for all or part o=
    f
    the processing is unlikely to make a large difference?)

    But I'm open to correction: there are people who have used Ruby for fairly
    intensive large data sets processing, but my understanding is that they use
    a mixture of Ruby as "glue" with any intensive calculations in C, etc. For
    example, from some limited experience I have the speed of Ruby reading
    strings of bytes in from files is similar to the speed of Java or compiled
    Pascal, but for calculating CRCs of files the speed of pure Ruby calculatin=
    g
    the CRCs once the bytes had read in was much slower than Java or compiled
    Pascal: so I used Ruby (or rather JRuby) to read in the strings of bytes
    from the files, and then called Java code from Ruby to calculate the CRC
    from the bytes. Overall the speed of this was similar to a pure Java or pur=
    e
    compiled Pascal program.

    Piet Hut and Jun Makino have been using Ruby to model dense star clusters.
    (Note that this is something I know nothing about! I'm just intrigued by th=
    e
    underlying principle of using Ruby for intensive numerical calculations by
    developing in Ruby without worrying about speed by using smaller unrealisti=
    c
    models, and then using more realistic models by translating part (or all!)
    of the Ruby code to a faster language.)

    http://www.kira.org/index.php?option=3Dcom_content&task=3Dview&id=3D124&Ite=
    mid=3D154
    ...MODEST is the new name for the Stellar Dynamics workshop. It stands for:
    MOdeling DEnse STellar systems
    ...
    The basic idea is to start a kind of N-body wikipedia, as a group's process=
     
    Colin Bartlett, Jul 27, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    4
    Views:
    974
    M.E.Farmer
    Feb 13, 2005
  2. Kobu
    Replies:
    3
    Views:
    514
    Kevin Bracey
    Feb 10, 2005
  3. Kevin Handy

    How to organize files to write include?

    Kevin Handy, Sep 15, 2006, in forum: C Programming
    Replies:
    2
    Views:
    335
    Thad Smith
    Sep 16, 2006
  4. Chuck
    Replies:
    0
    Views:
    270
    Chuck
    Jul 5, 2003
  5. Replies:
    0
    Views:
    313
Loading...

Share This Page