New guy... Intoduction and first question on some direction.

Discussion in 'Ruby' started by Oscar Gonzalez, Dec 8, 2005.

  1. Hi everyone. I'm new to these forums. I am sysadmin in California and
    I'm learning Ruby. I've been working on an automated web application
    testing using WATIR and I really like this language. Its the first one
    I've actually contiuned learning after the "hello world" example.

    It even made me want to write something on my own outside of work and
    here's the basics of the project... I just don't even know where to
    start. I can't seem to be searching for the right terms thus I can't
    find modules that would help me out.

    -- sorry for the lenght of the post --

    I want to merge 2 data txt files together.
    - Each file has sections and subsections.
    - Each section and subsection has data that may or may not be on both
    files.
    - The data that is in both files may be slightly different and in this
    case I need it to be "magically merged together" if its within certain
    arbitrary range
    - The data that is in both files that is outside of the range I mention
    above needs to be considered "new" data for the resulting file...
    - I'm not good with reg expressions but I can learn if that is part of
    the solution.


    I was going to post 2 samples of the files I want to merge but the post
    would have been over 4 pages long! Do you guys think the "needs" I've
    posted are enough to point me in the right direction?

    --
    Posted via http://www.ruby-forum.com/.
     
    Oscar Gonzalez, Dec 8, 2005
    #1
    1. Advertising

  2. Oscar Gonzalez wrote:
    > Hi everyone. I'm new to these forums. I am sysadmin in California and
    > I'm learning Ruby. I've been working on an automated web application
    > testing using WATIR and I really like this language. Its the first one
    > I've actually contiuned learning after the "hello world" example.


    That's great news! Welcom aboard.

    > It even made me want to write something on my own outside of work and
    > here's the basics of the project... I just don't even know where to
    > start. I can't seem to be searching for the right terms thus I can't
    > find modules that would help me out.
    >
    > -- sorry for the lenght of the post --
    >
    > I want to merge 2 data txt files together.
    > - Each file has sections and subsections.
    > - Each section and subsection has data that may or may not be on both
    > files.
    > - The data that is in both files may be slightly different and in this
    > case I need it to be "magically merged together" if its within certain
    > arbitrary range
    > - The data that is in both files that is outside of the range I
    > mention above needs to be considered "new" data for the resulting
    > file... - I'm not good with reg expressions but I can learn if that
    > is part of the solution.
    >
    >
    > I was going to post 2 samples of the files I want to merge but the
    > post would have been over 4 pages long! Do you guys think the "needs"
    > I've posted are enough to point me in the right direction?


    It's difficult to help out with some hints as we don't know much yet.
    Probably just post as much from those files so we can recognize how
    sections and subsections are recognized.

    From what I know so far: you might want to have classes Section and
    SubSection with obvious meaning. I don't know whether there's some kind
    of optimization possible with your data but in the worst case you'll have
    O(n*m) effor to compare all possible pairs of SubSections. Also the
    algorithm to decide whether they are close or not might be tricky.

    Kind regards

    robert
     
    Robert Klemme, Dec 8, 2005
    #2
    1. Advertising

  3. Oscar Gonzalez

    ako... Guest

    hello,

    1. what should happen if the files have different structure (different
    set of sections/subsections)?

    2. please define "data", "magically merged together", "range".

    konstantin
     
    ako..., Dec 8, 2005
    #3
  4. Re: New guy... Intoduction and first question on some direct

    akonsu wrote:
    > hello,
    >
    > 1. what should happen if the files have different structure (different
    > set of sections/subsections)?
    >
    > 2. please define "data", "magically merged together", "range".


    Thanks for the responses guys... Here's a little more info based on
    them.

    The sections and subsections and data are defined by {} and [] and
    values... for example.

    DataGroup = {
    [1] = {
    [1] = {
    ["dataidentifier"] = {
    [1] = {
    ["type"] = 1,
    ["x"] = 45.5,
    ["count"] = 1,
    ["image"] = 4,
    ["y"] = 18.8,
    },
    [1] = {
    ["type"] = 1,
    ["x"] = 21.5,
    ["count"] = 5,
    ["image"] = 4,
    ["y"] = 31.8,
    },
    },
    [2] = {
    ["dataidentifier2"] = {
    [1] = {
    ["type"] = 1,
    ["x"] = 74.5,
    ["count"] = 1,
    ["image"] = 3,
    ["y"] = 11.8,
    },
    [1] = {
    ["type"] = 1,
    ["x"] = 27.5,
    ["count"] = 5,
    ["image"] = 3,
    ["y"] = 36.8,
    },
    },
    -----------------

    Disregard the last piece of my post when I said I wanted to merge data
    within a range. After reviewing what I want to do, this is no longer the
    case. I only want to merge data if the value is the same.

    take for example the "x" and "y" values above... For the first section
    where there is "dataidentifier", both of my files have that section... I
    want that if the "x" and "y" values are the same then to just add the
    values under "count" If the "x" and "y" values are different then I just
    need to have the resulting file showing the section with each of the
    dataidentifier data for each of the x and y. Obviously these "x" and
    "y" values are coordinates.

    Maybe if I explain where I'm coming from it will make more sense. Say
    I'm looking for widgets at x and y, and I need to record how many I find
    and where.

    Scenario 1.
    I find 2 of the widgets at 15,18. That gets recorded into file 1.
    I find 1 of the widgets at the same location, 15,18 These get recorded
    into file 2.

    Scenario 2.
    I find 2 widgets at location 21,30. This goes into File 1
    I find 2 contraptions at location 10,23. This goes into File 1
    I find 3 widgets at location 13,40. This goes into File 2

    On scenario 1, I want the resulting file to show that I found a total of
    3 items at location 15,18.

    On scenario 2, I want the resulting file to show that I found 2 widgets
    at location 21,30, 3 widgets at location 13,40 and 2 contraptions at
    location 10,23.

    Is what I want to do better explained now? There is obviously the
    complication (I think) of the coordinates having decimal points. I
    can't aovid this. I also can't avoid writing the 2 files, this is why I
    am working on this. I know it would be ideal to have just a single data
    file where everything gets written to but this is beyond my control for
    this project...

    I really appreciate how fast you guys replied and I hope I'm helping you
    help me. :)

    --
    Posted via http://www.ruby-forum.com/.
     
    Oscar Gonzalez, Dec 8, 2005
    #4
  5. Oscar Gonzalez

    ako... Guest

    Re: New guy... Intoduction and first question on some direct

    i would write a parser for these files. represent the contents as a set
    of hashes/arrays. if you have control over the format of the files, you
    might want to make them simpler so that you won't have to write a real
    parser.
     
    ako..., Dec 8, 2005
    #5
  6. Re: New guy... Intoduction and first question on some direct

    akonsu wrote:
    > i would write a parser for these files. represent the contents as a set
    > of hashes/arrays. if you have control over the format of the files, you
    > might want to make them simpler so that you won't have to write a real
    > parser.


    Well I do'nt have control over the format of the files...

    I'll try to look into the parser thing... the thing is this is the first
    time I do any real coding so I don't even know what to look for in the
    ruby libraries to help me with this... What modules are out there that
    can help me or what are some keywords I should use to search for this.
    And are there any ruby parsers out there that I can look at? And this
    seems like a complex project... am I taking on too big of a project for
    a beginner?


    --
    Posted via http://www.ruby-forum.com/.
     
    Oscar Gonzalez, Dec 8, 2005
    #6
  7. Re: New guy... Intoduction and first question on some direct

    > Well I do'nt have control over the format of the files...

    If it helps at all... I believe the syntax of the files I'm parsing is
    Lua based.


    --
    Posted via http://www.ruby-forum.com/.
     
    Oscar Gonzalez, Dec 8, 2005
    #7
  8. Oscar Gonzalez

    Steve Litt Guest

    Re: New guy... Intoduction and first question on some direct

    On Thursday 08 December 2005 02:13 pm, Oscar Gonzalez wrote:
    > akonsu wrote:
    > > hello,
    > >
    > > 1. what should happen if the files have different structure (different
    > > set of sections/subsections)?
    > >
    > > 2. please define "data", "magically merged together", "range".

    >
    > Thanks for the responses guys... Here's a little more info based on
    > them.
    >
    > The sections and subsections and data are defined by {} and [] and
    > values... for example.
    >
    > DataGroup = {
    > [1] = {
    > [1] = {
    > ["dataidentifier"] = {
    > [1] = {
    > ["type"] = 1,
    > ["x"] = 45.5,
    > ["count"] = 1,
    > ["image"] = 4,
    > ["y"] = 18.8,
    > },
    > [1] = {
    > ["type"] = 1,
    > ["x"] = 21.5,
    > ["count"] = 5,
    > ["image"] = 4,
    > ["y"] = 31.8,
    > },
    > },
    > [2] = {
    > ["dataidentifier2"] = {
    > [1] = {
    > ["type"] = 1,
    > ["x"] = 74.5,
    > ["count"] = 1,
    > ["image"] = 3,
    > ["y"] = 11.8,
    > },
    > [1] = {
    > ["type"] = 1,
    > ["x"] = 27.5,
    > ["count"] = 5,
    > ["image"] = 3,
    > ["y"] = 36.8,
    > },
    > },



    If you can count on indentation like you have above, the easy way might be to
    run it through the OutlineParser object of Node.rb
    (http://www.troubleshooters.com/projects/Node.rb/index.htm). Once the data is
    in a Node tree instead of a file, you can use Walker objects and simple
    callbacks to put massage the data and then output it in any form you'd like,
    including XML or SQL.

    If you cannot count on the indentation, you could remove all indentation with
    a simple sed script, then run a Ruby program to convert every opening brace
    to a new level of indentation and convert ever closing brace to a previous
    level of indentation, and then use that conversion through Node.rb's parser.

    SteveT

    Steve Litt
    http://www.troubleshooters.com
     
    Steve Litt, Dec 8, 2005
    #8
  9. Oscar Gonzalez

    ako... Guest

    Re: New guy... Intoduction and first question on some direct

    a parser is a big project for a beginner. parsing is a process of
    translating a text stream in to a memory representation of the contents
    of the stream. to do that, you will have to be able to split the stream
    in to chunks called tokens, and then check if the combination of these
    tokens is valid, that is if this combination corresponds to the so
    called grammar for your language. there is a theory behind all that. if
    your file was simpler and for example had each line precisely
    identifying a data item like this for example:

    /1/1/dataidentifier/1/type = 1

    then you could just scan the file line by line and get all you need.
    there are tools used to generate parsers. the original ones are called
    lex, and yacc. lex would split your stream in to tokens, and yacc would
    check if the resulting sequence of tokens satisfies the grammar. i am
    not sure if there are parser generators for ruby, although it is
    comparatively easy to write them because they are based on a sound
    theory.

    hope this helps.
    konstantin
     
    ako..., Dec 8, 2005
    #9
  10. Oscar Gonzalez

    Bill Guindon Guest

    Re: New guy... Intoduction and first question on some direct

    On 12/8/05, Oscar Gonzalez <> wrote:
    > akonsu wrote:


    How accurate is this example? Just wondering if the mockup has
    copy/paste errrors. more below...

    > DataGroup =3D {
    > [1] =3D {
    > [1] =3D {
    > ["dataidentifier"] =3D {
    > [1] =3D {
    > ["type"] =3D 1,
    > ["x"] =3D 45.5,
    > ["count"] =3D 1,
    > ["image"] =3D 4,
    > ["y"] =3D 18.8,
    > },
    > [1] =3D {


    Does the [1] really repeat here, or should this be [2] (or some other numbe=
    r)?

    > ["type"] =3D 1,
    > ["x"] =3D 21.5,
    > ["count"] =3D 5,
    > ["image"] =3D 4,
    > ["y"] =3D 31.8,
    > },


    Should there be a '}' here to close off the 'dataidentifier'?

    > },
    > [2] =3D {
    > ["dataidentifier2"] =3D {
    > [1] =3D {
    > ["type"] =3D 1,
    > ["x"] =3D 74.5,
    > ["count"] =3D 1,
    > ["image"] =3D 3,
    > ["y"] =3D 11.8,
    > },
    > [1] =3D {
    > ["type"] =3D 1,
    > ["x"] =3D 27.5,
    > ["count"] =3D 5,
    > ["image"] =3D 3,
    > ["y"] =3D 36.8,
    > },
    > },


    I'm assuming any missing '}' here would be at the end of the file.

    If my guesses are right, it wouldn't be too tough to convert this
    quickly with something along the lines of:

    require 'pp'

    text =3D File.read('some.log')
    text.gsub!(/Datagroup =3D /, '')
    text.gsub!(/\["?(.*?)"?\] =3D/, '"\1" =3D>')

    datagroup =3D eval(text)

    pp datagroup


    --
    Bill Guindon (aka aGorilla)
     
    Bill Guindon, Dec 8, 2005
    #10
  11. Re: New guy... Intoduction and first question on some direct

    Well thats a lot of info so I have to digest on it. I'll post back as
    soon as I have a better grasp of your responses... However I do not want
    to use Lua for this because I want to learn Ruby... I don't think its a
    problem that the data is in Lua syntax, from what I can see, it doesnt
    matter what format the data is in. It seems to be a matter of finding a
    pattern and being able to merge the data from two files into one.

    For the purpose of being accurate on the smple, I've posted the file on
    my site so maybe if you see the actual file I'm working with you'll get
    a better idea of what I want.

    http://www.muychingon.com/gatherer.txt



    --
    Posted via http://www.ruby-forum.com/.
     
    Oscar Gonzalez, Dec 8, 2005
    #11
  12. Oscar Gonzalez

    Bill Guindon Guest

    Re: New guy... Intoduction and first question on some direct

    On 12/8/05, Oscar Gonzalez <> wrote:
    > Well thats a lot of info so I have to digest on it. I'll post back as
    > soon as I have a better grasp of your responses... However I do not want
    > to use Lua for this because I want to learn Ruby... I don't think its a
    > problem that the data is in Lua syntax, from what I can see, it doesnt
    > matter what format the data is in. It seems to be a matter of finding a
    > pattern and being able to merge the data from two files into one.
    >
    > For the purpose of being accurate on the smple, I've posted the file on
    > my site so maybe if you see the actual file I'm working with you'll get
    > a better idea of what I want.
    >
    > http://www.muychingon.com/gatherer.txt


    Ok, seems I was right about the format. Don't peek if you want to
    solve it on your own ;)

    http://www.mvgo.com/anarchy/lua.rb.txt

    a fun little ruby quiz (I _think_ I got it right).

    --
    Bill Guindon (aka aGorilla)
     
    Bill Guindon, Dec 9, 2005
    #12
  13. Oscar Gonzalez

    Steve Litt Guest

    Re: New guy... Intoduction and first question on some direct

    On Thursday 08 December 2005 06:39 pm, Oscar Gonzalez wrote:
    > Well thats a lot of info so I have to digest on it. I'll post back as
    > soon as I have a better grasp of your responses... However I do not want
    > to use Lua for this because I want to learn Ruby... I don't think its a
    > problem that the data is in Lua syntax, from what I can see, it doesnt
    > matter what format the data is in. It seems to be a matter of finding a
    > pattern and being able to merge the data from two files into one.
    >
    > For the purpose of being accurate on the smple, I've posted the file on
    > my site so maybe if you see the actual file I'm working with you'll get
    > a better idea of what I want.
    >
    > http://www.muychingon.com/gatherer.txt


    Below my sig is a 45 line program using Node.rb that converts the file into
    Node objects, each with a name and value. You can see how Walker objects and
    callback routines work. In order to output your chosen format (which I didn't
    completely understand), you'd need to create probably a couple more Walkers
    and a couple more callback routines.

    This program assumes consistent indentation. If that cannot be assumed, you
    need to either do something else (maybe what Bill Guindon suggested), or
    create a tiny brace to indent converter and then run the result through my
    program.

    HTH

    SteveT

    Steve Litt
    http://www.troubleshooters.com



    #!/usr/bin/ruby
    require "Node.rb"

    class Callbacks
    def cb_look_data(checker, level)
    print "\t" * level
    print "Name = ", checker.name
    print ", Value = " , checker.value unless checker.firstchild
    print "\n"
    end

    def cb_get_fields(checker, level)
    if checker.value =~ /\s*}/
    checker.deleteSelf()
    end
    checker.value.gsub!(/,\s*$/, "")
    checker.value.strip!
    checker.value =~ /\[([^\]]*)\]/
    checker.name = $1 if $1
    if level == 1
    checker.value =~ /(.*)\s*=/
    checker.name = $1 if $1
    end

    checker.value =~ /=\s*(.*)/
    checker.value = $1 if $1
    checker.value = "" if checker.value == "{"
    end
    end


    cb = Callbacks.new() # INSTANTIATE CALLBACKS OBJECT

    #### PARSE THE FILE
    parser = OutlineParser.new()
    head = parser.parse("/home/slitt/gatherer.txt")

    #### PARSE THE NODE TREE NODES INTO NAME AND VALUE FIELDS
    walker = Walker.new(head, cb.method:)cb_get_fields), nil)
    walker.walk()

    #### PRINT THE NAME FIELDS FOR CONTAINERS,
    #### AND NAME AND VALUE FIELDS FOR LEAF LEVELS
    walker = Walker.new(head, cb.method:)cb_look_data), nil)
    walker.walk()
     
    Steve Litt, Dec 9, 2005
    #13
  14. Re: New guy... Intoduction and first question on some direct

    Steve Litt <> wrote:
    > If you can count on indentation like you have above, the easy way might be to
    > run it through the OutlineParser object of Node.rb
    > (http://www.troubleshooters.com/projects/Node.rb/index.htm). Once the data is


    Very nice piece of software indeed.

    martin
     
    Martin DeMello, Dec 9, 2005
    #14
  15. Oscar Gonzalez

    Steve Litt Guest

    Re: New guy... Intoduction and first question on some direct

    On Friday 09 December 2005 09:47 am, Martin DeMello wrote:
    > Steve Litt <> wrote:
    > > If you can count on indentation like you have above, the easy way might
    > > be to run it through the OutlineParser object of Node.rb
    > > (http://www.troubleshooters.com/projects/Node.rb/index.htm). Once the
    > > data is

    >
    > Very nice piece of software indeed.
    >
    > martin


    Thanks Martin,

    It should be nice. I've written it in three different languages so far :)

    I use VimOutliner (http://www.vimoutliner.org) to create tab indented
    outlines, and find that Node.[pm py rb] makes processing outlines trivial for
    substantial jobs, and doable for arduous ones (like converting an outline
    into a menu system).

    Thanks for the compliment.

    SteveT

    Steve Litt
    http://www.troubleshooters.com
     
    Steve Litt, Dec 9, 2005
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?VG9t?=

    new guy needs some help

    =?Utf-8?B?VG9t?=, May 25, 2006, in forum: ASP .Net
    Replies:
    5
    Views:
    312
    =?Utf-8?B?RGVtZXRyaQ==?=
    May 25, 2006
  2. tom c
    Replies:
    2
    Views:
    536
    Rick Strahl
    Jul 9, 2006
  3. Replies:
    2
    Views:
    501
  4. Macon Joe Job Guy Joe Macon Job Guy

    Att: Macon Joe Job Guy Joe Macon Job Guy Macon

    Macon Joe Job Guy Joe Macon Job Guy, Oct 14, 2007, in forum: Java
    Replies:
    0
    Views:
    380
    Macon Joe Job Guy Joe Macon Job Guy
    Oct 14, 2007
  5. Replies:
    1
    Views:
    222
    Rectal Mania
    Aug 6, 2007
Loading...

Share This Page