Can someone point me in the right direction? Pipe Delimited Files& Hashes...

Discussion in 'Ruby' started by Samantha, Mar 7, 2007.

  1. Samantha

    Samantha Guest

    Hello all,

    I have a file that is not a normal csv or tab delimited file. It is
    delimited with the pipe | character.

    I Googled and found someone had posted how to parse it... That's all
    fine and well, but now I need to figure out how I'm going to either (a)
    import it into a MySQL database or (b) put it in some sort of container
    that will let me access each field (like an array or a hash).

    If I go through the hash, I'll probably need to assign each field in the
    file a key, and then populate it. I'm having a hard time figuring this out.

    If I go the route of MySQL I know that I'll probably end up using
    ActiveRecord.


    This is what I have so far that I found from someone's blog (I got the
    concept from someone's blog and then modified it a bit to see what it
    was doing and what it would do in relation to my file. What this does,
    obviously, is puts out the info, a new line between each field, and an
    extra linefeed between each row... (each group - boy, am I articulate
    today or WHAT?!)

    File.open("i put the filename here").each do |record|
    record.split("|").each do |field|
    field.chomp!
    puts field
    end
    end


    So, what I want it to do, is say I have the following fields in the |
    delimited file:

    category | subcategory | description

    How do I make that stuff into a hash? I should probably start out small
    by putting it into a hash first, and then figure out how to deal with it
    in MySQL.

    If someone could point me in the right direction, of possible libraries
    that would help or the such, I'd love to go read there and study on it
    and try to figure it out. Not asking for answers, just asking for
    resources. :)

    Thanks,

    Samantha

    http://www.babygeek.org/

    "Beware when the great God lets loose a thinker on this planet. Then all
    things are at risk."
    --Ralph Waldo Emerson
    Samantha, Mar 7, 2007
    #1
    1. Advertising

  2. On 07.03.2007 16:33, Samantha wrote:
    > Hello all,
    >
    > I have a file that is not a normal csv or tab delimited file. It is
    > delimited with the pipe | character.
    >
    > I Googled and found someone had posted how to parse it... That's all
    > fine and well, but now I need to figure out how I'm going to either (a)
    > import it into a MySQL database or (b) put it in some sort of container
    > that will let me access each field (like an array or a hash).
    >
    > If I go through the hash, I'll probably need to assign each field in the
    > file a key, and then populate it. I'm having a hard time figuring this
    > out.
    >
    > If I go the route of MySQL I know that I'll probably end up using
    > ActiveRecord.
    >
    >
    > This is what I have so far that I found from someone's blog (I got the
    > concept from someone's blog and then modified it a bit to see what it
    > was doing and what it would do in relation to my file. What this does,
    > obviously, is puts out the info, a new line between each field, and an
    > extra linefeed between each row... (each group - boy, am I articulate
    > today or WHAT?!)
    >
    > File.open("i put the filename here").each do |record|
    > record.split("|").each do |field|
    > field.chomp!
    > puts field
    > end
    > end
    >
    >
    > So, what I want it to do, is say I have the following fields in the |
    > delimited file:
    >
    > category | subcategory | description
    >
    > How do I make that stuff into a hash? I should probably start out small
    > by putting it into a hash first, and then figure out how to deal with it
    > in MySQL.
    >
    > If someone could point me in the right direction, of possible libraries
    > that would help or the such, I'd love to go read there and study on it
    > and try to figure it out. Not asking for answers, just asking for
    > resources. :)


    You pretty much got it already. Just add this:

    FIELDS = [:category, :subcategory, :description]
    db = []

    File.foreach("in") do |line|
    line.chomp!
    rec = {}
    FIELDS.zip(line.split("|")) do |name, val|
    rec[name]=val if val
    end
    db << rec
    end

    # work with db

    (untested)

    Kind regards

    robert
    Robert Klemme, Mar 7, 2007
    #2
    1. Advertising

  3. Re: Can someone point me in the right direction? Pipe Delimited Files & Hashes...

    On Mar 7, 2007, at 9:33 AM, Samantha wrote:

    > This is what I have so far that I found from someone's blog (I got
    > the concept from someone's blog...


    > File.open("i put the filename here").each do |record|
    > record.split("|").each do |field|
    > field.chomp!
    > puts field
    > end
    > end
    >
    >
    > So, what I want it to do, is say I have the following fields in the
    > | delimited file:
    >
    > category | subcategory | description
    >
    > How do I make that stuff into a hash? I should probably start out
    > small by putting it into a hash first, and then figure out how to
    > deal with it in MySQL.


    Let me try giving you this little hint and see if it's enough:

    >> cat, sub, des = "Books|Programming|A fun book about Ruby".split("|")

    => ["Books", "Programming", "A fun book about Ruby"]
    >> cat

    => "Books"
    >> sub

    => "Programming"
    >> des

    => "A fun book about Ruby"

    Don't be shy if you need more help!

    > If someone could point me in the right direction, of possible
    > libraries that would help or the such, I'd love to go read there
    > and study on it and try to figure it out. Not asking for answers,
    > just asking for resources. :)


    The standard CSV library will do the parsing for you, but this case
    looks very simple so you probably don't need it. However, if the
    first row of the file has the field names, it might be worth looking
    at FasterCSV which will build the Hashes for you:

    http://rubyforge.org/projects/fastercsv/

    Hope that helps.

    James Edward Gray II
    James Edward Gray II, Mar 7, 2007
    #3
  4. Re: Can someone point me in the right direction? Pipe Delimited Files & Hashes...

    On 3/7/07, Samantha <> wrote:
    >
    > If I go through the hash, I'll probably need to assign each field in the
    > file a key, and then populate it. I'm having a hard time figuring this out.


    Do the hash first, since that will give you a good feel for how things
    work, but if you put this into actual production, I'd consider using
    the ArrayFields library:

    http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-3.6.0/README

    It's a useful tool to have in your kit

    martin
    Martin DeMello, Mar 7, 2007
    #4
  5. Samantha

    Samantha Guest

    James Edward Gray II wrote:
    > Let me try giving you this little hint and see if it's enough:
    >
    > >> cat, sub, des = "Books|Programming|A fun book about Ruby".split("|")

    > => ["Books", "Programming", "A fun book about Ruby"]
    > >> cat

    > => "Books"
    > >> sub

    > => "Programming"
    > >> des

    > => "A fun book about Ruby"
    >
    > Don't be shy if you need more help!
    >

    Thank you! I really appreciate it. :)
    Again, I'll have to dig through and figure things out. Apparently I
    need another cup of coffee this morning for my synapses to fire properly.
    >> If someone could point me in the right direction, of possible
    >> libraries that would help or the such, I'd love to go read there and
    >> study on it and try to figure it out. Not asking for answers, just
    >> asking for resources. :)

    >
    > The standard CSV library will do the parsing for you, but this case
    > looks very simple so you probably don't need it. However, if the
    > first row of the file has the field names, it might be worth looking
    > at FasterCSV which will build the Hashes for you:

    Thanks! The first row does not have field names, however, I'm sure I
    could add them.

    Thanks again, James - this community is always so helpful!
    >
    > http://rubyforge.org/projects/fastercsv/
    >
    > Hope that helps.
    >
    > James Edward Gray II
    >
    >
    >



    --
    Samantha

    http://www.babygeek.org/

    "Beware when the great God lets loose a thinker on this planet. Then all
    things are at risk."
    --Ralph Waldo Emerson
    Samantha, Mar 7, 2007
    #5
  6. Samantha

    Samantha Guest

    Martin DeMello wrote:
    > On 3/7/07, Samantha <> wrote:
    >>
    >> If I go through the hash, I'll probably need to assign each field in the
    >> file a key, and then populate it. I'm having a hard time figuring
    >> this out.

    >
    > Do the hash first, since that will give you a good feel for how things
    > work, but if you put this into actual production, I'd consider using
    > the ArrayFields library:
    >
    > http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-3.6.0/README
    >
    > It's a useful tool to have in your kit
    >
    > martin
    >
    >


    Thanks, Martin! I will check that out.

    --
    Samantha

    http://www.babygeek.org/

    "Beware when the great God lets loose a thinker on this planet. Then all
    things are at risk."
    --Ralph Waldo Emerson
    Samantha, Mar 7, 2007
    #6
  7. Re: Can someone point me in the right direction? Pipe Delimited Files & Hashes...

    On Mar 7, 2007, at 10:09 AM, Samantha wrote:

    > James Edward Gray II wrote:
    >>> If someone could point me in the right direction, of possible
    >>> libraries that would help or the such, I'd love to go read there
    >>> and study on it and try to figure it out. Not asking for
    >>> answers, just asking for resources. :)

    >>
    >> The standard CSV library will do the parsing for you, but this
    >> case looks very simple so you probably don't need it. However, if
    >> the first row of the file has the field names, it might be worth
    >> looking at FasterCSV which will build the Hashes for you:

    > Thanks! The first row does not have field names, however, I'm sure
    > I could add them.


    Actually, FasterCSV plans for that too. I should have said that.
    Here's a taste:

    >> require "rubygems"

    => false
    >> require "faster_csv"

    => true
    >> csv = <<END_CSV

    Book|Programming|Good Stuff.
    Book|Home Improvement|Sounds like work.
    END_CSV
    => "Book|Programming|Good Stuff.\nBook|Home Improvement|Sounds like
    work.\n"
    >> FCSV.parse(csv, :col_sep => "|", :headers => [:cat, :sub, :des])

    do |row|
    ?> p row.to_hash
    >> end

    {:sub=>"Programming", :cat=>"Book", :des=>"Good Stuff."}
    {:sub=>"Home Improvement", :cat=>"Book", :des=>"Sounds like work."}
    => nil

    Hope that helps.

    James Edward Gray II
    James Edward Gray II, Mar 7, 2007
    #7
  8. Samantha

    Samantha Guest

    Robert Klemme wrote:
    > On 07.03.2007 16:33, Samantha wrote:
    >> Hello all,
    >>
    >> I have a file that is not a normal csv or tab delimited file. It is
    >> delimited with the pipe | character.
    >>
    >> I Googled and found someone had posted how to parse it... That's all
    >> fine and well, but now I need to figure out how I'm going to either
    >> (a) import it into a MySQL database or (b) put it in some sort of
    >> container that will let me access each field (like an array or a hash).
    >>
    >> If I go through the hash, I'll probably need to assign each field in
    >> the file a key, and then populate it. I'm having a hard time
    >> figuring this out.
    >>
    >> If I go the route of MySQL I know that I'll probably end up using
    >> ActiveRecord.
    >>
    >>
    >> This is what I have so far that I found from someone's blog (I got
    >> the concept from someone's blog and then modified it a bit to see
    >> what it was doing and what it would do in relation to my file. What
    >> this does, obviously, is puts out the info, a new line between each
    >> field, and an extra linefeed between each row... (each group - boy,
    >> am I articulate today or WHAT?!)
    >>
    >> File.open("i put the filename here").each do |record|
    >> record.split("|").each do |field|
    >> field.chomp!
    >> puts field
    >> end
    >> end
    >>
    >>
    >> So, what I want it to do, is say I have the following fields in the |
    >> delimited file:
    >>
    >> category | subcategory | description
    >>
    >> How do I make that stuff into a hash? I should probably start out
    >> small by putting it into a hash first, and then figure out how to
    >> deal with it in MySQL.
    >>
    >> If someone could point me in the right direction, of possible
    >> libraries that would help or the such, I'd love to go read there and
    >> study on it and try to figure it out. Not asking for answers, just
    >> asking for resources. :)

    >
    > You pretty much got it already. Just add this:
    >
    > FIELDS = [:category, :subcategory, :description]
    > db = []
    >
    > File.foreach("in") do |line|
    > line.chomp!
    > rec = {}
    > FIELDS.zip(line.split("|")) do |name, val|
    > rec[name]=val if val
    > end
    > db << rec
    > end
    >
    > # work with db
    >
    > (untested)
    >
    > Kind regards
    >
    > robert
    >
    >

    Thank you, Robert! I'm going to need to figure out what each thing
    does. :)

    I really want to make sure I grok everything. Again, thanks!

    --
    Samantha

    http://www.babygeek.org/

    "Beware when the great God lets loose a thinker on this planet. Then all
    things are at risk."
    --Ralph Waldo Emerson
    Samantha, Mar 7, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marc Bishop
    Replies:
    1
    Views:
    266
    Cowboy \(Gregory A. Beamer\)
    Oct 31, 2003
  2. james
    Replies:
    1
    Views:
    319
    Brock Allen
    Mar 31, 2005
  3. MooMaster
    Replies:
    3
    Views:
    314
    Simon Brunning
    Jul 4, 2005
  4. Jay Pondy
    Replies:
    5
    Views:
    308
    Steve C. Orr [MCSD, MVP, CSM, ASP Insider]
    Mar 22, 2007
  5. Tim O'Donovan

    Hash of hashes, of hashes, of arrays of hashes

    Tim O'Donovan, Oct 27, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    196
Loading...

Share This Page