Counting Tabs and splitting by that number

Discussion in 'Ruby' started by Nick Bo, Sep 28, 2008.

  1. Nick Bo

    Nick Bo Guest

    Basically i have a document which I am opening and then i am reading
    each line of the file and having to split it up into two arrays and then
    into a hash in which i have to get some sort of output like this:

    application/activemessage has no extensions
    application/andrew-inset has extensions ez
    application/applefile has no extensions
    application/atom has extensions atom
    application/atomcat+xml has extensions atomcat
    application/atomicmail has no extensions
    application/atomserv+xml has extensions atomsrv
    application/batch-SMTP has no extensions
    application/beep+xml has no extensions
    application/cals-1840 has no extensions

    I have determined that if there are no tabs in the document then the
    file has no extension so what i did was an if statement in the beginning
    to see if the line contained the tab if not then it would save false to
    the position in the array that i was at in the each loop.

    file.each_line do |line|
    next if line[0] == ?#
    next if line == "\n"
    string = line
    if string.include?("\t") == false
    mimeValue = false
    mimeKey=string.split
    else

    #THIS IS WHERE MY ISSUE IS NOW
    mimeKey, mimeValue = string.split("\t\t\t")
    end

    My problem now that sometimes teh document is split by tabs changing in
    number one line may have 3 tabs other may have 5 and one might just have
    just 1. So I am in a rut now How do i determine how many tabs are in
    the line(string variable) thus so i can split the two parts into their
    appropriate arrays. I was thinking I could do some kind of recurssion
    which would test to see if tab and if so then add 1 to count and then be
    able to do something like

    mimeKey, mimeValue = string.split(#{tabCount}*("\t"))

    I know there is alot in my message so here is a summary:

    HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t
    --
    Posted via http://www.ruby-forum.com/.
     
    Nick Bo, Sep 28, 2008
    #1
    1. Advertising

  2. Nick Bo wrote:
    > Basically i have a document which I am opening and then i am reading

    (...)
    >
    > I know there is alot in my message so here is a summary:
    >
    > HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t


    Split on \t anyway and dump all empty results, like this:

    str = 'beep+xml\t\t\t atom'
    res = str.split('\t').reject{|item|item.empty?}
    p res

    hth,

    Siep
    --
    Posted via http://www.ruby-forum.com/.
     
    Siep Korteling, Sep 28, 2008
    #2
    1. Advertising

  3. Nick Bo

    Guest

    On Sun, Sep 28, 2008 at 5:52 PM, Nick Bo <> wrote:
    > #THIS IS WHERE MY ISSUE IS NOW
    > mimeKey, mimeValue = string.split("\t\t\t")
    >
    > My problem now that sometimes teh document is split by tabs changing in
    > number one line may have 3 tabs other may have 5 and one might just have
    > just 1.
    >
    > mimeKey, mimeValue = string.split(#{tabCount}*("\t"))
    >
    > I know there is alot in my message so here is a summary:
    >
    > HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t


    Your tabs are consecutive and you don't actually care how many there are?
    string.split(/\t+/)
    ?
     
    , Sep 28, 2008
    #3
  4. Nick Bo

    Nick Bo Guest

    incorrect if i do it that way then if i have 5 tabs in between the two
    parts i want to separate then i get 4 blank arrays. giving me a total of
    6 arrays.
    eg = "abcdefg \t\t\t\t\t hi"
    eg.split("\t) --> ["abcdefg ", "", "", "", " i"
    eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
    matche the pattern given to the split at all so it makes whole thing
    part of the array.
    --
    Posted via http://www.ruby-forum.com/.
     
    Nick Bo, Sep 29, 2008
    #4
  5. Nick Bo

    Bill Kelly Guest

    From: "Nick Bo" <>
    >
    > eg = "abcdefg \t\t\t\t\t hi"
    > eg.split("\t) --> ["abcdefg ", "", "", "", " i"
    > eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
    > matche the pattern given to the split at all so it makes whole thing
    > part of the array.


    Huh?

    >> eg = "abcdefg \t\t\t\t\t hi"

    => "abcdefg \t\t\t\t\t hi"
    >> eg.split(/\t+/)

    => ["abcdefg ", " hi"]


    Regards,

    Bill
     
    Bill Kelly, Sep 29, 2008
    #5
  6. Nick Bo

    Nick Bo Guest

    Bill Kelly wrote:
    > From: "Nick Bo" <>
    >>
    >> eg = "abcdefg \t\t\t\t\t hi"
    >> eg.split("\t) --> ["abcdefg ", "", "", "", " i"
    >> eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
    >> matche the pattern given to the split at all so it makes whole thing
    >> part of the array.

    >
    > Huh?
    >
    >>> eg = "abcdefg \t\t\t\t\t hi"

    > => "abcdefg \t\t\t\t\t hi"
    >>> eg.split(/\t+/)

    > => ["abcdefg ", " hi"]
    >
    >
    > Regards,
    >
    > Bill


    it wouldnt give me the two, i so wish it did but i found a way around it
    this is my solution and it works perfect
    eg = "abcdefg \t\t\t\t\t\t hi"
    splitArray = eg.split("\t")
    splitArray = splitArray.delete("")

    loop
    arrayKey = splitArray[0]
    arrayValue = splitArray[1]

    Thanks for everyones help
    --
    Posted via http://www.ruby-forum.com/.
     
    Nick Bo, Sep 29, 2008
    #6
  7. Nick Bo

    Mark Thomas Guest


    > it wouldnt give me the two, i so wish it did but i found a way around it
    > this is my solution and it works perfect
    > eg = "abcdefg \t\t\t\t\t\t hi"
    > splitArray = eg.split("\t")
    > splitArray = splitArray.delete("")


    IMO, the regex solution is better

    splitArray = eg.split(/\t+/)

    I think you put it in quotes. Leave the quotes out.

    -- Mark.
     
    Mark Thomas, Sep 29, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Ericson
    Replies:
    0
    Views:
    450
    John Ericson
    Jul 19, 2003
  2. qwweeeit
    Replies:
    2
    Views:
    681
    qwweeeit
    Dec 14, 2005
  3. Matt

    Counting Tabs, New lines, etc. ~error~

    Matt, Apr 20, 2005, in forum: C Programming
    Replies:
    4
    Views:
    411
    Old Wolf
    Apr 20, 2005
  4. rantingrick

    Tabs -vs- Spaces: Tabs should have won.

    rantingrick, Jul 16, 2011, in forum: Python
    Replies:
    95
    Views:
    1,936
    Roy Smith
    Jul 19, 2011
  5. John Kopanas
    Replies:
    2
    Views:
    323
    Gregory Brown
    Jan 29, 2007
Loading...

Share This Page