
  • Thread starter Johnathan Smith
  • Start date

Johnathan Smith


im writing a class which so far:

- reads a text file
- and prints out the the number of Tags it encounters

however now i want to create an empty hash every time it encounters a
line, and if it encounters any other field, put the field and the
value in the hash, using the field name as the key

im unsure of how to go about this
any help or pseudo code would be greatly appreciated
my code is provided below


txt file

Tag: ref1
Type: Book
Author: Little, S R

Tag: ref2
Type: Journal
Author: Smith, J

Tag: ref3
Type: Conference Paper
Author: Williams, M

ruby file:

require 'getoptlong'

opts = GetoptLong.new(
['--style', '-n', GetoptLong::NO_ARGUMENT ],
['--database', '-i', GetoptLong::REQUIRED_ARGUMENT]

$linecount = 0

opts.each do |opt, arg|
case opt
when '--style'
require arg
when '--database'
# process options
File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount

Robert Klemme

2007/12/5 said:

im writing a class which so far:

- reads a text file
- and prints out the the number of Tags it encounters

however now i want to create an empty hash every time it encounters a
line, and if it encounters any other field, put the field and the
value in the hash, using the field name as the key

im unsure of how to go about this
any help or pseudo code would be greatly appreciated
my code is provided below


.txt file

Tag: ref1
Type: Book
Author: Little, S R

Tag: ref2
Type: Journal
Author: Smith, J

Tag: ref3
Type: Conference Paper
Author: Williams, M

ruby file:

require 'getoptlong'

opts = GetoptLong.new(
['--style', '-n', GetoptLong::NO_ARGUMENT ],
['--database', '-i', GetoptLong::REQUIRED_ARGUMENT]

$linecount = 0

opts.each do |opt, arg|
case opt
when '--style'
require arg
when '--database'
# process options
File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount

Why do you repost the same question when you got answers already? Are
those answers not propagated to ruby-forum?



Johnathan Smith

Why do you repost the same question when you got answers already? Are
those answers not propagated to ruby-forum?


i wasnt sure i had recieved the answers i was looking for. I was
provided with a few different bits of code which im very grateful for
but I was unable to get working. one of the solutions provided by the
previous threat i will post below. Im not provided with any output at
all. I was to maintain the tag count as well as creating the hash. if
you could point out my errors id be extremly grateful


require 'getoptlong'

opts = GetoptLong.new(
['--style', '-n', GetoptLong::NO_ARGUMENT ],
['--database', '-i', GetoptLong::REQUIRED_ARGUMENT]

$linecount = 0

opts.each do |opt, arg|
case opt
when '--style'
require arg
when '--database'
# process options
File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount

linecount = 0
results = []
hash = {}
File.open('reference.txt').each do |line|
m = line.match /^(\w+):\s*([\w+,\s]+)$/
unless m
results << hash unless hash.empty?
hash = {}
linecount += 1
hash[m[1]] = m[2].chomp


Why do you repost the same question when you got answers already? Are
those answers not propagated to ruby-forum?


i wasnt sure i had recieved the answers i was looking for. I was
provided with a few different bits of code which im very grateful for
but I was unable to get working. one of the solutions provided by the
previous threat i will post below. Im not provided with any output at
all. I was to maintain the tag count as well as creating the hash. if
you could point out my errors id be extremly grateful


require 'getoptlong'

opts = GetoptLong.new(
['--style', '-n', GetoptLong::NO_ARGUMENT ],
['--database', '-i', GetoptLong::REQUIRED_ARGUMENT]

$linecount = 0

opts.each do |opt, arg|
case opt
when '--style'
require arg
when '--database'
# process options
File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount

linecount = 0
results = []
hash = {}
File.open('reference.txt').each do |line|
m = line.match /^(\w+):\s*([\w+,\s]+)$/
unless m
results << hash unless hash.empty?
hash = {}
linecount += 1
hash[m[1]] = m[2].chomp

If you don't understand or don't find an answer sufficient, say so in
the same thread, no one will be mad. But if you start a new one, then
people following the old one aren't helped if you get the answer in
the new one.

Assuming that your text file is always in the format you listed, this
may be what you're after...

database = {}
File.open('reference.txt') { | handle |
tags = handle.read.split("\n\n")
for tag in tags
key, type, author = tag.split("\n")
database[key[5..-1]] = [type[6..-1], author[9..-1]]
p database

Given your sample data, this will output the following...

{"ref1"=>["Book", "Little, S R"], "ref2"=>["Journal", "Smith, J"],
"ref3"=>["Conference Paper", "Williams, M"]}

And the number of tags seen is just "database.length".


Johnathan Smith


thanks, i've got it to print out the hash
although I'm getting a wierd output where its taking the first letter
off the surname e.g:
ruby main.rb
{"ref1"=>["Book", "ittle, S R"], "ref2"=>["Journal", "mith, J"]

can you see in my code where im going wrong?
also, is it possible to print the different refrences on new lines?

ill provide my code below

File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount
File.open('reference.txt') { | handle |tags =
for tag in tags
key, type, author = tag.split("\n")
database[key[5..-1]] = [type[6..-1], author[9..-1]]
p database



thanks, i've got it to print out the hash
although I'm getting a wierd output where its taking the first letter
off the surname e.g:
ruby main.rb

{"ref1"=>["Book", "ittle, S R"], "ref2"=>["Journal", "mith, J"]

can you see in my code where im going wrong?
also, is it possible to print the different refrences on new lines?

ill provide my code below

File.open('reference.txt').each do |line|
if line =~ /^tag:/i
$linecount += 1
puts $linecount

^ This part is pointless now. $linecount above is equal to
database.length in the code below.
File.open('reference.txt') { | handle |tags =
for tag in tags
key, type, author = tag.split("\n")
database[key[5..-1]] = [type[6..-1], author[9..-1]]
p database

It should have been author[8..-1], sorry about that.

I suggest that you go through one of the many ruby guides/tutorials
listed here: http://www.ruby-lang.org/en/documentation/


Johnathan Smith


yer i have been reading many tutorials but was unable to grasp what I
was trying to achieve.

one for thing i was wondering:

if i was to extend the some of the information in the .txt file

Tag: ref1
Type: Book
Author: Smith, J
Publisher: New Books
Chapter: 12

could i adapt the code so "any other information" is put into the hash
rather than having set information like author or type?



yer i have been reading many tutorials but was unable to grasp what I
was trying to achieve.

one for thing i was wondering:

if i was to extend the some of the information in the .txt file

Tag: ref1
Type: Book
Author: Smith, J
Publisher: New Books
Chapter: 12

could i adapt the code so "any other information" is put into the hash
rather than having set information like author or type?

Yes, both of the pieces of code I wrote for you originally handle this
case just fine. Please read through the code and understand it. You
asked for pseudo-code originally, and I actually gave you fully-
functioning code. Now you just need to understand how to use it and
its output.


Yes, both of the pieces of code I wrote for you originally handle this
case just fine. Please read through the code and understand it. You
asked for pseudo-code originally, and I actually gave you fully-
functioning code. Now you just need to understand how to use it and
its output.

Sorry about that Phrogz. I didn't read the other thread, I just
glanced at it briefly the other day. I didn't realize you had already
answered this fully.



Sorry about that Phrogz. I didn't read the other thread, I just
glanced at it briefly the other day. I didn't realize you had already
answered this fully.

Absolutely no need for an apology. Helping people isn't a competition
of who can answer first. And, if I choose to stop helping someone
because I've done as much as I'm willing to do, there's certainly no
etiquette that says you shouldn't step in.

I'm glad you helped Johnathan get further towards his solution that I
did, and certainly I don't care that you used your own code versus

But I appreciate the sentiment. :)

Johnathan Smith

hey guys

sorry for the whole mix up! I appreciate all the help I've received
especially as im new to ruby! im also new to the forum and should have
tried to stick it our with the first threat! so i hope i havn't offended
any of you by this.

that aside, i really hope you can help me resolve this problem.

Jordan, the code you gave me worked successfully. However, it just deals
with the the two types i gave you, author and type.Its partially my
fault for not explaining properly becuase i hope it could deal with any
information after the tag. is this possible to achieve?

again sorry for the mix up


hey guys

sorry for the whole mix up! I appreciate all the help I've received
especially as im new to ruby! im also new to the forum and should have
tried to stick it our with the first threat! so i hope i havn't offended
any of you by this.

Nah! Any mistake you might make, we've already done that and worse
(probably several times). :)
that aside, i really hope you can help me resolve this problem.

Jordan, the code you gave me worked successfully. However, it just deals
with the the two types i gave you, author and type.Its partially my
fault for not explaining properly becuase i hope it could deal with any
information after the tag. is this possible to achieve?

Sure it is. The code I gave you can be expanded, but it was mainly
just for you to get started with, because it is fragile (as you saw
from my mis-post). Even with the correct "author[8..-1]", if you
accidentally leave out the space between the "Author: Name" line, it
will still clip the first letter off the name ([number..number] means
to grab a smaller section from a larger container, and -1 has the
special meaning of the last item; so author[8..-1] means to get the
characters from number 8 to the end).

What you really want is something more flexible, that will allow for
those kinds of mistakes and still do the right thing in almost every
case. That is a regular expression. It lets you give it a pattern
which has certain rules, and matches a string in different ways based
on those rules. And this is just what Phrogz' code is doing:

# we'll get rid of linecount since you can
# get that information directly from your hash

# and we'll store the hashes as subhashes
# underneath a main one, keyed by "Tag:" value

database = {}

File.open('reference.txt') { | handle |

last_tag = nil # we'll use this to keep track of
# the last tag we've seen, see below

handle.each { | line |

# the following regexp says to match:
# ^ = starting at the first char
# (\w+) = 1 or more "word" characters (a-z / 0-9)
# : = followed by a literal colon
# \s* = then zero or more "space" characters
# ([\w+,\s]+) = 1 or more "word/space/," characters
# $ = with nothing else until the end of line
# any string meeting that criteria is a match and m is
# the match data, however if the string doesn't match
# e.g., a blank line, m is nil

m = line.match(/^(\w+):\s*([\w+,\s]+)$/)

if m # if m is a match (i.e., not nil)

if m[1] == 'Tag' # we want to add a key to the hash
# since this is a "Tag:" line
# when we used the () in the regexp
# that told ruby we only care about
# those chars, and we can access
# them in the match object with m[n]
# i.e, m[1] means whatever part of the
# line matched (\w+)

last_tag = m[2].chomp
database[last_tag] = {} # make a subhash as the value
# now we have, e.g.,
# database["ref1"] = {}

# it's not a "Tag:" line, so we add it as key, value
# to the subhash we created for the last tag key we
# created in the hash

database[last_tag][m[1].downcase] = m[2].chomp
# e.g., database["ref1"]["type"] = "Journal"


# now print it
print "# of Tags: ", database.length, "\n\n"
print "Contents: ", database.inspect, "\n\n"

puts database["ref2"]["author"]


This outputs:

# of Tags: 3

Contents: {"ref1"=>{"author"=>"Little, S R", "type"=>"Book"},
"ref2"=>{"author"=>"Smith, J", "type"=>"Journal"},
"ref3"=>{"author"=>"Williams, M", "type"=>"Conference Paper"}}

Smith, J
again sorry for the mix up

No worries :)


Johnathan Smith

thanks mate.

i really appreciate it. and its making sense!
although i have one conern at the end of the output it prints J, Smith

can this be avoided?

thanks again

Johnathan Smith

although i have one conern at the end of the output it prints J, Smith

oh wait...iv solved it
guess it shows i shud actually have a go at it properly


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Latest member

Latest Threads
