"yield" and "old-way iteration"

Bjarke Walling · Nov 28, 2006

Hi,

I am new to Ruby, but I find the language easy to learn and use.
However I have some code that I think could be written smarter.

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize
@tokens = []
@index = 0
first_parse = FirstParse.new
for token in first_parse
@tokens.add token
end
end
def current
@tokens[@index]
end
def next
@index++
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

Am I being to "Java'ish" or what do you think. It is not a big problem
since the code works, but do I really need to load the tokens into an
array first?

- Bjarke Walling

dblack · Nov 29, 2006

HI --

Hi,

I am new to Ruby, but I find the language easy to learn and use.
However I have some code that I think could be written smarter.

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize
@tokens = []
@index = 0
first_parse = FirstParse.new
for token in first_parse
@tokens.add token
end
end
def current
@tokens[@index]
end
def next
@index++

That won't parse

There's no ++ operator in Ruby; you'll want to
do:
@index += 1

self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

Am I being to "Java'ish" or what do you think. It is not a big problem
since the code works, but do I really need to load the tokens into an
array first?

I would say that if you're going to load the tokens into an array,
don't do it by yielding; do it by returning an array. (All the code
that follows is very sketchy and just intended to illustrate the broad
picture.)

class FirstParse
attr_reader :tokens
def initialize
# put all tokens in @tokens array
end
end

class SecondParse
def initialize
@tokens = FirstParse.new.tokens
end
end

If you want to, you can just yield from the first parse to the second
parse.

class FirstParse
def initialize
while (# get token from stream)
yield token
end
end
end

class SecondParse
def initialize
FirstParse.new.each do |token|
# do something with token here -- don't save it
end
end
end

That way you don't have to maintain your own pointer (though you could
add it back in if needed it for some other reason).

David

--
David A. Black | (e-mail address removed)
Author of "Ruby for Rails" [1] | Ruby/Rails training & consultancy [3]
DABlog (DAB's Weblog) [2] | Co-director, Ruby Central, Inc. [4]
[1] http://www.manning.com/black | [3] http://www.rubypowerandlight.com
[2] http://dablog.rubypal.com | [4] http://www.rubycentral.org

Vidar Hokstad · Nov 29, 2006

Bjarke said:
I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

You're not providing much context. I am assuming that you want the
current/next approach because your parser will pull tokens, presumably
because you're using recursive descent or another top-down parsing
method.

If that's what you are doing, and you want to stick with that (as
opposed to switching to a bottom-up parser), then you're dealing with a
classic "inversion of control" problem.

I don't really think making the first lexer yielding tokens buys you
much over just making the parser call methods in the lexer to tokenize
and return the tokens as a normal method call. In other words, if
you're using a top-down parsing method, you really want to consider
making your parser pull tokens from the lexer, instead of having the
lexer push tokens to the parser, which is what you are doing when you
use yield.

However, if you want to stick to using yield, you can use "Generator"
(see http://ruby-doc.org/core/classes/Generator.html) to invert the
control and let you "pull" tokens from your yield'ing lexer without
having to go via an array.

Vidar

Eric Hodel · Nov 29, 2006

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write
using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and
"next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens
and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse

include Enumerable

def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize def initialize(tokens)
@index = 0 @tokens = tokens
end
def current
@tokens[@index]
end
def next
@index++
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

parser = SecondParse.new FirstParse.new.to_a

Passing in an Array of tokens makes it easier to test, too.

Bjarke Walling · Nov 29, 2006

Thank you for all your replies!

The real problem is that I am not that skilled in writing a parser, I
think.

It might be the "inversion of control" problem I experience. The
problem is that my FirstParse class push tokens using "yield" and in my
SecondParse I want to pull tokens and decide upon them, and pull the
next when I'm ready for it. I don't know how to write the SecondParse
another way without the code becomming too complex, but I have a book
on grammers and languages. I read the part on regular expressions and
on Turing machines, but missed out the grammers part. I have to read at
least some of it

But for now I solve it by loading the tokens directly into an array in
the first parse.

- Bjarke Walling

Can't solve problems! please Help	0	Sep 26, 2022
yield self if block_given?	3	Jan 1, 2010
A question about yield	5	Nov 7, 2010
set and dict iteration	42	Aug 16, 2012
Idiomatic way to detect first/last iteration?	18	Jun 11, 2009
yield example baffling - need help	3	Jan 15, 2009
using Proc and yield	4	Dec 15, 2006
Proc / def / yield semantics (long)	15	Sep 25, 2004

"yield" and "old-way iteration"

Bjarke Walling

dblack

Vidar Hokstad

Eric Hodel

Bjarke Walling

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads