"yield" and "old-way iteration"

B

Bjarke Walling

Hi,

I am new to Ruby, but I find the language easy to learn and use.
However I have some code that I think could be written smarter.

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize
@tokens = []
@index = 0
first_parse = FirstParse.new
for token in first_parse
@tokens.add token
end
end
def current
@tokens[@index]
end
def next
@index++
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

Am I being to "Java'ish" or what do you think. It is not a big problem
since the code works, but do I really need to load the tokens into an
array first?

- Bjarke Walling
 
D

dblack

HI --

Hi,

I am new to Ruby, but I find the language easy to learn and use.
However I have some code that I think could be written smarter.

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize
@tokens = []
@index = 0
first_parse = FirstParse.new
for token in first_parse
@tokens.add token
end
end
def current
@tokens[@index]
end
def next
@index++

That won't parse :) There's no ++ operator in Ruby; you'll want to
do:
@index += 1
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

Am I being to "Java'ish" or what do you think. It is not a big problem
since the code works, but do I really need to load the tokens into an
array first?

I would say that if you're going to load the tokens into an array,
don't do it by yielding; do it by returning an array. (All the code
that follows is very sketchy and just intended to illustrate the broad
picture.)

class FirstParse
attr_reader :tokens
def initialize
# put all tokens in @tokens array
end
end

class SecondParse
def initialize
@tokens = FirstParse.new.tokens
end
end

If you want to, you can just yield from the first parse to the second
parse.

class FirstParse
def initialize
while (# get token from stream)
yield token
end
end
end

class SecondParse
def initialize
FirstParse.new.each do |token|
# do something with token here -- don't save it
end
end
end

That way you don't have to maintain your own pointer (though you could
add it back in if needed it for some other reason).


David

--
David A. Black | (e-mail address removed)
Author of "Ruby for Rails" [1] | Ruby/Rails training & consultancy [3]
DABlog (DAB's Weblog) [2] | Co-director, Ruby Central, Inc. [4]
[1] http://www.manning.com/black | [3] http://www.rubypowerandlight.com
[2] http://dablog.rubypal.com | [4] http://www.rubycentral.org
 
V

Vidar Hokstad

Bjarke said:
I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

You're not providing much context. I am assuming that you want the
current/next approach because your parser will pull tokens, presumably
because you're using recursive descent or another top-down parsing
method.

If that's what you are doing, and you want to stick with that (as
opposed to switching to a bottom-up parser), then you're dealing with a
classic "inversion of control" problem.

I don't really think making the first lexer yielding tokens buys you
much over just making the parser call methods in the lexer to tokenize
and return the tokens as a normal method call. In other words, if
you're using a top-down parsing method, you really want to consider
making your parser pull tokens from the lexer, instead of having the
lexer push tokens to the parser, which is what you are doing when you
use yield.

However, if you want to stick to using yield, you can use "Generator"
(see http://ruby-doc.org/core/classes/Generator.html) to invert the
control and let you "pull" tokens from your yield'ing lexer without
having to go via an array.

Vidar
 
E

Eric Hodel

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write
using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and
"next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens
and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse

include Enumerable
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize def initialize(tokens)
@index = 0 @tokens = tokens
end
def current
@tokens[@index]
end
def next
@index++
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

parser = SecondParse.new FirstParse.new.to_a

Passing in an Array of tokens makes it easier to test, too.
 
B

Bjarke Walling

Thank you for all your replies!

The real problem is that I am not that skilled in writing a parser, I
think.

It might be the "inversion of control" problem I experience. The
problem is that my FirstParse class push tokens using "yield" and in my
SecondParse I want to pull tokens and decide upon them, and pull the
next when I'm ready for it. I don't know how to write the SecondParse
another way without the code becomming too complex, but I have a book
on grammers and languages. I read the part on regular expressions and
on Turing machines, but missed out the grammers part. I have to read at
least some of it :)

But for now I solve it by loading the tokens directly into an array in
the first parse.

- Bjarke Walling
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top