S
Simon Strandgaard
I am pleased to announce the release of version 0.8 of my
Bidirectional Iterator package.
download as TGZ:
http://rubyforge.org/frs/download.php/703/iterator-0.8.tar.gz
download as ZIP:
http://rubyforge.org/frs/download.php/704/iterator-0.8.zip
documentation:
http://aeditor.rubyforge.org/iterator/
What Is It?
===========
A bunch of external iterators, which both can go forward
and backwards. You may already know iterators from Java.
The main methods are #next, #has_next? and #current.
Its carefully unittested and is being used by my regexp
engine where it indirectly is being exercised by +2000 tests.
Changes Since Last
==================
For my regexp engine I needed primitives to iterate through
unicode strings (UTF8, UTF16BE, UTF16LE), both forward and
backwards. You never know when you need to search backwards
through an huge UTF8 file. I have putted much effort into
ensuring that decoding are robust to malformed data, so
when encountering malformed data we doesn't eat bytes of the
next maybe valid codepoint. Also when doing backwards
decoding its still robust to malformed data.
Talking about File.. I have also added an iterator which
can traverse through a file. Forward and backwards.
The interface is still the same.
Some Examples
=============
#
# example 1. map on String.each_line
#
require 'iterator'
str = %w(a b c).join("\n")
iterator = Iterator::Continuation.new(str, :each_line)
result = iterator.map{|i|"<<#{i}>>"}
iterator.close
p result # ["<<a\n>>", "<<b\n>>", "<<c>>"]
#
# example 2. multiway each
#
require 'iterator'
data_a = %w(a b c d)
data_b = (0..3)
ia = Iterator::Continuation.new(data_a, :each)
ib = Iterator::Continuation.new(data_b, :each)
result = []
while ia.has_next? and ib.has_next?
result << ia.current
result << ib.current
ia.next
ib.next
end
ia.close
ib.close
p result # ["a", 0, "b", 1, "c", 2, "d", 3]
#
# example 3. decode UTF8 encoded string
#
require 'iterator'
str = [1000, 50000, 40, 999, 30000].pack('U*')
byte_iterator = str.create_iterator
res = Iterator:
ecodeUTF8.new(byte_iterator).to_a
p res # [1000, 50000, 40, 999, 30000]
Call For Help
=============
Do you have knowledge about Asian text-encodings: Big5, EUC, SJIS, KOI.
Then please contact me.
BTW: if you have ideas for improvement, then speak out.
Bidirectional Iterator package.
download as TGZ:
http://rubyforge.org/frs/download.php/703/iterator-0.8.tar.gz
download as ZIP:
http://rubyforge.org/frs/download.php/704/iterator-0.8.zip
documentation:
http://aeditor.rubyforge.org/iterator/
What Is It?
===========
A bunch of external iterators, which both can go forward
and backwards. You may already know iterators from Java.
The main methods are #next, #has_next? and #current.
Its carefully unittested and is being used by my regexp
engine where it indirectly is being exercised by +2000 tests.
Changes Since Last
==================
For my regexp engine I needed primitives to iterate through
unicode strings (UTF8, UTF16BE, UTF16LE), both forward and
backwards. You never know when you need to search backwards
through an huge UTF8 file. I have putted much effort into
ensuring that decoding are robust to malformed data, so
when encountering malformed data we doesn't eat bytes of the
next maybe valid codepoint. Also when doing backwards
decoding its still robust to malformed data.
Talking about File.. I have also added an iterator which
can traverse through a file. Forward and backwards.
The interface is still the same.
Some Examples
=============
#
# example 1. map on String.each_line
#
require 'iterator'
str = %w(a b c).join("\n")
iterator = Iterator::Continuation.new(str, :each_line)
result = iterator.map{|i|"<<#{i}>>"}
iterator.close
p result # ["<<a\n>>", "<<b\n>>", "<<c>>"]
#
# example 2. multiway each
#
require 'iterator'
data_a = %w(a b c d)
data_b = (0..3)
ia = Iterator::Continuation.new(data_a, :each)
ib = Iterator::Continuation.new(data_b, :each)
result = []
while ia.has_next? and ib.has_next?
result << ia.current
result << ib.current
ia.next
ib.next
end
ia.close
ib.close
p result # ["a", 0, "b", 1, "c", 2, "d", 3]
#
# example 3. decode UTF8 encoded string
#
require 'iterator'
str = [1000, 50000, 40, 999, 30000].pack('U*')
byte_iterator = str.create_iterator
res = Iterator:
p res # [1000, 50000, 40, 999, 30000]
Call For Help
=============
Do you have knowledge about Asian text-encodings: Big5, EUC, SJIS, KOI.
Then please contact me.
BTW: if you have ideas for improvement, then speak out.