$& write-protected?

S

Simon Strandgaard

I am working on a regexp engine written entirely in Ruby.
I wish to test it against the rubicon testsuite, its using $& and $1-$9
heavily.

How do I assign my regexp-result to $& ?


irb(main):001:0> $& = "test"
SyntaxError: compile error
(irb):1: Can't set variable $&
from (irb):1
irb(main):002:0>
 
T

ts

"S" == Simon Strandgaard <none> writes:

S> How do I assign my regexp-result to $& ?

$& is read-only

S> irb(main):001:0> $& = "test"

it depend what you want to do

/test/ =~ "test" # $& ==> "test"
 
S

Simon Strandgaard

ts said:
S> How do I assign my regexp-result to $& ?

$& is read-only

S> irb(main):001:0> $& = "test"

it depend what you want to do

/test/ =~ "test" # $& ==> "test"

That will work. But I also need to so assignment to $1 - $9, $', $`, $+

Any ideas how to fake this?

Any rationale why they are write-protected?
 
T

ts

"S" == Simon Strandgaard <none> writes:

S> Any ideas how to fake this?

Well, it's easy to do for $' and $`, no idea for the rest $1, ...

S> Any rationale why they are write-protected?

These variables are the result of a regexp against a string : it seems a
non sense to change this result.
 
S

Simon Strandgaard

S> Any ideas how to fake this?

Well, it's easy to do for $' and $`, no idea for the rest $1, ...

S> Any rationale why they are write-protected?

These variables are the result of a regexp against a string : it seems a
non sense to change this result.

Aree, these values should only contain output from regexp.
But as a paradox I am developing my own regexp engine!

I want my regexp engine to be compatible with Ruby's, therefore I want to
exercise the engine with the Rubicon testsuite. But rubicon uses $& and
$1-$9. I can substitute all occurencies of $& and $1-$9 in rubicon.

If $& and $1-$9 assignment are impossible, then, then my regexp-engine
never will get compatible with Ruby's engine :-(

[RCR] remove write-protection of $&, $1-$9, $', $`, $+
or perhaps let them stay write-protected, but let assignment via const_set
become possible?
 
T

ts

S> I want my regexp engine to be compatible with Ruby's, therefore I want to
S> exercise the engine with the Rubicon testsuite. But rubicon uses $& and
S> $1-$9. I can substitute all occurencies of $& and $1-$9 in rubicon.

Be carefull with this test : it was designed for the *current* regexp
engine, this means, for example, that these tests must be modified for
Onigurama. Because Oniguruma will give different results.

S> [RCR] remove write-protection of $&, $1-$9, $', $`, $+

No need for this : just write an C extension which do this, and use this
extension *only* for testing your regexp engine.

Nobody, except you, need to modify these variables.


Guy Decoux
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: $& write-protected?"

|[RCR] remove write-protection of $&, $1-$9, $', $`, $+
|or perhaps let them stay write-protected, but let assignment via const_set
|become possible?

If you update $~ (the match result), $&, $1, etc. would follow
accordingly. So that the better RCR should be to create MatchData
without built-in regular expression match, I think.

matz.
 
S

Simon Strandgaard

S> I want my regexp engine to be compatible with Ruby's, therefore I want to
S> exercise the engine with the Rubicon testsuite. But rubicon uses $& and
S> $1-$9. I can substitute all occurencies of $& and $1-$9 in rubicon.

Be carefull with this test : it was designed for the *current* regexp
engine, this means, for example, that these tests must be modified for
Onigurama. Because Oniguruma will give different results.

I have about 150 testcases which I both test with Ruby's current regexp
engine and against my own engine. So far my engine is fully compatible.

Perhaps I should also run the tests agains Oniguruma.
Do you know if Oniguruma can coexist with Ruby's current regexp engine?

S> [RCR] remove write-protection of $&, $1-$9, $', $`, $+

No need for this : just write an C extension which do this, and use this
extension *only* for testing your regexp engine.

Nobody, except you, need to modify these variables.

Great, so it *is* possible.. I will look at it tomorrow (C++SWIG).
 
T

ts

S> Perhaps I should also run the tests agains Oniguruma.

yes, like said previously, Oniguruma will give you different results.

S> Do you know if Oniguruma can coexist with Ruby's current regexp engine?

well, you can have 2 differents ruby : one with Oniguruma, the other with
the GNU regexp

S> Great, so it *is* possible.. I will look at it tomorrow (C++SWIG).
^^^^

no need of SWIG in this case :)


Guy Decoux
 
S

Simon Strandgaard

In message "Re: $& write-protected?"

|[RCR] remove write-protection of $&, $1-$9, $', $`, $+
|or perhaps let them stay write-protected, but let assignment via const_set
|become possible?

If you update $~ (the match result), $&, $1, etc. would follow
accordingly. So that the better RCR should be to create MatchData
without built-in regular expression match, I think.

Good idea for an RCR. I vote yes immediately for a pure MatchData class ;-)
 
S

Simon Strandgaard

S> Great, so it *is* possible.. I will look at it tomorrow (C++SWIG).
^^^^
no need of SWIG in this case :)

Maybe overkill.

My first thought were to extern 'last_match_getter' from 're.c' and
just supply my own last_match_setter. But then I noticed the function are
are 'static' (private).

How do you think I shall implement this ?


server> gcc assign.c -I/usr/home/neoneye/install/ruby-1.8.1
/usr/lib/crt1.o: In function `_start':
/usr/lib/crt1.o(.text+0x81): undefined reference to `main'
/var/tmp//ccu9UXrr.o: In function `Init_Assign':
/var/tmp//ccu9UXrr.o(.text+0x17): undefined reference to `last_match_getter'
/var/tmp//ccu9UXrr.o(.text+0x21): undefined reference to `rb_define_virtual_variable'
server> expand -t4 assign.c
#include <ruby.h>

extern VALUE last_match_getter();

static void last_match_setter(VALUE val) {
/* TODO */
}

void Init_Assign() {
rb_define_virtual_variable("$&", last_match_getter, last_match_setter);
}
server>
 
T

ts

S> My first thought were to extern 'last_match_getter' from 're.c' and
S> just supply my own last_match_setter. But then I noticed the function are
S> are 'static' (private).

see the struct RMatch in re.h. str is the string, regs are the
registers. ruby use register to retrieve $&, $1, ...

see for example rb_reg_match_pre() ($'), rb_reg_match_post() ($`),
rb_reg_nth_match() ($&, $1, ...)

If you change the struct RMatch, this change will be reflected on $&,
$1, ...


Guy Decoux
 
S

Simon Strandgaard

S> My first thought were to extern 'last_match_getter' from 're.c' and
S> just supply my own last_match_setter. But then I noticed the function are
S> are 'static' (private).

see the struct RMatch in re.h. str is the string, regs are the
registers. ruby use register to retrieve $&, $1, ...

see for example rb_reg_match_pre() ($'), rb_reg_match_post() ($`),
rb_reg_nth_match() ($&, $1, ...)

If you change the struct RMatch, this change will be reflected on $&,
$1, ...

I bail out, maybe I will look at it tomorrow. I hoped it wouldn't be this
complicated (im tired). I will follow the other approach and substitution
all occurencies of $& and $1-$9 instead.

Thanks for sharing you wisdom.


BTW: Is there a reason that functions not found in Ruby's .H files, often
has a 'static' tag, so thats impossible to extern them?

Would it make any sense to remove all 'static' tags?
 
A

Aredridel

If you update $~ (the match result), $&, $1, etc. would follow
accordingly. So that the better RCR should be to create MatchData
without built-in regular expression match, I think.

Here, here!

I was trying to add some features to the regex engine and couldn't for
similar reasons. (I was trying to leverage the built-in syntax for
regexes, but to match against tree-shaped structures... Don't ask.)

I eventually gave up because I couldn't create MatchData.

Ari
 
S

Sabby and Tabby

Simon Strandgaard said:
That will work. But I also need to so assignment to $1 - $9, $', $`, $+

class MatchData
def MatchData.[](pre, match, post, *captures)
str = [pre, match, post] * ''
match = Regexp.quote(match.dup)

captures.collect! do |cap|
c = Regexp.quote(cap)
i = match.index(c) or raise ArgumentError,
"Not a backreference: #{cap.inspect}"
match[i + c.size, 0] = ')'
match.slice!(0, i) + '('
end

str =~ /#{captures * '' + match}(?=.{#{post.size}}\z)/m
$~
end
end

$~ = MatchData[*%w/alpha beta charlie beta be e ta t a/]

p [$`, $&, $', *$~.captures]
 
S

Simon Strandgaard

Simon Strandgaard said:
That will work. But I also need to so assignment to $1 - $9, $', $`, $+
[snip fake MatchData class]
$~ = MatchData[*%w/alpha beta charlie beta be e ta t a/]

p [$`, $&, $', *$~.captures]

Weew, nice. I were starting to believe that this were simply impossible.
Surprise of today. I am grateful for your help. Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top