Anchored Regexp get stalled or hung

R

rnicz

Dear rubyists!

I dare to post this question in spite of fact that lately there were
many posts about false RE bug reports.

I've tried to make following regexp

a=/^-{150}/

and it turns out that such expression hangs ruby interpreter.
I've checked that re{m,n} expression allows by design as much as
32766 repetitions. Expression such as

/-{32766}/

works fine, and

/-{32767}/

produces error message:

'too big quantifier in {,}: /-{32767})/'

But when regexp is anchored at front you can not specify more
than 127 repetitions:

/^-{127}/

is ok, but

/^-{128}/

hangs interpreter. Is it a bug or not?
 
J

Joel VanderWerf

rnicz said:
/^-{128}/

hangs interpreter. Is it a bug or not?

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
something that Oniguruma fixes.
 
S

Simon Strandgaard

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
something that Oniguruma fixes.


I can reproduce this problem in 1.8.1. I guess this is a bug in the GNU
engine.


This is show off.. my own regexp engine can deal with /^-{128}/

bash-2.05b$ irb
irb(main):001:0> re = NewRegexp.new("^-{128}")
=> #<NewRegexp:0x403a50f0 @source="^-{128}", @scanner=#<Scanner:0x403a50b4
@root=#<Root:0x403a4e5c @number_of_captures=2,
@node=#<ScannerHierarchy::Capture:0x403a4de4
@succ=#<ScannerHierarchy::Anchor:0x403a4d80 @anchor_type=:line_begin,
@succ=#<ScannerHierarchy::BeginMatch:0x403a4d44
@succ=#<ScannerHierarchy::RepeatGreedy:0x403a4d6c @max=128,
@succ=#<ScannerHierarchy::Capture:0x403a4df8
@succ=#<ScannerHierarchy::Last:0x403a4e34>, @register=1>, @min=128,
@index=nil, @pattern=#<ScannerHierarchy::Inside:0x403a4d1c @succ=EndPattern,
@set=#<RangeSet:0x403a4f24 @codepoints=[45]>>>>>, @register=0>,
@parser=+-Sequence
+-Anchor line_begin
+-Repeat greedy{128,128}
+-Inside set="-">>>
irb(main):002:0> re.match(("-"*128)+"x")
=> #<NewMatchData:0x403fb068 @captures=[],
@matched_string="--------------------------------------------------------------------------------------------------------------------------------",
@positions=[[0, 128]], @post_match="x",
@string="--------------------------------------------------------------------------------------------------------------------------------x",
@match_array=["--------------------------------------------------------------------------------------------------------------------------------"],
@pre_match="", @length=128, @offset=0>
irb(main):003:0> re.match(("-"*127)+"x")
=> nil
irb(main):004:0> puts re.tree
+-Sequence
+-Anchor line_begin
+-Repeat greedy{128,128}
+-Inside set="-"
=> nil
irb(main):005:0>


(sorry for show off)
 
J

Jamis Buck

Joel said:
Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
something that Oniguruma fixes.

I'm using 1.8.2 with Oniguruma, and it does not hang. I'm guessing it's
something with the legacy regexp engine.

- Jamis
 
T

ts

r> I've tried to make following regexp

r> a=/^-{150}/

try this

uln% diff -u regex.c~ regex.c
--- regex.c~ 2004-10-27 04:46:51.000000000 +0200
+++ regex.c 2004-11-19 12:25:39.000000000 +0100
@@ -1011,8 +1011,8 @@
{
int mcnt;
int max = 0;
- char *p = start;
- char *pend = end;
+ unsigned char *p = start;
+ unsigned char *pend = end;
char *must = 0;

if (start == NULL) return 0;
uln%


uln% ./ruby -e 'a = "-" * 152; /^-{150}/ =~ a; p $&.size'
150
uln%




Guy Decoux
 
R

rnicz

That's great: first answer after 4 minutes, and patch within 12 hours.

Thank you rubyists!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,434
Messages
2,571,691
Members
48,796
Latest member
Greg L.

Latest Threads

Top