No way of looking for a regrexp match starting from a particularpoint in a string?

  • Thread starter Kenneth McDonald
  • Start date
R

Robert Dober

On 6/4/07 said:
On 6/3/07 said:
to match a regular expression against only part of a string, in
particular only past a certain point of a string, as a way of finding
successive matches. Of course, one could do a match against a string,
take the substring past that match and do a match against the substring,
and so on, to find all of the matches for the string, but that could be
very expensive for very large strings.
I'm aware of the String.scan method, but that doesn't work for me
because it doesn't return MatchData instances.
What I want is just something like regexp.match(string, n),

Hmm apart of using #scan and #index with $~ as indicated, I do not
think that there is a performance penalty if you do

rg.match(string[n..-1])

How can that be? You have to create a whole new String.
Beating a dead man Tom? As mentioned I had a terrible slip to C in my
reasoning, no idea why :(
 
D

dblack

Hi --

Is $~ thread safe?

To bad it has to be done this way (though my library will hide it). I first
looked at Ruby several years ago, and at that time, didn't go further with it
because it was too PERLish for me. (PERL was great for its time, but speaking
as someone who actually had to maintain a lot of PERL code, it's actually a
pretty grotty language). One of the things that brought me back to Ruby was
the fact that an effort was being made to move Ruby away from its PERLisms.
But I guess it'll take a while longer...

The best thing is really just to use Ruby without thinking about Perl.
They're very different languages, and get mentioned in the same breath
far too often.


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
R

Robert Klemme

On 6/3/07, Kenneth McDonald <[email protected]>
wrote:> I'm probably just missing something obvious, but I haven't
found a way
to match a regular expression against only part of a string, in
particular only past a certain point of a string, as a way of finding
successive matches. Of course, one could do a match against a string,
take the substring past that match and do a match against the substring,
and so on, to find all of the matches for the string, but that could be
very expensive for very large strings.

I'm aware of the String.scan method, but that doesn't work for me
because it doesn't return MatchData instances.

What I want is just something like regexp.match(string, n),

Hmm apart of using #scan and #index with $~ as indicated, I do not
think that there is a performance penalty if you do

rg.match(string[n..-1])

How can that be? You have to create a whole new String.
Beating a dead man Tom? As mentioned I had a terrible slip to C in my
reasoning, no idea why :(
If that can be avoided in the internal implementation then adding an
optional offset
index to #match is not an unreasonable idea.

Robert, actually string[n..-1] is cheaper than you might assume: I
believe the new string shares the char buffer with the old string, so
you basically just get a new String object with a different offset - the
large bit (the char data) is not copied.

Kind regards

robert
 
R

Robert Dober

On 04.06.2007 13:28, Robert Dober wrote:
Robert, actually string[n..-1] is cheaper than you might assume: I
believe the new string shares the char buffer with the old string, so
you basically just get a new String object with a different offset - the
large bit (the char data) is not copied.
I am afraid that this is not true anymore when the slice is passed as
a formal parameter, the data has to be copied :(

irb(main):011:0> def change(x)
irb(main):012:1> x << "changed"
irb(main):013:1> end
=> nil
irb(main):014:0> a="abcdef"
=> "abcdef"
irb(main):015:0> change(a[1..2])
=> "bcchanged"
irb(main):016:0> a
=> "abcdef"

Cheers
Robert
 
R

Robert Klemme

On 04.06.2007 13:28, Robert Dober wrote:
Robert, actually string[n..-1] is cheaper than you might assume: I
believe the new string shares the char buffer with the old string, so
you basically just get a new String object with a different offset - the
large bit (the char data) is not copied.
I am afraid that this is not true anymore when the slice is passed as
a formal parameter, the data has to be copied :(

irb(main):011:0> def change(x)
irb(main):012:1> x << "changed"
irb(main):013:1> end
=> nil
irb(main):014:0> a="abcdef"
=> "abcdef"
irb(main):015:0> change(a[1..2])
=> "bcchanged"
irb(main):016:0> a
=> "abcdef"

Copying in this case is not caused by using the string as a parameter
but by appending to it.

I thought this thread was about /scanning/ which is a read only
operation. Did I miss something?

Kind regards

robert
 
R

Robert Dober

On 04.06.2007 13:28, Robert Dober wrote:
Robert, actually string[n..-1] is cheaper than you might assume: I
believe the new string shares the char buffer with the old string, so
you basically just get a new String object with a different offset - the
large bit (the char data) is not copied.
I am afraid that this is not true anymore when the slice is passed as
a formal parameter, the data has to be copied :(

irb(main):011:0> def change(x)
irb(main):012:1> x << "changed"
irb(main):013:1> end
=> nil
irb(main):014:0> a="abcdef"
=> "abcdef"
irb(main):015:0> change(a[1..2])
=> "bcchanged"
irb(main):016:0> a
=> "abcdef"

Copying in this case is not caused by using the string as a parameter
but by appending to it.

I thought this thread was about /scanning/ which is a read only
operation. Did I miss something?
No you did not, theoretically it might work like this:

def change( x )
x << changed # copy on write
end

a="some string"
b=a[1..3] # shallow copy
b << "changed" # copy on write
a << "changed" # no copy of course

but do you think it does? Note that the object must have state to know
when and how to copy the underlying data, I am about to read string.c
but it is quite complicated and I got some work to do :(.

Cheers
Robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,281
Latest member
Pedroaciny

Latest Threads

Top