regexp exclusion search - find matches NOT ending with a string?

B

BrendanC

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?

TIA,
BC
 
X

Xavier Noria

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html"

The easiest path is to negate that it matches, say for instance:

if filename !~ /\.html\z/
# non-HTML here
end

-- fxn
 
D

David A. Black

Hi --

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?

I would probably do:

lines.reject {|line| line =~ /html$/ }


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
Training! Intro to Ruby, with Black & Kastner, September 14-17
(More info: http://rubyurl.com/vmzN)
 
R

Robert Dober

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?
Xavier and David gave good advice.
If however you really have to have a matching regex

%r($(?<!\.html)\z) # is that what you meant above?

works fine. I believe that you can install Oniguruma on 1.8 as a gem
for that purpose.
HTH
Robert



--=20
Toutes les grandes personnes ont d=92abord =E9t=E9 des enfants, mais peu
d=92entre elles s=92en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exup=E9ry]
 
R

Rob Biedenharn

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs, and I haven't seen a
site that spells out all the nitty-gritty details and pokes into the
dark corners.

I'm looking for the Ruby equivalent of:
http://www.tcl.tk/man/tcl8.5/TclCmd/re_syntax.htm
http://docs.python.org/library/re.html#regular-expression-syntax
http://perldoc.perl.org/perlre.html

Does it exist?


You could try the Regular Expressions section of the Standard Types
chapter of Programming Ruby. Be advised that this is the online
version of the 1st edition that is now 8 years old. Since you seem to
be using a version 1.8.x of Ruby, the Regexp parts are going to be
mostly the same.

http://www.ruby-doc.org/docs/ProgrammingRuby/

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
R

Robert Dober

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs, and I haven't seen a
site that spells out all the nitty-gritty details and pokes into the
dark corners.

I'm looking for the Ruby equivalent of:
http://www.tcl.tk/man/tcl8.5/TclCmd/re_syntax.htm
http://docs.python.org/library/re.html#regular-expression-syntax
http://perldoc.perl.org/perlre.html

Does it exist?
For Oniguruma I found this most helpful
http://manual.macromates.com/en/regular_expressions#regular_expressionsNice one

Cheers
Robert
 
7

7stud --

BrendanC said:
I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above.

Some alternate means to the same end:

IO.foreach("data.txt") do |line|

#1
if line.chomp.split(".")[-1] != "html"
puts line
end

#2
if line[-5, 4] != "html"
print line
end

#3
if line.slice(-5..-1) != "html"
print line
end

puts
end

--output:--
2 b.doc
2 b.doc
2 b.doc

3 c.xml
3 c.xml
3 c.xml

4 d.tiff
4 d.tiff
4 d.tiff

5 e.jpeg
5 e.jpeg
5 e.jpeg
 
B

Brian Candler

Glenn said:
Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs

In my opinion, documentation is Ruby's weakest aspect by far - and the
deficiency has gotten substantially worse with ruby 1.9.

Best available information is in third-party books, which presumably
have reverse-engineered from the source code. I fairly often resort to
irb to check behaviour is what I want, and have on occasions had to
resort to reading the source.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top