Newbie regex question about \w

J

Jeff Cohen

I'm trying to parse ruby files to find all the class definitions in the
file. For each line in the file, I thought I could use the following to
pull out the class name:

\bclass\b(\w+)\b

so then $1 would give me the class name.

But it doesn't work:

irb(main):001:0> line = "class Article < MyBaseClass"
=> "class Article < ActiveRecord::Base"
irb(main):002:0> line =~ /\bclass\b(\w+)\b/
=> nil

I think I narrowed down the problem to my use of \w, but I can't
understand why.

For extra credit, anybody know how I can make sure I can ignore comments
and quoted strings? I want to make sure I ignore these things:

if option_exists # handle class options

as well as

puts "Your are in a class by yourself"

But those are advanced... if I can just get the first one working I'll
be grateful!

Thanks,
Jeff
 
J

James Edward Gray II

I'm trying to parse ruby files to find all the class definitions in
the
file. For each line in the file, I thought I could use the
following to
pull out the class name:

\bclass\b(\w+)\b

so then $1 would give me the class name.

You're close, you just forgot to allow for some space between class
and the name. A boundary is a zero-width assertion, so it's not enough:

\bclass\s+(\w+)\b

Hope that helps.

James Edward Gray II
 
J

Jeff Cohen

James said:
You're close, you just forgot to allow for some space between class
and the name. A boundary is a zero-width assertion, so it's not enough:

Awesome. Thanks a lot.

Jeff
 
L

Logan Capaldo

I'm trying to parse ruby files to find all the class definitions in
the
file. For each line in the file, I thought I could use the
following to
pull out the class name:

\bclass\b(\w+)\b

so then $1 would give me the class name.

But it doesn't work:

irb(main):001:0> line = "class Article < MyBaseClass"
=> "class Article < ActiveRecord::Base"
irb(main):002:0> line =~ /\bclass\b(\w+)\b/
=> nil

I think I narrowed down the problem to my use of \w, but I can't
understand why.

For extra credit, anybody know how I can make sure I can ignore
comments
and quoted strings? I want to make sure I ignore these things:

if option_exists # handle class options

as well as

puts "Your are in a class by yourself"

But those are advanced... if I can just get the first one working I'll
be grateful!

Thanks,
Jeff

your \w is right. \b doesn't work the way you think it does though.
It doesn't consume anything, ie;

"<-- \b is just before the 'c'
c
l
a
s
s__ \b is in between the 's' and the space
<- space doesn't match \w
A
r
t
i
c
l
e
 
D

dblack

Hi --

For extra credit, anybody know how I can make sure I can ignore comments
and quoted strings? I want to make sure I ignore these things:

if option_exists # handle class options

as well as

puts "Your are in a class by yourself"

But those are advanced... if I can just get the first one working I'll
be grateful!

You might try this:

/^\s*class\s+(\w+)/

which will only match "class ..." at the beginning of a line or with
only spaces to its left. It's certainly not impossible to get false
positives or negatives this way, but in the normal course of a
normally-written Ruby program file it should be close to 100%.

Don't forget, though, that you might get "::" in a class name, like
this:

module M
end

class M::C
end

and just going for \w+ will give you the module name, not the class
name.


David

--
David A. Black
(e-mail address removed)

"Ruby for Rails", from Manning Publications, coming April 2006!
http://www.manning.com/books/black
 
S

Simon Kröger

[...]
especially with heredocs and such. I would take a look at rdoc and see
if you can't manipulate it to get a list of classes for you.

Maybe this is about regexps and i'm totaly off, but what about:

---------------------------------------------
before = Object.constants
require 'sqlite3' # put your file here
after = Object.constants

p(after - before)
---------------------------------------------

output:

["NKF", "Deprecated", "SQLite3", "Base64", "Kconv", "ParseDate"]


cheers

Simon
 
L

Logan Capaldo

[...]
especially with heredocs and such. I would take a look at rdoc =20
and see if you can't manipulate it to get a list of classes for you.

Maybe this is about regexps and i'm totaly off, but what about:

---------------------------------------------
before =3D Object.constants
require 'sqlite3' # put your file here
after =3D Object.constants

p(after - before)
---------------------------------------------

output:

["NKF", "Deprecated", "SQLite3", "Base64", "Kconv", "ParseDate"]


cheers

Simon

Clever!

before =3D Object.constants
require 'file'
after =3D Object.constants
files_classes =3D (after - before).select { |x| Class =3D=3D=3D =20
Object.const_get(x) }
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,049
Latest member
Allen00Reed

Latest Threads

Top