a prototype ctags format generator for rdoc

S

Sam Roberts

Why? Because I want things tagged that means implementing a fairly
complex ruby parser, like qualified tags, and rdoc already is a
ruby parser.

To try it out, put the tags_generator.rb into rdoc/generators/, in your
path somewhere, then run:

rdoc -f tags mylib/
exhuberant-ctags -R mylib
mv tags tags.ctags
sort tags.ctags tags.rdoc > tags

Currently, it only generates qualified method tags, these are tags
exhuberant ctags does NOT support (it would be turned on by --extra=+q
if it was supported), so there shouldn't be duplicates.

Try it out, qualified tags allow you to do:

:ta Rdoc.<TAB>

and see all the methods and classes in Rdoc, choose a sub-class, keep
tabbing, see all the methods in that class, etc. It allows you to use
your tags to drill down through the tree to the exact method you want,
even to discover the methods, in a way.

Exhuberant ctags does this for C++ like languages (Java, etc.), but its
ruby support doesn't. I added some discussion of this over on the wiki
(maybe not a good page?):

http://www.rubygarden.org/ruby?VimRubySupport

Because this uses rdoc, it tags methods like == and [] that exhuberant
ctags doesn't get. Also, rdoc knows to map Vpim.Icalendar.Address.new to
the 'def initialise" of that class!

For me, personally, lack of qualified tags in an OO language makes tags
almost useless, how many classes are going to have #initialise, or
#to_s? Its been bugging me for a while now that I'm slower navigating
ruby code than C code, this might help a bit.

What do you all think? Useful to anybody else? Worth pushing forward?

Cheers,
Sam

Note for Dave:

A problem I'm having is that I'd like to add support for qualified
classes and modules, but that means recovering the text from the line
they were defined on in order to build the ctags REGEX. The src
text/tokens are kept around for methods, but it isn't kept around for
class/module definitions, and I don't think its kept around for
constants either, which would also be cool to tag. And maybe even
attributes?

----- tags_generator.rb -----
require 'ftools'

require 'rdoc/options'
require 'rdoc/template'
require 'rdoc/markup/simple_markup'
require 'rdoc/markup/simple_markup/to_flow'
require 'cgi'

require 'rdoc/ri/ri_cache'
require 'rdoc/ri/ri_reader'
require 'rdoc/ri/ri_writer'
require 'rdoc/ri/ri_descriptions'

require 'pp'

module RDoc
class ClassModule
def file_name
if @parent.class == TopLevel
@parent.file_absolute_name
else
@parent.file_name
end
end
end
class AnyMethod
# Collect all the tokens for the method, from the beginning of the
# line with the identifier, to the end of that line.
def decl_string
src = ''
break_on_nl = false
if @token_stream
@token_stream.each do |t|
next unless t
case t
when RubyToken::TkCOMMENT then ;
when RubyToken::TkNL
break if break_on_nl
src = ''

else
src << t.text
end
break_on_nl = true if RubyToken::TkIDENTIFIER === t
end
if false
puts "----------------------"
pp @token_stream
puts "+++"
puts src
puts "----------------------"
end
end
src
end
end
end

module Generators
class TAGSGenerator
def TAGSGenerator.for(options)
new(options)
end

class <<self
protected :new
end

# Set up a new HTML generator. Basically all we do here is load
# up the correct output temlate

def initialize(options) #:not-new:
@options = options

# FIXME - make this a command-line option
@gen_qualified = true

# FIXME - make this a command-line option
@output = File.open("tags.rdoc", 'w')

#pp options

end

def generate(toplevels)
# pp toplevels

RDoc::TopLevel.all_classes_and_modules.each do |cls|
process_class(cls)
end
end

def process_class(from_class)
generate_class_info(from_class)

# now recure into this classes constituent classess
from_class.each_classmodule do |mod|
process_class(mod)
end
end

def generate_class_info(cls)
# pp cls

# FIXME:
# - when generating qualified classes, generate the intermediate qualified as well, so
# all of these:
# Outer.Middle.Inner.a_method
# Middle.Inner.a_method
# Inner.a_method
# a_method
if cls === RDoc::NormalModule
tag_type = 'c'
else
tag_type = 'm'
end

# This won't work, don't have enough information to reconstruct the search string.
# @output.puts tag = "#{cls.name}\t#{cls.file_name}\t/^class *#{cls.name}/;"\t#{tag_type}"

# if @gen_qualified && cls.name != cls.full_name
# @output.puts "#{cls.full_name.gsub('::', '.')}\t#{cls.file_name}\t/^class *#{cls.name}/;\"\t#{tag_type}"
# end

cls.method_list.each do |m|
if m.singleton
tag_type = 'F'
else
tag_type = 'f'
end

tag = "#{m.name}\t#{cls.file_name}\t/^#{m.decl_string}$/;\"\t#{tag_type}"
# @output.puts tag
if @gen_qualified
@output.puts "#{cls.full_name.gsub('::', '.')}.#{tag}"
end
end

# It would be great to tag attributes and contstants
# cls.attributes.each do |a|
# cls.constants.each do |c|
end

end
end
 
S

Sam Roberts

Second version fixes a bug wherein methods were being assumed to be in
the same file as their enclosing class/module.

Also adds more tags, so:

module Foo::Bar::Baz
def
end
end

will generate tags for

Baz.[]
Bar.Baz.[]
Foo.Bar.Baz.[]

Usage is the same, I'll probably make it do the merging itself soon, to
make this simpler.

Cheers,
Sam

Quoteing (e-mail address removed), on Sun, Nov 14, 2004 at 08:31:50AM +0900:
Why? Because I want things tagged that means implementing a fairly
complex ruby parser, like qualified tags, and rdoc already is a
ruby parser.

To try it out, put the tags_generator.rb into rdoc/generators/, in your
path somewhere, then run:

rdoc -f tags mylib/
exhuberant-ctags -R mylib
mv tags tags.ctags
sort tags.ctags tags.rdoc > tags

Currently, it only generates qualified method tags, these are tags
exhuberant ctags does NOT support (it would be turned on by --extra=+q
if it was supported), so there shouldn't be duplicates.

Try it out, qualified tags allow you to do:

:ta Rdoc.<TAB>

and see all the methods and classes in Rdoc, choose a sub-class, keep
tabbing, see all the methods in that class, etc. It allows you to use
your tags to drill down through the tree to the exact method you want,
even to discover the methods, in a way.

Exhuberant ctags does this for C++ like languages (Java, etc.), but its
ruby support doesn't. I added some discussion of this over on the wiki
(maybe not a good page?):

http://www.rubygarden.org/ruby?VimRubySupport

Because this uses rdoc, it tags methods like == and [] that exhuberant
ctags doesn't get. Also, rdoc knows to map Vpim.Icalendar.Address.new to
the 'def initialise" of that class!

For me, personally, lack of qualified tags in an OO language makes tags
almost useless, how many classes are going to have #initialise, or
#to_s? Its been bugging me for a while now that I'm slower navigating
ruby code than C code, this might help a bit.

What do you all think? Useful to anybody else? Worth pushing forward?

Cheers,
Sam

Note for Dave:

A problem I'm having is that I'd like to add support for qualified
classes and modules, but that means recovering the text from the line
they were defined on in order to build the ctags REGEX. The src
text/tokens are kept around for methods, but it isn't kept around for
class/module definitions, and I don't think its kept around for
constants either, which would also be cool to tag. And maybe even
attributes?
----- tags_generator.rb -----
require 'ftools'

require 'rdoc/options'
require 'rdoc/template'
require 'rdoc/markup/simple_markup'
require 'rdoc/markup/simple_markup/to_flow'
require 'cgi'

require 'rdoc/ri/ri_cache'
require 'rdoc/ri/ri_reader'
require 'rdoc/ri/ri_writer'
require 'rdoc/ri/ri_descriptions'

require 'pp'

module RDoc
class ClassModule
# FIXME - I don't think this works...
def file_name
if @parent.class === TopLevel
@parent.file_absolute_name
else
@parent.file_name
end
end
end
class AnyMethod
# Collect all the tokens for the method up-to and including the identifier, and the filename.
def decl_string_and_file
src = ''
filename = nil
break_on_nl = false
if @token_stream
@token_stream.each do |t|
next unless t
case t
when RubyToken::TkCOMMENT
# TkCOMMENT.text is "# File vpim/maker/vcard.rb, line 29"
if( t.text =~ /# File (.*), line \d+/ )
filename = $1
end
when RubyToken::TkNL
break if break_on_nl
src = ''

else
src << t.text
end
break_on_nl = true if RubyToken::TkIDENTIFIER === t
end
if false
puts "----------------------"
pp @token_stream
puts "+++"
puts src
puts "----------------------"
end
end
[ src, filename ]
end
end
end

module Generators


class TAGSGenerator

# Generators may need to return specific subclasses depending
# on the options they are passed. Because of this
# we create them using a factory

def TAGSGenerator.for(options)
new(options)
end

class <<self
protected :new
end

# Set up a new HTML generator. Basically all we do here is load
# up the correct output temlate

def initialize(options) #:not-new:
@options = options

# TODO - make this a command-line option
@gen_qualified = true

# TODO - make this a command-line option
@gen_unqualified = false

# TODO - make this a command-line option
@output = File.open("tags.rdoc", 'w')

# TODO - make this a command-line option
@dump = nil # File.open("rdoc.dump", 'w')

# TODO - make this a command-line option
@verbose = nil

#pp options
end


def generate(toplevels)
# This takes +8 minutes on vPim! Wow!
PP.pp( toplevels, @dump ) if @dump

RDoc::TopLevel.all_classes_and_modules.each do |cls|
process_class(cls)
end
end

def process_class(from_class)
generate_class_info(from_class)

# now recure into this classes constituent classess
from_class.each_classmodule do |mod|
process_class(mod)
end
end

def generate_class_info(cls)
# TODO:
# - when generating qualified names, generate the intermediate qualified as well, so
# all of these:
# Outer.Middle.Inner.a_method
# Middle.Inner.a_method
# Inner.a_method
# a_method

=begin
# TODO: can't do classes and modules, we don't have the original text tokens to reconstruct
# the tag's REGEX.
if cls === RDoc::NormalModule
tag_type = 'c'
else
tag_type = 'm'
end

@output.puts tag = "#{cls.name}\t#{cls.file_name}\t/^class *#{cls.name}/;"\t#{tag_type}"

if @gen_qualified && cls.name != cls.full_name
@output.puts "#{cls.full_name.gsub('::', '.')}\t#{cls.file_name}\t/^class *#{cls.name}/;\"\t#{tag_type}"
end
=end

cls.method_list.each do |m|
if m.singleton
tag_type = 'F'
else
tag_type = 'f'
end

decl_string, decl_file = m.decl_string_and_file

puts "Tagging: #{m.name} in: #{cls.full_name} from: #{decl_file}" if @verbose

tag = "#{m.name}\t#{decl_file}\t/^#{decl_string}$/;\"\t#{tag_type}"

if @gen_unqualified
@output.puts tag
end
if @gen_qualified
path = cls.full_name.split('::')
(1..path.length).each do |elements|
qualifier = path[-elements, elements].join('.')

puts " ..#{qualifier}" if @verbose

@output.puts "#{qualifier}.#{tag}"
end
end
end

# TODO: It would be great to tag attributes and contstants
# cls.attributes.each do |a|
# cls.constants.each do |c|
end

end
end
 
D

Doug Kearns

On Sun, Nov 14, 2004 at 08:31:50AM +0900, Sam Roberts wrote:

For me, personally, lack of qualified tags in an OO language makes tags
almost useless, how many classes are going to have #initialise, or
#to_s? Its been bugging me for a while now that I'm slower navigating
ruby code than C code, this might help a bit.

What do you all think?
Excellent!

Useful to anybody else?
Yes.

Worth pushing forward?

Yes, please.

<snip>

Thanks,
Doug
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top