The core of RCM would allow this kind of SCM as it knows nothing about
files, only entities and their names. In the case of files, the name
would be the file name, and the entity the content of the file. But in
the same way, you could version classes and methods.
Good. But the important thing here is that if your entities are small
enough, you don't need to worry about automatic merging or diffing
within the content of the entity. You only need to compare different
content versions for equality. This makes a *huge* difference to the
complexity of the SCM. So it's not really about whether you could
implement something method-based on top of something that was file
based (clearly you could), but how much work and complexity you can
avoid by doing it method-based from the start.
Or how would you split this file as methods, classes etc.?
class A
attr_accessor :a
def b
end
C = 4
end
You can't simply reorder the statements inside the class!
That's definitely the most challenging problem of this approach. To
start with you could have an entity (maybe represent it as a singleton
method called "initialize" on the class, if you like) that would
gather up all of the "loose" code inside the class def. In most
cases, I think just keeping this in a single block at the top of the
class def would be ok. However, you would definitely need to provide
some structure for cases where ordering was important. One TSTTCPW
strategy would be to use RDoc-like comments to mark off blocks of code
that should be treated as individual entities, combined with a general
inclination on the part of the SCM to keep entities in the same order
where possible. So:
class A
#RCM
attr_accessor :a
#/RCM
def b
end
#RCM
C = 4
#/RCM
end
Do you know if Codeville is usable for C code? Or is a very regular
object structure (packages, classes, methods) required?
Codeville doesn't require any structure, it's line-based. Monticello
2 uses the Codeville approach (or at least I think it's the same, we
came to it independently) of a 2-way merge + independent ancestry info
for each fine-grained entity, but in Codeville's case the entity is
the line whereas in Monticello's case it's the method. I do think the
idea works much *better* if there's a regular structure (how do you
uniquely identify a line to attach ancestry to it?) but I think it's
still worth looking at even for a more language-agnostic system.
One thing that's really cool about this, though I don't know if
Codeville itself takes advantage of it, is that you can do a perfect
merge having *only* the two versions you're merging plus their
metadata. So a formal repository becomes completely unnecessary: you
never need to look up any previous history. You could build this as
an extension to RubyGems, for example: each gem version would carry
around enough info to let it be merged into any other gem version, and
they could just be emailed or FTP'd or whatever around without any
special protocol. It's about as distributed as it gets.
Monticello 1 did this as well, but with a Monotone-style three-way
merge, which means that you still need enough of a repository to find
and retrieve the common ancestor.
Cheers,
Avi