Extension Language for a Text Editor

Nikolai Weibull · Oct 9, 2003

* Simon Strandgaard said:
Yes this isn't going to be easy. I am the kind of type which like
hairy problems

Remember that there is more than syntax to consider. Both semantics and
actually doing the matching. You can alter the syntax without altering
the semantics and matching. (I may be expressing this incorrectly
though),
nikolai

P.S.
What I'm saying here is that even if you change the syntax, you don't
have to make it any less powerful, or reimplement it from the start.
D.S.

Nikolai Weibull · Oct 9, 2003

* Dale Martenson said:
You may want to look at the VIM's use of Ruby for writing
extensions. Documentation available inside of VIM via ":help
ruby".

Yes, but have you actually seen anyone use it? Or any of the other
extension languages available (Tcl, Perl, Python)?

For an up-to-date copy of VIM see: http://www.vim.org

Hm, have you not seen me post on the Vim lists? I'm an avid user of
Vim. Thanks anyway ;-),
nikolai

Nikolai Weibull · Oct 9, 2003

* Gavin Sinclair said:
This capability of Vim doesn't seem to be heavily used. I certainly
don't use it. Partially because the integration is a bit lacking (but
you could do Ryan's aligner very easily - but then there's an
excellent and well-supported vim plugin for that anyway). Partially
because I couldn't be bothered.

The problem with Vim is that it has an interface to every possible
language, so none of them becomes the standard. VimL is the standard of
course, but it would be great if we could force a better language as
well. (By "better" I of course mean _bigger_ right? ;-)
nikolai

Nikolai Weibull · Oct 9, 2003

* Seth Kurtzberg said:
Well, for one thing, doing it in lisp would be reinventing the wheel, as
emacs has been around for ages.

Yes, this is true, and one of the strongest points for not going down
that road.

One main advantage that I see to ruby is that one could modify existing
code and add new code much more easily if the modifier doesn't have a
lisp background. lisp code is horribly unreadable.

This is, however, only an opinion, not a truth. Lisp code may be
unreadable to you, and many who have never gotten used to it. But
there's no point in saying that it would be easier to modify others
code, simply because it wasn't written in Lisp. With Lisp you get 40
years of coding standards build-up. I bet that people able to speak
Lisp without starting to lisp (hahaha, so funny) are very able to modify
other peoples code. Probably even more so than for many other
languages, as Lisp is, after all, very clean and simple in many
respects. I'm guessing that everyone hates Lisp because of Common Lisp.
Common Lisp is large and hard to grasp, whereas there are many
alternatives that are small and simple to comprehend.

If you do want to use a functional language Haskell is a much better
choice, and would in fact be interesting in this context.

Yes, in fact Haskell would be great as I am studying at Gothenburg
University, which is one of the two or three major hubs of Haskell
research and development. I do, however, feel that Haskell is to
complex and, as you should also feel (as a Ruby user), to restrictive.
Typing is one of its strong selling points, and I'm not a fan of typing.
nikolai

Nikolai Weibull · Oct 9, 2003

* Robert Klemme said:
Since others have commented on various aspects of your posting I'll only
throw my 2c at this:
Why do you want to change it? I find it quite flexible and expressive
(especially when using flag "x"). And especially if you want to attract
rubyists to use TREC (The Ruby Editor to Come), then you should ensure
that their knowledge of ruby regexps is not obsoleted, since rx's will
likely play an important role in user defined extensions!

Hehe. Yeah, that is true. So we'd need two separate constructs.

You can always define methods in Kernel like this:

irb(main):002:0* module Kernel
irb(main):003:1> def rx(str)
irb(main):004:2> puts "building rx from '#{str}'"
irb(main):005:2> end
irb(main):006:1> end
=> nil
irb(main):007:0>
irb(main):008:0* rx %q{(foo)+ \s+ (\w+)}
building rx from '(foo)+ \s+ (\w+)'
=> nil
irb(main):009:0> rx '(foo)+ \s+ (\w+)'
building rx from '(foo)+ \s+ (\w+)'
=> nil
irb(main):010:0>

Which isn't too bad IMHO.

No, it isn't actually. Thank you for pointing this out. This could
actually be the answer to my question. I'll have to think about this.
I had totally forgot that we had very restrictive single-quoting
contexts (and even the %q{}). Thanks,
nikolai

P.S.
I realize now that that last part may seem a bit sarcastic, but it
isn't, I promise.
D.S.

Simon Strandgaard · Oct 9, 2003

Yes, I have seen it, and tried it (see below).

Great. Thanks for trying it

[snip]

where is the homepage ? ;-)

Click to expand...

eh, it was included in the end of the mail (marked [3]) ;-)
http://www.pcppopper.org/code/win/sled/

http://sourceforge.net/projects/slackedit

Click to expand...

This is very old. Do not use ;-)

Ok.. I have just download the source, but cannot extract it:

server> unzip slackedit-src.zip
Archive: slackedit-src.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of slackedit-src.zip or
slackedit-src.zip.zip, and cannot find slackedit-src.zip.ZIP, period.
server> md5 slackedit-src.zip
MD5 (slackedit-src.zip) = 45346944242e2b3edb418f6f9d76d776
server> ll
total 172
-rw------- 1 neoneye neoneye 175584 Oct 9 21:51 slackedit-src.zip
server>

No, the plan was to port it to Gtk/GLib, but that never happened.
Ok

Yeah, it is very nice and simply done. I like your code. I see you've
read your GOF well ;-) (especially the first chapter I presume ;-). I
do, however, feel that you've gone into it a bit too much. I mean, It's
fine to represent lines as object, but individual characters? It gets a
bit big at that point ;-)

The first+second incarnation of AEditor were coded in C++, and had a
little memory consuming datastructure. I was too focused on using as
little memory as possible and at the same time have an efficient program.
I have spend too many years on this without actually producing anything.
Now I follow the saying "make it work, make it right, make it fast".
I am in the 'make it work/right' phase at the moment.
Speed and memory-consumption doesn't yet matter.

Actually I got the GoF book this summer.. I don't know how I could live
without it. I am not inspired by its suggested design.

Well, if you take a look at the undo/redo code for Wily and Sam, they
are actually not very large. Using the Command pattern is very clean
and OO though.

I have just read the fine documentation for SAM. It covers many
interesting issues. The protocol is facinating, but I don't like its use
of the CommandPattern for undo/redo. I myself has exprienced that pitfall
of using CommandPattern this wrong way. Instead MementoPattern should be
used. Though in some cases, reverse operation can be sufficiently.

Intuitive in general, yes. But is it intuitive in this context. Are
there any obvious winnings by using it? Can you forsee syntax
definitions and indentation definitions easier to write in Ruby than in
LISP? Why? How?

I know very little about emacs.. I have no clue about LISP vs Ruby.
I want the user's AEditor dotfile to be Ruby code, something like:

server> cat .aeditor
language("ruby") {
tabsize=2
fold_tags=["#[", #]"]
help = HelpRI.new # using 'ri' for context help
key(F1){|context| help.lookup(context.current_word) }
key(F9){|context| system("ruby -d #{context.filename}") }
}
language("cplusplus") {
tabsize=4
fold_tags=["//[", "//]"]
key(F9){|context| system("make"); system("a.out") }
}
server>

[snip LISP]

guess, that, in the end I'm just pushing Lisp here because I am
interested in it and want to give it a try. Ruby is great, but I
already know Ruby ;-). I'm trying to get someone to push me back into
the Ruby camp :-D.

I prefer StandardML over LISP. Functional languages has other kinds of
constructions which has thier advantages. StandardML is has
pattern-matching, for instance 'gcd' looks like this:

fun gcd(0,n) = n
| gcd(m,n) = gcd(n mod m,m);

First line matches on a first argument which is zero.
Second line matches otherwise.

I have implemented a few things in SML: raytracer, www-search-engine,
compiler. I like Ruby over SML (but SML is nice

Yeah, but I want the language that makes it simpler to move as much out
to the extension language as possible.

I don't think the choise of a extension-language has any impact on how
many lines of code in the core.

Yeah, I tried opening this rather large text file (In The Beginning Was
The Command Line) and it started thrashing my hard-drive (just loading
it (a file of perhaps 200K?) since it required 130MB to load it started
swapping.

Yes it consumes much memory. AEditor works acceptable with relativ small
files. I am in the process of designing a new better/flexible/smarter/
faster/... datastructure.

It's a general editor, of course aimed at programming. Proportional
text is for whimps ;-)

Agree.. Same goals here.

What features do you plan: word-wrap, folding, refactoring ?

Ryan Pavlik · Oct 10, 2003

Yes, that, as I see it, is Emacs biggest win. The big problem I see
with Vim is that it hasn't undergone a major overhaul. It is basically
Emacs written in C now.

Not really. It's vi with lots of hacks to do more stuff added. So
much for being a tiny, bloat-free editor. ;-)

Vim is, in my opinion, probably the best editor that exists right
now. It is, however, going to reach a point where adding new
features will demand some sort of rewrite. With Emacs, things like
these are easy to alter. Anyway, Vim is extensible, it is, however,
not _changable_. By that I mean that, if you want to change the way
folding works, you must rewrite the core of Vim.

Actually that'd be the other way around. vim is changeable... you can
alter the C core... but it's not extensible: you can't really add
extensions. _This_ is what causes feature creep and bloat. Not
having 20 million extensions that do everything imaginable. Being
able to add extensions means you're able to _remove_ them. Rolling
stuff into the core means you've got a bloated editor.

Somewhat ironic.

Now, I don't want to start an editor war here. This could very easily
turn into one, and that would totally defeat the point. Therefore,
let's look at emacs and vim in terms of the interface concepts they
embody, such as interactivity, modality, extensibility, etc, rather
than implementation faults. Implementation is irrelevant since you're
implementing something new.

So basically, if you like a modal editor or not, go for editor of form
of your choice. I've seriously considered switching to vim simply
because of the ruby support. (A few things have kept me from doing
this, but they were merely technical issues.)

And, as Bram pointed out in some interview, that means altering
basically every file in the Vim distribution. To Vim's salvation
comes the ability to easily define new syntax definitions and
indentation definitions, which one has to agree, are a lot easier to
create and alter than with Emacs (Emacs being perhaps more powerful
though).

Not really, the emacs syntax bits are pretty trivial to do. They look
worse than they are; I was hacking around on the ruby-mode, and it was
much easier to modify than it appears.

Now, this isn't really emacs vs vim; it's just a matter of philosphy.
You could trivially add code in emacs to do vim-ish syntax
descriptions, and I'm sure you could add code in vim to do more
scripted syntax.

The point to note is that it'd be a lot less work to convert
simplified syntax into complex syntax than it is to add scripted
syntax later.

[stuff about Ruby libraries and such]

Right. They're there, people can write extensions that interface to
the web, or whatever.

Click to expand...

Yes, but I don't want an Operating System. I guess, to a certain degree
the library you get helps you, but it can also detract you from the
central topic, namely editing text.

I know it's a typical joke to say "emacs isn't an editor, it's an OS",
but remember---it's a joke. Don't build your editor based on jokes.

In truth the emacs core is very tight: it provides a buffer-oriented
language and interface elements. Because it's general, people have
written lots of stuff, some of which is quite silly (tetris, web
browser, etc.), but is also a testament to flexibility. You can be
sure that if you can make it play tetris, you can also make it edit
text in any conceivable way.

Of course this isn't limited to emacs-style editing. Plenty of
editors are scriptable; you could implement scripting capabilities and
provide any interface you want. Same deal. Flexibilty, however, is
something worth considering.

Yeah, OK. I see your point. It is, however, very easy to alter to fit
your own needs. Change some global variables and you can make it work
for almost anything. I can't tell, but I'm guessing your code in Ruby
wouldn't allow for this?

Actually the ruby version and the lisp version are almost identical,
API-wise. I just wrote a function in ruby that took a string and a
regexp, and returned the aligned string. The fact the emacs one goes
through and replaces buffer text is the only real difference.

I'm not trying to contract you, only point out that Emacs extensions
are, as oposed to Vim extensions, very flexible and well thought
out. This does, of course, not mean that it can't be done in Ruby.
I just get the feeling that Lisp excels at this.

Well. I definitely agree about the emacs extensions vs vim ones.
This is mostly philosophical IMO, as stated above. What makes
extensions more flexible is not the language, exactly, but the
philosophy you have for extensions. If it's "you can run a script on
the text" (like the ruby vim module), you don't get a lot to work
with. If it's "you can hook into every conceivable part of the
editor", or (like the sawfish window manager) "all higher-level
functionality is implemented in the scripting language", you have
extreme flexibility.

That said, one of the nice things about ruby is that it has taken a
_lot_ of direction from Lisp. There are _many_ little things that
just make it more convenient to use (closures, for instance).
Convenience when writing extensions is greatly appreciated by
users. ;-)

[stuff about functional vs. OO being more well suited for editing
text]

I don't really find that. I don't think functional programming is any
easier for editor-related tasks. I'm not even sure how you would come
up with such an assumption.

Click to expand...

My real point was that having OO around doesn't really help either. It
doesn't add anything. Sure, you can make classes like Buffer and Window,
were is the real gain?

The first gain is elegance. Uniformity. You know where to look when
you want a method to alter Window; you look in the Window class. You
don't look for all the functions that may or may not have "-window" in
them. (This is a major issue I've had with both sawfish and emacs;
the nomenclature is not uniform. This may be considered purely
implementational, but people tend to be lazy, or they've approached
something from an unusual angle, and things break down.)

The second is extensibility. If I want to alter how the buffer is
handled, I have to look for hooks (if they're provided) and make sure
to set all the right global variables (ugh) and change the keys, etc.
With the object-oriented approach, I just subclass Buffer and override
what I need. (Or perhaps I merely make a new interface class.) It's
much more straightforward, and again, more elegant.

I have tried to envision some OO structure for implementing Emacs
like Major/Minor Modes and such, but I haven't been able to come to
any satisfactory results. I mean, how is a Major Mode an object?
Really?

This is because you're trying to do a direct transliteration of emacs
to OOP. This doesn't work. Step back and see first what you're
trying to accomplish, and then design an object structure that handles
it, while sticking to the _philosophy_ of emacs ("script everything").

Just off the top of my head, you don't want modes, first off. They're
too constraining. The major problem is that you can't have multiple
"major" modes without a lot of hackery (if at all). What you ask
yourself is: what do modes provide? The answer: buffer behavior.

What might be better: behavior objects. You have a class hierarchy to
modify various aspects, and you put these together to build your
favorite buffer behavior. This way you could have multiple syntaxes
per buffer, for instance.

Configuration would simply be changing the object properties. For
instance, if you get a hilighter object as 'syntax', you might:

:
syntax.comment_color = BLUE;
syntax.variable_color = RED;
:

The 'syntax' object could be a subclass of a more general Formatter,
and you may even have a subclass per language, where you have the
ability to implement specifics if necessary. A basic SyntaxHilighter
could provide simple regexp hilights that everyone would use anyway.

I guess it has a syntax definition, a separate keybinding mapping,
an indentation callback, maybe something else? I just don't feel it
adds anything though. I am, of course open to suggestions ;-).

The main thing I'd suggest is to look at what things _mean_, and what
they _could_ be... don't necessarily mimic the way things are
implemented right now.

Yeah, this is true. Ruby would be well suited for this I do
believe. But note that Emacs C core isn't very small ;-)

The emacs C core includes elisp, which isn't exactly fair; count all
the lines of vim and perl, and you won't find it too small either. ;-)

Yeah, this would be easy to do as well. There is, of course, the
inherent risk of not being portable enough. Vim supports this in a way,
and I have never seen it used to date.

True, but ruby extensions tend to be fairly portable. Don't worry
about enforcing this. If people want it, they'll port it. It'd
likely be a rare occurance it happens at all. (Maybe some really fast
text munger; I dunno.)

[stuff about regular expressions]

Well, to be blunt, whatever you come up with won't be as popular or
useful as the existing regular expressions, just because they'll be a
nonstandard replacement of something already very common. PCRE
regexps are extremely flexible and well-known.

Click to expand...

As useful? Please, my dear sir, there has to be something better than
the way we describe regular expressions now.

Better? Perhaps. I would first want to know what is wrong with
existing regular expressions. (Now, I dislike the overuse of regexps,
and tend to avoid them in my code, but when you're writing editor code
or parsers it's a different story.) Consider:

1. Do you feel they could be improved in form alone, or
functionality?

2. What form should they take, it not the current form?

3. What function should they accomplish, that they do not
currently accomplish?

4. Is the action you're trying to take actually something you
should be using regexps for, or is there a better way?

At least for searching text. The syntax we have today for regular
expressions is basically the same, only extended, as that that Ken
Thompson uses in his 1968 paper on it. Or that of _real_ regular
expressions long before it. And remember, real regular expressions
only have * (Kleene star) and no +.

Just because something is old doesn't mean it's bad. Text is text.
That hasn't changed since 1968 or 1908 or 1809 or long before.

Only the content.

The method of handling it has gotten more complex, because we've gone
from purely academic uses to actual everyday useful uses.

There has to be a simpler syntax that can be useful for
interactive text search-and-replaces.

There might be. There are simpler syntaxes, but they usually trade
off on functionality. There are GUI regexp builders.

I'm definitely a fan of throwing things out the window when they need
to be. But the first thing to determine is what you're trying to
accomplish, and how throwing things away lets you accomplish it.

Look at Vim, Emacs, and Perl (and thus, basically, Ruby)'s syntax.
They are all extensions of this, adding new short cryptic ways of
saying things that you often don't need, and if you did you wouldn't
want to do it that way anyway. The real example of how it has
gotten out of hand is the overuse of backslash (\). It is
everywhere. <snip more obscure things>

There are really two things at play here. The first is commonality.
Everyone uses backslashes for escaping, from C to bash. You need to
escape things, so backslash is as good as any, and everyone is
familiar with it. This is good.

Obscure feature creep is less good. I am not a fan of Perl's "a
kitchen sink for every occasion" philosophy (don't get me started
here), but if you start getting tons of special cases, maybe your
instincts are right: maybe we need something new. But maybe it's not
regular expressions, either.

Nah OK. You've got a point. But, as with most free software, this
one's for me ;-). If anyone wants to tag along later on, fine. But I
won't care if no one is interested, Emacs and Vim are fine editors.
Even notepad has its uses. It can, for example, tell you if a file is
smaller or greater than 65535 bytes very easily ;-).

Actually this is good, I think. If you're writing an editor to be
popular, it probably won't. If you're writing it to be useful to you,
it'll probably be popular with others who have similar needs.

I have, perhaps, failed to describe the real winning here. (Alas, I
realize I forgot to mention it.) As you perhaps know, Vim, and most
other UNIX software, operate on a line-by-line basis. This restriction
would not impede the command language I'm contemplating. If you take a
look at the Sam editor[2], this is its main selling point, and this is
another one I want to include.

Not sure what you're getting at here, but it seems interesting...

Nono, they don't do string escapes. \n in a regex (//) means match a
newline, not substitute this for 0x0a. So, you don't have to quote it
with an extra backslash to get that meaning.

<snip>

Hmm, I think I see what you mean. You could just use '' though, which
does it for you:

irb> s = '\n'
=> "\\n"

and to match a backslash itself:
"\\\\"
which is horrendous. In a Ruby regex, /\\/ suffices.

Yes, this is a problem I've had with elisp regexps. Too many \\'s.
I've seen lines like \\\\\\\\\\\\\\\\\\\\ before. Really painful to
read.

ttyl,

Robert Klemme · Oct 10, 2003

Hehe. Yeah, that is true. So we'd need two separate constructs.

Yes, that might be better.

No, it isn't actually. Thank you for pointing this out. This could
actually be the answer to my question. I'll have to think about this.
I had totally forgot that we had very restrictive single-quoting
contexts (and even the %q{}). Thanks,
nikolai

You're welcome.

P.S.
I realize now that that last part may seem a bit sarcastic, but it
isn't, I promise.
D.S.

No offense taken.

Kind regards

robert

Robert Klemme · Oct 10, 2003

Nikolai Weibull said:
To Vim's salvation comes the ability to easily
define new syntax definitions and indentation definitions, which one has
to agree, are a lot easier to create and alter than with Emacs (Emacs
being perhaps more powerful though).

Which is a usual tradeoff: more flexibility typically introduces greater
complexity.

My real point was that having OO around doesn't really help either. It
doesn't add anything. Sure, you can make classes like Buffer and Window,
were is the real gain? I have tried to envision some OO structure for
implementing Emacs like Major/Minor Modes and such, but I haven't been
able to come to any satisfactory results. I mean, how is a Major Mode
an object? Really? I guess it has a syntax definition, a separate
keybinding mapping, an indentation callback, maybe something else? I
just don't feel it adds anything though. I am, of course open to
suggestions ;-).

Well, the real power comes into play if you find a *reasonable* definition
of an abstract major mode class whose interface provides everything you
need from a major mode. Then you can introduce a clean separation of
concerns between the modes and the framework, i.e. the framework can deal
with modes in a uniform way and the modes encapsulate those things that
are special for them. These could be among others, indentation algorithms
(IMHO a classes in their own right), coloring schemes (maybe as well a
whole set of classes), tab conversion rules (think of makefiles which
don't like their tabs converted

), mode documentation (short and long),
mode keybindings... You can have a mode instance per file storing current
file related status. All these seem to me to obviously point to modes as
a class hierarchy (possibly with related class hierarchies).

As useful? Please, my dear sir, there has to be something better than
the way we describe regular expressions now.

Well, there might be. But consider, that many people are used to defining
regular expressions the way we do it today (apart from those that have to
use Word regular expressions

) that any other format could indeed be
worse for the sole reason of habituation.

having to move my hand to the upper right corner of my
keyboard all the time is a real pain.

? I have a key named "Esc" at that place. What kind keyboard do you
have?

))

Even notepad has its uses. It can, for example, tell you if a file is
smaller or greater than 65535 bytes very easily ;-).

))

Anyway, I whish you good luck for this project. To me it sounds
promising.

Regards

robert

Frank Schmitt · Oct 10, 2003

Simon Strandgaard said:
This method requires typecasting if you want to do something like this:

Derived *a = new Derived1();
Derived *b = dynamic_cast<Derived*>(a->clone());

Covariant return types is unfortunatly a relative new thing in the C++
world.

class Derived1: public Base {
virtual Derived1* clone() { return new Derived1(*this); }
};

Derived *a = new Derived1();
Derived *b = a->clone();

Arg. Yes, of course I meant clone() to have covariant return types.

this

Compiling is expensive on my Pentium350 which is the fastest machine I have.
GCC2.96 was quite fast... I have not yet seen any fast result with GCC3.

Ah.. yes. GCC3 is definitely quite slow - but the Intel compiler is even
slower :-(

I like templates, I like many things in C++. If I had a faster machine, then
there would be no problems

At work I use a 2.0 GHz P4, and even on that machine compile times are
annoying.

Thanks for your insights
frank

Linus Sellberg · Oct 10, 2003

Actually I got the GoF book this summer.. I don't know how I could live

without it. I am not inspired by its suggested design.

What book is this?

Mike Stok · Oct 10, 2003

What book is this?

The "Gang of Four"'s Design Patterns book. See
http://www.c2.com/cgi/wiki?GangOfFour or
http://www.awl.com/cseng/titles/0-201-63361-2

Hope this helps,

Mike

Simon Strandgaard · Oct 10, 2003

What book is this?

GoF == Gang of Four == 4 authors with lots of success

title: Design Patterns
subtitle: Elements of Reusable Object-Oriented Software
authors: Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides.

covers 23 fundemental design patterns.

Great book, but expensive.

Nikolai Weibull · Oct 10, 2003

* Ryan Pavlik said:
Not really. It's vi with lots of hacks to do more stuff added. So
much for being a tiny, bloat-free editor. ;-)

It was not to be taken literally ;-). Vim ain't Emacs yet.

Actually that'd be the other way around. vim is changeable... you can
alter the C core... but it's not extensible: you can't really add
extensions. _This_ is what causes feature creep and bloat. Not
having 20 million extensions that do everything imaginable. Being
able to add extensions means you're able to _remove_ them. Rolling
stuff into the core means you've got a bloated editor.

Well, we had already used a different meaning for extensible, so I used
changable ;-).

Somewhat ironic.

Yes, definitely. That is the problem really. You can't really remove
or add features 'at runtime'. So, instead of a small editor, you get
this quasi-small one, depending on what features you want.

Now, I don't want to start an editor war here. This could very easily
turn into one, and that would totally defeat the point. Therefore,
let's look at emacs and vim in terms of the interface concepts they
embody, such as interactivity, modality, extensibility, etc, rather
than implementation faults. Implementation is irrelevant since you're
implementing something new.

Yes, you're right. I don't want that

. Yet, it is interesting to
compare the design decisions that went into the rather different editor
concepts.

So basically, if you like a modal editor or not, go for editor of form
of your choice. I've seriously considered switching to vim simply
because of the ruby support. (A few things have kept me from doing
this, but they were merely technical issues.)

You're welcome to come aboard ;-). I'm the maintainer of the Vim indent
script, together with Gavin Sinclair. It is, in my opinion, better than
the one that comes with Emacs, even though I guess Matz wrote that one
;-).

Not really, the emacs syntax bits are pretty trivial to do. They look
worse than they are; I was hacking around on the ruby-mode, and it was
much easier to modify than it appears.

OK. That's good to hear ;-). Sadly, I've never had time to really get
into Emacs Major Mode definitions, since they look rather threatening
;-).

Now, this isn't really emacs vs vim; it's just a matter of philosphy.
You could trivially add code in emacs to do vim-ish syntax
descriptions, and I'm sure you could add code in vim to do more
scripted syntax.

The point to note is that it'd be a lot less work to convert
simplified syntax into complex syntax than it is to add scripted
syntax later. Yes.

I know it's a typical joke to say "emacs isn't an editor, it's an OS",
but remember---it's a joke. Don't build your editor based on jokes.

Haha. No, of course not ;-). I was merely reusing the joke ;-).

In truth the emacs core is very tight: it provides a buffer-oriented
language and interface elements. Because it's general, people have
written lots of stuff, some of which is quite silly (tetris, web
browser, etc.), but is also a testament to flexibility. You can be
sure that if you can make it play tetris, you can also make it edit
text in any conceivable way.

Hehe, true, true. Yeah, I'm not saying that this isn't good. But these
are perhaps not features you need to include in the main distribution,
cluttering up menus and configuration displays.

Of course this isn't limited to emacs-style editing. Plenty of
editors are scriptable; you could implement scripting capabilities and
provide any interface you want. Same deal. Flexibilty, however, is
something worth considering.

Yes, anything should be possible. I'm planning on putting the whole
command language outside the C core and into the scripting language
domain. This seems best. It will be a good test of the API as well,
since you won't be able to do anything if even the most simple editing
commands won't work ;-).
[tab aligning]

Actually the ruby version and the lisp version are almost identical,
API-wise. I just wrote a function in ruby that took a string and a
regexp, and returned the aligned string. The fact the emacs one goes
through and replaces buffer text is the only real difference.

OK. Yeah, I've been rewriting some small Emacs-Lisp snippets in Ruby,
and I must say, Ruby is winning so far.

Well. I definitely agree about the emacs extensions vs vim ones.
This is mostly philosophical IMO, as stated above. What makes
extensions more flexible is not the language, exactly, but the
philosophy you have for extensions. If it's "you can run a script on
the text" (like the ruby vim module), you don't get a lot to work
with. If it's "you can hook into every conceivable part of the
editor", or (like the sawfish window manager) "all higher-level
functionality is implemented in the scripting language", you have
extreme flexibility.

Yes! Precisely what I was trying to say! This is what I want from my
editor. If you only get the "you can run a script on the text"-thing,
as you say, you will only wind up with dirty hacks. If, however, you
get to get into the guts of things, you can make wonderful things. It's
like a doctor operating. It's hard to do without breaking skin. (Well,
OK, there are ways, but still, the analogy works ;-)

That said, one of the nice things about ruby is that it has taken a
_lot_ of direction from Lisp. There are _many_ little things that
just make it more convenient to use (closures, for instance).
Convenience when writing extensions is greatly appreciated by
users. ;-)

Yeah, I've realized this in the last few weeks that I've been looking at
Lisp. But Ruby has a more 'familiar' syntax, and it is probably a big
winning.[me whining about there not being any reason to use classes/OOP]

The first gain is elegance. Uniformity. You know where to look when
you want a method to alter Window; you look in the Window class. You
don't look for all the functions that may or may not have "-window" in
them. (This is a major issue I've had with both sawfish and emacs;
the nomenclature is not uniform. This may be considered purely
implementational, but people tend to be lazy, or they've approached
something from an unusual angle, and things break down.) This sounds reasonable.
The second is extensibility. If I want to alter how the buffer is
handled, I have to look for hooks (if they're provided) and make sure
to set all the right global variables (ugh) and change the keys, etc.
With the object-oriented approach, I just subclass Buffer and override
what I need. (Or perhaps I merely make a new interface class.) It's
much more straightforward, and again, more elegant.

OK. This, i spose is a paradigmical question. I do, however, prefer
the way you describe in many ways.

This is because you're trying to do a direct transliteration of emacs
to OOP. This doesn't work. Step back and see first what you're
trying to accomplish, and then design an object structure that handles
it, while sticking to the _philosophy_ of emacs ("script everything").

Hm, maybe you're right. I said this after reviewing the Tcl/Python
plugins for Vim and came to the conclusion that there weren't many
winnings. But perhaps they've been too influenced by the thinking you
describe. I'll keep this in mind.

Just off the top of my head, you don't want modes, first off. They're
too constraining. The major problem is that you can't have multiple
"major" modes without a lot of hackery (if at all). What you ask
yourself is: what do modes provide? The answer: buffer behavior.

What might be better: behavior objects. You have a class hierarchy to
modify various aspects, and you put these together to build your
favorite buffer behavior. This way you could have multiple syntaxes
per buffer, for instance.

Configuration would simply be changing the object properties. For
instance, if you get a hilighter object as 'syntax', you might:

:
syntax.comment_color = BLUE;
syntax.variable_color = RED;
:

The 'syntax' object could be a subclass of a more general Formatter,
and you may even have a subclass per language, where you have the
ability to implement specifics if necessary. A basic SyntaxHilighter
could provide simple regexp hilights that everyone would use anyway.

Hm, yeah. This sounds reasonable.

The main thing I'd suggest is to look at what things _mean_, and what
they _could_ be... don't necessarily mimic the way things are
implemented right now.

The emacs C core includes elisp, which isn't exactly fair; count all
the lines of vim and perl, and you won't find it too small either. ;-)

Hehe, no. I know, that's was, in a way my point too.

Yeah, this would be easy to do as well. There is, of course, the
inherent risk of not being portable enough. Vim supports this in a way,
and I have never seen it used to date.

Click to expand...

True, but ruby extensions tend to be fairly portable. Don't worry
about enforcing this. If people want it, they'll port it. It'd
likely be a rare occurance it happens at all. (Maybe some really fast
text munger; I dunno.)
Agreed.[regexen]

Better? Perhaps. I would first want to know what is wrong with
existing regular expressions. (Now, I dislike the overuse of regexps,
and tend to avoid them in my code, but when you're writing editor code
or parsers it's a different story.) Consider:

1. Do you feel they could be improved in form alone, or
functionality?

Yes. I do feel the form could be greatly improved. The functionality
doesn't really need any modification though. Maybe take a few steps
back and review what we have now and what we really need to still be
able to do good and wonderful things with them.

2. What form should they take, it not the current form?

Hehe, this is, in a sense, the topic of my masters' thesis, and I do not
have an answer yet. Maybe there isn't, we'll see ;-).

3. What function should they accomplish, that they do not
currently accomplish?

It's not about the function, it's about the use. Use should be simpler.
Function will follow. I mean, search[-and-replace] is the main use of
course, and this should be made simpler. There is also a need for many
other things, such as syntax highlighting and such, but we'll see what
this requires.

4. Is the action you're trying to take actually something you
should be using regexps for, or is there a better way?

Well, search[-and-replace] demand them. Maybe syntax-highlighting and
indenting and such can be made different perhaps, but it all comes down
to parsing of some sort.

Just because something is old doesn't mean it's bad. Text is text.
That hasn't changed since 1968 or 1908 or 1809 or long before.
Only the content.

No, but to extend it as if it were still 1968 is bad.

There might be. There are simpler syntaxes, but they usually trade
off on functionality. There are GUI regexp builders.

Not necessarily, you simply make things that are unusual 'harder' to
express and more usual things easier.

I'm definitely a fan of throwing things out the window when they need
to be. But the first thing to determine is what you're trying to
accomplish, and how throwing things away lets you accomplish it. Yes, agreed.

There are really two things at play here. The first is commonality.
Everyone uses backslashes for escaping, from C to bash. You need to
escape things, so backslash is as good as any, and everyone is
familiar with it. This is good.

Yes, but it is also bad that everyone uses it. Then you wind up with
the Emacs "\\\\" pattern for matching a backslash. They are used for
too many things and in to many places.

Obscure feature creep is less good. I am not a fan of Perl's "a
kitchen sink for every occasion" philosophy (don't get me started
here), but if you start getting tons of special cases, maybe your
instincts are right: maybe we need something new. But maybe it's not
regular expressions, either.

True, this will need investigation.

Actually this is good, I think. If you're writing an editor to be
popular, it probably won't. If you're writing it to be useful to you,
it'll probably be popular with others who have similar needs.

Yes, precisely. Lets hope so ;-).

I have, perhaps, failed to describe the real winning here. (Alas, I
realize I forgot to mention it.) As you perhaps know, Vim, and most
other UNIX software, operate on a line-by-line basis. This restriction
would not impede the command language I'm contemplating. If you take a
look at the Sam editor[2], this is its main selling point, and this is
another one I want to include.

Click to expand...

Not sure what you're getting at here, but it seems interesting...

That regular expressions and the commands that utilize them often think
in 'everything-is-a-line' terms. If we get away from that, maybe we can
get commands that use regular expressions in a totally different way
from today.[stuff about \n being a nuisance]

Hmm, I think I see what you mean. You could just use '' though, which
does it for you:

irb> s = '\n'
=> "\\n"

yeah, someone else also pointed this out. I had totally forgotten about
it. Silly me ;-)

Yes, this is a problem I've had with elisp regexps. Too many \\'s.
I've seen lines like \\\\\\\\\\\\\\\\\\\\ before. Really painful to
read. Hehe, yes, precisely

ttyl,

I look forward to your response ;-), you have been very helpful so far,
nikolai

P.S.
I'm feeling moer and more inclined to use Ruby over a Lisp dialect. You
have all managed to sway me in the right direction it seems ;-).
Thanks.
D.S.

Charles Hixson · Oct 10, 2003

Nikolai said:
...

Well, to be blunt, whatever you come up with won't be as popular or
useful as the existing regular expressions, just because they'll be a
nonstandard replacement of something already very common. PCRE
regexps are extremely flexible and well-known.

Click to expand...

As useful? Please, my dear sir, there has to be something better than
the way we describe regular expressions now. At least for searching
text. The syntax we have today for regular expressions is basically the
same, only extended, as that that Ken Thompson uses in his 1968 paper on
it. Or that of _real_ regular expressions long before it. And
remember, real regular expressions only have * (Kleene star) and no +.
There has to be a simpler syntax that can be useful for interactive text
search-and-replaces. Look at Vim, Emacs, and Perl (and thus,
basically, Ruby)'s syntax. They are all extensions of this, adding new
short cryptic ways of saying things that you often don't need, and if
you did you wouldn't want to do it that way anyway. The real example of
how it has gotten out of hand is the overuse of backslash (\). It is
everywhere. having to move my hand to the upper right corner of my
keyboard all the time is a real pain.
Of course we'll have to see if I'm actually able to come up with
anything better. It's probably not going to be as easy as I'd like to
suggest here. However, look at the Perl 6 Apocalypse 5[1] to see one way
of moving away from cryptic (?:...) metasyntaxes.

I've seen one (1) way of improving regexp's. It involved using graphic
representations while creating them. After expression creation, it was
rendered into text (and if you knew what you wanted, you could just type
it in). I haven't seen any actual language improvements that weren't in
some way isomorphic. (I.e., you can use pretty graphics for each of the
inserted characters, and you might do something to make typing the
escape character easier, but improving the semantics ... I haven't seen
any better options. And improving the syntax... possibly if you switch
to unicode...but then how do you enter it?

Nah OK. You've got a point. But, as with most free software, this
one's for me ;-). If anyone wants to tag along later on, fine. But I

So you won't be interested in the graphic editor. I've got a vague idea
of how much additional work that would be. It was in a commercial
product on the Mac, but I don't think it's being made any more. (Either
Nisus or Qued/M ... probably Qued/M, but if it's Nisus, I seem to
remember that the feature went away in a later version... too complex
for many of their potential customers, perhaps.)

won't care if no one is interested, Emacs and Vim are fine editors.
Even notepad has its uses. It can, for example, tell you if a file is
smaller or greater than 65535 bytes very easily ;-).
I have, perhaps, failed to describe the real winning here. (Alas, I
realize I forgot to mention it.) As you perhaps know, Vim, and most
other UNIX software, operate on a line-by-line basis. This restriction...

--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

Not to discourage you, but have you looked at NEdit? It doesn't have a
full scripting language, but it has some nice pattern recognition
mechanisms. And it's GPL. (OTOH, I've never gotten their source to
compile...they don't use a standard make system, but something of their
own creation that seems to me to have problems.)

Brett H. Williams · Oct 10, 2003

You're welcome to come aboard ;-). I'm the maintainer of the Vim indent
script, together with Gavin Sinclair. It is, in my opinion, better than
the one that comes with Emacs, even though I guess Matz wrote that one
;-).

Thanks for maintaining the Ruby vim indent script (Gavin also). I use it
constantly.

One advantage I do see that the vim indent script has over emacs mode is the
matching colorization of def and end, which is a nice feature.

That said, having used both emacs mode and the vim indent script, I must beg to
differ that it is better than emacs mode. There are a couple of annoying
things about the current vim indent script for Ruby.

These things are annoying enough that <gasp> I have a binding that will call
out to Emacs to indent code. I don't use it that often, but I do have it.

This is my biggest annoyance:

foo = some_method(arg1, arg2, arg3, "some long argument that eats space",
"some other stuff", arg, etc)

Emacs mode will indent the "some other stuff" according to the ( on the
previous line).

If I indent it by hand, then the line following also tries to line up with it.
This seems to be common to more than just ruby's indent script, so maybe it's
difficult (perl's does the same). However, the C indent seems to do pretty
well, even if it doesn't line up exactly how I'd like. It's at least indented,
and the next line after the indent snaps back to where I expect:

int main() {
folifdsij("Helkjfkdlsjfkldjskajfkld",
"blah");
bar();
return 1;
}

This will be the last time I complain about this though, I promise

Maybe I
should just learn vim indents and try and figure it out myself. I'm just
surprised this doesn't bother more people.

Nikolai Weibull · Oct 10, 2003

* Charles Hixson said:
I've seen one (1) way of improving regexp's. It involved using graphic
representations while creating them. After expression creation, it was
rendered into text (and if you knew what you wanted, you could just type
it in). I haven't seen any actual language improvements that weren't in
some way isomorphic. (I.e., you can use pretty graphics for each of the
inserted characters, and you might do something to make typing the
escape character easier, but improving the semantics ... I haven't seen
any better options. And improving the syntax... possibly if you switch
to unicode...but then how do you enter it?

Hm, yeah, this is probably not what I'm looking for. It's a nice idea
perhaps, but it can easily become something for children to play with,
rather than for a programmer to utilize.

So you won't be interested in the graphic editor. I've got a vague idea
of how much additional work that would be. It was in a commercial
product on the Mac, but I don't think it's being made any more. (Either
Nisus or Qued/M ... probably Qued/M, but if it's Nisus, I seem to
remember that the feature went away in a later version... too complex
for many of their potential customers, perhaps.)

Hehe, probably ;-). I wonder how many 'user-friendly' things get
dropped since they prove not to be what they set out to be. Look at
Windows. They're adding and removing things all the time. Does anyone
remember the '<--- Click Here To Start' thing that scrolled across the
taskbar when you started it up? Man, was that stupid or what? You have
a button that says 'Start' and then you get an arrow that tells you that
if you click it, you 'Start'. My mom would have gotten that.

Not to discourage you, but have you looked at NEdit? It doesn't have a
full scripting language, but it has some nice pattern recognition
mechanisms. And it's GPL. (OTOH, I've never gotten their source to
compile...they don't use a standard make system, but something of their
own creation that seems to me to have problems.)

Yes. And let me tell you, it's crap. Sorry, I don't want to heat up
the discussion here, but in my eyes, NEdit just isn't very good. It's
an attempt to bring Windows like editors to Unix. It has a lot of weird
design decisions that I just don't like (and neither do many of the
people I've spoken to about it at the Computer Technics (whatever that's
really called in English) Department at Uni). It's just not powerful
enough. Vim _is_ powerful enough, but it still feels wrong somehow.
Anyway,
nikolai

Nikolai Weibull · Oct 10, 2003

* Brett H. Williams said:
Thanks for maintaining the Ruby vim indent script (Gavin also). I use
it constantly. You're welcome.
One advantage I do see that the vim indent script has over emacs mode
is the matching colorization of def and end, which is a nice feature.

Ah, this is the syntax definitions doing. Thank Doug Kearns for his
great efforts at getting it right.

[Vim is worse than Emacs]

These things are annoying enough that <gasp> I have a binding that
will call out to Emacs to indent code. I don't use it that often, but
I do have it. OMG!
This is my biggest annoyance:

foo = some_method(arg1, arg2, arg3, "some long argument that eats space",
"some other stuff", arg, etc)

Emacs mode will indent the "some other stuff" according to the ( on
the previous line).

OK. You should head on over to
http://rubyforge.org/projects/vim-ruby/
and get the latest version of it. It does all that. Better than Emacs
hopefully. This was the version I was referring to.

If I indent it by hand, then the line following also tries to line up
with it. This seems to be common to more than just ruby's indent
script, so maybe it's difficult (perl's does the same).

Yes, this is true. A lot of the indent scripts for Vim are rather
basic. I hope many of mine (I maintain almost a quarter of them) at
least do something that works most of the time.

However, the C indent seems to do pretty well, even if it doesn't line
up exactly how I'd like. It's at least indented, and the next line
after the indent snaps back to where I expect:

int main() {
folifdsij("Helkjfkdlsjfkldjskajfkld",
"blah");
bar();
return 1;
}

Hm. You should read up on 'cindent'

h 'cindent') and 'cinoptions'.
There's a lot you can change with the C indenter. It's built into the
editor, so it works a bit diferrently than the others as well.

I'm just surprised this doesn't bother more people.

Yes, I often wonder why people don't come with more suggestions. I
guess many are so fasinated that their editor even does anything for
them that they don't see the real possibilities.
nikolai

Brett H. Williams · Oct 10, 2003

* Brett H. Williams said:
* Brett H. Williams said:

Thanks for maintaining the Ruby vim indent script (Gavin also). I use
it constantly. You're welcome.
One advantage I do see that the vim indent script has over emacs mode
is the matching colorization of def and end, which is a nice feature.

Click to expand...

Ah, this is the syntax definitions doing. Thank Doug Kearns for his
great efforts at getting it right.

[Vim is worse than Emacs]

"Vim is worse than Emacs" is a poor paraphrase

But I know what you
mean.

OK. You should head on over to
http://rubyforge.org/projects/vim-ruby/
and get the latest version of it. It does all that. Better than Emacs
hopefully. This was the version I was referring to.

Oh thank you thank you. It does what I want now. I wish I'd been slower
posting then I would've seen your later post urging us to try it. I hadn't
looked at things since the 6.2 release.

Thank you. I see matchit is working correctly as well--I was relying on
Ned Konz's version to get this.

Here is the first problem that we found:

def somemethod(something)
return Array.new() unless @var.class == Array
end

The use of the .class method confuses the syntax. If you use the
deprecated type() method, this problem goes away.

If I notice anything else I'll be sure to provide feedback. If I can get
to the point where a gg=G can be relied on as I used to rely on Emacs, I'll
be extremely happy indeed.

Charles Hixson · Oct 10, 2003

Nikolai said:
...

Yes. And let me tell you, it's crap. Sorry, I don't want to heat up
the discussion here, but in my eyes, NEdit just isn't very good. It's
an attempt to bring Windows like editors to Unix. It has a lot of weird
design decisions that I just don't like (and neither do many of the
people I've spoken to about it at the Computer Technics (whatever that's
really called in English) Department at Uni). It's just not powerful
enough. Vim _is_ powerful enough, but it still feels wrong somehow.
Anyway,
nikolai
...

Interesting. The thing I really like about it is that it's easy to
define a new language in. (Just last month I defined D [Digital Mars D
from Walter {Bright ??}.]) Now they do this with pattern recognition,
largely regular expressions, which don't suffice for complex parsing,
but are pretty good. Good enough to make editing a lot easier. And
because of that I always have it installed. If kate knows about it,
then I prefer kate, largely because it's windows are cleaner. But the
incremental search works much better in NEdit. I'll grant you that it's
been around along time (I suspect that the Windows editors were taken
from it rather than the converse), and it's had a few different people
maintaining it with different styles, so the code-base probably isn't
too clean, and certainly won't match recent theories of program design.
And the dialogs are the old form, and not as responsive as the newer
ones that are adapted to either Gnome or KDE. But I definitely prefer
using it to vim or gvim. I've tried. (In text mode, I prefer
vim...well, actually vi, but that's vim these days. But I don't spend
any time there these last several years.)

Hi. What would be the best language for creating a text game that includes relatively simple stats?	3	Jan 2, 2023
editor for newbie	5	Dec 31, 2010
You dream API for a text editor	5	Feb 25, 2010
Algoexpert for a beginner	0	Nov 13, 2022
[QUIZ] Adaptive Text Editor (#217)	1	Aug 8, 2009
small bounties for fixing this text editor (written in ruby)	3	Apr 15, 2009
Making a C extension	4	Dec 23, 2010
Scriptable text editor with Ruby?	35	Jan 3, 2007

Extension Language for a Text Editor

Nikolai Weibull

Nikolai Weibull

Nikolai Weibull

Nikolai Weibull

Nikolai Weibull

Simon Strandgaard

Ryan Pavlik

Robert Klemme

Robert Klemme

Frank Schmitt

Linus Sellberg

Mike Stok

Simon Strandgaard

Nikolai Weibull

Charles Hixson

Brett H. Williams

Nikolai Weibull

Nikolai Weibull

Brett H. Williams

Charles Hixson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads