Re: Meta-Characters, Special Characters

Discussion in 'Java' started by xah@xahlee.org, May 30, 2007.

  1. Guest

    Will (aka weber) wrote:
    «
    [about the various ways to input or represent keystrokes and or non-
    printable characters in Emacs]

    As far as I can see in all those situations entering meta-characters
    is
    addressed in a different way which I find confusing, e.g.:
    a) <key> _or_ C-q <key>
    b) C-q C-[, C-q C-m, C-q C-j, C-q C-i
    c) \e, \r, \n, \t
    d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
    »

    None of this complexity is istrinsic.

    Will wrote:
    «a) <key> _or_ C-q <key>»

    The C-q (or, pressing the Control key down then type q) is the
    keyboard shortcut to invoke the command quoted-insert. It is a
    general a way to allow you to input any non-printable characters. This
    facility usually don't exist in other text editors. In popular text
    editor such as Microsoft Word or Mac's Application, you usally bring
    up a window showing all the special characters, then press a button to
    insert the char you want.

    « b) C-q C-[, C-q C-m, C-q C-j, C-q C-i»

    In this, the C-q is the keyboard shortcut to invoke the command quoted-
    insert, which will insert a literal character of whatever character
    you can type on your keyboard. So, for example, C-q followed by the
    tab key will insert a the non-printable character tab.

    When speaking of non-printable characters, the context is a character
    set standard. Implicitly, we are talking about ASCII, and this applies
    to emacs. Now, in ASCII, there are about 30 non-printable characters.
    Each of these is given a standard abbreviation, and several
    representations for different purposes. For example, ASCII 13 is the
    “Carriage return†character, with abbr code CR, and ^M as its control-
    key-input representation. (M being the 13th of the English alphabet)

    For the full detail, look at the table here:
    http://en.wikipedia.org/wiki/Ascii

    (Note: Emacs also have a general way to input non-printable characters
    of the unicode standard. See
    Emacs and Unicode Tips
    http://xahlee.org/emacs/emacs_n_unicode.html
    )

    « c) \e, \r, \n, \t »

    This is a ad-hoc set of input and display representation for a few non-
    printable characters. This set is started by the motherfucking unix
    tech geeking morons, and by its free and speedy nature as cigarette
    given to children, today has spread to many languages (Perl, Java, C+
    +, C#, Python, JavaScript ...) and is a de facto standard. The damage
    is to such a degree that the general concept of unprintable
    characters, their representation, and their method of input, all
    treated in one systematic, simple way, are not in the consciousness of
    average industrial programers.

    I do not know the history of these display representations. (hopefully
    someone will) It is my guess, that part of the reason for these, is
    that the unix text editor vi, doesn't have a general way to input non-
    printable chars.

    « d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
    »

    This is the only part of complexity in our context that we can blame
    emacs's design. Emacs has several ways to represent keystrokes for
    defining shortcuts. The varieties mostly came from historical reasons,
    combined the the influence of unix mentality “Why Change when it ain't
    brokenâ€.

    Note here, that keystroke combination and sequence, is not the same
    and cannot be mapped to character's input/representation in a
    character set such as ASCII. For example, the F1 key in vast majority
    of keyboards, isn't a character. So, this means, when you have a
    editor with a language such as emacs, that allows users to define
    arbitrary key stroke sequences, you necessarily have to come up with a
    system to represent keystrokes. So, this complexity is a intrinsic
    complexity.

    (Side note: A easy way to understand what's intrinsic vs extraneous
    complexity is to think: “My god, why is math so complex? God must have
    fucked up in its design.â€. The gist is that, certain things, are
    inherently complex by nature, while others, are extraneous complexity
    that are artificially created by lousy design or evolution. As a
    concrete example in computing, languages like Lisp, is in general very
    well designed. Due to its simplity and almost no artificial
    complexity, programers are immediately exposed to many of the
    intrinsic complexity of computing. While languages like C and its
    litters such as C++, Java, C#, Perl etc created by the unix
    motherfuckers, are filled to the brim with artificial complexity due
    to tremendous laziness, ignorance, and lies. )

    For various ways to represent keyboard shortcuts, see
    http://xahlee.org/emacs/keyboard_shortcuts.html

    For the unix mentality “Why Change when it ain't brokenâ€, see
    http://xahlee.org/UnixResource_dir/writ/aint_broken.html

    We, as software creators, must not have unix's “why change when it
    ain't broken†attitude. Emacs itself, although far more well thought
    out than majority of software, nevertheless aquired many baggages in
    its 30 or so years of old age. I would recommend that we start a
    effort to eliminate some of these outdated baggages. Please see:

    “The Modernization of Emacsâ€
    http://xahlee.org/emacs/modernization.html

    Xah

    ∑ http://xahlee.org/


    On May 29, 5:58 am, Will <> wrote:
    > Hi,
    >
    > how can I find the an overview on how to enter meta-characters
    > (e.g. esc, return, linefeed, tab, ...)
    > a) in a regular buffer
    > b) in the minibuffer when using standard search/replace-functions
    > c) in the minibuffer when using search/replace-functions using regular
    > expressions
    > d) in the .emacs file when defining keybindings
    >
    > As far as I can see in all those situations entering meta-characters is
    > addressed in a different way which I find confusing, e.g.:
    > a) <key> _or_ C-q <key>
    > b) C-q C-[, C-q C-m, C-q C-j, C-q C-i
    > c) \e, \r, \n, \t
    > d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
    >
    > Furthermore, they are displayed in a different way,e.g.
    > - actual, visible layout
    > - ^E, ^M, ^L, ^I
    > - Octals
    >
    > I would be happy about pages summarizing such information.
    > Any references available?
    >
    > Thanks in advance,
    >
    > Will
     
    , May 30, 2007
    #1
    1. Advertising

  2. Guest

    The following is a modified and extended version of the previous
    article.
    The HTML formatted version is available at
    http://xahlee.org/emacs/keystroke_rep.html

    The Confusion of Emacs's Keystroke Representation

    Xah Lee, 2007-05-29

    Someone wrote:
    «
    [about the various ways to input or represent keystrokes and or
    non-printable characters in Emacs]

    As far as I can see in all those situations entering meta-characters
    is
    addressed in a different way which I find confusing, e.g.:
    a) <key> _or_ C-q <key>
    b) C-q C-[, C-q C-m, C-q C-j, C-q C-i
    c) \e, \r, \n, \t
    d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
    »

    None of this complexity is intrinsic, except your item d. Your first
    item:

    «C-q <key>»

    The C-q (or, pressing the Control key down then type q) is the
    keyboard shortcut to invoke the command quoted-insert. After this
    command is invoked, the next key press on your keyboard will force
    emacs to insert a character represented by that key, and withheld that
    key's normal function.

    For example, if you are doing string replacement, and you want to
    replace tabs by returns. When emacs prompts you to type a string to
    replace, you can't just press the tab key, because the normal function
    of a tab key in emacs will try to do a command completion. (and in
    other Applications, it usually switches you to the next input field)
    So, here you can do C-q first, then press the tab key. Similarly, you
    can't type the return key and expect it to insert a return character,
    because normally the return key will activate the OK button or “end of
    inputâ€.

    This input mechanism usually don't exist in other text editors. In
    popular text editors such as Microsoft Word or Mac's Application, you
    usally bring up a window showing all the special characters, then
    press a button to insert the char you want.

    «C-q C-[, C-q C-m, C-q C-j, C-q C-i»

    In this, the C-q is the keyboard shortcut to invoke the command quoted-
    insert, which will insert a literal character of whatever character
    you can type on your keyboard. So, for example, C-q followed by the
    tab key will insert the non-printable character “tabâ€.

    The C-[, C-m, C-j etc key-press combinations (Holding down Control key
    while pressing “[â€, “mâ€, “jâ€), are methods to input non-printable
    characters that may not have a corresponding key on the keyboard.

    For example, suppose you want to do string replacement, by replacing
    Carriage Return (ASCII 13) by Line Feed (ASCII 10). Depending what is
    your operatin system and keyboard, usually your keyboard only has a
    key that corresponds to just one of these characters. But now with the
    special method to input non-printable characters, you can now type any
    of the non-printable characters directly.

    When speaking of non-printable characters, implied in the context is
    some standard character set. Implicitly, we are talking about ASCII,
    and this applies to emacs. Now, in ASCII, there are about 30 non-
    printable characters. Each of these is given a standard abbreviation,
    and several representations for different purposes. For example, ASCII
    13 is the “Carriage return†character, with abbr code CR, and ^M as
    its control-key-input representation. (M being the 13th of the English
    alphabet), and Control-m is the conventional means to input the
    character, and the conventional method to indicate a control key
    combination is by using the caret “^†followed by the character.

    For the full detail, look at the table in the wikipedia article:
    ASCII↗.

    In general, the practical issues involved for a non-printable
    character, in the context of a programing language for text editing,
    are: its display representation, its input method, and the display
    representation for the character's input method.

    (Note: Emacs also has a general way to input non-printable and or non-
    typable characters of the unicode standard. See Emacs and Unicode
    Tips )

    «\e, \r, \n, \t »

    This is a ad-hoc set of input and display representation for a few non-
    printable characters. This set is started by the motherfucking unix
    tech geeking morons, and by its free and speedy nature as cigarette
    given to children, today has spread to many languages (Perl, Java, C+
    +, C#, Python, JavaScript ...) and is a de facto standard. The damage
    is to such a degree that the general concept of unprintable
    characters, their representation, and their method of input, all
    treated in one systematic, simple way, are not in the consciousness of
    average industrial programers.

    I do not know the history of these display representations. It is my
    guess, that part of the reason for these, is that the unix text editor
    vi, doesn't have a general way to input and or represent non-printable
    chars. Other reasons are that these particular non-printable chars are
    vastly far more frequently needed in text/string manipulation among
    programing languages, and the blackslash representation are somewhat
    more intuitive, and processing blackslahsed characters as a “string
    escape†mechanism works better as a representation inside strings for
    programing languages, than the representations of prefixing a caret
    “^â€.

    «
    (global-set-key (kbd "M-a") 'func-name) ; meta a
    (global-set-key (kbd "C-a") 'func-name) ; control a
    (global-set-key [f2] 'func-name) ; F2 key
    (global-set-key [kp-2] 'func-name) ; the 2 key on the number
    keypad
    (global-set-key [M-f2] 'func-name) ; meta f2
    (global-set-key [(meta shift a)] 'func-name) ; Meta shift a (capital
    A)
    (global-set-key [?\C-x ?a] 'func-name) ; control x, followed by a
    (global-set-key [?\C-x f2] 'func-name) ; control x, followed by f2
    [This is confusing!]
    »

    These are elisp code to define a keyboard shortcuts. This is the only
    part of complexity in our context that we can blame emacs's design.

    Emacs today has several rather confusing ways for keystroke
    representation, out of mostly historical reasons. For example, the
    need to keep compatibility between Emacs and Xemacs↗. Another example
    of a reason, is that elisp the language uses integer to represent
    printable characters. So, for example, the number 97 in lisp's
    keystroke code also means the keystroke “aâ€. These mostly historical
    reasons, are exacerbated by the influence of unix mentality “Why
    Change when it ain't brokenâ€.

    Note here, that keystroke combination and sequence, is not the same
    and cannot be mapped to character's input/representation in a
    character set such as ASCII. For example, the F1 key in vast majority
    of keyboards, isn't a character. The Alt modifier key, isn't a
    character nor is it a function in one of ASCII's non-printable
    character. The keys on the number keypad, need a different
    representation than the ones on the main keyboard section.

    So, this means, when you have a editor with a language such as emacs,
    that allows users to define arbitrary key stroke combination and
    sequences, you necessarily have to come up with a system to represent
    keystrokes. So, this complexity is a intrinsic complexity.

    (Side note: A easy way to understand what's intrinsic vs extraneous
    complexity is to think: “My god, why is math so complex? God must have
    fucked up in its design.â€. The gist is that, certain things, are
    inherently complex by nature, while others, are extraneous complexity
    that are artificially created by lousy design or historical baggage.
    As a concrete example in computing, languages like Lisp, is in general
    very well designed. Due to its simplicity and almost no artificial
    complexity, programers are immediately exposed to many of the
    intrinsic complexity of computing. While languages like C and its
    litters such as C++, Java, C#, Perl etc and in general software in
    unix, created by the unix motherfuckers, are filled to the brim with
    artificial complexity due to mostly laziness/hack, ignorance, and
    lies.)

    For various ways to represent keystrokes in emacs, see How to Define
    Keyboard Shortcuts in Emacs.

    For the unix mentality “Why Change when it ain't brokenâ€, see Why
    Change when it ain't broken.

    We, as software creators, must not have unix's “why change when it
    ain't broken†attitude. Emacs itself, although far more well thought
    out than the majority of software, nevertheless acquired many baggage
    in its 30 or so years of old age. I would recommend that we start a
    effort to eliminate some of these outdated baggage. Please see: The
    Modernization of Emacs.

    Xah

    ∑ http://xahlee.org/
     
    , May 31, 2007
    #2
    1. Advertising

  3. Ingo Menger Guest

    On 30 Mai, 01:29, wrote:
    > Will (aka weber) wrote:
    >
    > «
    > [about the various ways to input or represent keystrokes and or non-
    > printable characters in Emacs]
    >
    > As far as I can see in all those situations entering meta-characters
    > is
    > addressed in a different way which I find confusing, e.g.:
    > a) <key> _or_ C-q <key>
    > b) C-q C-[, C-q C-m, C-q C-j, C-q C-i
    > c) \e, \r, \n, \t
    > d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
    > »
    >
    > None of this complexity is istrinsic.
    >
    > Will wrote:
    >
    > «a) <key> _or_ C-q <key>»
    >
    > The C-q (or, pressing the Control key down then type q) is the
    > keyboard shortcut to invoke the command quoted-insert. It is a
    > general a way to allow you to input any non-printable characters. This
    > facility usually don't exist in other text editors. In popular text
    > editor such as Microsoft Word or Mac's Application, you usally bring
    > up a window showing all the special characters, then press a button to
    > insert the char you want.
    >
    > « b) C-q C-[, C-q C-m, C-q C-j, C-q C-i»
    >
    > In this, the C-q is the keyboard shortcut to invoke the command quoted-
    > insert, which will insert a literal character of whatever character
    > you can type on your keyboard. So, for example, C-q followed by the
    > tab key will insert a the non-printable character tab.
    >
    > When speaking of non-printable characters, the context is a character
    > set standard. Implicitly, we are talking about ASCII, and this applies
    > to emacs. Now, in ASCII, there are about 30 non-printable characters.
    > Each of these is given a standard abbreviation, and several
    > representations for different purposes. For example, ASCII 13 is the
    > "Carriage return" character, with abbr code CR, and ^M as its control-
    > key-input representation. (M being the 13th of the English alphabet)
    >
    > For the full detail, look at the table here:http://en.wikipedia.org/wiki/Ascii
    >
    > (Note: Emacs also have a general way to input non-printable characters
    > of the unicode standard. See
    > Emacs and Unicode Tipshttp://xahlee.org/emacs/emacs_n_unicode.html
    > )
    >
    > « c) \e, \r, \n, \t »
    >
    > This is a ad-hoc set of input and display representation for a few non-
    > printable characters. This set is started by the motherfucking unix
    > tech geeking morons, and by its free and speedy nature as cigarette
    > given to children, today has spread to many languages (Perl, Java, C+
    > +, C#, Python, JavaScript ...) and is a de facto standard. The damage
    > is to such a degree that the general concept of unprintable
    > characters, their representation, and their method of input, all
    > treated in one systematic, simple way, are not in the consciousness of
    > average industrial programers.


    At least not in yours, it seems. You do not understand, that \n is not
    a way to enter the newline character, but is a way to name the newline
    character without actually using it right mow.
    The difference between using a character and mentioning (i.e. speaking
    about) a character did not come to your mind yet, did it?

    > I do not know the history of these display representations. (hopefully
    > someone will) It is my guess, that part of the reason for these, is
    > that the unix text editor vi, doesn't have a general way to input non-


    Type Ctrl+V when vi is in input mode and then type the character you
    want.
    But note that, in most languages, the string literals
    "Xah Lee
    knows not much"
    and
    "Xah Lee\n knows not much"
    are very different. In fact, some languages will not even recognize
    the first one as string literal.
    This, again, has to do with the fact, that string literals are a way
    to *mention* charachters that the compiled programm will later *use*.
     
    Ingo Menger, May 31, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nym Pseudo

    META NAME and META HTTP-EQUIV

    Nym Pseudo, Sep 26, 2003, in forum: HTML
    Replies:
    1
    Views:
    567
    =?iso-8859-1?Q?brucie?=
    Sep 26, 2003
  2. Stefan Mueller
    Replies:
    3
    Views:
    33,063
    Stefan Mueller
    Jul 23, 2006
  3. Duane Johnson

    Meta methods to govern meta data?

    Duane Johnson, Oct 25, 2005, in forum: Ruby
    Replies:
    6
    Views:
    247
    Adam Sanderson
    Oct 28, 2005
  4. Erik Veenstra

    Meta-Meta-Programming

    Erik Veenstra, Feb 7, 2006, in forum: Ruby
    Replies:
    29
    Views:
    402
    Erik Veenstra
    Feb 8, 2006
  5. Erik Veenstra

    Meta-Meta-Programming, revisited

    Erik Veenstra, Jul 21, 2006, in forum: Ruby
    Replies:
    21
    Views:
    447
    Erik Veenstra
    Jul 25, 2006
Loading...

Share This Page