Re: Other notes

Discussion in 'Python' started by Jp Calderone, Dec 29, 2004.

  1. Jp Calderone

    Jp Calderone Guest

    On Wed, 29 Dec 2004 12:38:02 -0600, Mike Meyer <> wrote:
    >Jp Calderone <> writes:
    >
    > > On Wed, 29 Dec 2004 11:42:00 -0600, Mike Meyer <> wrote:
    > >> writes:
    > >>
    > >> > @infix
    > >> > def interval(x, y): return range(x, y+1) # 2 parameters needed
    > >> >
    > >> > This may allow:
    > >> > assert 5 interval 9 == interval(5,9)
    > >>
    > >> I don't like the idea of turning words into operators. I'd much rather
    > >> see something like:

    > >
    > > Really? I like "not", "and", "or", "is", and "in". It would not be nice
    > > if they were replaced with punctuation.

    >
    > They can't be turned into operators - they already are.
    >


    They weren't operators at some point (if necessary, select this point
    prior to the creation of the first programming language). Later, they
    were. Presumably in the interim someone turned them into operators.

    > > This aside, not even Python 3.0 will be flexible enough to let you define
    > > an infix decorator. The language developers are strongly against supporting
    > > macros, which is what an infix decorator would amount to.

    >
    > Could you please explain how allowing new infix operators amount to
    > supporting macros?


    You misread - I said "what an infix decorator would amount to". Adding
    new infix operators is fine and in no way equivalent to macros.

    As you said in your reply to Steve Holden in this thread, one way support
    for @infix could be done is to allow the decorator to modify the parser's
    grammar. Doesn't this sound like a macro to you?

    >
    > > Now, they might be convinced to add a new syntax that makes a function
    > > into an infix operator. Perhaps something like this:
    > >
    > > def &(..)(x, y):
    > > return range(x, y + 1)

    >
    > And while you're at it, explain how this method of defining new infix
    > operators differs from using decorators in such a way that it doesn't
    > amount to supporting macros.


    Simple. You can't do anything except define a new infix operator with
    the hypothetical "def &( <operator> )" syntax. With real macros, you can
    define new infix operators, along with any other syntactic construct your
    heart desires.

    Jp
     
    Jp Calderone, Dec 29, 2004
    #1
    1. Advertising

  2. Jp Calderone

    Mike Meyer Guest

    Jp Calderone <> writes:

    > On Wed, 29 Dec 2004 12:38:02 -0600, Mike Meyer <> wrote:
    >>Jp Calderone <> writes:
    >> > This aside, not even Python 3.0 will be flexible enough to let you define
    >> > an infix decorator. The language developers are strongly against supporting
    >> > macros, which is what an infix decorator would amount to.

    >>
    >> Could you please explain how allowing new infix operators amount to
    >> supporting macros?

    >
    > You misread - I said "what an infix decorator would amount to". Adding
    > new infix operators is fine and in no way equivalent to macros.


    You misread, I said "allowing new infix oerpators amount to supporting
    macros?"

    >> > Now, they might be convinced to add a new syntax that makes a function
    >> > into an infix operator. Perhaps something like this:
    >> >
    >> > def &(..)(x, y):
    >> > return range(x, y + 1)

    >>
    >> And while you're at it, explain how this method of defining new infix
    >> operators differs from using decorators in such a way that it doesn't
    >> amount to supporting macros.

    >
    > Simple. You can't do anything except define a new infix operator with
    > the hypothetical "def &( <operator> )" syntax. With real macros, you can
    > define new infix operators, along with any other syntactic construct your
    > heart desires.


    You failed to answer the question. We have two proposed methods for
    adding new infix operators. One uses decorators, one uses a new magic
    syntax for infix operators. Neither allows you to do anything except
    declare new decorators. For some reason, you haven't explained yet,
    you think that using decorators to declare infix operators would
    amount to macros, yet using a new syntax wouldn't. Both require
    modifying the grammer of the language accepted by the parser. How is
    it that one such modification "amounts to macros", whereas the other
    doesn't?

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
     
    Mike Meyer, Dec 30, 2004
    #2
    1. Advertising

  3. Jp Calderone

    Guest

    Thank you to all the gentle people that has given me some comments, and
    sorry for bothering you...

    Doug Holton:

    >This will also likely never appear in Python.


    I know, that's why I've defined it "wild".


    >Also, you might request the NetLogo and StarLogo developers to support

    Jython (in addition to Logo) scripting in their next version<

    I was suggesting the idea of adding the complex data structures of
    NetLogo (patches, etc) to the normal Python.

    -------------

    Andrew Dalke:

    >(BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted

    as the floating point value "1.0".)<

    Uhm, I have to fix my ignorance about parsers.
    Cannot a second "." after the first tell that the first "." isn't in
    the middle of a floating point number?


    >In Pascal it works because you also specify the type and Pascal has an

    incr while Python doesn't.<

    The Pascal succ function (that works on chars and all integer types) is
    often useful (it's easy to define it in Python too).


    >>This may allow: assert 5 interval 9 == interval(5,9)

    >Maybe you could give an example of when you need this in real life?<


    Every time you have a function with 2 parameters, you can choose to use
    it infix.


    >Does this only work for binary or is there a way to allow unary or

    other n-ary (including 0-ary?) functions?<

    It's for binary functions only.
    For unary the normal fun(x) syntax is good enough, and for n-ary the
    sintax becomes unnecessary complex, and you can use the normal function
    syntax again.


    >But to someone with C experience or any language which derives its

    formatting string from C, Python's is easier to understand than your
    Pascal one.<

    You can be right, but "understanding" and "being already used to
    something" are still two different things :)


    >A Python view is that there should be only one obvious way to do a

    task. Supporting both C and Pascal style format strings breaks that.<

    Okay.


    >I don't think Pascal is IEEE enough.<


    Well, if you assign that FP number inside the program (and not as a
    constant) Delphi 4 too gives a more "correct" answer ^_^ (I don't
    know/remember why there are such differences inside Delphi between
    constants and variables).


    >note also that the Pascal-style formatting strings are less capable

    than Python's,<

    I know...


    >though few people use features like<


    Right.


    >A real-life example would also be helpful here.<


    (I was parsing trees for Genetic Programming).
    People here can probably suggest me 10 alternative ways to do what I
    was trying to do :)
    As list comprehensions, that suggestion of mine cannot solve new kinds
    of problems, it's just another way of doing things.


    >What does map(len, "Blah", level = 200) return?<


    Well:
    "c" == "c"[0][0][0][0][0][0]
    map(len, "blah") == [1, 1, 1, 1]
    So I think that answer is still [1, 1, 1, 1].


    >You need to learn more about the Pythonic way of thinking of things.

    The usual solution for this is to have "level = None".<

    To Steven Bethard I've suggested an alternative syntax (using a
    level-ed flatten with a standard map command).


    >There's also tmPython.<


    Thank you, the screenshoots are quite nice :)
    http://dkbza.org/11.0.html

    --------------

    Steven Bethard:

    >Since I can't figure it out intuitively (even with examples), I don't

    think this syntax is any less inscrutable than '%<width>.<decimals>f'.<

    Well, I haven't given a textual explanation, but just few examples.


    >My suspicion is that you're just biased by your previous use of

    Pascal.<

    This is possible.


    >This packs two things into map -- the true mapping behaviour (applying

    a function to a list) and the flattening of a list.<

    Okay, then this is my suggestion for the syntax of an iterable
    xflatten:
    xflatten(sequence, level=-1, monadtuple=False, monadstr=True,
    safe=False])

    - level allows to specify the mapping level:
    level=0 no flattening.
    level=1 the flattening is applied to the first and second level.
    Etc.
    And like in the indexing of lists:
    level=-1 (default) means the flattening is applied up to the leaves.
    level=-2 flattens up to pre-leaves.
    Etc.
    - monadtuple (default False) if True tuples are monads.
    - monadstr (default True) if False then strings with len>1 are
    sequences too.
    - safe (default False) if True it cheeks (with something like an
    iterative isrecursive) for recursive references inside the sequence.


    >(Also, Google for flatten in the python-list -- you should find a

    recent thread about it.)<

    I've discussed it too in the past :)
    http://groups-beta.google.com/group/comp.lang.python/browse_thread/thread/d0ba195d98f35f66/


    >and that your second example gains nothing over<


    Right, but maybe with that you can unify the def and lambda into just
    one thing ^_^


    >def foo(x):
    > globals()['y'] = globals()['y'] + 2
    >Not exactly the same syntax, but pretty close.


    Thank you for this idea.


    >I'll second that. Please, "Bearophile", do us the courtesy of checking

    [...] before posting another such set of questions. While most of the
    people on this list are nice enough to answer your questions anyway,
    the answers are already out there for at least half of your questions,
    if you would do us the courtesy of checking first.<

    I've read many documents, articles and PEPs and, but I'm still new, so
    I've missed many things. I'm sorry... I'm doing my best.

    -----------

    Terry J. Reedy

    >I also suggest perusing the archived PyDev (Python Development mailing

    list) summaries for the last couple of years (see python.org). Every
    two weeks, Brett Cannon has condensed everything down to a few pages.<
    Okay, thank you.

    Bear hugs,
    Bearophile
     
    , Jan 6, 2005
    #3
  4. wrote:
    > Andrew Dalke:
    >>(BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted
    >> as the floating point value "1.0".)<

    > Uhm, I have to fix my ignorance about parsers.
    > Cannot a second "." after the first tell that the first "." isn't in
    > the middle of a floating point number?


    Python uses an LL(1) parser. From Wikipedia:
    """ LL(1) grammars, although fairly restrictive, are very popular because the
    corresponding LL parsers only need to look at the next token to make their
    parsing decisions."""

    >>>This may allow: assert 5 interval 9 == interval(5,9)

    >>Maybe you could give an example of when you need this in real life?<

    > Every time you have a function with 2 parameters, you can choose to use
    > it infix.


    But why would you want to? What advantage does this give over the standard
    syntax? Remember, in Python philosophy, there should be one obvious way to do
    it, and preferably only one. Adding a whole another way of calling functions
    complicates things without adding much advantage. Especially so because you
    suggest it is only used for binary, i.e. two-parameter functions.
     
    Timo Virkkala, Jan 6, 2005
    #4
  5. Jp Calderone

    Steve Holden Guest

    Timo Virkkala wrote:

    > wrote:
    >
    >> Andrew Dalke:
    >>
    >>> (BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted
    >>> as the floating point value "1.0".)<

    >>
    >> Uhm, I have to fix my ignorance about parsers.
    >> Cannot a second "." after the first tell that the first "." isn't in
    >> the middle of a floating point number?

    >
    >
    > Python uses an LL(1) parser. From Wikipedia:
    > """ LL(1) grammars, although fairly restrictive, are very popular
    > because the corresponding LL parsers only need to look at the next token
    > to make their parsing decisions."""
    >

    Indeed, but if ".." is defined as an acceptable token then there's
    nothing to stop a strict LL(1) parser from disambiguating the cases in
    question. "Token" is not the same thing as "character".

    >>>> This may allow: assert 5 interval 9 == interval(5,9)
    >>>
    >>> Maybe you could give an example of when you need this in real life?<

    >>
    >> Every time you have a function with 2 parameters, you can choose to use
    >> it infix.

    >
    >
    > But why would you want to? What advantage does this give over the
    > standard syntax? Remember, in Python philosophy, there should be one
    > obvious way to do it, and preferably only one. Adding a whole another
    > way of calling functions complicates things without adding much
    > advantage. Especially so because you suggest it is only used for binary,
    > i.e. two-parameter functions.


    This part of your comments I completely agree with. However, we are used
    to people coming along and suggesting changes to Python on
    comp.lang.python. Ironically it's often those with less experience of
    Python who suggest it should be changed to be more like some other language.

    One of the things I like best about c.l.py is its (almost) unfailing
    politeness to such posters, often despite long stream-of-consciousness
    posts suggesting fatuous changes (not necessarily the case here, by the
    way). The level of debate is so high, and so rational, that the change
    requesters are often educated as to why their suggested changes wouldn't
    be helpful or acceptable, and having come to jeer they remain to
    whitewash, to use an analogy from "Tom Sawyer" [1].

    All in all a very pleasant change from "F*&% off and die, noob".

    regards
    Steve

    [1]: http://www.cliffsnotes.com/WileyCDA/LitNote/id-2,pageNum-10.html
    --
    Steve Holden http://www.holdenweb.com/
    Python Web Programming http://pydish.holdenweb.com/
    Holden Web LLC +1 703 861 4237 +1 800 494 3119
     
    Steve Holden, Jan 6, 2005
    #5
  6. Jp Calderone

    Andrew Dalke Guest

    Me
    >>>> (BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted
    >>>> as the floating point value "1.0".)<


    Steve Holden:
    > Indeed, but if ".." is defined as an acceptable token then there's
    > nothing to stop a strict LL(1) parser from disambiguating the cases in
    > question. "Token" is not the same thing as "character".


    Python's tokenizer is greedy and doesn't take part in the
    lookahead. When it sees 1..12 the longest match is for "1."
    which is a float. What remains is ".2". That also gets tokenized
    as a float. <float> <float> is not allowed in Python so the
    parser raises an compile time SyntaxError exception

    >>> 1..12

    File "<stdin>", line 1
    1..12
    ^
    SyntaxError: invalid syntax
    >>>


    Consider the alternative of "1..a". Again "1." is tokenized
    as a float. What remains is ".a". The longest match is "."
    with "a" remaining. Then the next token is "a". The token
    stream looks like
    <float 1.0><dot><name "a">
    which gets converted to the same thing as
    getattr(1.0, "a")

    That is legal syntax but floats don't have the "a" property
    so Python raises an AttributeError at run-time.

    >>> 1..a

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    AttributeError: 'float' object has no attribute 'a'
    >>>


    Here's a property that does exist

    >>> 1..__abs__

    <method-wrapper object at 0x547d0>
    >>>



    Because of the greedy lexing it isn't possible to do
    "1.__abs__" to get the __abs__ method of an integer.
    That's because the token stream is

    <float 1.0><name "__abs__">

    which is a syntax error.

    >>> 1.__abs__

    File "<stdin>", line 1
    1.__abs__
    ^
    SyntaxError: invalid syntax
    >>>


    One way to avoid that is to use "1 .__abs__". See the
    space after the "1"? The tokenizer for this case creates

    <integer 1><dot><name "__abs__">


    which create code equivalent to getattr(1, "__abs__") and
    is valid syntax

    >>> 1 .__abs__

    <method-wrapper object at 0x54ab0>
    >>>


    Another option is to use parentheses: (1).__abs__

    I prefer this latter option because the () is easier to
    see than a space. But I prefer using getattr even more.

    Andrew
     
    Andrew Dalke, Jan 6, 2005
    #6
  7. On Thu, 06 Jan 2005 19:24:52 GMT, Andrew Dalke <> wrote:

    >Me
    >>>>> (BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted
    >>>>> as the floating point value "1.0".)<

    >
    >Steve Holden:
    >> Indeed, but if ".." is defined as an acceptable token then there's
    >> nothing to stop a strict LL(1) parser from disambiguating the cases in
    >> question. "Token" is not the same thing as "character".

    >
    >Python's tokenizer is greedy and doesn't take part in the
    >lookahead. When it sees 1..12 the longest match is for "1."
    >

    But it does look ahead to recognize += (i.e., it doesn't generate two
    successive also-legal tokens of '+' and '=')
    so it seems it should be a simple fix.

    >>> for t in tokenize.generate_tokens(StringIO.StringIO('a=b+c; a+=2; x..y').readline):print t

    ...
    (1, 'a', (1, 0), (1, 1), 'a=b+c; a+=2; x..y')
    (51, '=', (1, 1), (1, 2), 'a=b+c; a+=2; x..y')
    (1, 'b', (1, 2), (1, 3), 'a=b+c; a+=2; x..y')
    (51, '+', (1, 3), (1, 4), 'a=b+c; a+=2; x..y')
    (1, 'c', (1, 4), (1, 5), 'a=b+c; a+=2; x..y')
    (51, ';', (1, 5), (1, 6), 'a=b+c; a+=2; x..y')
    (1, 'a', (1, 7), (1, 8), 'a=b+c; a+=2; x..y')
    (51, '+=', (1, 8), (1, 10), 'a=b+c; a+=2; x..y')
    (2, '2', (1, 10), (1, 11), 'a=b+c; a+=2; x..y')
    (51, ';', (1, 11), (1, 12), 'a=b+c; a+=2; x..y')
    (1, 'x', (1, 13), (1, 14), 'a=b+c; a+=2; x..y')
    (51, '.', (1, 14), (1, 15), 'a=b+c; a+=2; x..y')
    (51, '.', (1, 15), (1, 16), 'a=b+c; a+=2; x..y')
    (1, 'y', (1, 16), (1, 17), 'a=b+c; a+=2; x..y')
    (0, '', (2, 0), (2, 0), '')

    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 6, 2005
    #7
  8. Jp Calderone

    Andrew Dalke Guest

    Bengt Richter:
    > But it does look ahead to recognize += (i.e., it doesn't generate two
    > successive also-legal tokens of '+' and '=')
    > so it seems it should be a simple fix.


    But that works precisely because of the greedy nature of tokenization.
    Given "a+=2" the longest token it finds first is "a" because "a+"
    is not a valid token. The next token is "+=". It isn't just "+"
    because "+=" is valid. And the last token is "2".

    Compare to "a+ =2". In this case the tokens are "a", "+", "=", "2"
    and the result is a syntax error.

    > >>> for t in tokenize.generate_tokens(StringIO.StringIO('a=b+c; a+=2; x..y').readline):print t

    > ...


    This reinforces what I'm saying, no? Otherwise I don't understand
    your reason for showing it.

    > (51, '+=', (1, 8), (1, 10), 'a=b+c; a+=2; x..y')


    As I said, the "+=" is found as a single token, and not as two
    tokens merged into __iadd__ by the parser.

    After some thought I realized that a short explanation may be helpful.
    There are two stages in parsing a data file, at least in the standard
    CS way of viewing things. First, tokenize the input. This turns
    characters into words. Second, parse the words into a structure.
    The result is a parse tree.

    Both steps can do a sort of look-ahead. Tokenizers usually only look
    ahead one character. These are almost invariably based on regular
    expressions. There are many different parsing algorithms, with
    different tradeoffs. Python's is a LL(1) parser. The (1) means it
    can look ahead one token to resolve ambiguities in a language.
    (The LL is part of a classification scheme which summarizes how
    the algorithm works.)

    Consider if 1..3 were to be legal syntax. Then the tokenizer
    would need to note the ambiguity that the first token could be
    a "1." or a "1". If "1." then then next token could be a "."
    or a ".3". In fact, here is the full list of possible choices

    <1.> <.> <3> same as getattr(1., 3)
    <1> <.> <.> 3 not legal syntax
    <1.> <.3> not legal syntax
    <1> <..> <3> legal with the proposed syntax.

    Some parsers can handle this ambiguity, but Python's
    deliberately does not. Why? Because people also find
    it tricky to resolve ambiguity (hence problems with
    precedence rules). After all, should 1..2 be interpreted
    as 1. . 2 or as 1 .. 2? What about 1...2? (Is it 1. .. 2,
    1 .. .2 or 1. . .2 ?)


    Andrew
     
    Andrew Dalke, Jan 7, 2005
    #8
  9. Jp Calderone

    Steve Holden Guest

    Andrew Dalke wrote:

    > Bengt Richter:
    >
    >>But it does look ahead to recognize += (i.e., it doesn't generate two
    >>successive also-legal tokens of '+' and '=')
    >>so it seems it should be a simple fix.

    >
    >
    > But that works precisely because of the greedy nature of tokenization.
    > Given "a+=2" the longest token it finds first is "a" because "a+"
    > is not a valid token. The next token is "+=". It isn't just "+"
    > because "+=" is valid. And the last token is "2".
    >

    [...]

    You're absolutely right, of course, Andrew, and personally I don't think
    that this is worth trying to fix. But the original post I responded to
    was suggesting that an LL(1) grammar couldn't disambiguate "1." and
    "1..3", which assertion relied on a slight fuzzing of the lines between
    lexical and syntactical analysis that I didn't want to leave unsharpened.

    The fact that Python's existing tokenizer doesn't allow multi-character
    tokens beginning with a dot after a digit (roughly speaking) is what
    makes the whole syntax proposal infeasibly hard to adapt to.

    regards
    Steve
    --
    Steve Holden http://www.holdenweb.com/
    Python Web Programming http://pydish.holdenweb.com/
    Holden Web LLC +1 703 861 4237 +1 800 494 3119
     
    Steve Holden, Jan 7, 2005
    #9
  10. On Fri, 07 Jan 2005 06:04:01 GMT, Andrew Dalke <> wrote:

    >Bengt Richter:
    >> But it does look ahead to recognize += (i.e., it doesn't generate two
    >> successive also-legal tokens of '+' and '=')
    >> so it seems it should be a simple fix.

    >
    >But that works precisely because of the greedy nature of tokenization.

    So what happens if you apply greediness to a grammar that has both . and ..
    as legal tokens? That's the point I was trying to make. The current grammar
    unfortunately IMHO tries to tokenize floating point numbers, which creates a problem
    for both numbers and (if you don't isolate it with surrounding spaces) the .. token.

    There would UIAM be no problem recognizing 1 .. 2 but 1..2 has the problem that
    the current tokenizer recognizes 1. as number format. Otherwise the greediness would
    work to solve the 1..2 "problem."

    IMHO it is a mistake to form floating point at the tokenizer level, and a similar mistake
    follows at the AST level in using platform-specific floating point constants (doubles) to
    represent what really should preserve full representational accuracy to enable translation
    to other potential native formats e.g. if cross compiling. Native floating point should IMO
    not be formed until the platform to which it is supposed to be native is identified.

    IOW, I think there is a fix: keep tokenizing greedily and tokenize floating point as
    a sequence of integers and operators, and let <integer><dot><integer> be translated by
    the compiler to floating point, and <integer><dotdot><integer> be translated to the
    appropriate generator expression implementation.


    >Given "a+=2" the longest token it finds first is "a" because "a+"
    >is not a valid token. The next token is "+=". It isn't just "+"
    >because "+=" is valid. And the last token is "2".
    >

    Exactly. Or I am missing something? (which does happen ;-)

    >Compare to "a+ =2". In this case the tokens are "a", "+", "=", "2"
    >and the result is a syntax error.

    Sure.
    >
    >> >>> for t in tokenize.generate_tokens(StringIO.StringIO('a=b+c; a+=2; x..y').readline):print t

    >> ...

    >
    >This reinforces what I'm saying, no? Otherwise I don't understand
    >your reason for showing it.

    It does reinforce your saying the matching is greedy, yes. But that led me to
    think that .. could be recognized without a problem, given a grammar fix.
    >
    >> (51, '+=', (1, 8), (1, 10), 'a=b+c; a+=2; x..y')

    >
    >As I said, the "+=" is found as a single token, and not as two
    >tokens merged into __iadd__ by the parser.

    No argument.
    >
    >After some thought I realized that a short explanation may be helpful.
    >There are two stages in parsing a data file, at least in the standard
    >CS way of viewing things. First, tokenize the input. This turns
    >characters into words. Second, parse the words into a structure.
    >The result is a parse tree.
    >
    >Both steps can do a sort of look-ahead. Tokenizers usually only look
    >ahead one character. These are almost invariably based on regular
    >expressions. There are many different parsing algorithms, with
    >different tradeoffs. Python's is a LL(1) parser. The (1) means it
    >can look ahead one token to resolve ambiguities in a language.
    >(The LL is part of a classification scheme which summarizes how
    >the algorithm works.)
    >
    >Consider if 1..3 were to be legal syntax. Then the tokenizer
    >would need to note the ambiguity that the first token could be
    >a "1." or a "1". If "1." then then next token could be a "."
    >or a ".3". In fact, here is the full list of possible choices
    >
    > <1.> <.> <3> same as getattr(1., 3)
    > <1> <.> <.> 3 not legal syntax
    > <1.> <.3> not legal syntax
    > <1> <..> <3> legal with the proposed syntax.
    >

    Right, but a grammar fix to handle floating point "properly" (IMHO ;-)
    would resolve that. Only the last would be legal at the tokenizer stage.

    >Some parsers can handle this ambiguity, but Python's
    >deliberately does not. Why? Because people also find

    I'm not sure what you mean. If it has a policy of greedy matching,
    that does handle some ambiguities in a particular way.

    >it tricky to resolve ambiguity (hence problems with
    >precedence rules). After all, should 1..2 be interpreted
    >as 1. . 2 or as 1 .. 2? What about 1...2? (Is it 1. .. 2,

    ^^^^^^[1] ^^^^^^^[2]
    [1] plainly (given my grammar changes ;-)
    [2] as plainly not, since <1> would not greedily accept '.' to make <1.>

    >1 .. .2 or 1. . .2 ?)

    ^^^^^^[3] ^^^^^^^[4]
    [3] no, because greed would recognize an ellipsis <...>
    [4] ditto. Greedy tokenization would produce <1> <...> <2> for the compiler.

    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 7, 2005
    #10
  11. Jp Calderone

    Nick Coghlan Guest

    Bengt Richter wrote:
    > IOW, I think there is a fix: keep tokenizing greedily and tokenize floating point as
    > a sequence of integers and operators, and let <integer><dot><integer> be translated by
    > the compiler to floating point, and <integer><dotdot><integer> be translated to the
    > appropriate generator expression implementation.


    That would be:

    <int-literal><dot><int-literal> -> float(<int-literal> + "." + <int-literal>)
    <int-literal><dot><identifier> -> getattr(int(<int-literal>), <identifier>)
    <int-literal><dot><dot><int-literal> -> xrange(<int-literal>, <int-literal>)

    However, the problem comes when you realise that 1e3 is also a floating point
    literal, as is 1.1e3.

    Cheers,
    Nick.

    --
    Nick Coghlan | | Brisbane, Australia
    ---------------------------------------------------------------
    http://boredomandlaziness.skystorm.net
     
    Nick Coghlan, Jan 8, 2005
    #11
  12. On Sat, 08 Jan 2005 18:22:53 +1000, Nick Coghlan <> wrote:

    >Bengt Richter wrote:
    >> IOW, I think there is a fix: keep tokenizing greedily and tokenize floating point as
    >> a sequence of integers and operators, and let <integer><dot><integer> be translated by
    >> the compiler to floating point, and <integer><dotdot><integer> be translated to the
    >> appropriate generator expression implementation.

    >
    >That would be:
    >
    ><int-literal><dot><int-literal> -> float(<int-literal> + "." + <int-literal>)
    ><int-literal><dot><identifier> -> getattr(int(<int-literal>), <identifier>)
    ><int-literal><dot><dot><int-literal> -> xrange(<int-literal>, <int-literal>)
    >
    >However, the problem comes when you realise that 1e3 is also a floating point
    >literal, as is 1.1e3.
    >

    Ok, that requires a little more imagination ;-)

    I think it can be solved, but I haven't explored it all the way through ;-)

    The key seems to be to be to condition the recognition of tokens as if recognizing
    an old potentially floating point number, but emitting number-relevant separate tokens
    so long as there is no embedded spaces. When a number ends, emitting an end marker should
    permit the compiler to deal with the various compositions.

    We still aren't looking ahead more than one, but we are carrying context, just as we do
    to accumulate digits of an integer or characters of a name, but the context may continue
    and determine what further tokens are emitted. E.g. the 'e' in the embedded numeric context,
    becomes <fexp> rather than a name. In the following, <eon> :== end of number token

    1.1 -> <1><dot><1><eon>
    1 .1 -> <1><eon><dot><1><eon>
    1.e1 -> <1><dot><fexp><1><eon>
    1 .e1 -> <1><eon><dot><e1>
    1.2e3 -> <1><dot><2><fexp><3><eon>
    1..2 -> <1><eon><doubledot><1><eon>
    1. .2 -> <1><dot><eon><dot><2><eon> (syntax error)
    1 ..2 -> <1><eon><doubledot><1><eon>

    I just have the feeling that there is a solution, whatever the hiccups ;-)

    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 8, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bjorn Jensen
    Replies:
    0
    Views:
    1,244
    Bjorn Jensen
    Mar 22, 2005
  2. raymond chiu

    Lotus Notes and DotNet

    raymond chiu, Dec 23, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    795
    Scott M.
    Dec 27, 2005
  3. Boris Condarco

    ASP.NET vs Lotus Notes

    Boris Condarco, Jul 29, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    513
    Dino Chiesa [MSFT]
    Jul 30, 2003
  4. Other notes

    , Dec 29, 2004, in forum: Python
    Replies:
    22
    Views:
    927
  5. SteveM
    Replies:
    5
    Views:
    1,625
    Mark Rae [MVP]
    Aug 28, 2007
Loading...

Share This Page