This is not a question which really helps understanding C++
better nor is a good job interview question (well, not unless
you are applying for a job which involves writing a compiler),
but I think it's interesting nevertheless:
C++ is very hard to parse because its syntax is not a
so-called context-free grammar. Give an example (one full
sentence, ie. a full expression ending in a semi-colon) of
valid code which cannot be unambiguously tokenized properly
without knowing the environment in which the line of code
appears (ie. everything else in the same compilation unit). In
other words, it would be possible to tokenize the sentence in
at least two completely different ways, and both ways could be
valid C++ code (if in the proper environment).
(Note that tokenizing a sentence doesn't require understanding
the semantics of the expression, ie. it's not necessary to
know eg. if some type name has been declared earlier or not.
Tokenizing simply means that the sentence is divided into its
constituent tokens, each token having a well-defined type, eg.
"identifier", "unary operator", "binary operator", "opening
parenthesis", etc.)
The problem with that is that the question is ambiguous: what do
you mean by a token? (As for your "well-defined type", that's a
meaningless statement until you know how the compiler internals
are implemented.)
Formally, C++ defines tokens so that you can always "tokenize"
with at most one character look-ahead (is the next character
part of this token, or not), and no context. Practically,
internally, it's impossible to parse C++ if you don't separate
symbols into names of types, names of templates, and other, and
I imagine that most compilers treat these as separate tokens.
Similarly, it's probably advantageous to distinguish between the
which closes a template and the > which is the operator less
than; with the new standard, I suspect that the simplest
implementation would also distinguish between a >> which closes
two templates (which is formally a single token which is then
remapped to two---but if you know that the context would allow
the remapping, you could do it immediately in the tokenizing
phase) and the right shift operator.
So formally, there aren't any, but internally, there could be,
and in fact, probably are. (In practice, I would be very
surprised if there were any compilers which didn't use context
to return different token types for type names, template names
and other symbols; as long as >> cannot be used to close two
templates, I expect that that's the only case in most compilers,
so presumably, that's what you were looking for.)