Endless loop "--(end of buffer or a NULL)" in Flex++

Õ

ÕÅÑï Santa

Recently I am developing an SQL parser with Bison++ and Flex++ under
Windows. The SQL statements we parse may contain Chinese characters in
strings. Sometimes we could successfully parse the Chinese characters,
but most often the lexer will fail, going to an endless loop and
prints "--(end of buffer or a NULL)" continuously.

Since we are parsing statements from std::string, not from input
stream, we #define some macros in our lexer.l. The INPUT_CODE macros
is #define as:

%define INPUT_CODE \
result = 0; \
while (result < max_size && pos < inputText.length()) { \
buffer[result] = inputText[pos]; \
pos++; \
result++; } \
} \
return 0;


the inputText is a member of Lexer, in which we stored the SQL
statement to be parsed. And "pos" is an int member of Lexer which
stores the number of character we've lex'ed.


Could somebody please tell us how to deal with the endless loop "--
(end of buffer or a NULL)" ? Thank you very much! :)
 
S

s0suk3

Recently I am developing an SQL parser with Bison++ and Flex++ under
Windows. The SQL statements we parse may contain Chinese characters in
strings. Sometimes we could successfully parse the Chinese characters,
but most often the lexer will fail, going to an endless loop and
prints "--(end of buffer or a NULL)" continuously.

Since we are parsing statements from std::string, not from input
stream, we #define some macros in our lexer.l. The INPUT_CODE macros
is #define as:

%define INPUT_CODE \
result = 0; \
while (result < max_size && pos < inputText.length()) { \
buffer[result] = inputText[pos]; \
pos++; \
result++; } \
} \
return 0;

the inputText is a member of Lexer, in which we stored the SQL
statement to be parsed. And "pos" is an int member of Lexer which
stores the number of character we've lex'ed.

Could somebody please tell us how to deal with the endless loop "--
(end of buffer or a NULL)" ? Thank you very much! :)

First of all, a couple of stylistic suggestions:

Your macro has a few odd properties generally considered bad
programming practices: it requires variables, members or macros named
"result", "max_size", "pos", "inputText" and "buffer" to be defined at
the time it's called. Also, it causes the calling function to return,
which can make your code look confusing. Finally, it consists of
several statements, which can cause the program to behave differently
from what you intended (for example, if it appears after an if
statement with no braces) and/or leave the program with a syntax error
after preprocessing (if it appears after an if statement with no
braces that's followed by an else clause).

It would be better if you created a separate function (perhaps an
inline) that dealt with its own local variables, and that took any
needed variables (such as inputText and buffer) as arguments.

As for the infinite loop: If the problem is really the while loop that
this macro executes, its termination depends on the value of max_size
and of the value returned by the length() method or function pointer
in the inputText object...

But then you say that the program starts printing "--(end of buffer or
a NULL)", which suggests that the problem is elsewhere (since the
piece of code you posted isn't printing that anywhere).

Sebastian
 
M

Martin Ambuhl

张扬 Santa said:
Recently I am developing an SQL parser with Bison++ and Flex++ under
Windows. The SQL statements we parse may contain Chinese characters in
strings. Sometimes we could successfully parse the Chinese characters,
but most often the lexer will fail, going to an endless loop and
prints "--(end of buffer or a NULL)" continuously.

Even though SQL, Bison, Flex, and Windows are all off-topic, I had hopes
that you had a C question, then I found next:
Since we are parsing statements from std::string, not from input
stream, we #define some macros in our lexer.l.

'std::string' is a syntax error in C. Whatever language you are using,
it is not C. I strongly suspect that a programmer who doesn't know what
language he is using ought to look into some profession that rewards
incompetence, like bank management.
 
Õ

ÕÅÑï Santa

Pretty sorry I didn't notice this group is intended for C. And yes my
macro seems strange, but I have to write is like this, which is
required by Flex++.

Thanks for you comments.
 
S

s0suk3

'std::string' is a syntax error in C. Whatever language you are using,
it is not C. I strongly suspect that a programmer who doesn't know what
language he is using ought to look into some profession that rewards
incompetence, like bank management.

I think he does know what language he is using; I think he just
(unawarely) posted to the wrong newsgroup (which, incidentally,
discusses a very related language). Furthermore, his query was about
an "endless loop", and since basic loop constructs have equivalent
behavior in both C and C++, his question is also relevant to C.

In any event, being that rude isn't very helpful.

Sebastian
 
K

Kenny McCormack

Even though SQL, Bison, Flex, and Windows are all off-topic, I had hopes
that you had a C question, then I found next:


'std::string' is a syntax error in C. Whatever language you are using,
it is not C. I strongly suspect that a programmer who doesn't know what
language he is using ought to look into some profession that rewards
incompetence, like bank management.

Rewards it handsomely, in fact.
 
C

CBFalconer

ÕÅÑï Santa said:
Recently I am developing an SQL parser with Bison++ and Flex++
under Windows. The SQL statements we parse may contain Chinese
characters in strings. Sometimes we could successfully parse the
Chinese characters, but most often the lexer will fail, going to
an endless loop and prints "--(end of buffer or a NULL)"
continuously.

Bison++ and Flex++, also Windows, and SQL are all off-topic here,
where we deal with the C language as defined by the various C
standards and K & R.
Since we are parsing statements from std::string, not from input
stream, we #define some macros in our lexer.l. The INPUT_CODE
macros is #define as:

In addition std::string has nothing to do with the C language. You
may want comp.lang.c++, but the above off-topic considerations may
be a problem there.
.... snip ...

the inputText is a member of Lexer, in which we stored the SQL
statement to be parsed. And "pos" is an int member of Lexer which
stores the number of character we've lex'ed.

The care and feeding of 'Lexer' is also off-topic here. If you
hadn't mentioned Windows I would recomment comp.unix.programmer,
however Windows is another area entirely. Perhaps something to do
with GNU?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top