subscript and superscript

S

shahid

hello,
i want to write subscipts and superscripts in c++.is there
any compiler which do this.also tell me the latest compiler .currently
iam using turbo cv3.0.
 
J

Juha Nieminen

shahid said:
i want to write subscipts and superscripts in c++.is there
any compiler which do this.

Here you go:

#include <iostream>

int main()
{
std::cout << "subscipts and superscripts" << std::endl;
}
 
P

Pascal J. Bourguignon

shahid said:
i want to write subscipts and superscripts in c++.is there
any compiler which do this.also tell me the latest compiler .currently
iam using turbo cv3.0.

Well we don't know what you mean.

In most current implementations of Common Lisp, we can use any unicode
character in symbols, so we can write subscript and superscript
characters in Common Lisp. For example:

C/USER[6]> (defun d¹ (f dx) (lambda (x) (/ (- (funcall f (+ x dx)) (funcall f x)) dx)))
D¹
C/USER[7]> (defun d² (f dx) (d¹ (d¹ f dx) dx))
D²
C/USER[8]> (funcall (d² (lambda (x) (+ (* x x) (* 2 x) 1)) 0.001) 1)
1.9073485
C/USER[9]>


But ISTR that the C++ standard doesn't allow random unicode characters
(not even accented letters) in identifiers, only
[A-Za-z_][A-Za-z0-9_]*.

Instead of writting d¹₂, you can write d_sup1_sub2.
 
J

James Kanze

Here you go:
#include <iostream>
int main()
{
std::cout << "subscipts and superscripts" << std::endl;

}

Maybe:
std::cout << "$x_i + a^2$" ;
is closer to what he is looking for:). Or maybe not; as Victor
says (and your ironic response is meant to point out), it's not
really clear what he's looking for. However:

The compiler reads a linear sequence of printable characters;
there is no concept of subscript or superscript at the compiler
level (but a compiler which read from a source which did have
subscripts and superscripts could map somthing like $x^2$ to
"x[2]").

The library only supports reading and writing from linear
sequences of bytes---or characters, if the file is opened in
text mode. (Literally, char's, regardless of the mode.) The
exact meaning of those bytes is more or less implementation
defined---on some of my machines (but not all), writing a
two byte sequence with the values 0x78, 0xB2 will result in "x²"
(if that doesn't display correctly, it's a small letter x
followed by a superscript 2). More generally, however, the
output will have to be either directly in the PDL understood by
the output support (printer, etc.), or in some intermedate
language interpreted by a special program which knows how to
drive various devices. In other words, he should output
Postscript, LaTeX, or whatever. In which case, of course, he'll
have to conform to whatever that language requires. (My
suggestion, "$x^2$", is of course TeX or LaTeX.)

And if he's outputting directly to a GUI terminal, he will have
to use the primitives of the GUI library; these usually provide
some means of positionning graphic text in the lower level
containers. (The only GUI library I'm familiar with is Swing,
and that's in Java, not C++. But I presume the principles are
pretty universal: you have a couple of text components, which
provide various ways of displaying pure text, but you can also
do your own graphics on a "raw" component, with graphic
primitives for drawing text in various sizes and orientations at
a specific place in the component.)
 
P

Pascal J. Bourguignon

James Kanze said:
Here you go:
#include <iostream>
int main()
{
std::cout << "subscipts and superscripts" << std::endl;

}

Maybe:
std::cout << "$x_i + a^2$" ;
is closer to what he is looking for:). Or maybe not; as Victor
says (and your ironic response is meant to point out), it's not
really clear what he's looking for. However:

The compiler reads a linear sequence of printable characters;
there is no concept of subscript or superscript at the compiler
level (but a compiler which read from a source which did have
subscripts and superscripts could map somthing like $x^2$ to
"x[2]").

Another way to map, would be, like in the NASA programming language HAL/S
http://history.nasa.gov/computers/Appendix-II.html
to write expressions over several lines:

E: 2
M: std::cout << x + a ;
S: i

it would be 'trivial' to write a preprocessor to scan sources and
convert them to "pure" C++

std::cout << x+pow(a,2);




E: -2i
E: e
M: std::cout << x + a ;
E 2
S: n
S: i
 
S

shahid

well i want to write the words just like the power of anything for
example .A's power is 3 but we write in c++ as A3 .is there anyway to
write it as power of A or below the A3.
James Kanze said:
shahid wrote:
i want to write subscipts and superscripts in c++.is there
any compiler which do this.
Here you go:
#include <iostream>
int main()
{
std::cout << "subscipts and superscripts" << std::endl;

}

Maybe:
std::cout << "$x_i + a^2$" ;
is closer to what he is looking for:). Or maybe not; as Victor
says (and your ironic response is meant to point out), it's not
really clear what he's looking for. However:

The compiler reads a linear sequence of printable characters;
there is no concept of subscript or superscript at the compiler
level (but a compiler which read from a source which did have
subscripts and superscripts could map somthing like $x^2$ to
"x[2]").

Another way to map, would be, like in the NASA programming language HAL/S
http://history.nasa.gov/computers/Appendix-II.html
to write expressions over several lines:

E: 2
M: std::cout << x + a ;
S: i

it would be 'trivial' to write a preprocessor to scan sources and
convert them to "pure" C++

std::cout << x+pow(a,2);




E: -2i
E: e
M: std::cout << x + a ;
E 2
S: n
S: i
 
J

James Kanze

well i want to write the words just like the power of
anything for example .A's power is 3 but we write in c++ as A3
.is there anyway to write it as power of A or below the A3.

In C++, we write "a to the power of 3" "std::pow( a, 3 )". The
same (modulo the std::) as in C, Fortran, Basic, Ada and most
other programming languages. (In Lisp, I think it's "(pow a
3)", and in Forth "a 3 pow", but those are more or less
exceptions.)

C++, like all other programming languages (or almost
all---Pascal's example would seem to be an exception) read a
linear sequence of characters. There's no subscript,
superscript, italics or different fonts (blackboard bold, any
one?). And only a limited number of characters. So much of
your mathematical notation gets rearranged: a, instead of

a
i
and pow( a, 2 ) instead of:
2
a

.. That corresponds to the technology that was available then,
and even today, the program editors I know don't support things
like subscripts and superscripts, so you couldn't enter them
into the code even if the compiler supported them.

Note that this latter point is general, and doesn't affect just
code editors. For this reason, most mathematical texts are
actually written in LaTex (or maybe even plain TeX), in which _
is the subscript operator, and ^ the superscript operator (and
there are tons of other operators).

If you're interested in using mathematical notation in your
code, and seeing it as mathematical notation when you print the
program, I'd suggest looking into cweb, a (La)TeX based
preprocessor for C and C++.
 
J

James Kanze

Well we don't know what you mean.
In most current implementations of Common Lisp, we can use any
unicode character in symbols, so we can write subscript and
superscript characters in Common Lisp.

In standard C++, you can also use any Unicode character. But
only in strings or in user symbols, and in user symbols, it must
be a character considered "alphanumeric" (according to the
UnicodeData.txt file).
For example:
C/USER[6]> (defun d¹ (f dx) (lambda (x) (/ (- (funcall f (+ x dx)) (funcall f x)) dx)))
D¹
C/USER[7]> (defun d² (f dx) (d¹ (d¹ f dx) dx))
D²
C/USER[8]> (funcall (d² (lambda (x) (+ (* x x) (* 2 x) 1)) 0.001) 1)
1.9073485
C/USER[9]>
But ISTR that the C++ standard doesn't allow random unicode characters
(not even accented letters) in identifiers, only
[A-Za-z_][A-Za-z0-9_]*.

That's wrong. Regretfully, a lot of implementations don't
implement the standard in this regard, however.
Instead of writting d¹₂, you can write d_sup1_sub2.

I'm not sure, but I think that d¹₂ would be legal. (I'd have to
verify whether superscript 1 and subscript 2 are classified by
Unicode as digits.)

There are two issues, however. The first is that the standard
doesn't impose any specific character encoding in input; only
that the characters in the basic character set are present (and
not even that if you're willing to use trigraphs). So Unicode
ends up being formally specified using "universal character
names" (\uxxxx and \Uxxxxxxxx). The input mapping is
implementation defined, however, and it is clearly the intent
that on a platform which supports Unicode (including UTF-8), the
compiler should map all non-ASCII characters to the
corresponding univeral character names (which it can represent
internally as a single four byte value, e.g. in Unicode). Not
very many compilers are conform in this respect, however, and
even less respect the intent.
 
O

osmium

James Kanze said:
In standard C++, you can also use any Unicode character. But
only in strings or in user symbols, and in user symbols, it must
be a character considered "alphanumeric" (according to the
UnicodeData.txt file).

But the OP should note that the Turbo C++ compiler he is using is *not*
standard C++, it predates the standard by quite a bit..
 
D

Daniel Pitts

osmium said:
But the OP should note that the Turbo C++ compiler he is using is *not*
standard C++, it predates the standard by quite a bit..
If I recall, it doesn't even support templates.
 
P

Pascal J. Bourguignon

James Kanze said:
In standard C++, you can also use any Unicode character. But
only in strings or in user symbols, and in user symbols, it must
be a character considered "alphanumeric" (according to the
UnicodeData.txt file).
[...]
I'm not sure, but I think that d¹₂ would be legal. (I'd have to
verify whether superscript 1 and subscript 2 are classified by
Unicode as digits.)

There are two issues, however. The first is that the standard
doesn't impose any specific character encoding in input; only
that the characters in the basic character set are present (and
not even that if you're willing to use trigraphs). So Unicode
ends up being formally specified using "universal character
names" (\uxxxx and \Uxxxxxxxx). The input mapping is
implementation defined, however, and it is clearly the intent
that on a platform which supports Unicode (including UTF-8), the
compiler should map all non-ASCII characters to the
corresponding univeral character names (which it can represent
internally as a single four byte value, e.g. in Unicode). Not
very many compilers are conform in this respect, however, and
even less respect the intent.

Well then, if there's something in the standard, it's up to the users
like Shahid to ask the features from the compiler vendors. We may
hope for some progress on this front.

In 1990, (just to mention how advanced they were at the time),
NeXTSTEP compiler accepted sources in .rtf format, so you could write
your code in "Rich Text", with fonts, colors, etc. (The driver would
just convert the rtf to ascii and proceed).
 
J

James Kanze

James Kanze <[email protected]> writes:

[...]
Well then, if there's something in the standard, it's up to
the users like Shahid to ask the features from the compiler
vendors. We may hope for some progress on this front.

Maybe. I haven't seen much progress on export.
In 1990, (just to mention how advanced they were at the time),
NeXTSTEP compiler accepted sources in .rtf format, so you
could write your code in "Rich Text", with fonts, colors, etc.
(The driver would just convert the rtf to ascii and proceed).

That's cool.

According to the C++ standard, how the compiler maps "physical
source file characters" to the "basic source character set" is
implementation defined. So the next step would be to map
superscripts to std::pow, and subscripts to []. I'm pretty sure
such an implementation would be legal. And at least for numeric
applications, the possibilities are interesting, to say the
least---mapping a capital Greek sigma to a call to
std::accumulate, etc.
 
A

Alf P. Steinbach

* James Kanze:
[...]
Well then, if there's something in the standard, it's up to
the users like Shahid to ask the features from the compiler
vendors. We may hope for some progress on this front.

Maybe. I haven't seen much progress on export.
In 1990, (just to mention how advanced they were at the time),
NeXTSTEP compiler accepted sources in .rtf format, so you
could write your code in "Rich Text", with fonts, colors, etc.
(The driver would just convert the rtf to ascii and proceed).

That's cool.

According to the C++ standard, how the compiler maps "physical
source file characters" to the "basic source character set" is
implementation defined. So the next step would be to map
superscripts to std::pow, and subscripts to []. I'm pretty sure
such an implementation would be legal. And at least for numeric
applications, the possibilities are interesting, to say the
least---mapping a capital Greek sigma to a call to
std::accumulate, etc.

Unfortunately, with current compiler technology -- after all, we're only about
50 years on in that game and one can't expect much in just 50 years -- the
problem is not how to get beyond Latin-1, but rather, how to be able to use
Latin-1, as opposed to just plain ASCII, in our C++ programs' string constants.

For example, with MinGW g++ 3.4.5, the current version, the following *does not
compile* when the source code is in Latin-1:

L"blåbærsyltetøy" // Norwegian for "blueberry jam"

This may be surprising to some because apparently g++ handles non-ASCII Latin-1
characters just fine in narrow character literals.

However, the reason they "work" for narrow characters is a bug in the compiler,
where it doesn't recognize the source code bytes as invalid UTF-8 (with a wide
character literal it's forced to attempt a conversion to UTF-16 and chokes).

Save the source code as UTF-8 no BOM and that compiler is happy, of course, but
then, the source code is not portable (e.g. MSVC won't eat it, just spits it
out) and for a console application the executable is then useless, because the
Windows command interpreter's UTF-8 codepage doesn't work.

One very inefficient solution is to preprocess the source code to pure ASCII.

E.g., the following (not optimized at all, optimizations should be obvious but
do not affect the total efficiency very much) program does that:


<code>
#include <iomanip> // std::setfill, std::setw
#include <iostream>
#include <locale.h> // setlocale
#include <stdlib.h> // abort, mbtowc

wchar_t unicodeFrom( char c )
{
wchar_t wc;

int const returnValue = mbtowc( &wc, &c, 1 );
if( returnValue == -1 )
{
abort(); // mbtowc failed.
}
return wc;
}

bool isInAsciiRange( char c )
{
typedef unsigned char UChar;
return (UChar(c) < 0x80);
}

int const outsideLiteral = 0;
int const afterPrefix = 1;
int const inWideLiteral = 2;
int const inEscape = 3;

struct State
{
int current;
char terminator;
};

void onOutsideLiteralChar( char c, State& state )
{
if( c == 'L' )
{
state.current = afterPrefix;
}
}

void onAfterPrefixChar( char c, State& state )
{
if( c == '\'' || c == '\"' )
{
state.terminator = c; state.current = inWideLiteral;
}
else
{
state.current = outsideLiteral;
}
}

void onInWideLiteralChar( char c, State& state )
{
if( c == '\\' )
{
state.current = inEscape;
}
else if( c == state.terminator )
{
state.current = outsideLiteral;
}
}

void onInEscapeChar( char c, State& state )
{
state.current = inWideLiteral;
}

int main()
{
using namespace std;

char c;
State state;

setlocale( LC_ALL, "" ); // Affects mbtowc translation.
cout << hex << uppercase << setfill( '0' );
cout.sync_with_stdio( false );
state.current = outsideLiteral;
while( cin.get( c ) )
{
if( state.current != inWideLiteral || isInAsciiRange( c ) )
{
cout << c;
}
else
{
cout << "\\u" << setw( 4 ) << unsigned( unicodeFrom( c ) );
}

switch( state.current )
{
case outsideLiteral: onOutsideLiteralChar( c, state ); break;
case afterPrefix: onAfterPrefixChar( c, state ); break;
case inWideLiteral: onInWideLiteralChar( c, state ); break;
case inEscape: onInEscapeChar( c, state ); break;
}
}
}
</code>


To use this preprocessing properly the source should first be preprocessed via
the C/C++ preprocessor, i.e., compilation is then a pipeline of three processes.

Which, I suspect due to amount of text generated by the C/C++ preprocessor, is
very inefficient.

So, I contend that before asking compiler vendors to support the full range of
characters required by the C++ standard, we should ask them to support Latin-1.


Cheers,

- Alf
 
J

Juha Nieminen

Pascal said:
But ISTR that the C++ standard doesn't allow random unicode characters
(not even accented letters) in identifiers, only
[A-Za-z_][A-Za-z0-9_]*.

AFAIK, you are wrong, and the C++ standard *does* allow alphanumeric
unicode characters in identifiers.

Of course whether your *compiler* supports them is a different story.
I don't know how many C++ compilers in existence implement 100% of the
current C++ standard, but I would be surprised if there were more than one.
 
J

James Kanze

* James Kanze:
According to the C++ standard, how the compiler maps "physical
source file characters" to the "basic source character set" is
implementation defined. So the next step would be to map
superscripts to std::pow, and subscripts to []. I'm pretty sure
such an implementation would be legal. And at least for numeric
applications, the possibilities are interesting, to say the
least---mapping a capital Greek sigma to a call to
std::accumulate, etc.
Unfortunately, with current compiler technology -- after
all, we're only about 50 years on in that game and one can't
expect much in just 50 years -- the problem is not how to
get beyond Latin-1, but rather, how to be able to use Latin-1,
as opposed to just plain ASCII, in our C++ programs' string
constants.
For example, with MinGW g++ 3.4.5, the current version, the
following *does not compile* when the source code is in
Latin-1:
L"blåbærsyltetøy" // Norwegian for "blueberry jam"
This may be surprising to some because apparently g++ handles
non-ASCII Latin-1 characters just fine in narrow character
literals.

I suspect that much of the problem here isn't compiler
technology, but the fact that there are so many code sets. Why
should the compiler recognize latin-1, when I'm using UTF-8 on
my machines at home (but Latin-1 at work). The largest single
problem for the compiler writer, I suspect, is deciding what
encoding the input is using, or defining how to decide.

Note that it's a problem that can't necessarily be solved by a
simple compiler option (alhtough that would already be a major
step forward). Suppose you send me a library, that I compile
with my code. My code here is in UTF-8, but your include file
will be in Latin-1; the compiler has to switch encodings on the
fly. Which isn't difficult in itself, provided it has some way
of knowing what to switch to.
However, the reason they "work" for narrow characters is a bug
in the compiler, where it doesn't recognize the source code
bytes as invalid UTF-8 (with a wide character literal it's
forced to attempt a conversion to UTF-16 and chokes).
Save the source code as UTF-8 no BOM and that compiler is
happy, of course, but then, the source code is not portable
(e.g. MSVC won't eat it, just spits it out) and for a console
application the executable is then useless, because the
Windows command interpreter's UTF-8 codepage doesn't work.
One very inefficient solution is to preprocess the source code
to pure ASCII.

Using UCN's. I think (I'm not sure) part of the idea behind
UCN's was that IDE's would mask them. You'd type in
"blåbærsyltetøy", and the system would save it to the file as
"bl\u00e5b\u00e6rsyltet\u00f8y" (but of course displaying what
you typed in, even when you reread the file). But I don't know
of any IDE's that do that, and of course, once you start
throwing in third party tools (or Unix shells, or just about
anything else)...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top