utf8 -> ascii in c language??

C

chunhui_true

i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!
 
J

Jack Klein

i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!

There is no such thing as a "class" in C. Perhaps you are looking for
comp.lang.c++ down the hall.

Note that both C and C++ are case sensitive languages, so you had
better learn how to use the shift key on your keyboard.
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!

There is no such thing as "ascii". There is ASCII, named for the
American Code for Information Interchange. But neither C nor C++
define or require any specific character set, neither ASCII nor UTF8
nor any other.

So most likely comp.lang.c++ won't want your question either. You
need a group that supports your particular compiler/operating system
combination, but since you posted through Google I have no information
on which to suggest one to you.

And you need to learn how to use your shift key. And the space bar.
 
C

Chris Williams

chunhui_true said:
i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!

I would recommend writing the bytes of each string out as a number and
as a text value so that you can see them. You might find that this is a
case where it isn't as difficult as it seems.

-Chris
 
C

chunhui_true

Thanks for your suggestion.My English is very poor.And I first use
groups, So I don't konw the culture in groups.Though the time I think I
can communicate with each other very well.
 
P

Peter Nilsson

Jack said:
There is no such thing as a "class" in C.

But there is such a thing in object oriented programming.
There is no such thing as "ascii". There is ASCII, named for the
American Code for Information Interchange. But neither C nor C++
define or require any specific character set, neither ASCII nor
UTF8 nor any other.

More importantly, ASCII characters remain unchanged under UTF8,
so the question is likely ill-formed to begin with.

Nonetheless, it's certainly possible to convert UTF8 sequences back
to their original character codes in C. However, there is no specific
standard function for this purpose alone.
 
V

Villy Kruse

Nonetheless, it's certainly possible to convert UTF8 sequences back
to their original character codes in C. However, there is no specific
standard function for this purpose alone.

The standard(?) mbtowc could be doing that; well, it could convert
utf8 which is a multibyte format to unicode, which is a wide character
format, and the first 128 values of unicode is identical to US-ASCII.

Villy
 
C

chunhui_true

Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?
I use libcap get the FTP commands from Ethernet.I have one class to
get all packages and flowed,buffered them,Then aonther class can
readline (ended with \r\n)from buffer.Every time I readline from buffer
to get a command.
When I use CuteFTP I can get all commands an printf them in
screen.But when I use IE to FTP I can see one command "set utf8 on" and
then next commands I can't printf them in screen.Should I conver utf8
to ASCII?:(
 
C

chunhui_true

Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?
I use libcap get the FTP commands from Ethernet.I have one class to
get all packages and flowed,buffered them,Then aonther class can
readline (ended with \r\n)from buffer.Every time I readline from buffer
to get a command.
When I use CuteFTP I can get all commands an printf them in
screen.But when I use IE to FTP I can see one command "set utf8 on" and
then next commands I can't printf them in screen.Should I conver utf8
to ASCII?:(
 
V

Villy Kruse

Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?

They certainly do. Latin1 characters, however, don't, except for the
ASCII subset. Also, the unicode and ASCII have the same code values
for the ASCII subset of unicode.

See the description of UTF-8 in for example rfc-2044:


|
| Network Working Group F. Yergeau
| Request for Comments: 2044 Alis Technologies
| Category: Informational October 1996
|
|
| UTF-8, a transformation format of Unicode and ISO 10646
|
| Status of this Memo
|
| This memo provides information for the Internet community. This memo
| does not specify an Internet standard of any kind. Distribution of
| this memo is unlimited.
|
| Abstract
|
| The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993 jointly
| define a 16 bit character set which encompasses most of the world's
| writing systems. 16-bit characters, however, are not compatible with
| many current applications and protocols, and this has led to the
| development of a few so-called UCS transformation formats (UTF), each
| with different characteristics. UTF-8, the object of this memo, has
| the characteristic of preserving the full US-ASCII range: US-ASCII
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| characters are encoded in one octet having the usual US-ASCII value,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| and any octet with such a value can only be an US-ASCII character.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| This provides compatibility with file systems, parsers and other
| software that rely on US-ASCII values but are transparent to other
| values.
| [...]



Villy
 
C

chunhui_true

Oh,Thanks!!
since it :
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^­^^^^^
| This provides compatibility with file systems, parsers and other
| software that rely on US-ASCII values but are transparent to other

| values.
It means my programme wich parser the old ASCII FTP commands also can
parser the UTF8 commands??But why I can't readline from buffer when it
"set utf8 on"?
 
P

Peter Nilsson

chunhui_true said:
...
It means my programme wich parser the old ASCII FTP commands
also can parser the UTF8 commands??But why I can't readline
from buffer when it "set utf8 on"?

Do you have a question on the ISO C language?

Comp.lang.c is the wrong forum for (repeatedly) asking questions
about utf8 and ftp commands.

Try comp.programming (say) instead, or a platform specific newsgroup
that caters for your current development tools. Also note that you
are much more likely to get useful responses by posting sample code
that exhibits the problems you're having.
 
S

SM Ryan

# Oh,Thanks!!
# since it :
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^=AD^^^^^
# | This provides compatibility with file systems, parsers and other
# | software that rely on US-ASCII values but are transparent to other
#
# | values.
# It means my programme wich parser the old ASCII FTP commands also can
# parser the UTF8 commands??But why I can't readline from buffer when it
# "set utf8 on"?

Because readline is not transparent to character codes 0x80 - 0xFF? You'd
have to examine the source; lots of older of program used the 'fact' that
characters where only seven bits and used the extra bit as a flag.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top