my CGI script receives UTF-8 strings, like "0xE2 0x82 0xAC" for the Euro
symbol.
Then it would be best to use at least Perl 5.8.0 ...
However, the unicode for this symbol is 0x20AC.
"Unicode" is an abstract concept - an identification of particular
characters with particular integer numbers ("code points" in the
Unicode character set). In order to actually _use_ those abtract
Unicode characters, it's necessary to have a way of representing them.
utf-8 is one particular way of representing them (and it just happens
to be Perl's own internal representation of Unicode, although you
don't need to know that in order to use it). You writing 0x20AC (or
as the Unicode folks would write it, U+20AC) are just other ways of
giving a concrete representation to the abstract characters. None of
them is "Unicode" per se: all of them are representations of Unicode.
How can I convert from UTF-8 to Unicode?
utf-8 already _is_ (a representation of) Unicode.
I'd like to do sth like:
if( $str =~ m/\x{20AC}/ ){
Yup, that's another way of representing Unicode: it's Perl's way of
writing a "wide character" in source code.
Perhaps you could be a bit more precise about how this script
"receives" Unicode characters. Is it reading them directly from a
file (then it's easy in 5.8.0, you just open the file with :utf8), or
is it that you've decoded some HTML form submission data, and got
yourself a string of bytes which contains some utf-8 representations
of characters?
If it's the latter, and you really have to handle this yourself by
hand (it appears that recent versions of CGI.pm handle it for you, but
I have to admit to not trying that myself yet), then I think you want
pack() with a template of U0, as others have said.
but first, I have to convert "0xE2 0x82 0xAC" to Unicode, of course...
Sort-of; but I'd still recommend taking a bit of time out to study
relevant parts of
http://www.perldoc.com/perl5.8.0/pod/perluniintro.html and then
http://www.perldoc.com/perl5.8.0/pod/perlunicode.html
to get a firmer understanding of what's going on, and how it's meant
to be used.