WindAndWaves said:
I have an index file, which I would like to load quickly, but also
contains some Japanese, Russian, Chinese, etc.. characters (links
pointing to translations of the page).
Ideally, we would use language negotiation (a protocol for selecting
content based on the language preferences in the browser and information
on existing versions in the server) for sending the user the best
alternative available. But this is unreliable since most people have
wrong language settings in their browsers, so a multilingual index file
is indeed needed for a multilingual site.
Now, I could either double
the file in size by saving it as unicode or I could use the
codes to specify the characters that I need.
You can use either of the methods, but please note that using Unicode
does not double the file size. Well, sometimes it might, but normally it
won't. In UTF-8, each Ascii character takes just one octet (byte), just
as in a pure Ascii file. Other characters take two or more octets each,
but if your document (including HTML markup, which uses Ascii only) is
dominantly Ascii characters, the increase in file size won't be big, and
it'll probably be a little smaller than the size of a version that uses
references. (After all, Ӓ is seven octets.)
PS does anyone know of any programs / online applications that can
translate characters into these codes ( )
There are many of them, for different platforms. See
http://www.alanwood.net/unicode/utilities_editors.html
(which is about Unicode editors, which let you work with UTF-8 in
general, but they often have an output mode that uses ).