W
Wes Groleau
I have a huge file with information about Chinese characters. But
instead of the character, each line starts with the Unicode hex,
e.g., U+AC34
It would be trivial to use awk or perl to write a long script containing
the substitution for each line, but then every line
would have to be checked against every sub, for an N² processing time.
Not good for 36K lines.
What I tried to do instead was to use the hex value to compute the
character, for an N² processing time.
But my not-as-clever-as-I-thought method didn't work:
iMac:Anki wgroleau$ perl -CSD -p -i -e \
'sU\+[A-F0-9]{4})(\s):\1\2\N{\1}\2:g;' \
/tmp/Chars_Info.txt
Unknown charname '\1' at -e line 1.
Deprecated character in \N{...}; marked by <-- HERE in \N{\<-- HERE 1}
at -e line 1.
I suspect "there's more than one way" to do it,
but a perl guru I am definitely not.
instead of the character, each line starts with the Unicode hex,
e.g., U+AC34
It would be trivial to use awk or perl to write a long script containing
the substitution for each line, but then every line
would have to be checked against every sub, for an N² processing time.
Not good for 36K lines.
What I tried to do instead was to use the hex value to compute the
character, for an N² processing time.
But my not-as-clever-as-I-thought method didn't work:
iMac:Anki wgroleau$ perl -CSD -p -i -e \
'sU\+[A-F0-9]{4})(\s):\1\2\N{\1}\2:g;' \
/tmp/Chars_Info.txt
Unknown charname '\1' at -e line 1.
Deprecated character in \N{...}; marked by <-- HERE in \N{\<-- HERE 1}
at -e line 1.
I suspect "there's more than one way" to do it,
but a perl guru I am definitely not.