Valhalla Legends Forums Archive | Battle.net Bot Development References | Battle.net UTF-8 Information

AuthorMessageTime
Skywing
According to research done by Kp and backed by experiments the two of us carried out last night, Starcraft 1.10 introduced UTF-8 encoding for all text transmitted to/from Battle.net. Presently the server does not properly encode/decode this text when relaying to/from legacy clients, so you'll be stuck with using US English when chatting to them for now.

Background for those who don't know about it:
UTF-8 is a method for encoding Unicode characters as 8-bit sequences. You can use the Win32 WideChartoMultiByte and MultiByteToWideChar functions to translate things to and from UTF-8. 7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.

C++ bot developers may find the UTF-8 conversion routines which I originally wrote for my MSN Messenger client useful to handle this. It would be advisable to deal natively in Unicode and remove the to-ANSI translation step, as this represents a potential loss in information.

This may pose a problem for many of the chat encryption/obfuscation schemes in use, as extended ASCII characters must be checked for after UTF-8 processing, and not before.
May 3, 2003, 5:06 AM
tA-Kane
[quote author=Skywing link=board=17;threadid=1218;start=0#msg9026 date=1051938399]7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.[/quote]What do you do with a character who's value is 127 then?
May 3, 2003, 7:06 PM
Skywing
[quote author=tA-Kane link=board=17;threadid=1218;start=0#msg9062 date=1051988801]
[quote author=Skywing link=board=17;threadid=1218;start=0#msg9026 date=1051938399]7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.[/quote]What do you do with a character who's value is 127 then?
[/quote]
http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=UTF-8
http://www.cl.cam.ac.uk/~mgk25/unicode.html
Google is an excellent resource.
May 3, 2003, 7:50 PM
Camel
[quote author=tA-Kane link=board=17;threadid=1218;start=0#msg9062 date=1051988801]
[quote author=Skywing link=board=17;threadid=1218;start=0#msg9026 date=1051938399]7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.[/quote]What do you do with a character who's value is 127 then?
[/quote]
i would go ahead and assume that it's the first 2^7 chars that dont need to be encoded; that would put 127 in the category or non-encoded
May 4, 2003, 2:33 AM
tA-Kane
[quote author=Camel link=board=17;threadid=1218;start=0#msg9113 date=1052015635]i would go ahead and assume that it's the first 2^7 chars that dont need to be encoded[/quote]I had assumed as much, since that would make more sense. By the way, it's (2^7)-1, otherwise 128 would be included.
May 4, 2003, 9:04 PM
Camel
no it wouldn't
i said the first 128 charactors
0 is the first, 127 is the 128th
May 5, 2003, 2:14 AM

Search