"消息包含敏感字符!" (Message contains sensitive characters!)
Check out the...
Censorship of the Day
These are recently retrieved, decrypted lists of the keywords that Sina UC downloads when starting up.
- List1 (retrieved 7/17/2016)
- List2 (retrieved 7/17/2016)
- List3 (retrieved 7/17/2016)
- List4 (retrieved 7/17/2016)
- List5 (retrieved 7/17/2016)
Censorship Analysis
Sina UC 8.3.4.22616 has two sources for lists of keywords: built-in and
downloaded. Both sources are encrypted with the 'H177UC09VI67KASI'
16-byte Blowfish key in ECB mode, then JSON-encoded, then encoded as an ASCII
hex string. The built-in source is embedded in IMResource.dll. The
downloaded source is retrieved from http://im.sina.com.cn/fetch_keyword.php?ver=8.3.4.22616.
Each source contains five lists of keywords (downloaded list retrieved
8/25/11):
- List1: List1 censors one-on-one text chat, group text chat, usernames (replacing username with a UID#), and mood indications.
- List2: List2 censors usernames (replacing username with a
UID#) and mood indications.
- List3: List3 has no purpose in the version analyzed.
- List4: List4 censors one-on-one text chat.
- List5: List5 censors group text chat.
Decryption
The first python program decrypt.py takes an ASCII
hex string from stdin and outputs
the decrypted JSON plaintext to stdout. The second program dejson.py takes a JSON file
foo and outputs a file fooi containing a
line-delimited list of keywords for each keyword list i in
foo.
This material is based upon work supported by the National Science
Foundation under Grant No. 0844880. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of
the author(s) and do not necessarily reflect the views of the National
Science Foundation.