Bush hid the facts

1

"Bush hid the facts" is a common name for a bug present in Microsoft Windows which causes text encoded in ASCII to be interpreted as if it were UTF-16LE, resulting in garbled text. When the string "Bush hid the facts", without quotes, was put in a new Notepad document and saved, closed, and reopened, the nonsensical sequence of the Chinese characters "" would appear instead. While "Bush hid the facts" is the sentence most commonly presented to induce the error, the bug can be triggered by other strings, for example "hhhh hhh hhh hhhhh" or "this app can break", and even "a " or "z!". The bug occurs when the string is passed to the Win32 charset detection function. guesses it is Unicode if the "hi byte" (the odd indexes) changes three times less than the "low byte", if so it returns, and the application then incorrectly interprets the text as UTF-16LE. The bug had existed since was introduced with Windows NT 3.5 in 1994, but was not discovered until early 2004. Many text editors and tools exhibit this behavior on Windows because they use to determine the encoding of text files. As of Windows Vista, Notepad has been modified to use a different detection algorithm that does not exhibit the bug, but remains unchanged in the operating system, so any other tools that use the function are still affected.

Workarounds

Several workarounds exist for this bug:

This article is derived from Wikipedia and licensed under CC BY-SA 4.0. View the original article.

Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc.
Bliptext is not affiliated with or endorsed by Wikipedia or the Wikimedia Foundation.

View original