Application security, DevOps, Supply chain

‘Trojan Source’ flaw could result in covert app poisoning

Share
Cyber security concept. Toy horse on a digital screen, symbolizes the attack of the Trojan virus. 3D illustration.

A newly disclosed vulnerability in the way source code is compiled could put enterprises at risk of upstream attacks.

A pair of researchers from Cambridge University in the UK said that a condition dubbed "Trojan Source" allows attackers to insert malicious source code which can evade detection by security reviewers.

Researchers Nicholas Boucher and Ross Anderson said that the problem is that snippets of code can be presented in a certain way in source code and then behave in a completely different way once compiled.

“Since the 1960s, researchers have investigated formal methods to mathematically prove that a compiler’s output correctly implements the source code supplied to it,” the researchers explained.

“Many of the discrepancies between source code logic and compiler output logic stem from compiler optimizations, about which it can be difficult to reason.”

In particular, the researchers found that the Unicode bidirectional algorithm (Bidi) can be manipulated to hide potentially malicious code. Intended to allow interoperability between left to right languages (such as English or Russian) and right to left languages (such as Arabic and Hebrew), Bidi instructions allow the order of text to be switched as needed.

What the duo discovered was that in some cases the Bidi instructions can also be concealed within the source code. This allows for the appearance of the source code to be manipulated in a way that would likely evade detection when a review conducts quality or security checks.

In some cases, the manipulation would result in the way instructions are executed, such as "early return" attacks that end the operation prematurely. In other cases, the Bidi manipulation would allow entire chunks of code (such as security measures or input validation) to be read as comments and not executed.

The result is a way for attackers to deliberately inject vulnerabilities into a project’s source code, potentially creating an upstream attack such as the 2020 SolarWinds supply chain attack. Even more disturbing, the Bidi instructions persist through copy-and-paste operations, meaning code-sharing sites could be targeted to infect multiple applications and services via poisoned source code.

The are some mitigations in place. The researchers noted that a handful of modern languages such as Rust and C++ contain syntaxes checks that would catch some of the described techniques, but not all. Scripting languages such as Python and SQL would be open to an even wider variety of attack techniques.

The Cambridge duo said there a number of mitigations that have been put in place pre-disclosure of the flaw. Both GitHub and BitBucket now highlight the use of Bidi characters within source code, and Visual Studio Code, Emacs, and Rust have put measures in place to detect and warn of Bidi control characters.

Developers are still advised to keep a close eye on their source code, particularly snippets that have been copied from a shared repository.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms and Conditions and Privacy Policy.