Tizen content scanning and app obfuscation

This article brought to you by LWN subscribers
Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

By Nathan Willis
June 12, 2013

Tizen Dev Con 2013

At the 2013 Tizen Developer Conference in San Francisco, there was a range of different security talks examining difference facets of hardening the mobile platform. Last week, we examined the Smack framework that implements access control for system resources. There were also sessions that explored the problem of protecting the device at higher levels of the system stack. Sreenu Pilluta spoke about guarding against malware delivered via the Internet, and Roger Wang offered an unusual proposal for obfuscating JavaScript applications themselves: by compiling them.

Content, secure

Pilluta is an engineer at anti-virus software vendor McAfee. As he explained, Tizen device vendors are expected to manage their own "app stores" through which users install safe applications, but that leaves a lot of avenues for malicious content unblocked. Email, web pages, and media delivery services can all download content from untrusted sources that might contain a dangerous payload. Pilluta described Tizen's Content Security Framework (CSF), a mechanism designed to let device vendors add pluggable virus- and malware-scanning software to their Tizen-based products.

The CSF itself provides a set of APIs that other components can use to scan two distinct classes of content: downloaded data objects and remote URLs. The actual engines that perform the scanning are plugins to CSF, and are expected to be added to Tizen by device vendors. Security engines come in two varieties: Scan Engines (for data) and Site Engines (for URLs). Scan Engines inspect content and are designed to retrieve malware-matching patterns from the vendor, as is typical of PC virus-scanning programs today. Site Engines use a reputation system, in which the vendor categorizes URLs and creates block list policies by category (e.g., gambling, pornography, spyware, etc.).

Applications dictate when the scanning is performed, Pilluta said, which is intentionally a decision left up to the vendor. Some might choose to scan a page before loading it at all, while others might load the page but scan it before executing any JavaScript. It is also up to the application what to do when infected content is found; the rationale being that the application can provide a more context-aware response to the user, and do so within the expected bounds of the user interface, rather than popping up an imposing and unfamiliar warning notification from a component the user was unaware even existed.

The CSF scanning APIs are high-level and event-driven, which Pilluta said allowed applications to call them cooperatively. For example, an email client could call the Site Engine to scan an URL inside of an email message and the Scan Engine on a file attachment. Similarly, the email client could call the Site Engine on a URL clicked upon to be opened in the browser. Old-fashioned scanning methods that use "deep hooks" into the filesystem would make this sort of cooperation difficult, he said.

The APIs are also designed to provide flexibility to application authors. For example, the Site Engine API is not tied to the Tizen Web Runtime or even to the system's HTTP stack. Thus, an application that uses its own built-in HTTP proxy can still take advantage of the CSF to scan URLs without re-implementing the scanner.

Ultimately, CSF is a framework that device makers will take advantage of, each in its own way. Presumably commercial vendors will offer virus scanning engines to interested OEMs, but consumers will likely not see any of them until a Tizen product hits the market. The flexible framework also seems designed to support HTTP-driven services like downloadable media and game content, which are frequently the most-cited examples of why companies want to see Tizen in devices like smart TVs and car dashboards.

CSF is an open source contribution to the Tizen platform, although one would reasonably expect McAfee to also develop scanning engines to offer to device vendors and mobile providers. As the CSF begins to take shape in products coming to market, it will be interesting to see if there are also any open source scanning engines, either in the Tizen reference code or produced by third parties. One would hope so, since malware detection is a concern for everyone, not just commercial device makers.

JavaScript app protection

In contrast to Pilluta's talk, Wang was not presenting a component of the Tizen architecture; rather, he was showing the progress he has made on a personal effort that he hopes will appeal to independent application developers. The issue he tackled was protecting JavaScript applications against reverse-engineering. While that is not an issue for developers of open source apps, building tools to simplify the process on an open platform like Tizen could have implications further down the road. Wang is a developer for Intel working on the Tizen platform, although this particular project is a personal side-effort.

In the past, he said, JavaScript was primarily used for incidental page features and other such low-value scripts, but today JavaScript applications implement major functionality, and HTML5-driven platforms like Tizen should offer developers a way to protect their code against theft and reverse-engineering. There are a number of techniques already in use that side-step the issue, such as separating the core functionality out into a server-side component, or building the business model around the value of user-contributed data. But these approaches do not work for "pure" client-side JavaScript apps.

Most app developers rely on an obfuscation system to "minify" JavaScript that they want to obscure from prying eyes. Obfuscation removes easily-understood function and variable names, and changes the formatting to make the code delivered difficult to understand. The most popular obfuscator, he noted, was Yahoo's YUI Compressor (which has other beneficial features like removing dead code), followed by the Google Closure Compiler, and UglifyJS. But obfuscators still produce JavaScript which is delivered to the client browser or web runtime and can ultimately be reverse-engineered.

The other major approach found in practice today is encryption, in which the app is downloaded by the device and placed in encrypted storage by the installer. Typically either the initial download is conducted over a secure channel (e.g., HTTPS) or the download is done in the clear and the installation program encrypts the app when it is installed. Both have weaknesses, Wang said. Someone can dump the HTTP connection if it is unencrypted and intercept the app, but a skilled attacker could also run a man-in-the-middle attack against HTTPS. Ultimately, he concluded, there is always dumping from memory, so encryption is an approach that will always get broken one way or another.

Although there are a few esoteric approaches out there—such as writing one's app in another language and then compiling it to JavaScript (a practice Wang said was out-of-scope for the talk since he was addressing the concerns of JavaScript coders), most people simply "lawyer up" and apply licensing terms that forbid examining the app. That may not work in every jurisdiction, he said, and even when it does, it is expensive.

Wang's experiment takes a different approach entirely: compiling the JavaScript app to machine code, just like one does with a native app. The technique works by exploiting the difference between a platform's web runtime (which does not allow the user to inspect or save HTML content) and the web browser. A developer can work in JavaScript, then deploy the app as a binary. The platform would have to support this approach, both in the installer and in the web runtime, however, and developers would need to rebuild their apps for each HTML5 platform.

Wang has implemented the technique as an experimental feature of node-webkit, his app runtime derived from Chromium and Node.js. It compiles a JavaScript app using the V8 JavaScript engine's "snapshot" feature. Snapshots dump the engine's heap, and thus contain all of the created objects and Just-In-Time (JIT) compiled functions. In Chromium, snapshots are used to cache contexts for performance reasons; the node-webkit compiler simply saves them. The resulting binaries can then be executed by WebKit's JSC utility.

There are, naturally, limitations. V8 snapshots are created very early in the execution process, so some DOM objects (such as window) have not yet been created when the snapshot is taken. On the wiki entry for the feature, Wang suggests a few ways to work around this issue. The other limitation, however, is that the snapshot tool will throw an error if the JavaScript app is too large. Wang suggests splitting the app up if this limitation poses a practical problem. Another limitation is that the resulting binary also runs significantly slower than JavaScript executed in the runtime.

He has been exploring other techniques for extending the idea, such as using the Crankshaft optimizer. Crankshaft is an alternative to the JavaScript compiler currently used in V8. At the moment, using Crankshaft's compiler can generate code that runs faster, Wang said, but it takes significantly longer to compile, and it requires "training Crankshaft on your code."

Wang has defined an additional field for the package.json file that defines Tizen HTML5 applications; "snapshot" : "snapshot.bin" can be used to point to compiled JavaScript apps and test them with node-webkit. He is still in the process of working out the API required to connect JSC to the Tizen web runtime, however. The feature is not currently slated to become part of the official Tizen platform.

Obfuscating JavaScript by any means is a controversial subject. To many in the free software world, it is seen as a technique to prevent users from studying and modifying the software on their systems. Bradley Kuhn of the Software Freedom Conservancy lambasted it at SCALE 2013, for instance. Then again, obfuscation is not required to make a JavaScript app non-free; as the Free Software Foundation notes, licensing can do that alone. Still, it is likely that compiling JavaScript apps to machine code offers a tantalizing measure of protection to quite a few proprietary software vendors, beyond the attack-resistance of traditional obfuscation techniques.

Many users, of course, are purely pragmatic about mobile apps: they use what is available, free software or otherwise. But as the FSF points out, unobfuscated JavaScript, while it may be non-free, can still be read and modified. Perhaps the longer-term concern about obfuscation or compiling to machine code is that a device vendor could automate the technique on its mobile app store. But automated or manual, the prospect of building JavaScript compilation into Tizen did appear to ruffle several feathers at Tizen Dev Con; audience members asked about the project during the Q&A sections of several later talks. Nevertheless, for the foreseeable future, Wang's effort remains a side project of an experimental nature.

[The author wishes to thank the Linux Foundation for travel assistance to Tizen Dev Con.]

Index entries for this article
Security	Mobile phones
Conference	Tizen Developer Conference/2013

"protect their code against theft and reverse-engineering"

Posted Jun 13, 2013 20:19 UTC (Thu) by debacle (subscriber, #7114) [Link]

Thanks, Roger Wang, and thanks to the organizers of the conference!

It is surely an interesting task, and may have positive results for the JavaScript community in general.

I wonder, however, since then it is possible to steal code. One might copy it, but the original owner of the code still has it. And on what ethical base, does Mr Wang want to prevent me to analyse code that is supposed to run on my computer or my telephone?

To really prevent any form of reverse engineering, I suggest Mr Wang to promote free software and putting well-documented source code in the public.

Tizen content scanning and app obfuscation

Posted Jun 14, 2013 20:32 UTC (Fri) by massimiliano (subscriber, #3048) [Link] (1 responses)

I do not want to touch the ethical issues (hiding information, trying to prevent reverse engineering...), but even just technically the techniques described seem bad ideas to me.

Om one hand, compiling javascript ahead of time and expecting reasonable performance is an uphill battle at best, and currently next to impossible. And it could be possible only with a major engineering effort in compiler technology (note that I work in Google in the team of the V8 engine).

Javascript is not statically typed, and modern JITs rely on type information optimistically inferred at run time to produce efficient code. In this environment "deoptimizing" (throwing away optimized code and resuming the execution with "regular" code) is something that happens regularly when, at run time, the optimized code needs to deal with a value that was following a different "pattern". Producing optimized code guaranteed not to be deoptimized ahead of time is just not possible, unless one restricts itself to a language subset like asm.js.

And here comes the second problem: if one is willing to restrict himself to such a language subset, that looks really similar to machine code, then why not biting the bullet and adopting a technology like NaCL (or its portable variant)? Technically, using javascript for this is not the best solution.

Of course in this world technical choices are often done for reasons that are not technical at all...

Tizen content scanning and app obfuscation

Posted Jun 17, 2013 10:56 UTC (Mon) by nix (subscriber, #2304) [Link]

> Of course in this world technical choices are often done for reasons that are not technical at all

Quite. Hence the *other* talk, in which McAfee attempts to give away a razor so it can sell people expensive razor blades that slow down their whole system, as a halfassed defence against a problem that simply *does not affect* properly secured systems in the first place!