Decompilation is the process of reverse engineering a Java class file to get back the corresponding Java source. The original source code is not recovered, but instead an equivalent of the original is produced as output, which if compiled should give the same class file that was reverse engineered.
As a simple example of the difference with original code, any source code generated through decompilation will not include any of the comments that may have been available in the original source.
Decompilation is therefore an approximate process. However, when compared with older programming languages, e.g. C or C++, Java decompilation provides the closest approximation to the original source code. This is to do with the Java class file structure that includes extra information like the (human readable) variable, method and class names, byte code to source code line number mapping, and similar data for debugging purposes. All these aid in the the accuracy of the decompilation process.
The Importance of Decompilation
Let’s say you are writing a Java application that depends on a third-party library which is a commercial product. The vendor of the product has not shipped the source code along with the product.
Assume that you have encountered a bug during development that is traced back to be original within a specific class in the third-party library. The normal procedure in this case is to raise a ticket with the vendor, describe the problem in detail, and wait for a patch to get released.
However, your product development also has an extremely tight schedule for milestone delivery. The vendor might also be taking a long time to respond to the tickets that you have raised. In such cases, you would probably want to patch the offending class yourself. This would strictly be a temporary fix till the vendor rolls out an official patch or a new revision.
Since the original source was not shipped with the product, the only way you can create the patch, is by
a) Decompiling the offending class into an equivalent source file.
b) Make appropriate modifications to the source file.
c) Compile the source file to a class file.
d) Patch the library by replacing the original class file with the new one that you have created.
Popular Decompilation Tools
Arguably, the most popular Java decompiler ever written was this little utility called JAD. This was a command line utility written in C++. Several graphical shells are available that execute this program behind the scenes while providing the user with a more comfortable interface for source browsing, project management, etc.
Though, JAD works well with Java class file versions, versions later than 1.4. of Java are not supported. Development on the product has also ceased.
DJ takes over from where JAD left the decompiler scene. It can work on class files created with more recent versions of Java. Like JAD, DJ is also a native Windows utility that does not require a JRE to run.
Protection against Decompilation
Strictly speaking, you cannot prevent a Java class from being decompiled, unless you convert the byte code (Generated at the time of code compilation) in the class file to machine code for the underlying hardware and operating system. Following are a few such tools that allow you to do this:
- The GNU Compiler for the Java Programming Language (GCJ) can compile Java source code to machine code for Unix platforms. It can also convert Java byte code into machine code. GCJ comes pre-bundled with most flavors of Unix and Linux, or is available for download from public repositories.
- Excelsior Jet provides ‘Ahead of Time’ (AOT) compilation (AOT) compilation of Java source to native code for the Windows platform. This product supports till version 7 of the Java programming language.
Conversion to machine code is usually not recommended since it takes away many of the features (especially platform portability) that makes Java so attractive as a development platform. A simpler approach is to use a technique called obfuscation.
Simply put, obfuscation is a process that modifies a Java class file to:
a) Replace human readable variable, method and class names with short, cryptic and randomly generated names.
b) Remove all debug information (e.g. byte code to source line number mapping) from the byte code.
The end goal is not to prevent decompilation of the class file, but rather, to make decompilation ineffective since the generated source code is no longer easy to decipher by a human reader.
Decompiling on the Decline?
In recent years, decompilation has lost its relevance as a popular mechanism to debug and patch a Java application. This is apparent from the lesser number of decompiler tools that are currently available. Reasons for this decline are numerous:
- Java is increasingly becoming more of a server-side application than a desktop application. Since end users have little or no access to server-side implementation, the scope of decompilation for hacking and other purposes reduces drastically.
- The fast pace at which the Java programming language itself is undergoing refinements and incorporation of new features. Every new version of the platform comes with some language enhancement or the other. Most decompiler vendors are unable to keep pace with changing technologies on the Java front.
- Open-Source Movement: Most of the popular Java tools, platforms and applications are being offered as open-source for development and proprietary for deployment. Should any error occur during development, the developer has the source code available to quickly identify and provide a patch fix for the error.
- Competing Products: The number of duplicate products, tools and platforms for the same set of features are on the rise today. The developer community has benefited from this choice of technology available: if something does not work right on one platform, then there is the flexibility switching over to a similar platform immediately from a different vendor.
- Standards Compliance: Applications are written to follow certain standards (in terms of API, communication protocol, etc.) and deployed on top of platforms that support those standards. This wide spread acceptance of standards has simplified the switch over process across platforms.
- Better Support: Most popular Java software comes with extremely good support from both the vendor organization and the developer community. Time taken to resolve an issue or to get a fix issued is drastically reduced.
When all else fails to debug closed-source and commercial Java applications, decompilation can serve as an aid to resolve such issues. By having access to equivalent source code, programmers can identify causes of error and decide on alternatives to resolve the same. In a crunch situation, the reverse engineered code can be modified and rebuilt as a patch to the original software to resolve critical issues. While server-side prevalence of Java as a technology has reduced frequent usage of decompilers, commercial desktop application providers should take sufficient precautions to prevent their products from being reverse engineered.