As announced earlier, I will present some ideas for student theses here. Feel free to comment on them.
The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.
One way of effectively evading the Java security model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already used in the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.
Currently, users of the public API of the JCL are completely oblivious of the fact that most of their method calls will sooner or later result in a native call. Thus, they rely on the JCL to perform any checks or sanitization necessary. The non-native part of the JCL therefore endows the trust of application developers using it.
As the JCL and its native part have grown over the years of its existence, security reviews became increasingly complicated to perform purely by hand. Oracle, as one of the larger contributors to Java, runs code analysis tools to aid these reviews. Due to the complex nature of the Java security model and the architecture of the JCL finding vulnerabilities is a rather tough problem.
In this thesis an implementation of an automated static code analysis has to be created that is able to evaluate the propagation of the possible threat native calls pose to the public API of the JCL. Assuming that every native method poses the same amount of threat, the propagation of this threat is solely depending on the data provided to these method. Therefore, the analysis will largely benefit on an elaborate rating of the data, its type, safe guards on that data and possible treatments applied to it. In order to combine this information, techniques from data mining, machine learning, graph or network theory can be applied to the transitive hull of the reverse call graph (around 95.000 methods) of native methods (around 1800 methods) in the JCL. After a successful analysis run it should be possible to determine the risk of calling methods of the JCL. Developers can then take effective countermeasures while processing user input and make educated choices on the classes and methods they are using.
You will find the original thesis description on the institute’s website. If you are a CS student of TU Darmstadt, please contact me if you are interested in making this topic the topic of your Bachelor or Master’s thesis.