Student thesis project: Charting and characterizing data flows between the Java and the native part of the JCL

The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java Security Model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already provided by the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

When constructing an attack against the Java Security Model using the native part of the JCL most attacks use specially crafted input sent through Java methods to the native part. This crafted input might break the native part and thus enable the Java part of the exploit to deactivate the Java Security Model (e.g. CVE-2013-2465) and continue in full privileged mode. Choosing an Applet as the delivery method for the exploit the number of possible targets easily becomes interesting for an attacker.

javas native threat - dataflows

In this thesis an automated analysis of the data flows between the VM-controlled and the native part of the JCL has to be created. As it will be hard to cross the language boundary between the VM-controlled and the native part with an analysis, the analysis may run in two steps. One step analyzing the Java part of the JCL (e.g. with Soot) and another step analyzing the native part of the JCL (e.g. with LLVM). The results of both analysis steps then have to be combined to produce an overall result. A classification schema has to be developed to characterize data flows depending on their possible exploitability. For instance, some safe guards and input sanitizers might mitigate threats well, while others might not. Additionally, certain data types could be more prone to exploitation than other data types. However, some parameters of the Java Native API might not even be accessible for an attacker at all.

You will find the original thesis description on the institute’s website. If you are a CS student of TU Darmstadt, please contact me if you are interested in making this topic the topic of your Bachelor or Master’s thesis.

Java Exploit Library started

Today, Johannes Lerch and I started a public bitbucket project to collect and categorize example code from Java-related exploits. It should help researchers and students to develop effective detection and prevention techniques against them.

To construct this library of exploits, we gather information from different public analysis reports and attribute the authors of the original reports, so everything can be tracked back.

Feel free to comment, use the exploit code to develop your own countermeasures, perform pull request if you want to add or correct a thing.

Link: Java Exploit Library Project

Student thesis project: Modelling the use of native methods in the Java Class Library

As announced earlier, I will present some ideas for student theses here. Feel free to comment on them.

The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java security model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already used in the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

Currently, users of the public API of the JCL are completely oblivious of the fact that most of their method calls will sooner or later result in a native call. Thus, they rely on the JCL to perform any checks or sanitization necessary. The non-native part of the JCL therefore endows the trust of application developers using it.

As the JCL and its native part have grown over the years of its existence, security reviews became increasingly complicated to perform purely by hand. Oracle, as one of the larger contributors to Java, runs code analysis tools to aid these reviews. Due to the complex nature of the Java security model and the architecture of the JCL finding vulnerabilities is a rather tough problem.

javas native threat - modelling

In this thesis an implementation of an automated static code analysis has to be created that is able to evaluate the propagation of the possible threat native calls pose to the public API of the JCL. Assuming that every native method poses the same amount of threat, the propagation of this threat is solely depending on the data provided to these method. Therefore, the analysis will largely benefit on an elaborate rating of the data, its type, safe guards on that data and possible treatments applied to it. In order to combine this information, techniques from data mining, machine learning, graph or network theory can be applied to the transitive hull of the reverse call graph (around 95.000 methods) of native methods (around 1800 methods) in the JCL. After a successful analysis run it should be possible to determine the risk of calling methods of the JCL. Developers can then take effective countermeasures while processing user input and make educated choices on the classes and methods they are using.

You will find the original thesis description on the institute’s website. If you are a CS student of TU Darmstadt, please contact me if you are interested in making this topic the topic of your Bachelor or Master’s thesis.

Code obfuscation in Java

You carefully reviewed your Java source code for thing you don’t want to happen?
Well, look again and take a close look at your comments…

What do you think the following example will produce?

public class OMG {
  public static void main(String[] args) throws Exception {
     /*
       \u006a\u0075\u006e\u006b\u0079\u002a\u002f
       \u0053\u0079\u0073\u0074\u0065\u006d\u002e
       \u006f\u0075\u0074\u002e\u002f\u002f\u0078
       \u0070\u0072\u0069\u006e\u0074\u006c\u006e
       \u0028\u0022\u0048\u006f\u0077\u003f\u0022
       \u0029\u003b\u002f\u002a\u0020\u0062\u0079
       \u0040\u006d\u0069\u0068\u0069\u0034\u0032
     */
  }
}

It’s a main method with just some comments in it… seems strange but reasonably harmless doesn’t it?
Well, it isn’t that harmless…

The output of this is the following:

How?

As part of the compilation step Java performs a conversion of the ASCII escaped unicode characters in the comments to real unicode characters. These characters may also close a comment block. The upper code block thus gets converted into this:

public class OMG {
  public static void main(String[] args) throws Exception {
     /*
       junky*/
       System.
       out.//x
       println
       ("How?"
       );/* by
       @mihi42
     */
  }
}

According to Joshua J. Drake’s Black Hat article someone called Michael Schierl reported that first. But as he does not provide a reference and I could not find anything besides Joshua’s article via Google, the credits have to stand unlinked for now.

Edit: At least found his twitter account… 🙂

Java’s Native Threat: Charting Some Unsafe Areas

Whilst my research on the Object-Capability Model continues, I am investigating the native part of the Java Class Library (JCL). This is a part of the JCL that is written in C or C++ and compiled specifically to the target platform and the operating system. These are functions that bind with operating system procedures such as file or network I/O, graphical user interaction or process control. Additionally functionality with an increased performance need (e.g. reflection, array copys, …) can be found.

Users of the JCL are usually oblivious of the fact that parts of the execution that they trigger will be performed outside of the Java VM, its safety guarantees and its security model.

The native part of the JCL is undergoing a manual code review as well as automated checks for common mistakes and vulnerabilities. Nevertheless, as Jack Tang pointed out recently, there is an increase in the number of published vulnerabilities that are related to the native part of the JCL.

Typical attack patterns try to disable the Java security manager (resp. the AccessControlContext) by providing constructed harmful input data to a native method. This is not an easy task in the light of operating system countermeasures such as ASLR and DEP.

In the following days I will present ideas for novel approaches to the problem of finding these vulnerabilities or assessing the current risk. These ideas will be available for CS students of TU Darmstadt in the form of bachelor or master thesis topics.

I’m a certified ScrumMaster now!

It’s been awfully quiet here… I know.

I had some busy month since the start of the year. A lot of tasks needed my full attention and things like blogging received a sudden low priority. But thing are changing for the better and this leaves me more time to talk about the things I do.

The main reason: My team was introducing Scrum at the start of the year and I was one of the driving forces behind that effort. And actually we’re getting good results…

Although it took a lot of work getting it started and establishing the proper rhythm and cycle, we see the benefits in the form of a better and more comfortable communication every day. We actually have a lot of fun. 🙂

Last week, a colleague and I went to a Certified ScrumMaster (CSM) course in Karlsruhe held by Joseph Pelrine organised by andrena objects. (And yes, we’re both certified ScrumMasters now.) We were quite impressed by the course and the way Joseph taught us all the little traps and pitfalls other teams went through in the past. It was very interesting to compare our approach to the way other people implement Scrum in their development process.

A lot of work is ahead of us, but we definitely caught a spark there to keep on working on our process.

NHibernate Attributes

I am using NHibernate for some years now, mainly to integrate an existing database into the new shiny .NET version of the application while it is still being used by a legacy application.
It’s served very well as an integration technique saving me from headaches more than once. Especially when moving from MS SQL Server to Oracle to MySQL while attracting more customers to the application.
Thus, I use mapping files which are deployed alongside the application in order to make the odd adjustment for some customers. That doesn’t happen very often, but it does.
Recently I had the chance to start a new application. Besides all of the advantages, writing mapping files CAN get a little bit tiring when creating new data model classes. That’s where NHibernate Mapping Attributes come into play.
The concept is quite straight forward. You add a lot of annotations to your data class and during application startup the library creates the mapping infomation in memory and configures NHibernate accordingly. It could actually be a little more directly, but it works just fine.

Configuration cfg = new Configuration();
HbmSerializer.Default.Validate = true;
MemoryStream ms = HbmSerializer.Default.Serialize(System.Reflection.Assembly.GetExecutingAssembly());
cfg.AddInputStream(ms);

The argument of the Serialize() method could be any other assembly if you like. For my prototype it was the executing assembly. After that bit, it’s just the usual NHibernate process of creating the SessionFactory and getting the Sessions from the SessionFactory.
Creating a persistent class is just as easy.

using NHibernate.Mapping.Attributes;

[Class(Lazy=false)]
public class Something
{
[Id, Generator(Class="assigned")]
public int Id{ get; set; }

[Property]
public String Name { get; set; }
}

The attributes are quite self-explanatory and if you are used to writing mapping files it is quite natural.

What’s so hard about exceptions?

I keep wondering… Why is it actually so hard to handle exceptions correctly?

There are a few ground rules to follow and still people tend to overcatch or throw far too general exception types. As a recent empirical study suggests, this seems to be a general problem throughout software development. There is a brief, very well written guideline for .NET development, I keep waving at my team. I bet there is a similar guide for Java as well – any suggestions?

But I think the problem with exceptions is not just some guidelines that should be followed. Not every developer (and certainly not every customer) is comfortable with the ideal of error that may pop up from out of nowhere. Errors are silently suppressed for the better looks. But missing functionality cannot be hidden. If it doesn’t save, it doesn’t save and I certainly want to know that…

Logging is a way, but there is no path around exceptions in object-oriented programming. Taken that, there is also no way around a fail fast ideology. If something has gone wrong and I cannot do anything about it, it’s best to tell the world and not to brush it under the carpet. That’s also the best way to learn, by the way.

So, the first thing to establish proper exception handling seems to be the introduction of a culture of openly accepted mistakes. “Anything we find now, we won’t find later on.” should be the new motto.

Expressing constant curiosity.