Getting to Know You… Towards a Capability Model for Java

Have you ever wondered what system resources the library you are about to include into your software project uses?

When you are developing software and care for the security of your system, you are in a dilemma: Either you use off-the-shelf software components and don’t know what might happen or either  inspect the off-the-shelf components (which might take the same time as a rewrite) and probably miss your deadlines. This is not a very enjoyable situation to be in.

We would like to change that and developed a high-level capability inference for Java libraries. It can tell you which system resources it uses, so you can sleep safely again because the math library you use will not leak your sensitive data.

Continue reading “Getting to Know You… Towards a Capability Model for Java” »

Hidden Truths in Dead Software Path

Why is there code in software products that actually never will be executed?

The reasons for this can be technical. Your compiler may introduce dead code to the compiled product as part of its compilation scheme.  But also this dead code can show you where you as a programmer actually did something wrong.

We developed a method using the OPAL framework based on abstract interpretation that successfully detects dead software paths, filters out all the technical issues that you couldn’t take care of and shows you just the issues that you might want to fix depending on their severity.

Continue reading “Hidden Truths in Dead Software Path” »

Writing a Brilliant Talk Abstract

When giving a talk or submitting a talk to a conference, you are often asked to give an abstract for this talk. This abstract is particularly different from that that you write for a paper for instance. However, it carries some characteristics that are indeed the same.

As I am for the third year in a row in the program committee for the Frankfurter Entwicklertag (Frankfurt Developer Day) – an industry conference – I thought it might be a good idea to provide a small tutorial for producing talk abstracts that lets program committees easily assess the quality of your work and also lets your work shine in the conference programme.


Having a good structure is key to getting your idea through. If you bore your readers they possibly won’t read until the end. Also if you give all the important information away in the first sentence why should your read bother until the end? So finding the right balance is your task.

A talk abstract is usually larger than a paper abstract. Typical paper abstracts are about 200-250 words. For the Entwicklertag we limit abstracts to 5,000 characters – which is roughly 850 words. You should USE this space… If you just write three sentences it is hard to tell if this is actually great or not.

As Frank Herbert writes in Dune “A beginning is a very delicate time.” The first part of your abstract is there to convince people that it is actually worth reading the whole thing.

Always start with a problem statement. Tell your reader what you are trying to solve or what particular thing you are addressing in your talk.
When your reader has understood what the problem actually is, then you need to convince her that it is actually a problem worth solving. This is the motivation. Tell us why we should care. You can either do this by sketching the benefits of this problem being solved or by outlining the cost of ignoring it.

Now that you caught your readers attention it is the time to actually cash in on it and tell them what you are suggesting in your talk. It is the section where you present your approach to things. Here you can be much more detailed but be sure not to give everything away… Keep them interested to actually listen to your talk, but let the committee know that you know what you’re speaking about. Evidence helps a lot – for instance: GitHub repository? Name it!

Finally it is time to close. It is also the last chance you have to convince the last doubters that your work is great. So first conclude and quickly recap all the things you told us earlier. And then give us some motivation to come to your talk. What do we take home? Will we be a better person? Will be be a better programmer?


A good abstract should stand for its own. It should be self explanatory.  You don’t have to mention every detail of your work, but you have to have every major point that your make in the text.  Be concise and don’t add any fluff… We just want to read about this particular talk and not anything else.

You should be aware of your target audience… Don’t overshoot and rely on things they might not know. But also don’t underestimate what they already know.

Ask yourself while writing: “Would I go to the talk or does the abstract already include everything that I need to know?” Because your ultimate goal should always be getting people to your talk.


… all this only works if you have a great thing to talk about in the first place. It is just a way to get your idea across to your readers which you ultimately want to come and enjoy your talk.

Design your Analysis!

Does software engineering matter for your research prototype? Johannes and I published a paper in the latest SOAP Workshop at PLDI and tend to answer “yes of course!”.  We inspected a data flow analysis using the IFDS framework. Even though we only discuss and inspect this particular case in the paper, the argument itself is, however, more general.

When developing static code analysis, we – as in we the researchers or practitioners – forget everything that we learned in our software engineering classes and write code that is very much sequential and not modularized very well. Given that the domain can be pretty challenging and may require intensive testing it is very surprising to me that we tend to write code that is hard to test in isolation. So what we do most of the time is integration testing with huge chunks of test code that we analyze and then check the results against our expectations for the chunks. But this also means that if your analysis target is large so is your expectation set and tracing and fixing bugs in your implementation can be a pretty tough task.

We faced this problem while implementing FlowTwist using Heros which is an implementation of the IFDS framework/solver by Reps et al. The framework forces you to separate your analysis steps into so called flow functions that represent treatments of data flow facts at instructions. It thus forces you to design along the lines of the framework which is a good first step.

However, a simple flow function implementation has to cover a lot of concerns such as the treatment of assignments, conditionals, field accesses, etc. Therefore, our flow function implementations tended to become incredibly long pieces of code (over 300 LOC) that were hard to read, incredibly hard to explain to other people, hard to test, hard to maintain and impossible to reuse. So plainly a big big mess with a method header.

Given that we actually teach students not to write code in this manner, we were – let’s say – unsatisfied with the situation and refactored. We quickly observed that we were handling multiple concerns in the same flow function. What took a while is the insight that we actually scattered things over the four flow function types.

Phases and Propagators in FlowTwist’s design. F is a Fact and U is actually the union operator

So we came up with a new design of Propagators that factored out these concerns in a single neat and tidy class. We quickly became aware that we needed a way of expressing a sequence like first filter then do expensive calculations in order to produce something that will actually work and introduced phases as a means to organize propagators. The paper goes into the details here.

In the end, the story is that software engineering matters even in your research prototypes and – YES – you should care about it. Believe it or not sometimes your research actually gets picked up and continued or extended by other people. The likelihood of this happening will dramatically increase if your prototype is easily understandable, testable, maintainable and extendable. Even though I haven’t got any proof for this last very bold sentence, I believe it to be true – and might be supported by the gods of software engineering.

FlowTwist – Efficient Context-Sensitive Inside-Out Taint Analysis

FlowTwist_Logo_rgbData flows can embody significant security risks that may remain undetected as security critical flow cannot be distinguished from normal usage. Passing an argument to a function that might lead to unwanted possibly security-critical behavior is called an integrity issue. Attackers can hereby achieve program reactions that the original programmer did not intend like crashes or data corruption.

Integrity problem: Be careful what someone gives you!
Integrity problem: Be careful what someone gives you!

Continue reading “FlowTwist – Efficient Context-Sensitive Inside-Out Taint Analysis” »

PEAKS – Project looking for heroes

peaks_logo_mediumYou are in the final steps of your master thesis? I am currently looking for a young researcher (M.Sc. or comparable preferably computer science or mathematics, preferably heroes of all kind) that is interested in a one-year research project in static analysis and security. It is actually a nice way to look into daily scientific work before deciding to do your PhD (or not) or to get a smooth start into your own PhD project if you have already decided.

The project that needs a hero is PEAKS. The name is an acronym for Platform for the Efficient Analysis and Secure Composition of Software Components (and just works as an acronym in German). It tries to solve (or alleviate) the dilemma you are facing when trying to write secure software efficiently using pre-existing software components. Currently, it is pretty hard to limit the libraries you use to the privileges you want to give to them. In a common setting these libraries run with the same privileges as their hosting process and can therefore do all the good and all the harm as the process can.

We would like to change that, because we believe this could be solved better!

Following the Principle of Least Authority,  software libraries should only be equipped with the privileges necessary for their job. There are two possible ways to achieve this… You could either isolate the components from the rest of the application and only process those requests that are privileged or you could monitor its behavior and make decisions based on the monitoring.

In PEAKS we follow the latter approach to reach an effective isolation through the monitoring of static properties. We analyze Java libraries and try to find transgressions of the Object-Capability Model. In the OCaps Model objects in an object-oriented system are both the subject and the privilege of the security system. Or better: they could be used in either way. Here the privileges – called capabilities – only can be transferred from one subject to the other in four ways:

  1. The initial setup of the system is granting it
  2. Parenthood – a new object is created. The creator owns the capability to call the object.
  3. Endowment – during the creation of and object the creator endows the newly created object with a subset of its own capabilities.
  4. Introduction – when an object calls another, it can choose to transfer the capabilities to the other object.

It is obvious that some widely used object-oriented programming models such as Java can violate this principle. Though static analysis we try to find these transgressions in libraries. They can be of the following types.

static callsStatic Calls

Through static calls you can construct a so called ambient resource. Capabilities can be shared between subjects that otherwise wouldn’t have access to it. You lose the ability to control the explicit dispensation of capabilities.

However, not every static call is harmful here. In certain circumstances functionally pure methods might be allowed.

intrusive reflectionIntrusive Reflection

The most prominent and excessively used security feature of an object-oriented system is information hiding. If you break information hiding the inner state of an object can be read or altered. Subjects possessing a  capability can use intrusive reflection to break information hiding and effectively gain all transitively accessible capabilities.

native callsNative Calls

When using Java you might not be aware that most of the calls you make sooner or later end up executing code that is directly compiled from C/C++ to native code residing either in the Java Class Library or the software library you are currently using. This code is executed outside the security context of the Java Virtual Machine and can access and alter the VM memory directly. Thus, one of the major requirements of the OCaps model – the unforgeability of references – can not be guaranteed any more.

You might have seen in the past week, that I am offering a few student thesis projects in this area:

The work done in these thesis projects complements PEAKS as it forms a baseline for further library analysis because software libraries usually call JCL methods as well.

Are you interested in being that hero?

The project team already did a lot of preparatory work and we already have a running prototype. Your task will be to develop this prototype into a running proof-of-concept together with the project team, to evaluate it and strengthen the theory behind the approach. Of course publications are part of the job as well.

You should be:

  • a graduate of computer science or mathematics (M.Sc. or comparable)
  • a good software engineer – programming is definitely required here
  • able to talk in terms of security – privilege and protection domain are notions that you understand
  • a team player
  • bright
  • brilliant
  • a fluent speaker of English and (a plus!) German

You are a heroine? I especially encourage female computer scientists or mathematicians to apply for the job!

Please contact me over one of these channels if you are interested in the job.


PEAKS is funded as part of the Software Campus initiative of the German Ministry for Research and Education (BMBF) filed under 01|S12054.

Student thesis project: Classification of native methods of the JCL using an analysis of their implementation

The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java security model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already used in the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

Clearly, not every of the roughly 1,800 native methods in the JCL constitutes a serious risk, some of them might even be completely benign. For instance, a call to might be harmless in contrast to sun.miscUnsafe.copyMemory. So the potential threat of a native method is depending on their treatment of the input data and the resulting expected (or unexpected) side effects they produce (e.g. memory alterations, buffer overflows, …).

javas native threat - classification

In this thesis an automated code analysis has to be developed that operates on the native part of the JCL (e.g. with LLVM) and classifies the methods visible to the Java part of the JCL according to their potential threat. Different input data for this classification can be utilized here. Interesting signals might be (and are not limited to) functional purity, direct memory manipulations, pointer arithmetics or type misuses. Basically anything from current exploit literature can be applied here to achieve more precise and meaningful results.

You will find the original thesis description on the institute’s website. If you are a CS student of TU Darmstadt, please contact me if you are interested in making this topic the topic of your Bachelor or Master’s thesis.

Analysis of Java Exploit CVE-2013-2460

Lately I’ve been looking very much into the Java Security model as the news coverage is quite excessive since September 2012. Today I took a closer look at  issue CVE-2013-2460.

In principle these exploits all work more or less in the same way. Usually, they come in the form of an applet that can be easily delivered to the possible victims with a web site.  These applets now try to break the sandboxing security model of Java. In order to do this, they use functionality of the JCL that is not properly checked. These in turn provide access to security critical functionality that is used to break applet encapsulation. We call them “Confused Deputies” with respect to the definition of Norm Hardy.

In CVE-2013-2460, these are calls to java.lang.reflect.InvocationHandler, which is an interface.  The used instance has the concrete type sun.tracing.NullProvider, but the interesting method is implemented in its superclass sun.tracing.ProviderSkeleton.

Here, the method Method.invoke is called. This method does not implement any checks whether a public method should be invoked. Therefore, also security critical, caller-sensitive methods can be called with this proxy.

This was actually fixed after Java 7u21 with the following lines:

Class localClass = paramMethod.getDeclaringClass();
if ((localClass == Provider.class) || (localClass == Object.class))

Using this vulnerability three classes are being loaded:

They are then being used to create objects of the type MethodHandle that point to the following methods:


To get this to work, the exploit is using a very mean trick…

Using the InvocationHandler the method MethodHandles.lookup is called. This leaks a reference to the class Lookup, which also falls under the Confused Deputy category.

This class is somehow infamous for such nice implementations as this: 

/* Obtain the external caller class, when called from Lookup.<init> or
*  a first-level subroutine. */
private static Class<?> getCallerClassAtEntryPoint(boolean inSubroutine) {
final int CALLER_DEPTH = 4;

// Stack for the constructor entry point (inSubroutine=false):
//   0: Reflection.getCC, 1: getCallerClassAtEntryPoint,
//   2: Lookup.<init>, 3: MethodHandles.*, 4: caller
// The stack is slightly different for a subroutine of a
// Lookup.find* method:
//   2: Lookup.*, 3: Lookup.find*.*, 4: caller
// Note:  This should be the only use of getCallerClass in this file.

assert(Reflection.getCallerClass(CALLER_DEPTH-2) == Lookup.class);
assert(Reflection.getCallerClass(CALLER_DEPTH-1) == (inSubroutine ? Lookup.class : MethodHandles.class));
return Reflection.getCallerClass(CALLER_DEPTH);

And just this small and benign looking return statement in the last line of code is spreading the misery here. It SHOULD return the calling class but in fact is returning sun.tracing.ProviderSkeleton in this case…

Which is of course highly privileged… Red Alert!

Then the usual stuff happens… A fresh ClassLoader is used to load and create a class from a byte array, that of course includes some highly miserable code, namely:


That loaded class of course is privileged as the ProviderSkeleton itself is. The SecurityManager is then disabled and disaster may begin…

You will find the code for this exploit as well as other exploits in the Java Exploit Library I am maintaining with Johannes Lerch.

Also, here is a very interesting report on the exploit provided by Security Explorations, a Polish team concerned with Java security since 2002.


Expressing constant curiosity.