Category Archives: Static Analysis

Getting to Know You… Towards a Capability Model for Java

Have you ever wondered what system resources the library you are about to include into your software project uses?

When you are developing software and care for the security of your system, you are in a dilemma: Either you use off-the-shelf software components and don’t know what might happen or either  inspect the off-the-shelf components (which might take the same time as a rewrite) and probably miss your deadlines. This is not a very enjoyable situation to be in.

We would like to change that and developed a high-level capability inference for Java libraries. It can tell you which system resources it uses, so you can sleep safely again because the math library you use will not leak your sensitive data.

Continue reading “Getting to Know You… Towards a Capability Model for Java” »

Hidden Truths in Dead Software Path

Why is there code in software products that actually never will be executed?

The reasons for this can be technical. Your compiler may introduce dead code to the compiled product as part of its compilation scheme.  But also this dead code can show you where you as a programmer actually did something wrong.

We developed a method using the OPAL framework based on abstract interpretation that successfully detects dead software paths, filters out all the technical issues that you couldn’t take care of and shows you just the issues that you might want to fix depending on their severity.

Continue reading “Hidden Truths in Dead Software Path” »

Design your Analysis!

Does software engineering matter for your research prototype? Johannes and I published a paper in the latest SOAP Workshop at PLDI and tend to answer “yes of course!”.  We inspected a data flow analysis using the IFDS framework. Even though we only discuss and inspect this particular case in the paper, the argument itself is, however, more general.

When developing static code analysis, we – as in we the researchers or practitioners – forget everything that we learned in our software engineering classes and write code that is very much sequential and not modularized very well. Given that the domain can be pretty challenging and may require intensive testing it is very surprising to me that we tend to write code that is hard to test in isolation. So what we do most of the time is integration testing with huge chunks of test code that we analyze and then check the results against our expectations for the chunks. But this also means that if your analysis target is large so is your expectation set and tracing and fixing bugs in your implementation can be a pretty tough task.

We faced this problem while implementing FlowTwist using Heros which is an implementation of the IFDS framework/solver by Reps et al. The framework forces you to separate your analysis steps into so called flow functions that represent treatments of data flow facts at instructions. It thus forces you to design along the lines of the framework which is a good first step.

However, a simple flow function implementation has to cover a lot of concerns such as the treatment of assignments, conditionals, field accesses, etc. Therefore, our flow function implementations tended to become incredibly long pieces of code (over 300 LOC) that were hard to read, incredibly hard to explain to other people, hard to test, hard to maintain and impossible to reuse. So plainly a big big mess with a method header.

Given that we actually teach students not to write code in this manner, we were – let’s say – unsatisfied with the situation and refactored. We quickly observed that we were handling multiple concerns in the same flow function. What took a while is the insight that we actually scattered things over the four flow function types.

Phases and Propagators in FlowTwist’s design. F is a Fact and U is actually the union operator

So we came up with a new design of Propagators that factored out these concerns in a single neat and tidy class. We quickly became aware that we needed a way of expressing a sequence like first filter then do expensive calculations in order to produce something that will actually work and introduced phases as a means to organize propagators. The paper goes into the details here.

In the end, the story is that software engineering matters even in your research prototypes and – YES – you should care about it. Believe it or not sometimes your research actually gets picked up and continued or extended by other people. The likelihood of this happening will dramatically increase if your prototype is easily understandable, testable, maintainable and extendable. Even though I haven’t got any proof for this last very bold sentence, I believe it to be true – and might be supported by the gods of software engineering.