Radia

Visualizing Code

Radia is a 3D visualization framework for exploring disassembled binary code*, for the purpose of illuminating vulnerabilities.

Co-created by Zoe Hardisty and Devin Kinch.

*Terminology has been underlined and the definitions can be found at the end of this post.

Radia is desktop application created in Unity3D, to visualize call graphs in immersive 3D space, as an explorable force directed graph. The goal of Radia is to improve visibility into the complexity of compiled code, and augment the available tools for auditing. It does this by highlighting potential problematic functions, showing the code interconnections, flow, and groups nodes via their gravitational pull towards other functions they interact with. Visualizing these structures lets code auditor use different cognitive processes including spatial reasoning and pattern recognition to parse the code in a macro way, over looking at it as a text based output.

The Problem Space

To understand the relationships in the structure of a large block of code, one of the tools at our disposal is to visualize it. The issues with current visualization tools their format is too limited to express complex data effectively, especially in the commonly used 2D graphs. They offer limited ways to interact and explore the data, making it difficult to impossible to trace interconnections. This prevents users from getting a sense of the entire code structure and its interconnections.

Radia assists this process by providing a visual way to explore the data in an interactive 3D environment.


The Audience

Reverse Engineer Mental Model
Reverse Engineer Mental Model

Reverse Engineer Empathy Map
Reverse Engineer Empathy Map

Radia is a specialized tool designed for reverse engineers, security researcher and other code auditors, to gain more visibility into code. A number of exercises where done to map out my user research to gain insight into Radia’s audience.

 

persona


Research

When I initially created this project I asked the question:

Can immersive virtual reality visualizations of code structures, help reverse engineers understanding?

To answer this, I conducted secondary research on the effectiveness of being immersed in complex data, and how having a spatial relationship to the body could increase understanding and interpretation.

I found there is a correlation in increased understanding in immersive environments with certain types of data, including complex information like scatter plots, network topologies and call graphs over screen based representations.

The initial concept of using Virtual Reality as the primary method of interaction, came against the current workflows of the target audience. The primary issue is that most people I observed reverse engineering software use multiple displays to work with numerous programs and views of the code at once, which would mean that Radia would have to take over much more functionality of disassembler software then was originally intended. VR is does not lend to large amounts of text, due to field of view, and the resolution of the headset.

Interactability was also an issues, as this particular user group is adept with programs like IDA Pro and Vi, which rely heavily on keyboard short keys and minimal to no mouse usage. I originally wanted to include gestural controls using the Leap Motion to overcome the need for keyboard, however the gestural controls were imprecise and limited the options for effectively documenting finding while exploring the visualization. Continuing with VR would necessitate the use of head tracking for movement and to remaining seated to use a keyboard, which worked in testing, but left the users peering down their nose to see the keyboard, or taking the VR headset off and on frequently. I still believe there is opportunities to explore this sort of problem with VR or Augmented Reality (AR) headsets, but felt that concentrating on building a working visualization tool that fits in with the current reverse engineering workflow would generate a better solution.

For information about some of the research that went into Radia, please see my article in the Current Design Research Journal: Issue 06,  Radia: Visualizing Code Structures.

Reverse Engineering workflow

 

The Software

Radia a cross-platform visualization application which is intended to be used as part of an existing reverse engineering workflow. It was built to be companion software to the IDA Pro disassembler, and utilizes its plug-in architecture to pass the data as a JSON file into Radia. IDA Pro takes a binary code file like a application, and converts it to assembly language making it human readable.

The Graph

The visualizations are created using the node and edge model inherent in semantic representation of information like call graphs. The binary code, once disassembled by IDA Pro, is broken down into basic blocks which are represented as nodes. The edges represent all the cross-referenced calls between the basic blocks made by the program.

The data is fed into a force-directed graph algorithm at the heart of Radia and assigned attractor and repeller forces according to the number of connections it has, and the specific groupings it belongs to. When the visualization generates, relational structures form into distinguishable patterns inherent to that piece of code, which provides insight into its functionality.

The nodes are also parsed through a list of flagged functions that are known to be problematic, and assigned a specific node shape according to the most severe issue. Radia also pulls relational data from IDA Pro to show call flow, function names, basic block information, and comments from the original programmers.

Visual Design

For the visual design I wanted early on to give it a techy, futuristic/ sci-fi feel. Usually futuristic interfaces are predominantly blue and black pallete. I migrated away from this paradigm, as it felt cliched, instead opting for a retro futuristic colours of layered blacks with orange, inspired by the Grid Compass computer with its orange text on black, and harder shapes.

For the shapes I used hexagons throughout, as I like the stability of the shape and the reference to hex which the assembly language is written in. It also let me pair 45 and 30 degree angles with it, that I believe builds an interesting aesthetic. I felt that making the UI fit into the current trend of super minimal interfaces when the visualization looks like organized chaos at times would be jarring. Same with employing the principals of flat design when I am trying to deal with spatiality also feels counter intuitive.

To complement the angular interface I choose the typeface Bender in heavier weights, which carries on the 45 degree angles and the futuristic look. Ubuntu Mono was also used in areas where code text import and export where needed, to avoid spacing issues when passing the information back and forth.

To represent the nodes I chose hard geometric shapes to increase tension and bring awareness to potential issues — bright colours also increase detectability over a large area. Conversely, the Default node is soft and neutral.

Initial node shape sketches
Initial ideation of the shapes that would represent the different types of nodes in the visualization.

Node connector ideation
An investigation of line types used in various types of visualizations.

Line Connector Ideation
The arrow style lines where ultimately chosen for their clear represented the code flow.

Node Shape Legend
The current shapes and colours are assigned to each node in the visualization. They are designed to be easily recognizable over a distance, so large visualizations can be easily explored.

Radia Title Screen
Title screen for the application.

Radia Main Menu
Main Menu Screen: From here the user can customize the key mapping, adjust look sensitivity and travel speed and add the exported JSON file exported from IDA Pro to be visualized. Currently only the screen launch mode option works, but with further development with Radia and VR hardware this option may become feasible.

This is the last build of Radia (0.9.1), showing the menu system that is tied to each node. The menu gives the researcher insight into the incoming and outgoing calls (ingress and egress), the flagged functions picked up on import that could be potential problems and any strings that may be included in the original code to give extra context.

Radia zoomed out view
Distinct patterns can be seen when the you fly back from the visualization.


Radia was shown at the Emily Carr University of Art and Design as part of “The Show” as my thesis project.
Radia Exhibition Space

 

Radia was displayed to the public during May 2015 at Emily Carr. Included was a live demo of the software that showed the code extracted from the camera on the right. A poster to explain the problem space, solution and terminology. Branding elements and a 3d model made out of paper that mimicked what Radia displays. I custom build the stand to provide a place to interact with Radia, and to provide security and active cooling to the computer.

I choose the camera to illustrate the need for better code quality, and the amount of problems with current Internet of Things (IOT) commercial software. A issue that as an industry should be remediated.

 

3D Paper Model of Visualization
The model was crafted from folded paper patterns, extracted off the 3d models in the visualization and carefully hand folded.

 


Keywords & Concepts

Call Graphs

A directed graph (flow graph) that represents calling relationships between subroutines in a computer program. Specifically each node represents a procedure and each edge (f, g) indicates that procedure gf calls procedure g. Created to illustrate a program for human understanding in a static or dynamic way.

DYNAMIC: Shows a graph of one run of the program.
STATIC: Shows all possible iterations of the program.

Reverse Engineering

Reverse engineering software is the process of taking binary code ( 0’s & 1’s ) andtransforming it into human readable assemble language. This is done to gain an understanding of how a program was built, its internal functionality and any problems it may contain.

Disassembler

Explores binary programs, for which source code isn’t always available, to create maps of their execution. A disassembler shows the instructions that are executed by the processor in a symbolic representation called assembly language. [1]

Binary or Machine Code

A binary code represents text or computer processor instructions using the binary number system’s two binary digits, 0 and 1. A binary code assigns a bit string to each symbol or instruction.

Assembly Language

A low level programming language for a computer that generally has a one-to-one correspondance between the assembly language and the machine code instructions. Assembly is converted into machine code by an assembler, and recovered with a disassembler.

Basic Block

Is a maximal sequence of instructions that executes, without branching, from beginning to end. Each basic block therefore has a single entry point (the first instruction in the block) and a single exit point (the last instruction in the block). The first instruction in a basic block is often the target of a branching instruction, while the last instruction in is often a branch instruction. [2]

[1] IDA Pro documentation www.hex-rays.com
[2] Eagle, Chris. The IDA Pro Book: The Unofficial Guide to the World’s Most Popular Disassembler. 2nd ed. No Starch Press. 2011. pg 61. Print.

 

 

Top ↑
← Token Tap & Pour →