Visualizing version control data
How might a software development team use version control data to augment decision making?
Version control systems are tools that enable large teams of software developers to collaborate on a single project, track changes to the project, and not interfere with each other’s work. Version control data is not currently used to its full potential. Currently, developers use version control data to prescribe fixes to bugs after they are detected; unfortunately, few products use version control data to predict and prevent bugs before they occur.
First, I conducted a comprehensive literature review to learn how version control data is currently used for predictive analytics. That’s how I learned about code churn, an extremely useful metric that is underutilized in current industry solutions.
A metric that represents the number of lines of code that were modified, deleted, or added to a repository of source code
Easy to measure using data generated from version control systems.
Measures system volatility and is a useful predictor of software bugs
Based on the findings of my literature review, I sketched out ideas for how to best visualize code churn and narrowed down my list of potential solutions to the following two graphs:
Advantage: high information density at the attribute level
Trade-off: does not visualize network structure
Advantage: visualizes both network structure and key attributes of each node
Trade-off: does not visualize changes over time
After sketching, I built out the designs using real data sourced from public Github repositories. Using the command line, I forked my own copy of a public repository and piped the version history into an excel file. From there, I manually encoded the visualizations using the Sketch software platform.
I’ve identified two areas where future research is needed:
Which visualization can users most rapidly and accurately read?
Remote, unmoderated user experience evaluation using survey software
Within-subjects design, each participant reads a graph and answers three questions, repeats until all graphs have been read
To create a sense of urgency, participants will have to answer each question before a countdown clock runs out of time
Metrics to capture: task completion, time on task, & task accuracy
Formative user research
How do users currently understand code churn? What kinds of information are most important to users when monitoring code churn?
One-one-one user interviews with software developers and project managers