Graphiti

Back in college, I stumbled across a great tool called GraphViz. I used it to generate class diagrams from the source of a PHP application I was working on. More recently, I’ve found it useful again for visualizing the runtime relationships between objects. GraphViz can turn a plaintext description of relationships into a graph image. Not “graph” as in “charts and graphs”, but “graph” as in “computer science graph”. Nodes and edges:

Today, we’ll see how to easily generate images like this. Along the way, we’ll see an interesting use case for anonymous objects.

The Dot File Format

The image above can be generated from the following text file, using a GraphViz tool called dot.exe. Note the use of the word ‘digraph’. A graph is anything made up of nodes and edges; a digraph is a graph where the edges have direction from a source node to a destination node.

digraph "Simple" {
  "a" -> "b";
  "b" -> "c";
  "b" -> "d";
}

To create the image from the dot file, run the command: dot.exe Simple.dot -Tpng -o Simple.png.

The dot file format supports a lot of options. For instance, we can customize the original dot file like so:

digraph "Fancy" {
  "b"[shape="box"];
  "a" -> "b";
  "b" -> "c"[label="1", color="red"];
  "b" -> "d";
}

Optional attributes appear in square brackets. Here, we give one of our edges a label, alter the colors, and change one of the nodes from an ellipse to a box:

One can imagine generating these dot files programmatically with a StringBuilder. Doing so would be a little monotonous, especially when the optional attributes like shapes and colors are included. You have to busy yourself with the dot format’s quoting rules as well as C#’s quoting rules. Even when you succeed, you’ll be left with some ugly code. I’d much rather generate the above two dot files like so:

var simple = new Digraph("Simple");

var a = new Node("a");
var b = new Node("b");
var c = new Node("c");
var d = new Node("d");

simple.AddEdge(a, b);
simple.AddEdge(b, c);
simple.AddEdge(b, d);

var dotFileContent = simple.ToString();

var fancy = new Digraph("Fancy");

var a = new Node("a");
var b = new Node("b", new { shape = "box" });
var c = new Node("c");
var d = new Node("d");

fancy.AddEdge(a, b);
fancy.AddEdge(b, c, new { label = 1, color="red" });
fancy.AddEdge(b, d);

var dotFileContent = fancy.ToString();

Implementing the Digraph Class

First, it’ll be handy to define an extension method to apply the dot quoting rules to any .NET value. The double-quote character is special in dot files and must be preceded with a slash. Of course, double-quotes and slashes are also meaningful in C# strings, so we have to double-escape these in the substitution:

public static class QuotationExtensions
{
    public static string Quoted(this object o)
    {
        var s = (o ?? "").ToString();
        return "\"" + s.Replace("\"", "\\\"") + "\"";
    }
}

Next, take a look at the way the optional parameters were supplied in the earlier code sample: new { label = 1, color="red" }. This is an anonymous object. It creates an instance of a class whose name we never see, with read-only properties named “label” and “color”. We pass this to a method that doesn’t know what type is being passed to it, but that’s OK because we can use reflection to inspect such an object. Here’s a helper that can look at these anonymous objects and turn them into an equivalent dictionary, whose ToString produces the dot syntax for attributes:

public class AttributeList
{
    private readonly IReadOnlyDictionary<string, object> _attributes;

    public AttributeList(object attributes)
    {
        _attributes = attributes == null
                          ? new Dictionary<string, object>()
                          : attributes
                                .GetType()
                                .GetProperties()
                                .ToDictionary(x => x.Name, x => x.GetValue(attributes, null));
    }

    public bool Empty
    {
        get { return !_attributes.Any(); }
    }

    public override string ToString()
    {
        var pairs = _attributes.Select(x => string.Format("{0}={1}", x.Key, x.Value.Quoted())).ToArray();

        var result = string.Join(", ", pairs);

        if (result == "")
            return "";

        return "[" + result + "]";
    }
}

I could have instead provided a real named class with all the possible dot properties already defined. Doing so would be useful as it would give the developer statement completion suggestions, but the dot file format is actually quite extensive. In this case, it’s a very reasonable tradeoff to use anonymous objects and reflection for these node and edge attributes. Even with statement completion, you’d find yourself frequently looking up details in the documentation anyway, so the return on investment would be ridiculous.

We can represent a single node in the graph like so:

public class Node
{
    public Node(string name, object attributes = null)
    {
        Name = name;
        Attributes = new AttributeList(attributes);
    }

    public string Name { get; private set; }
    public AttributeList Attributes { get; private set; }

    public override string ToString()
    {
        return string.Format("{0}{1}", Name.Quoted(), Attributes);
    }
}

An Edge is defined by the two Nodes it connects:

public class Edge
{
    public Edge(Node from, Node to, object attributes)
    {
        From = from;
        To = to;
        Attributes = new AttributeList(attributes);
    }

    public Node From { get; private set; }
    public Node To { get; private set; }
    public AttributeList Attributes { get; private set; }

    public override string ToString()
    {
        return string.Format("{0} -> {1}{2}", From.Name.Quoted(), To.Name.Quoted(), Attributes);
    }
}

Armed with all of the above classes, we can finally implement the Digraph class with ease:

public class Digraph
{
    private readonly string _name;
    private readonly HashSet<Node> _nodes = new HashSet<Node>();
    private readonly HashSet<Edge> _edges = new HashSet<Edge>();

    public Digraph(string name)
    {
        _name = name;
    }

    public void AddEdge(Node from, Node to, object attributes = null)
    {
        _nodes.Add(from);
        _nodes.Add(to);
        _edges.Add(new Edge(from, to, attributes));
    }

    public override string ToString()
    {
        var dot = new StringBuilder();

        dot.AppendLine("digraph " + _name.Quoted() + " {");

        foreach (var node in Nodes)
                dot.AppendLine(node);

        foreach (var edge in Edges)
            dot.AppendLine(edge);

        dot.AppendLine("}");

        return dot.ToString();
    }

    private IEnumerable<string> Nodes
    {
        get { return _nodes.Where(n => !n.Attributes.Empty).Select(Line); }
    }

    private IEnumerable<string> Edges
    {
        get { return _edges.Select(Line); }
    }

    private static string Line(object content)
    {
        return string.Format("  {0};", content);
    }
}

Criticism

Since the scope of this sample program is entirely devoted to generating the dot file format, I don’t mind the fact that I mixed state (node and edge definitions) with presentation (string formatting in each ToString override). An earlier implementation put all of the string formatting in the Digraph class, but it was fairly ugly. If the requirements expanded such that we needed to use the node/edge structure for things in addition to presentation in dot format, I would want to move all string formatting out of the Node, Edge, and Digraph classes, having a dedicated class just for traversing a Digraph to produce a dot file.