voidflower

Modelling Small Systems with UML and dot/graphviz

The first (but hopefully not last) geeky instructional-type thing I’ll post.

What is UML, graphviz, dot?

UML stands for Unified Modelling language. A picture/diagram is used to represent a ’system’, by which I mean a program in this case. Wikipedia says “UML includes a set of graphical notation techniques to create visual models of software-intensive systems.”

Graphviz is a free set of software that allows you to describe graphs in various ways. Dot is a language – nothing more – that represents a graph. The graphviz software takes a description of a graph written in dot, and give you your image in various formats.

Dot is a simple language, and uses intuitive labels and values for things like edges and nodes. The hardest part is thinking of what you need, and finding the correct spelling of the attributes to apply to your node/edge. (But that can be found on the documentation site – and google).

To make any graphs, you need: graphiz (there are windows binaries, and packages in popular linux distributions). Dot is a tool included with graphviz. You also need a text editor. There are GUIs for making graphs (and OSes like OSX have larger programs that use graphviz) but I’m going to be on the command line in Ubuntu.

First graph

There are a couple types of graphs (that I’ve seen so far and used) and they are directed graphs and undirected graphs. Directed means that there are arrows and form a direct relationship between nodes, and undirected means there are no arrows between the nodes, and the path from one node to another is bidirectional.

So let’s make a graph now. The hello world of graphs:

digraph G {
    node [ shape="box" ]
    edge [ arrowhead="dot" ]

    Hello [ label="Hello" ]
    World [ label="World", color="blue" ]

    Hello -> World
}


On the command line, I ran this command:

$ dot -Tpng -o helloworld.png helloworld.dot

-T specifies a type for the output to take. I chose a PNG image. Other types include, but not limited to, PDF, PS, BMP, GIF, JPEG… the list goes on. The -o allows me to specify an explicit output file. Last on the line is the input file – my description of the graph.

and this makes:

Hello -> World

As you may have noticed, I didn’t say anything about where things were placed, or whether Hello is below World or not. The point of this graphing software is to that stuff for you. It magically figures out (via fancy algorithms) where to put things to occupy the space in the most efficient manner, without having things overlap or look ugly.

The line-by-line:
digraph G says you’re going to make a new digraph (short for directed graph) called G. node and edge here are describing all the edges and nodes for this graph. Hello and World are nodes. They have attributes: label, and for World, color. By default, text is black, nodes are white. I’ve changed the shape colour of World to blue. I could have set other things like background colours, text colours, and probably shape outlines. The label goes inside the node. One of the most important lines, Hello -> World, describes the relationship between Hello and World. The edgeop here tells us that Hello points to World. Since I chose “dot” for my edge above, I get a line with a circle at the end. Without relationships, directed graphs are pretty pointless.

Play around with all the options and you can make some pretty cool graphs. I made a digraph to model my course plan for university a while ago located here. Each node is a course, and the relationships define prerequisites. I coloured a bunch of nodes the same colours, and those represent the terms I will do those courses in. Other examples of digraphs include things like family trees, maps, fancy to-do lists, graph theory stuff, visualisations of various data structures.

UML and this digraph stuff

So for my last CS246 project, we need to make a UML diagram to fit on an 8.5×11 sheet of paper. We’re handing it in electronically, so it must be done on the computer at some point. I really didn’t feel like writing it on paper, then opening up gimp/paint to draw boxes and lines for 3hrs. I’ve used dot before, so I know it’s much easier to open a text editor and write the descriptions of these graphs. It’s very easy to modify, then just compile in less than a second.

There are 7 classes, and 2 subclasses for this project design. So I need 9 boxes to draw the classes, then shove arrows between them in a way that describes how the classes interact with each other. UML’s boxed are sectioned into at least 2 sections. So far I’ve only done one shape – the box. Most of the other shapes graphviz offers look to be like only one label. How do I fit so much stuff into a node?

There is a special shape called a ‘record’. It allows for a special definition of a label that makes sections out of your node. The sections are formed when a pipe (|) character is seen in the label. Read more about the record shape.

digraph C {
    SomeClass [
        shape="record"
        label="{SomeClass|some attributes\l|some operations/methods\l}"
    ]
}

class

The description looks messy for that, but the important point here is that the pipes separate the label into sections. The \l in there means: make a new line, and align to the left.

With record, I can make my classes now, and then join them like a normal digraph. I’ll make a sample, more interesting system to model instead of the one I’m doing for my project.

Sample Model – Pokemon

Pokemon is cool. Let’s use pokemon for an example. I’ll model an example party of pokemon and a trainer. The key thing to look for here is common methods (attacks) between pokemon, and the connection between me (the trainer). You will see the basics UML and a simple system model in this example.

digraph Party {
    node [ shape="record" ]
    // I want the default arrowhead, so I won't set any
    // global (until changed later) edge descriptions

    Trainer [
        // You can use string concatenation to split strings into smaller chunks
        label="{Trainer|- name : String\l- money : Integer\l|" +
                 "- party : String[1..*]\l" +
                 "+ callPokemon( pokemon : Pokemon ) : bool\l" +
                 "+ returnPokemon ( pokemon : Pokemon ) : void\l}"
     ]

    Pokemon [
        label="{Pokemon|- name : String\l- health : Integer\l|" +
                 "+ tackle : Attack\l}"
        color="orange"
    ]

    Charmander [
        label="{Charmander|\l|+ ember : Attack\l}"
        color="red"
    ]

    Vaporeon [
        label="{Vaporeon|\l|+ surf : Attack\l}"
        color="blue"
    ]

    // Finally, the relationships
    // Vaporeon is a Pokemon
    Vaporeon -> Pokemon [ arrowhead="empty" ]

    // So is Charmander
    Charmander -> Pokemon [ arrowhead="empty" ]

    // Trainer can have up to 6 pokemon
    // Each pokemon is caught by only one trainer
    Trainer -> Pokemon [
        label="caught"
        headlabel="1..6"
        taillabel="1"
        arrowhead="none"
    ]
}

Once again:

dot -Tpng -o pokemon.png pokemon.dot

Okay, so that took a long time. I should have been working on something else (see sidenote). It looks like a lot of stuff up there in order to describe an image you could have drawn faster, but for much larger diagrams, this is faster (you can write programs to write dot code for you). The product of all this is:

pokemon

As for explaining things, I will assume you now have some knowledge of UML from the wikipedia page, and understand what the arrows and lines mean as well as the basic attribute types.

The most important segment of the description above is:

    Trainer -> Pokemon [
        label="caught"
        headlabel="1..6"
        taillabel="1"
        arrowhead="none"
    ]

This tells us about the relationship between the Trainer and the multiple pokemon – called ‘caught’. Since a party of pokemon can contain from 1 to 6 pokemon (1 to 6 caught pokemon), I’ve set my label to be 1..6 at the pokemon point. Each pokemon can only be caught by only one trainer, so taillabel is 1. These are called ‘restrictions’ in the model. They give the viewer some more detail about how your system/program works. If I had put headlabel=”*” that would mean that a Trainer could catch 0 or more pokemon (all 150+!).

Charmander -> Pokemon [ arrowhead="empty" ]
Vaporeon -> Pokemon [ arrowhead="empty" ]

These two lines mean that the subclasses of Pokemon – Charmander and Vaporeon – are represented as a parent/child relationship. Pokemon contains common operations (attacks) and attributes (name, health). In a program, Vaporeon and Charmander would inherit from the Pokemon base class.

Conclusion

Pokemon is awesome. The record shape makes UML possible in dot.


It’s 6am now and I didn’t sleep yet for one, and I should have been writing the test documentation and finishing the UML diagram for the project. Instead, I thought I’d make this tutorial. I’ve also eaten a lot of carrots in this time.

Idea: turn this into a mathNEWS article!!11!

Comments

4 responses to Modelling Small Systems with UML and dot/graphviz

  • Great post, going over the basics is something that is missed in a lot of languages. Simple explanations like this are very welcome when just trying to grasp the subject.

    And pokémon examples ftw!

    Michael Hartog on December 6, 2009 at 22:52 (94 days ago)

  • Upvoted for pokemon.

    The next step would have been to write a parser to read in all the .h files in the current directory and generate the UML automatically from that.

    Tony on December 6, 2009 at 23:06 (94 days ago)

  • Yes, a parser for .h files is definitely a good step after this. For this small project though, just writing it by hand would take less time than a parser. Had we been required to produce more diagrams for other assignments, I would have made one :)

    Rebecca on December 7, 2009 at 11:32 (93 days ago)

  • Nice post, wish I knew that at the time. I used Dia which does the trick, but the graphviz output looks cleaner.

    Paul on December 10, 2009 at 14:02 (90 days ago)

Add a Comment

You can't! They're closed! >:D