On relational tables

← Back to Kevin's homepagePublished: 2018 Aug 25

I’ve been exploring Alloy for the past few months and it’s raised the interesting problem of how to best display relational data within tables.

There are plenty of data visualization resources on designing charts (scatter plots, bar charts, line charts, etc.) and tables (summary statistics, pivot tables) of quantitative data, but I haven’t been able to find much about tables of relational, mostly non-quantitative data.

I’m interested in exploring something akin to the grammar of graphics, but for relational data tables.

This should make more sense after an example.

The jug puzzle

How do you measure 4 quarts using just a 5 quart jug and 3 quart jug?

We can find the solution by encoding this problem in Alloy, which will find a solution. The latest version, Alloy 5, will also visualize the solution using tables:

Alloy table output

(I’ve truncated the full output since it doesn’t really matter — it’s easy to download and run Alloy if you want to see the full example or other visualizations of the solution.)

Here we have three tables: One with our two jugs, one with an instance of a Pour operation (from jug0 to jug1), and one with the list of steps required to solve the puzzle (start with the jugs empty, then fill the 5-quart jug, pour as much as you can into the 3-quart jug, etc.).

The tables are shown in a completely normal form: There’s one table for every signature, with rows corresponding to atoms and columns corresponding to fields (if a field has multiple values, the values are stacked; if a value is itself a relation, subcolumns are drawn; the jugAmounts field of State demonstrates both situations).

Together these tables show the full solution, but it’s not particularly easy to read or understand.

(That is, if I were running some kind of puzzle solution factory, I’d want my employees to have clearer instructions.)

I can think of three kinds of transformations to this normalized view that would be helpful for specifying clearer presentational tables.

Specifying joins

The first transformation is joining; combining fields from multiple signatures into a single table.

In the jug example, joins must be performed manually by looking between tables. E.g., the State2 atom in the State table is associated with the pour0 atom, but one must refer to the Pour table to see what jug you’re supposed to pour into what.

Joining would not only allow us to avoid a manual lookup, it would also allow us to hide entirely the pour0 atom. This would definitely be an improvement, since a Pour atom is synthetic. That is, unlike a jug atom (which represents a physical jug), the Pour atoms (and all other atoms derived from op) exist only to encode instructions that the puzzle-solver needs to perform.

Joining these instructions directly onto the State table as new columns would make things clearer:

Joined table values

Then one could look at one table and read out everything they need, row-by-row: Fill jug0, pour from jug0 into jug1, empty jug0, etc.

A grammar would potentially need to handle the following questions around joins:

Render functions

The second kind of transformation to improve presentation is transformation of the instance data itself (the atoms, tuples, and relations).

Rather than an explicit join to introduce new columns, one could specify a render function that maps, e.g., the pour0 atom to the string “Pour from jug0 to jug1”. The resulting “table” would then be a simple list of steps.

This kind of functionality would be particularly useful for rendering out complex presentational markup. E.g., instead of displaying an atom as Person7, the function could examine the appropriate fields (or other relations) to generate markup containing a profile picture, full name, etc.

Open questions:

Untidying

Alloy’s table visualization is “tidy” in Hadley Wickham’s tidy data sense: columns correspond to fields and rows to atoms and their field values.

However, an “untidy” presentation may be an improvement. Our example problem only has two jugs, and it’d be clearer to follow what’s going on if their amounts were displayed as columns:

Untidy table

Here, we’ve essentially “transposed” State‘s jugAmounts field (a relation of type Jug one -> Int) from nested rows (Alloy’s normalized form) into top-level columns. (See section 3.1 in Hadley’s paper for more on this situation.)

Open questions:

Other questions: Essence vs. Accident

Alloy finds instances that satisfy the constraints of the provided model, but an individual instance can’t always reflect the full nature of the model. E.g., this table of the Person signature:

| Person  | Name  | Snack  |
|---------+-------+--------|
| Person0 | Kevin |        |

Doesn’t indicate the arity of the Name relation (Can I have multiple names? Must I have at least one name?). Is the empty Snack an accident of this instance, or does the model forbid me from having any snacks in any instance? (I hope not.)

Such constraints could be visualized using header icons, cell shading, placeholder values, etc.

However, other constraints may be more difficult to show.

E.g., Alloy subtypes are disjoint, so in the joined table example above all, exactly one set of columns from the Op-derived signatures (Pour, Empty, Fill) may be filled for any given row.

How could this be indicated visually?

Of course, not all model constraints are going to be possible to draw visually (if they could be, you probably would’ve already solved your problem on paper rather than turning to a formal modeling tool + SAT-solvers). Would visualizing simple constraints but omitting the impossible-to-visualize complex constraints cause people to misinterpret the underlying model by conveying a false sense of complete understanding?

Other questions: Ordering

(Arguably this question is a special case of essence/accident, but it’s worth calling out on its own.)

Rendering a collection as a list/table requires ordering the elements. But since Alloy relations are sets, their order is arbitrary.

This will probably only become an issue when you actually do want an ordering between some atoms. (Like in the jug puzzle, where we want to see the State atoms in order, otherwise we don’t really have a solution.)

Fields (columns) can be ordered according to source order (i.e., based on where they are defined in the textual Alloy code), though there may be presentational situations where you’d want to specify a different order.

For example, should the grammar be sufficiently expressive to allow one to render a sudoku puzzle as a 9x9 grid? (Of course you can solve sudoku with Alloy! It’s not the most machine-efficient way to do it, but since I don’t have a CS degree or know how to C++ safely, it’s the most efficient way for me to do it.)

Other questions: Grammar embedding

It’s not clear to me whether a presentational grammar should be “embedded” within Alloy, whether it should be implemented as a standalone program that consumes relational data from any source, or some hybrid approach

Embedding in Alloy:

Standalone program / library:

Further notes

Email me if you have any questions, relevant personal experiences, or want to collaborate.