Working with Anorm and hierarchical data

Updated 2 January 2016: Change to reflect recent commits to play-hierarchical-data application. Update stale links.

Play Framework’s Anorm library offers Scala applications a way to access relational data. Anorm’s design is minimalist, and so Anorm-reliant applications end up dealing with some of the nuances of relational databases.

One of those nuances concerns SQL joins: joining tables in a SQL query tends to blur the hierarchical, many-to-one relationship they may have.

For example, when the following tables are joined on country ID to combine city data with country names…

city country
id country_id name population id name
1 1 Berlin 3462000 1 Germany
2 1 Hamburg 1796000 2 France
3 1 Cologne 1006000 3 Spain
4 2 Lyon 1488000
5 2 Marseille 1489000
6 2 Paris 10620000
7 3 Barcelona 5570000
8 3 Madrid 6574000

Population values from urban agglomeration data, United Nations’ 2011 World Urbanization Prospects.

…the hierarchy between countries and cities is not immediately evident in the result set.

But what if that hierarchy is critical to the desired presentation of the data? This blog post offers a solution. It walks through play-hierarchical-data, a simple application written with Play, Anorm, and an in-memory H2 database. The application retrieves the table data from above and displays it, placing cities in descending order of population by country.

play-hierarchical-data application

play-hierarchical-data on GitHubCode from the application is interspersed below. The full application is available on GitHub.

Case classes and parsers

The application follows the convention of the computer database sample application, which uses Anorm. In particular, it establishes Country and City as case classes corresponding to the database tables. Using Anorm’s parser API, it then builds up to a list method that joins the tables with SQL and returns a List[(Country, City)].

Sorting

After invoking the list() method…


val countriesAndCitiesUnsorted: List[(Country, City)] =
  City.list

…the application sorts the resulting data based on the desired presentation. It applies a primary sort on country name; where the country names are the same, it applies a secondary sort on population:


val countriesAndCities = countriesAndCitiesUnsorted.sortWith {

  case ((country0, city0), (country1, city1)) =>

    country0.name < country1.name ||
      (country0.name == country1.name &&
       city0.population > city1.population)
}

Restoring the hierarchy

To restore the data’s hierarchical relationship, the application relies on a foldLeft() operation. The operation loops over the List’s (Country, City) tuples and accumulates a Map[Country, Seq[City]]. In particular, on each loop iteration, a tuple’s City is appended to the existing cities of the tuple’s Country (if any):


val countriesToCities =
  countriesAndCities.foldLeft(ListMap.empty[Country, Seq[City]]) {

  case (theMap, (country, newCity)) => {

    val existingCities =
      theMap.getOrElse(country, Seq.empty[City])

    theMap + ((country, existingCities :+ newCity))
  }
}

Of the various immutable Map types, the application relies on a ListMap, as it’s guaranteed to maintain insertion order, and therefore the List[(Country, City)]’s sorting.

Displaying the result

With a simple template that iterates over the Map’s Country keys and their Seq[City] values, the application produces the desired output:

Screenshot of play-hierarchical-data

Closing thoughts: Why not groupBy()?

The List.foldLeft() operation described above groups data. So, why not work with List.groupBy() instead, which also produces a Map? That method, and the Map it produces, do not attempt to maintain the List’s order, which is critical to achieving the desired presentation.