Updated 2 January 2016: Change to reflect recent commits to play-hierarchical-data
application. Update stale links.
Play Framework’s Anorm library offers Scala applications a way to access relational data. Anorm’s design is minimalist, and so Anorm-reliant applications end up dealing with some of the nuances of relational databases.
One of those nuances concerns SQL joins: joining tables in a SQL query tends to blur the hierarchical, many-to-one relationship they may have.
For example, when the following tables are joined on country ID to combine city data with country names…
city | country | |||||
---|---|---|---|---|---|---|
id | country_id | name | population | id | name | |
1 | 1 | Berlin | 3462000 | 1 | Germany | |
2 | 1 | Hamburg | 1796000 | 2 | France | |
3 | 1 | Cologne | 1006000 | 3 | Spain | |
4 | 2 | Lyon | 1488000 | |||
5 | 2 | Marseille | 1489000 | |||
6 | 2 | Paris | 10620000 | |||
7 | 3 | Barcelona | 5570000 | |||
8 | 3 | Madrid | 6574000 |
Population values from urban agglomeration data, United Nations’ 2011 World Urbanization Prospects.
…the hierarchy between countries and cities is not immediately evident in the result set.
But what if that hierarchy is critical to the desired presentation of the data? This blog post offers a solution. It walks through play-hierarchical-data
, a simple application written with Play, Anorm, and an in-memory H2 database. The application retrieves the table data from above and displays it, placing cities in descending order of population by country.
play-hierarchical-data application
Code from the application is interspersed below. The full application is available on GitHub.
Case classes and parsers
The application follows the convention of the computer database sample application, which uses Anorm. In particular, it establishes Country
and City
as case classes corresponding to the database tables. Using Anorm’s parser API, it then builds up to a list
method that joins the tables with SQL and returns a List[(Country, City)]
.
Sorting
After invoking the list()
method…
val countriesAndCitiesUnsorted: List[(Country, City)] = City.list
…the application sorts the resulting data based on the desired presentation. It applies a primary sort on country name; where the country names are the same, it applies a secondary sort on population:
val countriesAndCities = countriesAndCitiesUnsorted.sortWith { case ((country0, city0), (country1, city1)) => country0.name < country1.name || (country0.name == country1.name && city0.population > city1.population) }
Restoring the hierarchy
To restore the data’s hierarchical relationship, the application relies on a foldLeft()
operation. The operation loops over the List
’s (Country, City)
tuples and accumulates a Map[Country, Seq[City]]
. In particular, on each loop iteration, a tuple’s City
is appended to the existing cities of the tuple’s Country
(if any):
val countriesToCities = countriesAndCities.foldLeft(ListMap.empty[Country, Seq[City]]) { case (theMap, (country, newCity)) => { val existingCities = theMap.getOrElse(country, Seq.empty[City]) theMap + ((country, existingCities :+ newCity)) } }
Of the various immutable Map
types, the application relies on a ListMap
, as it’s guaranteed to maintain insertion order, and therefore the List[(Country, City)]
’s sorting.
Displaying the result
With a simple template that iterates over the Map
’s Country
keys and their Seq[City]
values, the application produces the desired output:
Closing thoughts: Why not groupBy()?
The List.foldLeft()
operation described above groups data. So, why not work with List.groupBy()
instead, which also produces a Map
? That method, and the Map
it produces, do not attempt to maintain the List
’s order, which is critical to achieving the desired presentation.