A Sneak Peek into Metabase

Published onNov 12, 2020
About a year ago, punctum books started its transition to become a press working fully with open source infrastructure. Thanks to the ongoing effort of and collaboration with, we have been able to run our daily operations already mostly on open source platforms.

One of the areas in which our transition to open was more difficult to accomplish was in the field of book usage statistics.

Parenthetically, the idea that usage statistics of open access books provide any meaningful index of actual usage (for example for academic libraries as a basis for their funding allocation, or for scholars as an indication of their “excellence”) is grossly misguided. Open access books are by definition free and untrackable. We should never hope or desire to track the many times that a PDF is copied within classrooms and workplaces, sent by Bluetooth, transferred over AirDrop, or smuggled on USB stick. In this specific sense usage stats are antithetical to the ethos of open access. Yet, as a press we duly collect them whenever possible.

We made the choice not to host the public-facing PDFs of ebook monographs and edited collections on our own server, but via OAPEN. We made this decision not only with a view toward long-term preservation and archiving, but also because OAPEN provided the metadata export formats needed by libraries, such as ONIX and MARCXML. We hope to generate these formats in the near future through open source metadata management system Thoth. Furthermore, OAPEN provides detailed, COUNTER-compliant usage statistics.

But OAPEN is not the only source of our usage data. In the past, we used multiple Wordpress plugins to track book downloads from our website, we have recently started hosting parts of our catalog on JSTOR and Project Muse, and we also keep track of our print book sales through a variety of distribution platforms. All of these sources deliver usage stats in different formats, at different moment, with different categories, in various data and file types, which often also change over time. It is therefore little surprise that we have until recently struggled to provide a complete answer to the simple question posed by many an author: “How is my book doing?”

After brainstorming with our friends at, in particular Silva Arapi, we decided to explore to use of Metabase for author- and library-facing usage statistics. Metabase is open source database vizualization software that has allowed us to represent the decade-worth of usage data collected in a plethora of formats into simple summary tables and graphics. It is also allows us to have an improved understanding ourselves about how our books are used through the various platforms on which they are available. The ingest into a MySQL database from which Metabase presents the data is done through free software tool phpMyAdmin.

By the end of the year, we hope that all data has been ingested and the integrity of all tables checked, and from that moment onward we will be able to provide our authors and supporting libraries with usage data – without manually searching through tabs with unordered data in Google Sheets, but with the important caveat that usage data collected from closed systems can never represent the true usage and value of open access books.

Metabase homescreen, with the different usage data tables on top.

Example visualization of total book downloads per month across different platforms (OAPEN data for 2020 as still missing). We only started collecting download stats in 2015.


