By Andrew Clifford
The design challenges Metrici has faced with summary data are common to lots of systems, and the solution we have adopted could be used elsewhere.
Metrici holds data across multiple nodes and members, each of which is physically implemented as a separate row in the database. This is very versatile but can be slow to gather, process and display large volumes of data. This problem is not unique to Metrici - many systems need to manipulate data in summary form, rather than record-at-a-time.
To overcome this, we standardised on a single way of representing summary data, and rebuilt components around this. Although not optimised for any single purpose, the ability to reuse the same components across the system has made everything much simpler.
The full standard is documented in the Metrici table structure documentation. As a simple example, the table:
Can be represented using the structure:
The same structure is used throughout Metrici for different purposes. It is used to efficiently extract data from the triple store, including incrementally recalculating only those parts of a table that have changed. It is used for set-based operations such as filtering and data aggregation. It is used to display data as HTML tables, blocks of content, or lists. It is used to pass data to charting routines. It is used as an in-memory structure, but it is also written to and read from node members to persist tables in the database. It is particularly effective with Metrici's derivation processing, efficiently providing summarised data that is automatically kept up to date. It is very fast indeed.
The standard we have adopted for representing tables in memory has proved really useful in many different scenarios, and I would recommend it to anyone facing similar design problems. Over the next few weeks I will show in more detail how it is used in different parts of Metrici.
Copyright © 2005-2014 Minimal IT Ltd. All rights reserved.