Minimal IT logo and link to home page
Research, training, consultancy and software to reduce IT costs
Home | About | Newsletter | Contact
Previous | Next Printer friendly
15 March 2011

Removing data constraint

By Andrew Clifford

Breaking data down into a simpler form can overcome constraints on how data is accessed, stored and structured.

A couple of weeks ago I wrote that the technologies that support the semantic web could be very significant for mainstream IT.

The semantic web is an idea championed by web pioneer Tim Berners-Lee. The basic idea is that the web can contain data that can be read by computers, not just pages of text that can be read by humans. The vision of the semantic web is that computers can perform much more meaningful queries of the web, though better understanding of the meaning of the data.

What interests me most about the semantic web is some of the technologies being used to develop it, and how these could be applied in other ways.

One of the chief technologies is Resource Description Framework (RDF). RDF is conceptually simple. In a conventional database, data is arranged in tables, in which each thing is represented by a row, each type of data as a column, and each data value as a cell. RDF breaks this down further into triples, in which the first item of the triple identifies the thing, the second the data type, and the third the data value. Each thing is represented by a uniform resource identifier (URI), which can be a web address. Each data type is also represented by a URI. The value can be a textual value, or a URI which identifies another thing.

This way of arranging data has very interesting characteristics.

Because each piece of data has a web address, it can be referenced from anywhere. If you want some data, you do not have to be sent it, you only need to know its address. These technologies could, for example, be used to publish reference data from enterprise systems to departmental systems. These technologies allow this to be done in a simple, consistent way, using only web access technologies. It could overcome many of the problems with other methods, such as database sharing and file transfer.

Because the data structure and access methods are consistent, this approach overcomes constraints of how data is stored. For example, it would allow data stored in a file archive to be combined seamlessly with a current view from an operational database, or with publicly-published reference data, without having to merge them into a single database.

Perhaps most interestingly is that breaking data down into triples lets you build very rich data structures. It can achieve a level of data polymorphism or sub typing that is difficult to achieve on a relational database. It can deal with data and data-about-data in the same structures, creating self-describing data. You can build data structures that are more flexible and less constrained.

These approaches remove constraints on how data is accessed, how data is stored, and how data is structured. This could overcome the constraints of current databases, where data is stuck in a fixed structure within the database. Although a huge amount of engineering is required to make this approach as efficient and secure as current databases, it could become a very significant set of technologies for mainstream IT.

Next: AntiSamy

Subscription

Subscribe to RSS feed

Latest newsletter:
Magical metadata

We use the term "metadata-driven" to describe IT solutions in which functionality is defined in data. Taking this to the extreme can provide unparalleled levels of speed, simplicity and versatility.
Read full newsletter

System governance

System governance helps you implement high-quality systems, manage existing systems proactively, and improve failing systems.

Find out more