Mathew Lowry

I stumbled upon a short video on the BBC of Tim Berners-Lee trying to explain the importance of the data web, aka semantic web, again. He himself says that he can’t say where it will lead us, as it is paradigm changing. True – but I can think of a few applications that anyone interested in the EU should know about.

I can’t embed the video here so you’ll just have to visit it.

I’ve seen him and others try to explain this many times, live and on video, but I’ve yet to see anyone really capture and convey the semantic web in a few short sentences. This is a problem, because the semantic web has a much greater potential impact on today’s Web than today’s web has had so far.

Why is it so hard to explain? In the video, he says that the web as we know it today would have been difficult to explain in the 1980s, but I disagree – the paper publishing metaphor would have worked as an on-ramp for almost everybody (“it’s like you publish a book, but it’s free, and online, and you can link them“) with many people able to grok the significance of online databases (already existing), search engines (“kinda like being able to search all the card catalogues that you find at your local library from your PC, but searching a lot more“) and so on.

But there are no available metaphors for the semantic web, so it’s harder to ‘get’ without being a techie.

So what is the Semantic Web?

With the semantic web, the computing infrastructure – your PC, the websites you use – ‘understand’ the data it’s processing, and so can take data from across the Web and process it for you, extracting meaning you couldn’t have gotten yourself without a hell of a lot of work.

This data has to be ‘semantically encoded’, following now well-developed standards. If everything was published according to these standards, it would essentially turn the entire Web into one sophisticated database, rather than the collection of pages, stand-alone databases and so on it is now.

I put ‘understand’ in quotes because we’re not talking artificial intelligence. Instead, information is encoded according to the publisher’s ontologies (like a category list); the ontologies themselves are published on the web; and they can be linked together, including across linguistic frontiers.

So what? Can you give me an example?

Well, look at it this way:

Q: When you look for information to answer your question today, what do you do?

A: I type keywords relevant to my question into Google.

Q: What do you get back?

A: A list of websites that might have the answer(s) to my question, because they feature the keywords.

Q: Then what?

A: I visit the sites and try to find the answer. I usually find myself copying numbers into an excel file if I really need to process the information, because it’s all just text.

With the semantic web, you ask your question, and the Google-equivalent ‘reads’ the semantic web, compiles and compares the data from various sources, and gives you an answer.

If this looks like a gross simplification that’s because it is. The business information processing is quite complex – but then so is the Google algorithm. Given the data, the applications would follow. But the data won’t come without the applications to process for them, giving us a classic chicken-and-egg problem.

The SW and the EU

I saw TBL try to explain the potential of the semantic web to various people in the EU institutions a few years ago, but noone in his audience were IT engineers, so I don’t think many grasped what this could mean for most EU policies.

Can you imagine what it could mean for policymakers if they could quickly find out who was doing what across the EU in research, environmental protection, social policy, and a hundred other fields, and then process and query this information as easily as they use Google?

Currently, this sort of information is painfully and slowly extracted out of national and regional bodies by armies of consultants, brigades of steering committees and armoured divisions of task forces. Attempts to standardise data formats – e.g., “all EU countries must publish their research data following these categories on this website” – consistently fail, because each country categorises their information their own way.

But if national data was published semantically, countries could still publish it as they see fit – but the semantic web could still bring it together as quickly as Google collects pages for you to read. More time could be spent on analysis, and less on collection. The structuring effect on everything from European research to the single market could be profound.

The European Commission, moreover, has both the most to gain, and is in the best position (theoretically) to prime the pump and overcome the chicken and egg problem.

Further reading: some semantic web bookmarks

Author :


  1. matthew,

    I don’t know enough about Europe to comment on how the Semantic Web might help Europe, but I think I can help about the Semantic Web.

    I think it is unhelpful to think of computers “understanding” anything. By referring to the relevant ontology, a computer can look up the meaning of a term in a particular context or domain. A computer searching semantically might need URIs (S Web equivalent for URL) for relevant ontologies and it might find pages with relevant terms using a conventional binary match search. It might then (automatically) mark up the terms on that page using the ontology and if it still seemed relevant then it might become a search result.
    As a simple example – a north london environmental health department publishes information on restaurant inspections. There are also many web sites that publish restaurant reviews. Southampton University S Web researchers have used S Web approach to combine environmental health and review datasets and examine whether good restaurants are also hygienic. The results were interesting and not what you’d expect.
    I also know of an aerospace manufacturer who has used Semantic Web technology to repurpose datasets because of business change. Renting engines by the hour to airlines and bearing the cost of maintenance has forced them to design engines differently and look at minimising downtime and maintenance effort. The service business had huge amounts of maintenance data and tacit knowledge captured in maintenance systems all over the world. The design business needed to gain access to this through the complexity of different data models, physical locations and so forth. Using S Web technology they can now answer 40 ‘big’ queries such as “which component causes the most Aborted Take offs, ? ” or “Which component causes the highest servicing expense ?”.

    Semantic web technology allows data to be reused in ways not envisaged during design because the meaning is captured in the ontology not in the mind of the business analyst. It also can start to make more sense of text. This means that legacy data (structured or unstructured) can be reused in new ways. Now there must be some of that in Europe! (I assume you mean the European Commission)

    I hope this helps


    [WORDPRESS HASHCASH] The poster sent us ‘1055386704 which is not a hashcash value.

  2. Thanks, James, for your excellent real-life examples.

    The useful application I immediately see for the SW in terms of EU policymaking is in the area I am most familiar – the European Research Area, or ERA.

    The idea of ERA is to – among other things – avoid the reinvention of the wheel in laboragtories in 27 member states, and focus a critical mass of human, technical and financial resources on important areas of research by knitting European researchers together in efficient cross-border networks to create economies of scale, etc. See for the blah.

    In reality, it’s incredibly difficult to do any of these things because it’s hard to get an overview of Who is Researching What, When and How across the EU’s various member states, particularly with research in some countries being done at least in part at the regional level. Not to mention privately-funded research!

    So what would happen if all research projects and programmes published all of their activities – particularly their research programmes, projects and results – in Semantic Web form?

    This would not just be useful for harried EU policymakers trying to coordinate an EU-wide research effort in, say, nanotechnology. Or for academics and consultants trying to chart research trends within Europe.

    It would also be incredibly useful for the researchers themselves, and the innovation-oriented companies that need their services.

    A single market for science, technology and engineering knowledge and innovation, analogous to the single market for products and services.

    More on that in another post, when I’m back from holiday. In the meantime, any thoughts?

    [WORDPRESS HASHCASH] The poster sent us ‘1055386704 which is not a hashcash value.

Comments are closed.