This day’s invited speaker was Peter Mika, from Yahoo! Research Barcelona. In his talk — The Future of Web Search — emphasised the state of the web search, semantic web deployment difficulties, the shift from documents to databases (web of data), and current trends in annotation/​structure of data: folksonomies, µformats, wikipedia infoboxes, RDFa; then how to reconsider IR in this context: folksonomies mining, GRDDL and hGRDDL, should we have “forgiving” parsers for µformats?

In this context the description of the ideal world would be:

  • plenty of precise metadata to harvest
  • user intent capturable directly as a SPARQL query
  • single ontology used both by the query and the knowledge base (KB)
  • a query executed on a single KB, gives the correct, single answer

In real world we face technical and social challenges: query interface usability, data quality (from synctactic/​semantic errors to spam), ontology mapping, entity resolution, ranking across types, results display (information overload and partial understanding issues), user motivation to annotate, trust.

Next, Fabio Ciravegna presented the state of the art in using semantic web technologies for knowledge management (KM) in large distributed organisations — from the sheer amount of raw data (i.e. a Rolls-​Royce jet engine produces 1GB of vibration data per hour) to unstructured reports on the lifecycle (diagnose, repairs, etc.) of such engines, distributed over multiple repositories.

The Rolls-​Royce case study of cross-​media KA was impressive, the main issues (apart of data volume) were that evidence is distributed over different media, from more or less structured text (word, excel, powerpoint and PDF) to 3D images, data integration and hybrid search.

Other specific information extraction (IE) issues were event modelling, table data extraction, distance metrics approaches (as opposed to the linguistic and statistical ones).

Later in the practical session we explored machine learning (ML) from both (human) text annotations as well as image annotations; which also showed how easy humans disagree on annotations and how the annotations reflect the world model of the annotator (and not of the user).

The last tutorial was given by John Domingue, on semantic web services (SWS) — the problems with the web services today, SWS vision, IRS3 SWS broker, web service modelling ontology (WSMO), orchestration and choreography in SWS. Then the Essex County Council Emergency Planning case study was presented and demoed, and the talk ended with OWL-​S and semantic annotations for WSDL (SAWSDL).

In the afternoon’s practical session, Barry Norton led us in how to re-​create the european travel demo with IRS3 and WSMO Studio.