Saturday, August 11, 2012

JWS preprint: FaBiO and CiTO: ontologies for describing bibliographic resources and citations

A new ontology paper preprint is available on the Journal of Web Semantics preprint server.

Silvio Peroni and David Shotton, FaBiO and CiTO: ontologies for describing bibliographic resources and citations, Journal of Web Semantics, in press.

Semantic publishing is the use of Web and Semantic Web technologies to enhance the meaning of a published journal article, to facilitate its automated discovery, to enable its linking to semantically related articles, to provide access to data within the article in actionable form, and to facilitate integration of data between articles. Recently, semantic publishing has opened the possibility of a major step forward in the digital publishing world. For this to succeed, new semantic models and visualization tools are required to fully meet the specific needs of authors and publishers. In this article, we introduce the principles and architectures of two new ontologies central to the task of semantic publishing: FaBiO, the FRBR-aligned Bibliographic Ontology, an ontology for recording and publishing bibliographic records of scholarly endeavours on the Semantic Web, and CiTO, the Citation Typing Ontology, an ontology for the characterization of bibliographic citations both factually and rhetorically. We present those two models step by step, in order to emphasise their features and to stress their advantages relative to other pre-existing information models. Finally, we review the uptake of FaBiO and CiTO within the academic and publishing communities.

Thursday, August 9, 2012

New letters on the preprint server

The first letters to the Journal of Web Semantics are on the IN PRINT section of the JWS preprint server. These include comments by Peter Patel-Schneider on the article WebPIE: A Web-scale Parallel Inference Engine using MapReduce by Jacopo Urbani et al. and a response and corrigendum from the authors.

Wednesday, August 8, 2012

JWS preprint: Folksonomized Ontology and the 3E Steps Technique to Support Ontology Evolvement

A new preprint is available on the preprint server as part of a special issue on the Semantic and Social Web.

Hugo Alves and Andre Santanche, Folksonomized Ontology and the 3E Steps Technique to Support Ontology Evolvement, Journal of Web Semantics, to appear, 2012.

Folksonomies are increasingly adopted in web systems. These “social taxonomies”, which emerge from collaborative tagging, contrast with the formalism and the systematic creation process applied to ontologies. However, they can play complementary roles, as the knowledge systematically formalized in ontologies by a restricted group can be enriched by the implicit knowledge collaboratively produced by a much wider group. Existing initiatives that involve folksonomies and ontologies are often unidirectional, i.e., ontologies improve tag operations or tags are used to automatically create ontologies. We propose a new fusion approach in which the semantics travels in both directions – from folksonomies to ontologies and vice versa. The result of this fusion is our Folksonomized Ontology (FO). In this paper, we present our 3E Steps technique – Extraction, Enrichment, and Evolution –, which explores the latent semantics of a given folksonomy – expressed in a FO – to support ontology review and enhancement. It was implemented and tested in a visual review/enhancement tool.

JWS adds section for letters to the Journal

The Journal of Web Semantics is introducing a new letters section as a  place to publish comments on recent Journal of Web Semantics articles that have appeared either in print or online.  Such letters might present corrections or errata, identify problems or errors, provide complimentary evidence for claims, and/or raise interesting points for discussion.  Where appropriate, the Editors in Chief will invite the authors of the original article to compose a reply to a letter and publish both together.

Letters and associated responses should be submitted using the Journal of Web Semantics' submission site by selecting the 'letter' article type.  Both letters and their responses will normally not exceed four JWS-formatted pages.  Where a longer length is required by the subject, it may be allowed at the discretion of the Editors in Chief.   Both will be reviewed for content and appropriateness by the letters area editor and/or the Editors in Chief.  In some cases, we may also ask for one or more peer reviews.

Due to limited space, the Journal of Web Semantics cannot publish all submitted letters and responses in the printed journal. Some may be selected for online publication only on the Journal's preprint Web site.

The content of both the letter and any response is the responsibility of their authors and subsequent publication in the Journal of Web Semantics does not imply the Journal's agreement or endorsement.

Monday, August 6, 2012

Three new papers from the 2011 Semantic Web Challenge

Three new papers are available on the Journal of Web Semantics preprint server. These will appear in volume 15 as part of a special section edited by Diana Maynard and Chris Bizer featuring revised and extended papers from the 2011 ISWC Semantic Web Challenge.

Marco Balduini, Irene Celino, Daniele Dell´Aglio, Emanuele Della Valle, Yi Huang, Tony Lee, Seon-Ho Kim, Volker Tresp, BOTTARI: An Augmented Reality Mobile Application to Deliver Personalized and Location-Based Recommendations by Continuous Analysis of Social Media Streams, Journal of Web Semantics, volume 15, to appear.

In 2011, an average of three million tweets per day was posted in Seoul. Hundreds of thousands of tweets carry the live opinion of some tens of thousands of users about restaurants, bars, coffees and many other semi-public points of interest (POIs) in the city. Trusting this collective opinion to be a solid base for novel commercial and social services, we conceived BOTTARI: an augmented reality application that offers personalized and localized recommendation of POIs based on the temporally-weighted opinions of the social media community. In this paper, we present the design of BOTTARI, the potentialities of semantic technologies like inductive and deductive stream reasoning and the lesson learnt in experimentally deploying BOTTARI in Insadong – a popular tourist area in Seoul – for which we have been collecting tweets for three years to rate the few hundreds of restaurants in the district. The results of our study show to demonstrate the feasibility of BOTTARI and encourage its commercial spreading.

Matthias Konrath, Thomas Gottron, Steffen Staab, Ansgar Scherp, SchemEX – Efficient Construction of a Data Catalogue by Stream-Based Indexing of Linked Data, Journal of Web Semantics, volume 15, to appear.

We present SchemEX, an approach and tool for a stream-based indexing and schema extraction of Linked Open Data (LOD) at web-scale. The schema index provided by SchemEX can be used to locate distributed data sources in the LOD cloud. It serves typical LOD information needs such as finding sources that contain instances of one specific data type, of a given set of data types (so-called type clusters), or of instances in type clusters that are connected by one or more common properties (so-called equivalence classes). The entire process of extracting the schema from triples and constructing an index is designed to have linear runtime complexity. Thus, the schema index can be computed on-the-fly while the triples are crawled and provided as a stream by a linked data spider. To demonstrate the web-scalability of our approach, we have computed a SchemEX index over the Billion Triples Challenge (BTC) dataset 2011 consisting of 2,170 million triples. In addition, we have computed the SchemEX index on a dataset with 11 million triples. We use this smaller dataset for conducting a detailed qualitative analysis. We are capable to locate relevant data sources with recall between 71% and 98% and a precision between 74% and 100% at a window size of 100K triples observed in the stream and depending on the complexity of the query, i.e. if one wants to find specific data types, type clusters or equivalence classes.

Danh Le-Phuoc, Hoan Quoc Nguyen-Mau, Josiane Xavier Parreira, Manfred Hauswirth, A Middleware Framework for Scalable Management of Linked Streams, Journal of Web Semantics, volume 15, to appear.

The Web has long exceeded its original purpose of a distributed hypertext system and has become a global, data sharing and processing platform. This development is confirmed by remarkable milestones such as the Semantic Web, Web services, social networks and mashups. In parallel with these developments on the Web, the Internet of Things (IoT), i.e., sensors and actuators, has matured and has become a major scientific and economic driver. Its potential impact cannot be overestimated – for example, in logistics, cities, electricity grids and in our daily life, in the form of sensor-laden mobile phones – and rival that of the Web itself. While the Web provides ease of use of distributed resources and a sophisticated development and deployment infrastructure, the IoT excels in bringing real-time information from the physical world into the picture. Thus a combination of these players seems to be the natural next step in the development of even more sophisticated systems of systems. While only starting, there is already a significant amount of sensor-generated, or more generally dynamic information, available on theWeb. However, this information is not easy to access and process, depends on specialised gateways and requires significant knowledge on the concrete deployments, for example, resource constraints and access protocols. To remedy these problems and draw on the advantages of both sides, we try to make dynamic, online sensor data of any form as easily accessible as resources and data on theWeb, by applying well-established Web principles, access and processing methods, thus shielding users and developers from the underlying complexities. In this paper we describe our Linked StreamMiddleware (LSM,,which makes it easy to integrate time-dependent data with other Linked Data sources, by enriching both sensor sources and sensor data streams with semantic descriptions, and enabling complex SPARQL-like queries across both dataset types through a novel query processing engine, along with means to mashup the data and process results. Most prominently, LSM provides (1) extensible means for real-time data collection and publishing using a cloud-based infrastructure, (2) a Web interface for data annotation and visualisation, and (3) a SPARQL endpoint for querying unified Linked Stream Data and Linked Data. We describe the system architecture behind LSM, provide details how Linked Stream Data is generated, and demonstrate the benefits and efficiency of the platform by showcasing some experimental evaluations and the system’s interface.