Friday, September 19, 2014

CFP: Special Issue on Geospatial Semantics

Special Issue of the JWS on Geospatial Semantics

The Journal of Web Semantics seeks submissions for a special issue on geospatial semantics to be edited by Yolanda Gil and Raphaël Troncy. Submissions are due by January 31, 2015.

Geospatial reasoning has an increasingly larger scope in the semantic web. More and more information is geolocated, more mobile devices produce geocoded records, and more web mashups are created to convey geospatial information. Semantics can enable automated integration of geospatial information, and track the provenance of the data shown to an end user. Semantics can also improve visualizations and querying of geospatial data. Semantics can also support crowdsourcing of geospatial data, particularly to track identity through name and property changes over time. Several recent workshops on geospatial semantics have emphasized the interest in the community on these topics. Of note are workshops organized by the World Wide Web Consortium (W3C) and the Open Geospatial Consortium (OGC) indicating a strong interest in standardization efforts in geospatial semantics. This special issue aims to synthesize the recent trends in research and practice in the area of geospatial semantics.

Topics of interest include but are not limited to:
  • Combining semantic information with more traditional representations and standards for geospatial data
  • Exploiting semantics to enhance visualizations of geospatial information
  • Use of semantics to support geospatial data integration and conflation
  • Semantic mashups of geospatial data
  • Semantic provenance of geospatial data (e.g., PROV)
  • Semantics for mobile geospatial applications
  • Geospatial linked open data
  • Managing privacy of personal geospatial data and whereabouts through semantics
  • Combining semantic web standards (W3C) with geospatial (OGC) standards (e.g., GML)
  • Format for representing geographical data (e.g., GeoJSON)
  • Semantics for crowdsourcing geospatial information
  • Semantics for exploiting geospatial information in social network platforms
  • Scalable reasoning with semantic geospatial data
  • Real world applications of semantic geospatial frameworks

Guest Editors

  • Yolanda Gil, Information Sciences Institute, University of Southern California
  • Raphaël Troncy, Multimedia Communications Department, EURECOM

Important Dates

  • Call for papers: September 20, 2014
  • Submission deadline: January 31, 2015
  • Author notification: mid-April 2015
  • Publication: third quarter of 2015

Submission guidelines

The Journal of Web Semantics solicits original scientific contributions of high quality. Following the overall mission of the journal, we emphasize the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services.

Submission of your manuscript is welcome provided that it, or any translation of it, has not been copyrighted or published and is not being submitted for publication elsewhere. Manuscripts should be prepared for publication in accordance with instructions given in the JWS guide for authors. The submission and review process will be carried out using Elsevier's Web-based EES system. Upon acceptance of an article, the author(s) will be asked to transfer copyright of the article to the publisher. This transfer will ensure the widest possible dissemination of information. Elsevier's liberal preprint policy permits authors and their institutions to host preprints on their web sites. Preprints of the articles will be made freely accessible on the JWS preprint server. Final copies of accepted publications will appear in print and at Elsevier's archival online server.

Wednesday, September 3, 2014

Preprint: SINA: Semantic Interpretation of User Queries for Question Answering on Interlinked Data

Saeedeh Shekarpour, Edgard Marx, Axel-Cyrille Ngonga Ngomo and Sören Auer, SINA: Semantic Interpretation of User Queries for Question Answering on Interlinked Data, Web Semantics: Science, Services and Agents on the World Wide Web, to appear.

Abstract: The architectural choices underlying Linked Data have led to a compendium of data sources which contain both duplicated and fragmented information on a large number of domains. One way to enable non-experts users to access this data compendium is to provide keyword search frameworks that can capitalize on the inherent characteristics of Linked Data. Developing such systems is challenging for three main reasons. First, resources across different datasets or even within the same dataset can be homonyms. Second, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain user query. Finally, constructing a federated formal query from keywords across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present Sina, a scalable keyword search system that can answer user queries by transforming user-supplied keywords or natural-languages queries into conjunctive SPARQL queries over a set of interlinked data sources. Sina uses a hidden Markov model to determine the most suitable resources for a user-supplied query from different datasets. Moreover, our framework is able to construct federated queries by using the disambiguated resources and leveraging the link structure underlying the datasets to query. We evaluate Sina over three different datasets. We can answer 25 queries from the QALD-1 correctly. Moreover, we perform as well as the best question answering system from the QALD-3 competition by answering 32 questions correctly while also being able to answer queries on distributed sources. We study the runtime of SINA in its mono-core and parallel implementations and draw preliminary conclusions on the scalability of keyword search on Linked Data.

Monday, September 1, 2014

Preprint: Global Machine Learning for Spatial Ontology Population, Kordjamshidi and Moens

Parisa Kordjamshidi and Marie-Francine Moens, Global Machine Learning for Spatial Ontology Population, Web Semantics: Science, Services and Agents on the World Wide Web, to appear. 

Abstract: Understanding spatial language is important in many applications such as geographical information systems, human computer interaction or text-to-scene conversion. Due to the challenges of designing spatial ontologies, the extraction of spatial information from natural language still has to be placed in a well-defined framework. In this work, we propose an ontology which bridges between cognitive-linguistic spatial concepts in natural language and multiple qualitative spatial representation and reasoning models. To make a mapping between natural language and the spatial ontology, we propose a novel global machine learning framework for ontology population. In this framework we consider relational features and background knowledge which originates from both ontological relationships between the concepts and the structure of the spatial language. The advantage of the proposed global learning model is the scalability of the inference, and the flexibility for automatically describing text with arbitrary semantic labels that form a structured ontological representation of its content. The machine learning framework is evaluated with SemEval-2012 and SemEval-2013 data from the spatial role labeling task.

Thursday, August 28, 2014

Preprint: Theophrastus: On Demand and Real-Time Automatic Annotation and Exploration of (Web) Documents using Open Linked Data

Pavlos Fafalios and Panagiotis Papadakos, Theophrastus: On Demand and Real-Time Automatic Annotation and Exploration of (Web) Documents using Open Linked Data, Web Semantics: Science, Services and Agents on the World Wide Web, to appear.

Abstract: Theophrastus is a system that supports the automatic annotation of (Web) documents through entity mining and provides exploration services by exploiting Linked Open Data (LOD), in real-time and only when needed. The system aims at assisting biologists in their research on species and biodiversity. It was based on requirements coming from the biodiversity domain and was awarded the first prize in the Blue Hackathon 2013. Theophrastus has been designed to be highly configurable regarding a number of different aspects like entities of interest, information cards and external search systems. As a result it can be exploited in different contexts and other areas of interest. The provided experimental results show that the proposed approach is efficient and can be applied in real-time.


Friday, August 15, 2014

New LOD cloud draft includes 558 semantic web datasets

Chris Bizer announced a new draft version of the LOD cloud with 558 linked datasets connected by 2883 linking sets. Last call for new datasets (submit at DataHub) for this version is 2014-08-20.

Sunday, July 27, 2014

2013 Journal Metrics data computed from Elsevier's Scopus data

Eugene Garfield first published the idea of analyzing citation patterns in scientific publications in his 1955 Science paper, Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. He subsequently popularized the impact factor metric for journalsand many other bibliographic concepts and founded the Institute for Scientific Information to provide products and services around them.

In the last decade, digital libraries, online publishing, text mining and big data analytics have combined to produce new bibliometric datasets and metrics. Google's Scholar Metrics, for example, uses measures derived from the popular  h-index concept. Microsoft's Academic Search uses a PageRank like algorithm to weigh citations based on the metric for their source.  Thompson Reuters, which acquired Garfield's ISI in 1992, still relies largely on the traditional impact factor in its Citation Index. These new datasets and metrics have also stimulated a lively debate on the value of such analysis and the dangers of putting too much reliance on them.

Elsevier's Journal Metrics site publishes journal citation metrics computed with data from their Scopus bibliographic database, which covers nearly 21,000 titles from over 5,000 publishers in the scientific, technical, medical, and social science fields. Last week the site added data from 2013, using  three measures of a journal's impact based on an analysis of its paper's citations.
  • Source Normalized Impact per Paper (SNIP), a measure of contextual citation impact that weights citations based on the total number of citations in a subject field.
  • Impact Per Publication (IPP), an estimate of the average number of citations a paper will receive in tree years.
  • SCImago Journal Rank (SJR), a PageRank-like measure that takes into account the "prestige" of the citing sources.
We were happy to see that the metrics for the Journal of Web Semantics remain strong, with 2013 values for SNIP, IPP and SJR of 4.51, 3.14 and 2.13, respectively.  Our analysis, described below, shows that these metrics put the journal in the top 5-10% of a set of 130 journals in our "space".

To put these in context, we wanted to compare these to other journals that regularly publish similar papers. The Journal Metrics site has a very limited search function, but you can download all of the data as a CSV file. We downloaded the data, used grep to select out just the journals in the Computer Science category and whose names contained any of the strings web, semantic, knowledge, data, intellig, agent or ontolo. The data for the resulting 130 journals for the last three years is available as a Google spreadsheet.

All of these metrics have shortcomings and should be taken with a grain of salt.  Some, like Elsevier's, are based on data from a curated set of publications with several years (e.g., three or even five) years of data available, so new journals are not included. Others, like Google's basic citation counts, weigh a citation from a paper in Science the same as one from an undergraduate research paper found on the Web.  Journals that publish a handful of very high quality papers each year fare better on some measures but are dominated by publications that publish a large number of articles, from top quality to mediocre, on others.  Nonetheless, taken together, the different metrics offer insights into the significance and utility of a journal's published articles based on citations from the research community.

Sunday, July 13, 2014

Preprint: Tailored Semantic Annotation for Semantic Search

Rafael Berlanga, Victoria Nebot and Maria Pérez, Tailored Semantic Annotation for Semantic Search, Web Semantics: Science, Services and Agents on the World Wide Web, to appear, 2014.

Abstract: This paper presents a novel method for semantic annotation and search of a target corpus using several knowledge resources (KRs). This method relies on a formal statistical framework in which KR concepts and corpus documents are homogeneously represented using statistical language models. Under this framework, we can perform all the necessary operations for an efficient and effective semantic annotation of the corpus.  Firstly, we propose a coarse tailoring of the KRs w.r.t the target corpus with the main goal of reducing the ambiguity of the annotations and their computational overhead. Then, we propose the generation of concept profiles, which allow measuring the semantic overlap of the KRs as well as performing a finer tailoring of them. Finally, we propose how to semantically represent documents and queries in terms of the KRs concepts and the statistical framework to perform semantic search. Experiments have been carried out with a corpus about web resources which includes several Life Sciences catalogues and Wikipedia pages related to web resources in general (e.g., databases, tools, services, etc). Results demonstrate that the proposed method is more effective and efficient than state-of-the-art methods relying on either context-free annotation or keyword-based search.

Wednesday, July 2, 2014

Preprint: Konclude: System Description

Preprint: Andreas Steigmiller, Thorsten Liebig, Birte Glimm, Konclude: System Description, Web Semantics: Science, Services and Agents on the World Wide Web, to appear, 2014.

This paper introduces Konclude, a high-performance reasoner for the Description Logic SROIQV. The supported ontology language is a superset of the logic underlying OWL 2 extended by nominal schemas, which allows for expressing arbitrary DL-safe rules. Konclude's reasoning core is primarily based on the well-known tableau calculus for expressive Description Logics. In addition, Konclude also incorporates adaptations of more specialised procedures, such as consequence-based reasoning, in order to support the tableau algorithm. Konclude is designed for performance and uses well-known optimisations such as absorption or caching, but also implements several new optimisation techniques. The system can furthermore take advantage of multiple CPU's at several levels of its processing architecture. This paper describes Konclude's interface options, reasoner architecture, processing workflow, and key optimisations. Furthermore, we provide results of a comparison with other widely used OWL 2 reasoning systems, which show that Konclude performs eminently well on ontologies from any language fragment of OWL 2.