Lots of the data to be published as Linked Data are in the form of tabular data. Nevertheless, in order to semantically interpret such tabular data and publish them as quality Linked Data, non-trivial effort is needed in terms of linkage of the data to existing Linked Data resources and and in terms of selecting proper target vocabularies.
Odalic is a tool for semantic table interpretation and (semi)automatic publishing of such tables as high quality Linked Data. As an input to Odalic, user defines the table which should be processed and knowledge base (e.g. DBpedia), which should be used to:
- Classify columns in the input table with the classes from the knowledge base (e.g. that column A contains instances of class “http://schema.org/Person”)
- Disambiguate cell values within the columns against the knowledge base, so that instead of strings, there are Linked Data resources (e.g. instead of string “Prague”, there is “http://dbpedia.org/resource/Prague”
- Discover relations between columns (e.g. that there is a relation “x:livesIn” between column with persons and column with cities)
Users may provide feedback to the suggested classification/disambiguation/relation discovery, e.g., they may select different suggested class, propose new relations, etc. Such feedback is then taken as a constraint in the further processing of such data.
Users can export the semantically interpreted tables as Linked Data or extended CSV files (original CSV files including also e.g. disambiguations of cell values). Odalic also supports export of statistical data in the form of RDF Data Cube vocabulary.
Odalic is available under an Apache open source license and hosted on github.