The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data. Unlike traditional lexical matching-based methods, we propose a deep prediction model that can fully exploit a table's contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN), and inter-column semantics features learned by a knowledge base (KB) lookup and query answering algorithm. It exhibits good performance not only on individual table sets, but also when transferring from one table set to another.

Citation information

Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks and Charles Sutton (2019). Learning Semantic Annotations for Tabular Data. Accepted for publication in the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019).

Turing affiliated authors