ptype: Probabilistic Type Inference

Abstract

Type inference refers to the task of inferring the data type of a given column of data. Current approaches often fail when data contains missing data and anomalies, which are found commonly in real-world data sets. In this paper, we propose ptype, a probabilistic robust type inference method that allows us to detect such entries, and infer data types. We further show that the proposed method outperforms existing methods.

Citation information

Ceritli, T., Williams, C.K.I. & Geddes, J. (2020) ptype: probabilistic type inference. Data Mining and Knowledge Discovery, 34(3), pp. 870–904

Turing affiliated authors

Research areas