Type inference refers to the task of inferring the data type of a given column of data. Current approaches often fail when data contains missing data and anomalies, which are found commonly in real-world data sets. In this paper, we propose ptype, a probabilistic robust type inference method that allows us to detect such entries, and infer data types. We further show that the proposed method outperforms existing methods.

Ceritli, T., Williams, C.K.I. & Geddes, J. (2020) ptype: probabilistic type inference. Data Mining and Knowledge Discovery, 34(3), pp. 870–904

