salient substrings in values that can be used for classiﬁcation • Start with an embedding layer for dimensionality reduction • Use convolutional layers to capture substring patterns • End with fully connected layers
to organize all the database types into a similarity graph. • A logistic function is used to rescale the similarity measure so we can control the spatial layout of the graph with the parameters: a and b.
analyze graphs. • We apply spectral clustering to organize the database types into clusters which represent semantic topics in the database. • Spectral clustering can be applied recursively so that a hierarchical organization can be obtained.
data where data values that are assigned incorrect types. Example: “Beethoven’s 9th Symphony” being classiﬁed as Artist. • We can utilize the neural network to detect likely candidates of such dirty data. • Using substring reasoning, we can also generate visual explanation of the abnormalities of the identiﬁed data values.