Slide 4
Slide 4 text
One hot encoding/ Explosion
• Convert each level to a column, if the current row belongs to
the particular level, then assign 1, otherwise assign 0
• No information loss
• Automatically handled in OML in-DB algorithms
• Drawback
• Generated high dimensional feature, especially categorical
variables with high number of levels – high cardinality
Popular techniques for encoding categorical variables
Copyright © 2021, Oracle and/or its affiliates.
4