Slide 37
Slide 37 text
The OneHotEncoder method is used to convert categorical variables into a format that can be
provided to machine learning algorithms, which typically require numerical input.
This is a scikit-learn class used to convert categorical variables
into a numerical form that can be more easily used by machine
learning models. It does this by creating a new binary column for
each unique category in the original column.
During data transformation, it may occur that
categories appear that were not present in the
original data set during encoder training.
By default, OneHotEncoder returns a sparse array to save memory,
especially useful when there are many categories. However, by setting
sparse=False, the encoder will return a dense array of NumPy.
Primera Imputación incluyendo la columna ‘soil’
-> Preprocesamiento