close

How can I one hot encode in Python?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How can I one hot encode in Python in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How can I one hot encode in Python?

  1. How can I one hot encode in Python?

    The .reshape(-1) is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]).

  2. one hot encode in Python

    The .reshape(-1) is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]).

Method 1

You can do it with numpy.eye and a using the array element selection mechanism:

import numpy as np
nb_classes = 6
data = [[2, 3, 4, 0]]

def indices_to_one_hot(data, nb_classes):
    """Convert an iterable of indices to one-hot encoded labels."""
    targets = np.array(data).reshape(-1)
    return np.eye(nb_classes)[targets]

The the return value of indices_to_one_hot(nb_classes, data) is now

array([[[ 0.,  0.,  1.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  1.,  0.,  0.],
        [ 0.,  0.,  0.,  0.,  1.,  0.],
        [ 1.,  0.,  0.,  0.,  0.,  0.]]])

The .reshape(-1) is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]).

Method 2

One hot encoding with pandas is very easy:

def one_hot(df, cols):
    """
    @param df pandas DataFrame
    @param cols a list of columns to encode 
    @return a DataFrame with one-hot encoding
    """
    for each in cols:
        dummies = pd.get_dummies(df[each], prefix=each, drop_first=False)
        df = pd.concat([df, dummies], axis=1)
    return df

EDIT:

Another way to one_hot using sklearn’s LabelBinarizer :

from sklearn.preprocessing import LabelBinarizer 
label_binarizer = LabelBinarizer()
label_binarizer.fit(all_your_labels_list) # need to be global or remembered to use it later

def one_hot_encode(x):
    """
    One hot encode a list of sample labels. Return a one-hot encoded vector for each label.
    : x: List of sample Labels
    : return: Numpy array of one-hot encoded labels
    """
    return label_binarizer.transform(x)

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read