Hello Guys, How are you all? Hope You all Are Fine. Today I get the following error **sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples** in** Python**. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

Table of Contents

## How sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples Error Occurs?

Today I get the following error **sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples** in** Python**.

## How To Solve sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples Error ?

**How To Solve sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples Error ?**To Solve sklearn train_test_split – ValueError: Found This means that the length of the various elements you're trying to split don't match.For

`X`

and`y`

,`sklearn`

will take the same indices, usually a random sample of 80% of the indices of your data. So, the lengths have to match.**sklearn train_test_split – ValueError: Found input variables with inconsistent numbers of samples**To Solve sklearn train_test_split – ValueError: Found This means that the length of the various elements you're trying to split don't match.For

`X`

and`y`

,`sklearn`

will take the same indices, usually a random sample of 80% of the indices of your data. So, the lengths have to match.

## Solution 1

As you stated, labels orginal shape is `(83292, 5)`

and once you applied `MultiLabelBinarizer`

it became `(5, 18)`

.

`train_test_split(X, y)`

function expect that **X and y should have the same rows**. E.g: `83292`

datapoints available in your `X`

and respective datapoints label should be available in your `y`

variable. Hence, in your case it should be `X`

and `y`

shape should be `(83292, 15)`

and `(83292, 18)`

.

**Try this:** Your `MultiLabelBinarizer`

output having wrong dimension here. So, if your `labels`

is a dataframe object, then you should apply `mlb.fit_transform(labels.values.tolist())`

. this would produce the same no of rows as output here `83292`

.

**Example of your labels should be like below format:**

your `y`

input can be like `list of list`

or dataframe having `one column which having list of values`

. Make sure you have X and y having same no of rows. You can represent multi-label multi-class `y`

variable like below format. Or dataframe.shape should be `(no_of_rows, 1)`

[[1, 1, 25, 0, 0], [1, 1, 25, 0, 0], [1, 1, 25, 0, 0], [1, 1, 25, 0, 0], [1, 1, 25, 0, 0], [3, 5, 50, 0, 0], [3, 5, 50, 0, 0], [3, 5, 50, 0, 0], [3, 5, 50, 0, 0], [3, 5, 50, 0, 0]]

## Solution 2

This means that the length of the various elements you’re trying to split don’t match.For `X`

and `y`

, `sklearn`

will take the same indices, usually a random sample of 80% of the indices of your data. So, the lengths have to match.

Imagine it’s trying to keep these indices. What would `sklearn`

do when there’s nothing at some index?

0 1 0 0 1 0 1 0 0 1 0 1 0 1 a b b a b a b a a b b b ^ ^ ^ ^ ^ ^ ^ ^

Do this check to verify that the lengths match. Does this return `True`

?

len(dataset) == len(labels)

**Summery**

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

**Also Read**