close

[Solved] ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras

Hello Guys, How are you all? Hope You all Are Fine. Today I get the following error ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras in Python. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

How ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras Error Occurs?

Today I get the following error ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras in Python.

How To Solve ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras Error ?

  1. How To Solve ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras Error ?

    To Solve ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras Error
    For example, if I want to use inputs/outputs naming and my model has two input layers named as “words” and “importance”, and also two output layers named as

  2. ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras

    To Solve ValueError: No gradients provided for any variable – Tensorflow 2.0/Keras Error
    For example, if I want to use inputs/outputs naming and my model has two input layers named as “words” and “importance”, and also two output layers named as

Solution 1

There are two different sets of problems in your code, which could be categorized as syntactical and architectural problems. The error raised (i.e. No gradients provided for any variable) is related to the syntactical problems which I would mostly address below, but I would try to give you some pointers about the architectural problems after that as well.

The main cause of syntactical problems is about using named inputs and outputs for the model. Named inputs and outputs in Keras is mostly useful when the model has multiple input and/or output layers. However, your model has only one input and one output layer. Therefore, it may not be very useful to use named inputs and outputs here, but if that’s your decision I would explain how it could be done properly.

First of all, you should keep in mind that when using Keras models, the data generated from any input pipeline (whether it’s a Python generator or tf.data.Dataset) should be provided as a tuple i.e. (input_batch, output_batch) or (input_batch, output_batch, sample_weights). And, as I said, this is the expected format everywhere in Keras when dealing with input pipelines, even when we are using named inputs and outputs as dictionaries.

For example, if I want to use inputs/outputs naming and my model has two input layers named as “words” and “importance”, and also two output layers named as “output1” and “output2”, they should be formatted like this:

({'words': words_data, 'importance': importance_data},
 {'output1': output1_data, 'output2': output2_data})

So as you can see above, it’s a tuple where each element of the tuple is a dictionary; the first element corresponds to inputs of the model and the second element corresponds to outputs of the model. Now, according to this point, let’s see what modifications should be done to your code:

  • In sample_generator we should return a tuple of dicts, not a dict. So:example = tuple([ {'inputs': text_encoder.encode(inputs) + [EOS_ID]}, {'targets': text_encoder.encode(targets) + [EOS_ID]}, ])
  • In make_dataset function, the input arguments of tf.data.Dataset should respect this:output_types=( {'inputs': tf.int64}, {'targets': tf.int64} ) padded_shapes=( {'inputs': (None,)}, {'targets': (None,)} )
  • The signature of prepare_example and its body should be modified as well:def prepare_example(ex_inputs: dict, ex_outputs: dict, params: dict): # Make sure targets are one-hot encoded ex_outputs['targets'] = tf.one_hot(ex_outputs['targets'], depth=params['vocab_size']) return ex_inputs, ex_outputs
  • And finally, the call method of subclassed model:return {'targets': x}
  • And one more thing: we should also put these names on the corresponding input and output layers using the name argument when constructing the layers (like Dense(..., name='output'); however, since we are using the Model sub-classing here to define our model, that’s not necessary to do.

All right, these would resolve the input/output problems and the error related to gradients would be gone; however, if you run the code after applying the above modifications, you would still get an error regarding incompatible shapes. As I said earlier, there are architectural issues in your model which I would briefly address below.


As you mentioned, this is supposed to be a seq-to-seq model. Therefore, the output is a sequence of one-hot encoded vectors, where the length of each vector is equal to (target sequences) vocabulary size. As a result, the softmax classifier should have as much units as vocabulary size, like this (Note: never in any model or problem use a softmax layer with only one unit; that’s all wrong! Think about why it’s wrong!):

self.out_layer = keras.layers.Dense(params['vocab_size'], activation='softmax')

The next thing to consider is the fact that we are dealing with 1D sequences (i.e. a sequence of tokens/words). Therefore using 2D-convolution and 2D-pooling layers does not make sense here. You can either use their 1D counterparts or replace them with something else like RNN layers. As a result of this, the Lambda layer should be removed as well. Also, if you want to use convolution and pooling, you should adjust the number of filters in each layer as well as the pool size properly (i.e. one conv filter, Conv1D(1,...) is not probably optimal, and pool size of 1 does not make sense).

Further, that Dense layer before the last layer which has only one unit could severely limit the representational capacity of the model (i.e. it is essentially the bottleneck of your model). Either increase its number of units, or remove it.

The other thing is that there is no reason for not one-hot encoding the labels of dev set. Rather, they should be one-hot encoded like the labels of training set. Therefore, either the training argument of make_generator should be removed entirely or, if you have some other use case for it, the dev dataset should be created with training=True argument passed to make_dataset function.

Finally, after all these changes your model might work and start fitting on data; but after a few batches passed, you might get incompatible shapes error again. That’s because you are generating input data with unknown dimension and also use a relaxed padding approach to pad each batch as much as needed (i.e. by using (None,) for padded_shapes). To resolve this you should decide on a fixed input/output dimension (e.g. by considering a fixed length for input/output sequences), and then adjust the architecture or hyper-parameters of the model (e.g. conv kernel size, conv padding, pooling size, adding more layers, etc.) as well as the padded_shapes argument accordingly. Even if you would like your model to support input/output sequences of variable length instead, then you should consider it in model’s architecture and hyper-parameters and also the padded_shapes argument. Since this the solution depends on the task and desired design in your mind and there is no one-fits-all solutions, I would not comment further on that and leave it to you to figure it out. But here is a working solution (which may not be, and probably isn’t, optimal at all) just to give you an idea:

self.out_layer = keras.layers.Dense(params['vocab_size'], activation='softmax')

self.model_layers = [
    keras.layers.Embedding(params['vocab_size'], params['vocab_size']),
    keras.layers.Conv1D(32, 4, padding='same'),
    keras.layers.TimeDistributed(self.out_layer)
]


# ...
padded_shapes=(
    {'inputs': (10,)},
    {'targets': (10,)}
)

Summery

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

Also Read

Leave a Comment