-
Notifications
You must be signed in to change notification settings - Fork 19.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The loss becomes negative #1917
Comments
Do you really think that's enough information for anyone to be able to answer your question? |
The loss is just a scalar that you are trying to minimize. It's not supposed to be positive! For instance a cosine proximity loss will usually be negative (trying to make proximity as high as possible by minimizing a negative scalar). |
Hi, thank you for your answers. |
What is your training objective, |
Hi, thank you for your help. |
@FiReTiTi
|
If the loss cannot be negative, then does it mean it goes over the encoding limits and then it loops back into the negative values? |
I am still working on the same data, and here is an other weird thing: |
@FiReTiTi please give more information about your model if you want help on that. In your last case, your optimiser is likely to be stuck in a local minima. That could explain why it remains identical during all your next iterations. |
Here is the model: The dataset contains 70 000 images of size 31x31. |
@FiReTiTi May I ask how did you solve the negative loss problem? I ran into the same problem, my loss is a customed loss with a bunch of mse, so it shouldn't be negative either. It looks like this:
|
I haven't, it still happen time to time :-( |
@FiReTiTi Thanks for your reply. I found my problem, I use a custom loss and accidentally put y_pred and y_true in the wrong order when passing them to my loss function, so maybe it's not the same reason in your case. |
@sunshineatnoon so you were using a non symmetric loss function like the cross entropy? |
Here is my loss function, it's a bunch of squared values. |
Cool for you! |
@FiReTiTi In that case, I think it's more likely an overflow. Do you use Theano? Maybe you can try NanGuardMode in theano to see if it gives you any errors or warnings. I googled a lot last night and found that Nans or Infs might cause this kind of error. Such as this one |
That's also my opinion. Thanks for the tips, I will test them when it occurs again. |
My loss is negative, what does that mean? I am using tensorflow backend. Epoch 1/10 My code is here for reference: import numpy as np model = Sequential() model.add(Convolution2D(3, 3, 32, border_mode='valid', dim_ordering='tf', input_shape=(150, 200, 3))) model.add(Convolution2D(64, 3, 3, border_mode='valid')) model.add(Flatten()) train_datagen = ImageDataGenerator( model.compile(loss='binary_crossentropy', test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( validation_generator = test_datagen.flow_from_directory( model.fit_generator(train_generator, samples_per_epoch=2536, nb_epoch=10, validation_data=validation_generator, nb_val_samples=800) model.save_weights('thesis.h5') |
@zach-nervana FYI |
Hello everyone, As we all know, the kld loss can not be negative, I am training a regression model, and get negative values. modelbase_model = VGG16(input_shape=(360, 480, 3), weights='imagenet', include_top=False) compileadam = Adam(lr=1e-5, beta_1=0.9, beta_2=0.999, epsilon=1e-8, decay=0.0) The problem is, if I add a softmax layer at end of the model, the loss is positive, which is fine, but the loss is around 32, it is really big. But if I remove the softmax layer, the loss becomes negative. For the input and output, input are images, I normalize the images to 0-1, and labels also 0-1. |
@FiReTiTi Did you solve your problem? I had a similar problem. I used theano as backend, and the loss function is binary_crossentropy, during the training, the acc, val_acc, loss, and val_loss never changed in every epoch, and loss value is very high , about 8. I used 4000 training samples 1000 validation samples `inputs_x=Input(shape=(1,65,21)) x=Conv2D(32,(5,5),padding='same',data_format='channels_first',activation='relu',use_bias=True)(x) x=Dropout(0.55)(x) inputs_y=Input(shape=(1,32,21)) y=Conv2D(32,(4,4),padding='same',data_format='channels_first',activation='relu',use_bias=True)(y) y=Dropout(0.60)(y) merged_input=keras.layers.concatenate([x,y],axis=-1) z=Dense(16,activation='softmax')(merged_input) outp=Dense(1,activation='softmax')(z) model=Model(inputs=[inputs_x,inputs_y],outputs=outp) history=model.fit(x=[train_inputs_x,train_inputs_y],y=train_label,batch_size=32, Any ideas for this problem? |
No. It looks like an overflow problem that did not happen when I reduced the size of my model. Have you tried to switch for TensorFlow as backend. Things seems to be more stable for me since I use TensorFlow. |
Ok, I will try switching the backend. Thanks |
@FiReTiTi Did you try to normalize your input? Non appropriate normalization of the input may lead to a gradient explosion problem. |
@fregocap Yes, the input are normalized. |
I had the same problem with negative loss binary crossentropy.
The problem in my case was that the outputs given by generator were not 0 and 1 but several classes (0, 1, 2, ... 6) instead. The model unexpectedly did not fail but provided negative loss. The solution is to use Dense(n_classes, activation='softmax') |
When binary cross entropy predictions are negative, it is because the true values are not [0,1]. In my case I was using [-1,1]. The model does not fail, but produces negative value. |
Thanks. |
I got the negative loss, when i training autoencoder on image data and normalize the images to 0 mean and 1 std (half of data value is -ve) and using binary_crossentropy loss. Later i figure out, this is happening because of binary_crossentropy loss work as regression loss when the input is between 0 and 1, but in my case inputs are also -ve. |
The answer is easy in my opinion. Your data are not between 0 and 1 and they are between 0 and 255. Just add a "/ 255" on your ground truth data and results will be positive. |
Thanks Hamed. You are right |
Quite a long time ago I also found this issue, fixed it by changing the optimizer from Adam back to the default RMSprop. |
I think it can also be the result of a high learning rate in some cases, the weights might become too large for tensorflow to work properly. Sometimes when I have the loss growing, I try decreasing the learning rate and it works. |
Hi,
I jut ran a CNN built with Keras on a big training set, and I has weird loss values at each epoch (see below):
66496/511502 [==>...........................] - ETA: 63s - loss: 8.2800
66528/511502 [==>...........................] - ETA: 63s - loss: -204433556137039776.0000
345664/511502 [===================>..........] - ETA: 23s - loss: 8.3174
345696/511502 [===================>..........] - ETA: 23s - loss: -39342531075525840.0000
214080/511502 [===========>..................] - ETA: 41s - loss: 8.3406
214112/511502 [===========>..................] - ETA: 41s - loss: -63520753730220536.0000
How is that possible? The loss becomes suddenly to big and the value gets bigger than the double encoding?
Is there a way to avoid it?
Regards,
The text was updated successfully, but these errors were encountered: