Skip to main content

All Questions

Filter by
Sorted by
Tagged with
0 votes
0 answers
10 views

Numerical precision in Flux.jl

I am trying to study ANN training in terms of dynamical systems framework, by treating the model as the system, and the training as the time evolution dynamics. As an extension, I tried to make the ...
IBArbitrary's user avatar
5 votes
2 answers
644 views

Is there any advantage of a lower value of a loss function?

I have two loss functions $\mathcal{L}_1$ and $\mathcal{L}_2$ to train my model. The model is predominantly a classification model. Both $\mathcal{L}_1$ and $\mathcal{L}_2$ takes are two variants of ...
Aleph's user avatar
  • 185
0 votes
0 answers
14 views

How to handle sequences with crossEntropyLoss

fist of all i am ne wto the whole thing, so sorry if this is superdumb. I'm currently training a Transformer model for a sequence classification task using CrossEntropyLoss. My input tensor has the ...
Tobias's user avatar
  • 101
1 vote
1 answer
69 views

Does using different optimizer change the loss landscape

I plot the landscape using this code, and I notice the landscape shape has changed a lot. My understanding is that the optimizer does not change the loss landscape. But now I'm confused if its just ...
user836026's user avatar
0 votes
1 answer
51 views

My custom neural network is converging but keras model not

in most cases it is probably the other way round but... I have implemented a basic MLP neural network structure with backpropagation. My data is just a shifted quadratic function with 100 samples. I ...
tymsoncyferki's user avatar
0 votes
0 answers
312 views

Training loss is much higher than validation loss

I am trying to train a neural network with 2 hidden layers to perform a multi class classification of 3 different classes. There is a huge imbalance to the classes, with the distribution being around ...
joseph wong's user avatar
2 votes
1 answer
222 views

What is the benefit of the exponential function inside softmax?

I know that softmax is: $$ softmax(x) = \frac{e^{x_i}}{\sum_j^n e^{x_j}}$$ This is an $\mathbb{R}^n \implies \mathbb{R}^n$ function, and the elements of the output add up to 1. I understand that the ...
Victor2748's user avatar
0 votes
0 answers
116 views

Why backpropagation is done in every epoch when loss is always scalar?

I understand the backpropagation algorithm that it calculates the derivate of loss with respect to all the parameters in the neural network. My question is this derivate is constant right because the ...
Jeet's user avatar
  • 101
1 vote
2 answers
4k views

Training and validation loss are almost the same (perfect fit?)

I am developing an ANN from scratch which classifies MNIST digits. These are the curves I get using only one hidden layer composed of 100 neurons activated by ...
tail's user avatar
  • 177
0 votes
1 answer
25 views

Binary crossentropy loss

When we have a binary classification problem, we use a sigmoid activation function in the output layer+ a binary crossentropy loss. We also need to one hot encode the target variable.This s a binary ...
Ahmed Mohamed's user avatar
0 votes
1 answer
87 views

How do I know that my weights optimizer have found the best weights?

I am new to deep learning and my understanding of how optimizers work might be slightly off. Also, sorry for a third-grader quality of images. For example if we have simple task our loss to weight ...
Neriko's user avatar
  • 3
1 vote
3 answers
188 views

How to learn steep functions using neural network?

I am trying to use a neural network to learn the below function. In total, I have 25 features and 19 outputs. The above image shows the distribution of two features with respect to one of the outputs....
newbie's user avatar
  • 61
0 votes
1 answer
356 views

Training deep neural networks with ReLU output layer for verification

Most algorithms for verification of deep neural network require ReLU activation functions in each layer (e.g. Reluplex). I have a binary classification task with classes 0 and 1. The main problem I ...
alext90's user avatar
1 vote
1 answer
19 views

The proper loss function for regression that prediction values do not lie on one side of the real values

I'm doing a prediction task using machine learning. First I'm doing a regression task, then I use the values to predict its class. I used MSE as loss function. However, my prediction values are ...
user900476's user avatar
1 vote
0 answers
287 views

Val Loss and manually calculated loss produce different values

I have a CNN classification model that uses loss: binary cross entropy: ...
Amit Raz's user avatar
  • 201

15 30 50 per page
1
2 3 4 5
9