All Questions
Tagged with optimization neural-network
92 questions
9
votes
4
answers
2k
views
Does training a neural network on a combined dataset outperform sequential training on individual datasets?
I have a neural network with a fixed architecture (let's call it Architecture A). I also have two datasets, Dataset 1 and Dataset 2, both of which are independently and identically distributed (i.i.d.)...
5
votes
2
answers
644
views
Is there any advantage of a lower value of a loss function?
I have two loss functions $\mathcal{L}_1$ and $\mathcal{L}_2$ to train my model. The model is predominantly a classification model. Both $\mathcal{L}_1$ and $\mathcal{L}_2$ takes are two variants of ...
1
vote
1
answer
60
views
How does seeing training batches only once influence the generalization of a neural network?
I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment.
As I am not seeing any training batch multiple times I would guess that ...
0
votes
1
answer
47
views
Learning the gradient descent stepsize with RL [closed]
Problem statement:
I've been working on a project to accelerate the convergence of gradient descent using reinforcement learning (RL). I want to learn a policy that can map the current state of ...
1
vote
2
answers
403
views
Gradient Descent: Is the magnitude in Gradient Vectors arbitrary?
I am only just getting familiar with gradient descent through learning logistic regression. I understand the directional component in the gradient vectors is correct information derived from the slope ...
0
votes
0
answers
21
views
How to decide a State for Deep Q Learning for Production Line scheduling
There is a production floor with W workstations and N jobs with M operations( different processing times per operation ). A job is completed only if its M Operations are completed. Objective is to ...
2
votes
0
answers
142
views
Can I find the input that maximises the output of a Neural Network?
So I trained a 2 layer Neural Network for a regression problem that takes $D$ features $(x_1,...,x_D)$ and outputs a real value $y$. With the model already trained (weights optimised, fixed), can I ...
0
votes
1
answer
25
views
Binary crossentropy loss
When we have a binary classification problem, we use a sigmoid activation function in the output layer+ a binary crossentropy loss. We also need to one hot encode the target variable.This s a binary ...
0
votes
1
answer
87
views
How do I know that my weights optimizer have found the best weights?
I am new to deep learning and my understanding of how optimizers work might be slightly off. Also, sorry for a third-grader quality of images.
For example if we have simple task our loss to weight ...
0
votes
1
answer
19
views
change parameterization to eliminate weight constraints in neural networks
I am wondering if it makes sense to use a parameterization to eliminate simple weight inequalities, for example if the weights should be $w\geq 0$, one cound train $\exp w$ over the unconstrained set ...
0
votes
0
answers
26
views
uncertainties in non-convex optimization problems (neural networks)
How do you treat statistical uncertainties coming from non-convex optimization problems?
More specifically, suppose you have a neural network. It is well known that the loss is not convex; the ...
0
votes
1
answer
244
views
Implementing a Randomized Neural Network using Tensorflow?
I want to implement a Randomised Neural Network (alt. Neural Network with Random Weights (NNRW)) in keras based on the following paper: https://arxiv.org/pdf/2104.13669.pdf
Essentially the idea is the ...
0
votes
0
answers
39
views
Why does my regression-NN completely fail to predict some points?
I would like to train a NN in order to approximate an unknown function $y = f(x_1,x_2)$. I have a lot of measurements $y = [y_1,\dots,y_K]$ (with K that could be in the range of 10-100 thousands) ...
0
votes
2
answers
2k
views
Is reinforcement learning analogous to stochastic gradient descent?
Not in a strict mathematical formulation sense but, would there be there any key overlapping principals for the two optimisation approaches?
For example, how does
$$\{x_i, y_i, \mathrm{grad}_i \}$$ (...
0
votes
1
answer
226
views
Why is my Neural Network having constant loss and always predicting a singular value?
I am trying to make a neural network on a dataset with 257 features and 1 target variable. My code looks like the following:
...