Skip to main content

All Questions

Filter by
Sorted by
Tagged with
9 votes
4 answers
2k views

Does training a neural network on a combined dataset outperform sequential training on individual datasets?

I have a neural network with a fixed architecture (let's call it Architecture A). I also have two datasets, Dataset 1 and Dataset 2, both of which are independently and identically distributed (i.i.d.)...
Arvind Kumar Sharma's user avatar
5 votes
2 answers
644 views

Is there any advantage of a lower value of a loss function?

I have two loss functions $\mathcal{L}_1$ and $\mathcal{L}_2$ to train my model. The model is predominantly a classification model. Both $\mathcal{L}_1$ and $\mathcal{L}_2$ takes are two variants of ...
Aleph's user avatar
  • 185
1 vote
1 answer
60 views

How does seeing training batches only once influence the generalization of a neural network?

I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment. As I am not seeing any training batch multiple times I would guess that ...
ZenDen's user avatar
  • 23
0 votes
1 answer
47 views

Learning the gradient descent stepsize with RL [closed]

Problem statement: I've been working on a project to accelerate the convergence of gradient descent using reinforcement learning (RL). I want to learn a policy that can map the current state of ...
CodeGuy's user avatar
1 vote
2 answers
403 views

Gradient Descent: Is the magnitude in Gradient Vectors arbitrary?

I am only just getting familiar with gradient descent through learning logistic regression. I understand the directional component in the gradient vectors is correct information derived from the slope ...
MrHunda's user avatar
  • 11
0 votes
0 answers
21 views

How to decide a State for Deep Q Learning for Production Line scheduling

There is a production floor with W workstations and N jobs with M operations( different processing times per operation ). A job is completed only if its M Operations are completed. Objective is to ...
ArchanaR's user avatar
2 votes
0 answers
142 views

Can I find the input that maximises the output of a Neural Network?

So I trained a 2 layer Neural Network for a regression problem that takes $D$ features $(x_1,...,x_D)$ and outputs a real value $y$. With the model already trained (weights optimised, fixed), can I ...
puradrogasincortar's user avatar
0 votes
1 answer
25 views

Binary crossentropy loss

When we have a binary classification problem, we use a sigmoid activation function in the output layer+ a binary crossentropy loss. We also need to one hot encode the target variable.This s a binary ...
Ahmed Mohamed's user avatar
0 votes
1 answer
87 views

How do I know that my weights optimizer have found the best weights?

I am new to deep learning and my understanding of how optimizers work might be slightly off. Also, sorry for a third-grader quality of images. For example if we have simple task our loss to weight ...
Neriko's user avatar
  • 3
0 votes
1 answer
19 views

change parameterization to eliminate weight constraints in neural networks

I am wondering if it makes sense to use a parameterization to eliminate simple weight inequalities, for example if the weights should be $w\geq 0$, one cound train $\exp w$ over the unconstrained set ...
Philipp123's user avatar
0 votes
0 answers
26 views

uncertainties in non-convex optimization problems (neural networks)

How do you treat statistical uncertainties coming from non-convex optimization problems? More specifically, suppose you have a neural network. It is well known that the loss is not convex; the ...
Dave's user avatar
  • 13
0 votes
1 answer
244 views

Implementing a Randomized Neural Network using Tensorflow?

I want to implement a Randomised Neural Network (alt. Neural Network with Random Weights (NNRW)) in keras based on the following paper: https://arxiv.org/pdf/2104.13669.pdf Essentially the idea is the ...
SwagCakes's user avatar
0 votes
0 answers
39 views

Why does my regression-NN completely fail to predict some points?

I would like to train a NN in order to approximate an unknown function $y = f(x_1,x_2)$. I have a lot of measurements $y = [y_1,\dots,y_K]$ (with K that could be in the range of 10-100 thousands) ...
MttRch's user avatar
  • 1
0 votes
2 answers
2k views

Is reinforcement learning analogous to stochastic gradient descent?

Not in a strict mathematical formulation sense but, would there be there any key overlapping principals for the two optimisation approaches? For example, how does $$\{x_i, y_i, \mathrm{grad}_i \}$$ (...
hH1sG0n3's user avatar
  • 2,078
0 votes
1 answer
226 views

Why is my Neural Network having constant loss and always predicting a singular value?

I am trying to make a neural network on a dataset with 257 features and 1 target variable. My code looks like the following: ...
bballboy8's user avatar
  • 101

15 30 50 per page
1
2 3 4 5
7