Newest 'optimization' Questions

3 votes

4 answers

160 views

Rounding Float Values in ML Models

Let's assume I have a column with float values (e.g., 3.12334354454, 5.75434331354, and so on). If I round these values to two decimal places (e.g., 3.12, 5.75), I think the advantages and ...

Guna

390

asked 18 hours ago

3 votes

1 answer

59 views

Why MAE is hard to optimize?

In numerous sources it is said that MAE has a disadvantage of not being differentiable a zero hence it has problems with gradient-based optimization methods. However I've never saw an explanation why ...

Nourless

163

asked Apr 11 at 10:33

9 votes

4 answers

2k views

Does training a neural network on a combined dataset outperform sequential training on individual datasets?

I have a neural network with a fixed architecture (let's call it Architecture A). I also have two datasets, Dataset 1 and Dataset 2, both of which are independently and identically distributed (i.i.d.)...

Arvind Kumar Sharma

91

asked Mar 24 at 21:01

0 votes

0 answers

16 views

Stale weights and gradients given Adam with an optimal learning rate

I'm fitting a network to predict a delta between eight corresponding 3D points at two timesteps. The model consists of two MLPs with two layers each, with LeakyRELU in between the layers. It takes in ...

zak

31

asked Mar 15 at 22:04

1 vote

0 answers

20 views

Second Moment (Uncentered Variance) Estimate of Gradient

I am reading Kingma and Lei Ba's paper introducing the Adam optimizer. I was looking over some derivations for the second moment estimate: I noticed that they find the sum of a finite geometric ...

Mateo del Rio Lanse

11

asked Mar 2 at 17:35

0 votes

0 answers

9 views

Nesterov Accelerated Gradient Descent Stalling with High Regularization in Extreme Learning Machine

I'm implementing Nesterov Accelerated Gradient Descent (NAG) on an Extreme Learning Machine (ELM) with one hidden layer. My loss function is the Mean Squared Error (MSE) with L2 regularization. The ...

Paolo Pedinotti

1

asked Mar 2 at 10:12

0 votes

0 answers

23 views

Question on Optimized Threshold in Predictive Modeling

I'm trying to build a predictive model, but I haven't found a method that consistently delivers high performance. Is it acceptable to use an # Optimize classification threshold 0.996 ?

waleed almutairi

1

asked Feb 23 at 17:42

0 votes

0 answers

15 views

Optimizing LLM-Based Field Extraction Across 3500+ Document Templates

We are using Azure Document Intelligence to extract all content from PDFs containing 1-7 pages. After extraction, we pass the content to an LLM (OpenAI) to extract only the required 35-40 fields. The ...

Johnimmanuel

1

asked Jan 22 at 7:00

0 votes

0 answers

15 views

Error in plotting Gaussian Process for 3 models that use Bayesian Optimization

I'm writing a python script for Orange Data Mining to plot the gaussian processes in order to find the best hyperparameters for the 5-FoldCrossValidation Accuracy metric. The three models are SVC, ...

Mattma

1

asked Jan 15 at 17:06

0 votes

0 answers

31 views

Objective function in reward model in Vanilla RLHF is ambiguous for me

I am trying to learn the background of Vanilla RLHF. I am struggling to understand the objective function in reward model. It is defined If the difference of the log of the sigmoid of the difference ...

Baghban

101

asked Jan 14 at 18:35

5 votes

2 answers

644 views

Is there any advantage of a lower value of a loss function?

I have two loss functions $\mathcal{L}_1$ and $\mathcal{L}_2$ to train my model. The model is predominantly a classification model. Both $\mathcal{L}_1$ and $\mathcal{L}_2$ takes are two variants of ...

Aleph

185

asked Jan 6 at 3:05

2 votes

0 answers

33 views

Effect of objective function's Hessian's condition number on learning rate in Gradient Descent

I'm following Ian Goodfellow et al. book titled Deep Learning, and in Chapter 4 - Numerical Computation, page 87, he mentions that by utilising second order Taylor approximation of the objective ...

Aditya

121

asked Dec 28, 2024 at 13:16

0 votes

0 answers

17 views

How to improve LSTM model performance for weather prediction?

I predict rainfall using observational data. There are a total of 87,070 data samples, but only 1,885 samples have rainfall. And here is the LSTM model I am using: ...

Vinh Nguyen

1

asked Dec 5, 2024 at 9:14

0 votes

0 answers

17 views

Given the total cost of a graph walk, how to estimate the cost of each edge?

I have a real-world problem in which I have a collection of nodes and their edges. This collection is composed of hundreds of nodes and thousands of connections. Then I have about 10 K datapoints each ...

Althis

123

asked Dec 3, 2024 at 15:16

4 votes

3 answers

247 views

What is best package for convex optimization?

I have a set of problems of the form $\text{min} \|Ax-y\|_1$ with some constraints on the $x_i$. A quick search turns up the cvxpy, ...

Edmund

757

asked Nov 24, 2024 at 20:07

Stack Exchange Network

Questions tagged [optimization]

Rounding Float Values in ML Models

Why MAE is hard to optimize?

Does training a neural network on a combined dataset outperform sequential training on individual datasets?

Stale weights and gradients given Adam with an optimal learning rate

Second Moment (Uncentered Variance) Estimate of Gradient

Nesterov Accelerated Gradient Descent Stalling with High Regularization in Extreme Learning Machine

Question on Optimized Threshold in Predictive Modeling

Optimizing LLM-Based Field Extraction Across 3500+ Document Templates

Error in plotting Gaussian Process for 3 models that use Bayesian Optimization

Objective function in reward model in Vanilla RLHF is ambiguous for me

Is there any advantage of a lower value of a loss function?

Effect of objective function's Hessian's condition number on learning rate in Gradient Descent

How to improve LSTM model performance for weather prediction?

Given the total cost of a graph walk, how to estimate the cost of each edge?

What is best package for convex optimization?

Hot Network Questions

Questions tagged [optimization]

Related Tags