Hottest 'lasso' Answers

6 votes

For a square matrix of data, I achieve $R^2=1$ for Linear Regression and $R^2=0$ for Lasso. What's the intuition behind?

A few things going on here: Your matrix is 100x100. So you have no degrees of freedom left in a linear model, which will cause $R^2=1$. See this post. You use random numbers. Thus, they should make ...

Peter

7,866

answered Dec 28, 2019 at 10:43

4 votes

Why does Lasso behave "erratically" when the number of features is greater than the number of training instances?

When p > n, the LASSO model can only sustain up to n variables (this can be proven using linear algebra, the rank of the data matrix in particular), leaving at least p - n variables out (some that ...

aranglol

2,236

answered Jul 17, 2019 at 13:44

4 votes

Normalisation results in R^2 score of 0 - Lasso regression

Standardizing/normalizing is generally the right thing to do, but it will make little/no difference with just one independent variable if you also adjust the regularization strength. With more than ...

Ben Reiniger♦

12.7k

answered Nov 5, 2019 at 16:32

3 votes

Group lasso and feature selection

Presumably, you need a sparse group logistic regression model to perform feature selection while considering the binary response. skglm is a new modular, scikit-...

Badr MOUFAD

31

answered Aug 24, 2022 at 8:49

3 votes

Accepted

Elegant way to plot the L2 regularization path of logistic regression in python?

sklearn has such a functionality already for regression problems, in enet_path and lasso_path. There's an example notebook here....

Ben Reiniger♦

12.7k

answered Oct 22, 2021 at 18:32

3 votes

How do standardization and normalization impact the coefficients of linear models?

When you have a linear regression (without any scaling, just plain numbers) and you have a model with one explanatory variable $x$ and coefficients $\beta_0=0$ and $\beta_1=1$, then you essentially ...

Peter

7,866

answered Aug 21, 2020 at 18:47

3 votes

Accepted

Need advice regarding cross-validiation to obtain optimal lambda in Lasso

Welcome to DS.SE @h_ihkam ! So how can I decide on the search range? What is the best practice? Please provide me with some guidance. Good questions !! Choosing the Optimal Lambda in LASSO Using ...

Robert Long

3,238

answered Nov 28, 2024 at 12:54

2 votes

how Lasso regression helps to shrinks the coefficient to zero and why ridge regression dose not shrink the coefficient to zero?

These diagrams show the "constrained" version of lasso/ridge, in which you minimize the pure loss function subject to a constraint $\|\beta\|_1\leq t$ or $\|\beta\|_2\leq t$. (Another ...

Ben Reiniger♦

12.7k

answered Nov 10, 2020 at 21:47

2 votes

Accepted

What is the meaning of the sparsity parameter

When we implement penalized regression models we are saying that we are going to add a penalty to the sum of the squared errors. Recall that the sum of squared errors is the following and that we are ...

Ethan

1,657

answered Dec 1, 2020 at 3:32

2 votes

Lasso regression not getting better without random features

In answer to your first question: The reason that your RMSE proceeded to increase as you increased the strength of your regularization (the value of $\lambda$) can be explained by reviewing the ...

Ethan

1,657

answered Feb 4, 2021 at 19:05

2 votes

Accepted

Do I have to remove features with pairwise correlation even if I am doing a regularized logistic regression?

Yes the L1 regularization will shrink the irrelevant feature coefficients to zero and hence it doesn't require feature selection. In fact it IS a commonly used feature selection technique. So ...

spectre

2,223

answered Oct 21, 2021 at 8:51

2 votes

Accepted

What's the correct cost function for Linear Regression

Interesting question. I'd say it is correct not to divide, due to the following reasoning... For linear regression there is no difference. The optimum of the cost function stays the same, regardless ...

MB-F

286

answered Oct 14, 2021 at 8:57

2 votes

Accepted

Difference between PCA and regularisation

Lasso does feature selection in the way that a penalty is added to the OLS loss function (see figure below). So you can say that features with low "impact" will be "shrunken" by ...

Peter

7,866

answered May 18, 2021 at 17:08

1 vote

Lack of standardization in Kaggle jupyter notebooks when using lasso/ridge?

Kaggle is a crowd source platform with no quality control. It is to be expected that there will be deviations from best practices.

Brian Spiering

22.9k

answered May 4, 2021 at 19:59

1 vote

How to compare between two methods of multivariate to filling NA

You don't at this stage. Train a few models with each method and compare.

lpounng

1,177

answered May 19, 2022 at 6:29

1 vote

Accepted

Why is gridsearchCV.best_estimator_.score giving me r2_score even if I mentioned MAE as my main scoring metric?

This is the default behavior for any Scikit-learn regressor, and as far as I know, it cannot be modified. So for regressors, the score method will return the $R^2$ ...

Multivac

3,139

answered Feb 16, 2022 at 23:47

1 vote

Is it possible to explain why Lasso models eliminated certain coefficient?

Have a look at "Introduction to Statistical Learning" (Chapter 6.2.2). The Lasso adds an aditional penalty term to the original OLS penalty. In addition to the residual sum of squares (RSS, ...

Peter

7,866

answered Feb 16, 2022 at 8:30

1 vote

Accepted

Accessing regression coefficients when using MultiOutputRegressor

Instead of using the estimator attribute you should be using the best_estimator attribute, after which you can access the ...

Oxbowerce

8,492

answered Jan 14, 2022 at 10:27

1 vote

What (linear) model is common practice to use on sample size of 500 with 26 features?

The predictive power of a model is highly contingent on the data generating process and it is ex ante hard to tell what will work best (especially with limited information about the data as in this ...

Peter

7,866

answered Oct 2, 2021 at 9:30

1 vote

How to set coefficient limit in lasso regression in Python?

Scikit-learn (which I'm assuming you're using) does not allow you to constrain the coefficients in such a way (at most you can constrain them to all be positive with ...

mdgrogan

91

answered Sep 2, 2021 at 12:26

1 vote

How to remove features from a sklearn pipeline after it has already been fitted?

Standard Scalar trained on 30 features so it expects 30 features only. One simple hack you can do is, you can create a new ...

Uday

576

answered Feb 3, 2021 at 16:41

1 vote

Accepted

Lasso Regression for Feature Importance saying almost every feature is unimportant?

Change (search over) the penalty parameter of lasso. FinalRevenue = RevenueSoFar is a good baseline "model," but hopefully your other features can ...

Ben Reiniger♦

12.7k

answered Jan 22, 2021 at 16:05

1 vote

Interpreting machine learning coefficients

Neural Networks are notoriously good at performance and bad at interpretability, i.e. it's very difficult (almost impossible) to explain why a particular prediction was made. It's even more difficult ...

Erwan

26.2k

answered Dec 17, 2020 at 23:30

1 vote

What is the meaning of the sparsity parameter

@Ethan is correct about the formulation of the lasso penalty, and I think it's particularly important to understand it in that form (for one thing, because that same penalty can work with other models ...

Ben Reiniger♦

12.7k

answered Dec 3, 2020 at 15:36

1 vote

how Lasso regression helps to shrinks the coefficient to zero and why ridge regression dose not shrink the coefficient to zero?

This StatQuest video does a fantastic job of explaining in simple terms why this is the case.

Oliver Foster

912

answered Dec 11, 2020 at 23:39

1 vote

regarding lasso.score in lasso modeling using scikit-learn

R^2 is a statistical measure of how close the data are to the fitted regression line. It does this by seeing percentage of the variance of dependent varible that's explained by independent variable. ...

prashant0598

1,561

answered Aug 27, 2020 at 9:38

1 vote

How do standardization and normalization impact the coefficients of linear models?

I believe with scaling, the coeff. are scaled by the same level i.e. Std. Deviation times with Standardization and (Max-Min) times with Normalization If we look at all the features individually, we ...

10xAI

5,839

answered Aug 22, 2020 at 11:53

1 vote

Accepted

How is learning rate calculated in sklearn Lasso regression?

With sklearn you can have two approaches for linear regression: 1) LinearRegression object uses Ordinary Least Squares (OLS) solver from scipy, as Learning rate (...

Carlos Mougan

6,410

answered May 18, 2020 at 8:53

1 vote

Accepted

When should we start using stacking of models?

Stacking is going to help most when individual models capture unique characteristics of the data. It is often the case that different architectures perform similarly, if somewhat differently, on the ...

HEITZ

911

answered Dec 4, 2019 at 5:09

1 vote

LASSO remaining features for different penalisation

Lambda is a tuning parameter („how much regularisation“, I think called alpha in sklearn) and you would choose lambda so that you optimise fit (e.g. by MSE). You can do this by running cross ...

Peter

7,866

answered Jan 5, 2020 at 21:10

Stack Exchange Network

Tag Info

Hot answers tagged lasso

For a square matrix of data, I achieve $R^2=1$ for Linear Regression and $R^2=0$ for Lasso. What's the intuition behind?

Why does Lasso behave "erratically" when the number of features is greater than the number of training instances?

Normalisation results in R^2 score of 0 - Lasso regression

Group lasso and feature selection

Elegant way to plot the L2 regularization path of logistic regression in python?

How do standardization and normalization impact the coefficients of linear models?

Need advice regarding cross-validiation to obtain optimal lambda in Lasso

how Lasso regression helps to shrinks the coefficient to zero and why ridge regression dose not shrink the coefficient to zero?

What is the meaning of the sparsity parameter

Lasso regression not getting better without random features

Do I have to remove features with pairwise correlation even if I am doing a regularized logistic regression?

What's the correct cost function for Linear Regression

Difference between PCA and regularisation

Lack of standardization in Kaggle jupyter notebooks when using lasso/ridge?

How to compare between two methods of multivariate to filling NA

Why is gridsearchCV.best_estimator_.score giving me r2_score even if I mentioned MAE as my main scoring metric?

Is it possible to explain why Lasso models eliminated certain coefficient?

Accessing regression coefficients when using MultiOutputRegressor

What (linear) model is common practice to use on sample size of 500 with 26 features?

How to set coefficient limit in lasso regression in Python?

How to remove features from a sklearn pipeline after it has already been fitted?

Lasso Regression for Feature Importance saying almost every feature is unimportant?

Interpreting machine learning coefficients

What is the meaning of the sparsity parameter

how Lasso regression helps to shrinks the coefficient to zero and why ridge regression dose not shrink the coefficient to zero?

regarding lasso.score in lasso modeling using scikit-learn

How do standardization and normalization impact the coefficients of linear models?

How is learning rate calculated in sklearn Lasso regression?

When should we start using stacking of models?

LASSO remaining features for different penalisation

Tag Info

Hot answers tagged lasso

Related Tags