Open
Description
Hello.
I am running into some performance issues when specifying and sampling from a linear model with b-splines in PYMC (chapter 4, R Code reference 4.76). Everything works fast as long as I keep the number of knots and/or the degree low enough (below 10 knots and 2 degrees). As soon as I increase one of the parameters above the mentioned values, the model no longer samples within ~30 seconds, but takes 20 minutes+.
Has anybody else run into similar issues? I am currently puzzled by the rapid performance drop...
Any help or pointers would be appreciated.
Code
import numpy as np
import pymc as pm
from patsy import dmatrix
# data: cherry_blossoms
n_knots = 10
b_spline_degree = 2
years = cherry_blossoms["year"].values
day_of_blossom = cherry_blossoms["doy"].values
knots = np.quantile(years, np.linspace(0, 1, n_knots))
b_spline_basis_dm = dmatrix(
"bs(year, knots=knots, degree=b_spline_degree, include_intercept=True) - 1",
{
"year": years,
"knots": knots[1:-1],
"b_spline_degree": b_spline_degree,
},
)
b_spline_basis = np.asarray(b_spline_basis_dm)
with pm.Model() as b_spline_model:
alpha = pm.Normal("alpha", mu=100, sigma=10)
beta = pm.Normal("beta", mu=0, sigma=10, shape=b_spline_basis.shape[1])
sigma = pm.Exponential("sigma", 1)
mu = pm.Deterministic("mu", alpha + pm.math.dot(b_spline_basis, beta.T))
likelihood = pm.Normal(
"likelihood", mu=mu, sigma=sigma, observed=day_of_blossom
)
trace = pm.sample(draws=1000, tune=1000, chains=4)
Versions:
name : pymc
version : 5.9.1
name : numpy
version : 1.25.2
Best htw
Metadata
Metadata
Assignees
Labels
No labels