Skip to content

Sampling performance b-splines model chapter 04 #233

Open
@hannes-tw

Description

@hannes-tw

Hello.

I am running into some performance issues when specifying and sampling from a linear model with b-splines in PYMC (chapter 4, R Code reference 4.76). Everything works fast as long as I keep the number of knots and/or the degree low enough (below 10 knots and 2 degrees). As soon as I increase one of the parameters above the mentioned values, the model no longer samples within ~30 seconds, but takes 20 minutes+.

Has anybody else run into similar issues? I am currently puzzled by the rapid performance drop...
Any help or pointers would be appreciated.

Code

import numpy as np
import pymc as pm
from patsy import dmatrix

# data: cherry_blossoms

n_knots = 10
b_spline_degree = 2
years = cherry_blossoms["year"].values
day_of_blossom = cherry_blossoms["doy"].values

knots = np.quantile(years, np.linspace(0, 1, n_knots))
b_spline_basis_dm = dmatrix(
    "bs(year, knots=knots, degree=b_spline_degree, include_intercept=True) - 1",
    {
        "year": years,
        "knots": knots[1:-1],
        "b_spline_degree": b_spline_degree,
    },
)

b_spline_basis = np.asarray(b_spline_basis_dm)

with pm.Model() as b_spline_model:
    alpha = pm.Normal("alpha", mu=100, sigma=10)
    beta = pm.Normal("beta", mu=0, sigma=10, shape=b_spline_basis.shape[1])
    sigma = pm.Exponential("sigma", 1)
    mu = pm.Deterministic("mu", alpha + pm.math.dot(b_spline_basis, beta.T))
    likelihood = pm.Normal(
        "likelihood", mu=mu, sigma=sigma, observed=day_of_blossom
    )

    trace = pm.sample(draws=1000, tune=1000, chains=4)

Versions:
name : pymc
version : 5.9.1

name : numpy
version : 1.25.2

Best htw

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions