Releases: pymc-devs/pymc
v4.0.0b1
PyMC 4.0.0 beta 1
⚠ This is the first beta of the next major release for PyMC 4.0.0 (formerly PyMC3). 4.0.0 is a rewrite of large parts of the PyMC code base which make it faster, adds many new features, and introduces some breaking changes. For the most part, the API remains stable and we expect that most models will work without any changes.
Not-yet working features
We plan to get these working again, but at this point, their inner workings have not been refactored.
- Timeseries distributions (see #4642)
- Mixture distributions (see #4781)
- Cholesky distributions (see WIP PR #4784)
- Variational inference submodule (see WIP PR #4582)
- Elliptical slice sampling (see #5137)
BaseStochasticGradient(see #5138)pm.sample_posterior_predictive_w(see #4807)- Partially observed Multivariate distributions (see #5260)
Also, check out the milestones for a potentially more complete list.
Unexpected breaking changes (action needed)
- New API is not available in
v3.11.5. - Old API does not work in
v4.0.0.
All of the above applies to:
- ⚠ The library is now named, installed, and imported as "pymc". For example:
pip install pymc. (Usepip install pymc --prewhile we are in the pre-release phase.) - ⚠ Theano-PyMC has been replaced with Aesara, so all external references to
theano,tt, andpymc3.theanofneed to be replaced withaesara,at, andpymc.aesaraf(see 4471). pm.Distribution(...).logp(x)is nowpm.logp(pm.Distribution(...), x)pm.Distribution(...).logcdf(x)is nowpm.logcdf(pm.Distribution(...), x)pm.Distribution(...).random()is nowpm.Distribution(...).eval()pm.draw_values(...)andpm.generate_samples(...)were removed. The tensors can now be evaluated with.eval().pm.fast_sample_posterior_predictivewas removed.pm.sample_prior_predictive,pm.sample_posterior_predictiveandpm.sample_posterior_predictive_wnow return anInferenceDataobject by default, instead of a dictionary (see #5073).pm.sample_prior_predictiveno longer returns transformed variable values by default. Pass them by name invar_namesif you want to obtain these draws (see 4769).pm.sample(trace=...)no longer acceptsMultiTraceorlen(.) > 0traces (see 5019#).- The GLM submodule was removed, please use Bambi instead.
pm.Boundinterface no longer accepts a callable class as an argument, instead, it requires an instantiated distribution (created via the.dist()API) to be passed as an argument. In addition, Bound no longer returns a class instance but works as a normal PyMC distribution. Finally, it is no longer possible to do predictive random sampling from Bounded variables. Please, consult the new documentation for details on how to use Bounded variables (see 4815).pm.logpt(transformed=...)kwarg was removed (816b5f).Model(model=...)kwarg was removedModel(theano_config=...)kwarg was removedModel.sizeproperty was removed (useModel.ndiminstead).dimsandcoordshandling:Model.update_start_values(...)was removed. Initial values can be set in theModel.initial_valuesdictionary directly.- Test values can no longer be set through
pm.Distribution(testval=...)and must be assigned manually. Transform.forwardandTransform.backwardsignatures changed.pm.DensityDistno longer accepts thelogpas its first positional argument. It is now an optional keyword argument. If you pass a callable as the first positional argument, aTypeErrorwill be raised (see 5026).pm.DensityDistnow accepts distribution parameters as positional arguments. Passing them as a dictionary in theobservedkeyword argument is no longer supported and will raise an error (see 5026).- The signature of the
logpandrandomfunctions that can be passed into apm.DensityDisthas been changed (see 5026). - Changes to the Gaussian process (
gp) submodule:- The
gp.prior(..., shape=...)kwarg was renamed tosize. - Multiple methods including
gp.priornow require explicit kwargs.
- The
- Changes to the BART implementation:
- Changes to the Gaussian Process (GP) submodule (see 5055):
- For all implementations,
gp.Latent,gp.Marginaletc.,cov_funcandmean_funcare required kwargs. - In Windows test conda environment the
mklversion is fixed to verison 2020.4, andmkl-serviceis fixed to2.3.0. This was required forgp.MarginalKronto function properly. gp.MvStudentTuses rotated samples fromStudentTdirectly now, instead of sampling frompm.Chi2and then frompm.Normal.- The "jitter" parameter, or the diagonal noise term added to Gram matrices such that the Cholesky is numerically stable, is now exposed to the user instead of hard-coded. See the function
gp.util.stabilize. - The
is_observedargument forgp.Marginal*implementations has been deprecated. - In the gp.utils file, the
kmeans_inducing_pointsfunction now passes throughkmeans_kwargsto scipy's k-means function. - The function
replace_with_valuesfunction has been added togp.utils. MarginalSparsehas been renamedMarginalApprox.
- For all implementations,
Expected breaks
- New API was already available in
v3. - Old API had deprecation warnings since at least
3.11.0(2021-01). - Old API stops working in
v4(preferably with informative errors).
All of the above apply to:
pm.sample(return_inferencedata=True)is now the default (see #4744).- ArviZ
plotsandstatswrappers were removed. The functions are now just available by their original names (see #4549 and3.11.2release notes). pm.sample_posterior_predictive(vars=...)kwarg was removed in favor ofvar_names(see #4343).ElemwiseCategoricalstep method was removed (see #4701)
Ongoing deprecations
- Old API still works in
v4and has a deprecation warning. - Preferably the new API should be available in
v3already
New features
- The length of
dimsin the model is now tracked symbolically throughModel.dim_lengths(see #4625). - The
CARdistribution has been added to allow for use of conditional autoregressions which often are used in spatial and network models. - The dimensionality of model variables can now be parametrized through either of
shape,dimsorsize(see #4696):- With
shapethe length of dimensions must be given numerically or as scalar AesaraVariables. Numeric entries inshaperestrict the model variable to the exact length and re-sizing is no longer possible. dimskeeps model variables re-sizeable (for example throughpm.Data) and leads to well-defined coordinates inInferenceDataobjects.- The
sizekwarg behaves as it does in Aesara/NumPy. For univariate RVs it is the same asshape, but for multivariate RVs it depends on how the RV implements broadcasting to dimensionality greater thanRVOp.ndim_supp. - An
Ellipsis(...) in the last position ofshapeordimscan be used as shorthand notation for implied dimensions.
- With
- Added a
logcdfimplementation for the Kumaraswamy distribution (see #4706). - The
OrderedMultinomialdistribution has been added for use on ordinal data which are aggregated by trial, like multinomial observations, whereasOrderedLogisticonly accepts ordinal data in a disaggregated format, like categorical
observations (see #4773). - The
Polya-Gammadistribution has been added (see #4531). To make use of this distribution, thepolyagamma>=1.3.1library must be installed and available in the user's environment. - A small change to the mass matrix tuning methods jitter+adapt_diag (the default) and adapt_diag improves performance early on during tuning for some models. #5004
- New experimental mass matrix tuning method jitter+adapt_diag_grad. [#5004](https://github.com/pymc-devs/pymc/pu...
PyMC3 3.11.4 (20 August 2021)
Update __init__.py Update RELEASE-NOTES.md Mark 3.11.3 release as broken per discussion
PyMC3 3.11.3 (19 August 2021)
Release PyMC3 v3.11.3 (#4941) * Release PyMC3 v3.11.3 * Update RELEASE-NOTES.md
PyMC3 3.11.2 (14 March 2021)
PyMC3 3.11.2 (14 March 2021)
New Features
pm.math.cartesiancan now handle inputs that are themselves >1D (see #4482).- Statistics and plotting functions that were removed in
3.11.0were brought back, albeit with deprecation warnings if an old naming scheme is used (see #4536). In order to future proof your code, rename these function calls:pm.traceplot→pm.plot_tracepm.compareplot→pm.plot_compare(here you might need to rename some columns in the input according to thearviz.plot_comparedocumentation)pm.autocorrplot→pm.plot_autocorrpm.forestplot→pm.plot_forestpm.kdeplot→pm.plot_kdepm.energyplot→pm.plot_energypm.densityplot→pm.plot_densitypm.pairplot→pm.plot_pair
Maintenance
- ⚠ Our memoization mechanism wasn't robust against hash collisions (#4506), sometimes resulting in incorrect values in, for example, posterior predictives. The
pymc3.memoizemodule was removed and replaced withcachetools. Thehashablefunction andWithMemoizationclass were moved topymc3.util(see #4525). pm.make_shared_replacementsnow retains broadcasting information which fixes issues with Metropolis samplers (see #4492).
Release manager for 3.11.2: Michael Osthege (@michaelosthege)
PyMC3 3.11.1 (12 February 2021)
New Features
- Automatic imputations now also work with
ndarraydata, not justpd.Seriesorpd.DataFrame(see#4439). pymc3.sampling_jax.sample_numpyro_nutsnow returns samples from transformed random variables, rather than from the unconstrained representation (see #4427).
Maintenance
- We upgraded to
Theano-PyMC v1.1.2which includes bugfixes for...- ⚠ a problem with
tt.switchthat affected the behavior of several distributions, including at least the following special cases (see #4448)Bernoulliwhen all the observed values were the same (e.g.,[0, 0, 0, 0, 0]).TruncatedNormalwhensigmawas constant andmuwas being automatically broadcasted to match the shape of observations.
- Warning floods and compiledir locking (see #4444)
- ⚠ a problem with
math.log1mexp_numpyno longer raises RuntimeWarning when given very small inputs. These were commonly observed during NUTS sampling (see #4428).ScalarSharedVariablecan now be used as an input to other RVs directly (see #4445).pm.sampleandpm.find_MAPno longer change thestartargument (see #4458).- Fixed
Dirichlet.logpmethod to work with unit batch or event shapes (see #4454). - Bugfix in logp and logcdf methods of
Triangulardistribution (see #4470).
Release manager for 3.11.1: Michael Osthege (@michaelosthege)
PyMC3 3.11.0 (21 January 2021)
This release breaks some APIs w.r.t. 3.10.0. It also brings some dreadfully awaited fixes, so be sure to go through the (breaking) changes below.
Breaking Changes
- ⚠ Many plotting and diagnostic functions that were just aliasing ArviZ functions were removed (see 4397). This includes
pm.summary,pm.traceplot,pm.essand many more! - Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)
- ⚠ We now depend on
Theano-PyMCversion1.1.0exactly (see #4405). Major refactorings were done inTheano-PyMC1.1.0. If you implement customOps or interact with Theano in any way yourself, make sure to read the Theano-PyMC 1.1.0 release notes. - ⚠ Python 3.6 support was dropped (by no longer testing) and Python 3.9 was added (see #4332).
- ⚠ Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)
- Applies to random variables and also the
.random(size=...)kwarg! - To create scalar variables you must now use
shape=Noneorshape=(). shape=(1,)andshape=1now become vectors. Previously they were collapsed into scalars- 0-length dimensions are now ruled illegal for random variables and raise a
ValueError.
- Applies to random variables and also the
- In
sample_prior_predictivethevarskwarg was removed in favor ofvar_names(see #4327). - Removed
theanof.set_theano_configbecause it illegally changed Theano's internal state (see #4329).
New Features
- Option to set
check_bounds=Falsewhen instantiatingpymc3.Model(). This turns off bounds checks that ensure that input parameters of distributions are valid. For correctly specified models, this is unneccessary as all parameters get automatically transformed so that all values are valid. Turning this off should lead to faster sampling (see #4377). OrderedProbitdistribution added (see #4232).plot_posterior_predictive_glmnow works witharviz.InferenceDataas well (see #4234)- Add
logcdfmethod to all univariate discrete distributions (see #4387). - Add
randommethod toMvGaussianRandomWalk(see #4388) AsymmetricLaplacedistribution added (see #4392).DirichletMultinomialdistribution added (see #4373).- Added a new
predictmethod toBARTto compute out of sample predictions (see #4310).
Maintenance
- Fixed bug whereby partial traces returns after keyboard interrupt during parallel sampling had fewer draws than would've been available #4318
- Make
sample_shapesame across all contexts indraw_values(see #4305). - The notebook gallery has been moved to https://github.com/pymc-devs/pymc-examples (see #4348).
math.logsumexpnow matchesscipy.special.logsumexpwhen arrays contain infinite values (see #4360).- Fixed mathematical formulation in
MvStudentTrandom method. (see #4359) - Fix issue in
logpmethod ofHyperGeometric. It now returns-inffor invalid parameters (see 4367) - Fixed
MatrixNormalrandom method to work with parameters as random variables. (see #4368) - Update the
logcdfmethod of several continuous distributions to return -inf for invalid parameters and values, and raise an informative error when multiple values cannot be evaluated in a single call. (see 4393 and #4421) - Improve numerical stability in
logpandlogcdfmethods ofExGaussian(see #4407) - Issue UserWarning when doing prior or posterior predictive sampling with models containing Potential factors (see #4419)
- Dirichlet distribution's
randommethod is now optimized and gives outputs in correct shape (see #4416) - Attempting to sample a named model with SMC will now raise a
NotImplementedError. (see #4365)
Release manager for 3.11.0: Eelke Spaak (@Spaak)
PyMC3 v3.10.0 (7 December 2020)
This is a major release with many exciting new features. The biggest change is that we now rely on our own fork of Theano-PyMC. This is in line with our big announcement about our commitment to PyMC3 and Theano.
When upgrading, make sure that Theano-PyMC and not Theano are installed (the imports remain unchanged, however). If not, you can uninstall Theano:
conda remove theano
And to install:
conda install -c conda-forge theano-pymc
Or, if you are using pip (not recommended):
pip uninstall theano
And to install:
pip install theano-pymc
This new version of Theano-PyMC comes with an experimental JAX backend which, when combined with the new and experimental JAX samplers in PyMC3, can greatly speed up sampling in your model. As this is still very new, please do not use it in production yet but do test it out and let us know if anything breaks and what results you are seeing, especially speed-wise.
New features
- New experimental JAX samplers in
pymc3.sample_jax(see notebook and #4247). Requires JAX and either TFP or numpyro. - Add MLDA, a new stepper for multilevel sampling. MLDA can be used when a hierarchy of approximate posteriors of varying accuracy is available, offering improved sampling efficiency especially in high-dimensional problems and/or where gradients are not available (see #3926)
- Add Bayesian Additive Regression Trees (BARTs) #4183)
- Added
pymc3.gp.cov.Circularkernel for Gaussian Processes on circular domains, e.g. the unit circle (see #4082). - Added a new
MixtureSameFamilydistribution to handle mixtures of arbitrary dimensions in vectorized form for improved speed (see #4185). sample_posterior_predictive_wcan now feed onxarray.Dataset- e.g. fromInferenceData.posterior. (see #4042)- Change SMC metropolis kernel to independent metropolis kernel #4115)
- Add alternative parametrization to NegativeBinomial distribution in terms of n and p (see #4126)
- Added semantically meaningful
strrepresentations to PyMC3 objects for console, notebook, and GraphViz use (see #4076, #4065, #4159, #4217, #4243, and #4260). - Add Discrete HyperGeometric Distribution (see #4249)
Maintenance
- Switch the dependency of Theano to our own fork, Theano-PyMC.
- Removed non-NDArray (Text, SQLite, HDF5) backends and associated tests.
- Use dill to serialize user defined logp functions in
DensityDist. The previous serialization code fails if it is used in notebooks on Windows and Mac.dillis now a required dependency. (see #3844). - Fixed numerical instability in ExGaussian's logp by preventing
logpowfrom returning-inf(see #4050). - Numerically improved stickbreaking transformation - e.g. for the
Dirichletdistribution. #4129 - Enabled the
Multinomialdistribution to handle batch sizes that have more than 2 dimensions. #4169 - Test model logp before starting any MCMC chains (see #4211)
- Fix bug in
model.check_test_pointthat caused thetest_pointargument to be ignored. (see PR #4211) - Refactored MvNormal.random method with better handling of sample, batch and event shapes. #4207
- The
InverseGammadistribution now implements alogcdf. #3944 - Make starting jitter methods for nuts sampling more robust by resampling values that lead to non-finite probabilities. A new optional argument
jitter-max-retriescan be passed topm.sample()andpm.init_nuts()to control the maximum number of retries per chain. 4298
Documentation
- Added a new notebook demonstrating how to incorporate sampling from a conjugate Dirichlet-multinomial posterior density in conjunction with other step methods (see #4199).
- Mentioned the way to do any random walk with
theano.tensor.cumsum()inGaussianRandomWalkdocstrings (see #4048).
Release manager for 3.10.0: Eelke Spaak (@Spaak)
PyMC3 v3.9.3 (August 11, 2020)
This release includes several fixes, including (but not limited to) the following:
- Fix keep_size argument in Arviz data structures: #4006
- Pin Theano 1.0.5: #4032
- Comprehensively re-wrote radon modeling notebook using latest Arviz features: #3963
NB: The docs/* folder is still removed from the tarball due to an upload size limit on PyPi.
PyMC3 v3.9.2 (24 June 2020)
Maintenance
- Warning added in GP module when
input_dimis lower than the number of columns inXto compute the covariance function (see #3974). - Pass the
tuneargument fromsamplewhen usingadvi+adapt_diag_grad(see issue #3965, fixed by #3979). - Add simple test case for new coords and dims feature in
pm.Model(see #3977). - Require ArviZ >= 0.9.0 (see #3977).
NB: The docs/* folder is still removed from the tarball due to an upload size limit on PyPi.
PyMC3 v3.9.1 (16 June, 2020)
The v3.9.0 upload to PyPI didn't include a tarball, which is fixed in this release. Though we had to temporarily remove the docs/* folder from the tarball due to a PyPI size limit.