Lab 10: MCMC Inference#
For the class on Monday, February 12th
Tip
You may find this tutorial of emcee helpful!
Tip
If you don’t have emcee and corner installed, you’ll want to install them first.
Run the following in a terminal after you activate the environment you use for this course.
(If running in a notebook, add !
at the very beginning of the line.)
conda install -y emcee corner
# Run this cell to check if emcee and corner are installed
import emcee, corner
A. Comparing Samples, with emcee#
A1. The posterior function#
With the prior and likelihood functions you obtained from Lab 9, Part A,
populate the functions log_prior_a
, log_prior_b
and log_likelihood
below.
# Include your implementation for Part A1 here
import numpy as np
def log_prior_a(a):
if a <= 0:
return -np.inf
# TODO: implement this function
return
def log_prior_b(b):
# TODO: implement this function
return
def log_likelihood(a, b, nx, x_sum, ny, y_sum):
if a - b <= 0 or a + b <= 0:
return -np.inf
# TODO: impelement this function
return
def log_posterior(parameters, *args):
a, b = parameters
return log_likelihood(a, b, *args) + log_prior_a(a) + log_prior_b(b)
A2. Run MCMC with emcee!#
The following cell runs the MCMC with emcee using 40 walkers and 5,000 steps per walker. The initial guesses are set to a narrow gaussian distribution around the MLE estimator.
Uncomment the last line of the cell below and run it!
# For A2. Uncomment the last line run this cell
import emcee
#########
# do not change the following two lines
data_x = np.random.default_rng(123).exponential(scale=1/0.51, size=100)
data_y = np.random.default_rng(753).exponential(scale=1/0.49, size=100)
#########
nwalkers = 40
nsteps = 5000
ndim = 2 # we have two parameters, a and b
args = [len(data_x), np.sum(data_x), len(data_y), np.sum(data_y)] # nx, x_sum, ny, y_sum
# generate initial guess
a0 = (1/np.mean(data_x) + 1/np.mean(data_y)) * 0.5
b0 = 1/np.mean(data_x) - 1/np.mean(data_y)
p0_center = np.array([a0, b0])
p0 = np.random.default_rng().multivariate_normal(p0_center, np.diag((p0_center * 0.1) ** 2), size=int(nwalkers*2))
# remove any invalid initial guesses
p0 = p0[(p0[:,0] + p0[:,1] > 0) & (p0[:,0] - p0[:,1] > 0)]
p0 = p0[:nwalkers]
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_posterior, args=args)
#sampler.run_mcmc(p0, nsteps, progress=True);
A3. Inspect your MCMC run#
Do the following inspection for your MCMC run.
Check the acceptance rate for each walker (use
sampler.acceptance_fraction
).Check the autocorrelation time for each parameter (use
sampler.get_autocorr_time()
).Plot one chain for each parameter (Use
sampler.get_chain()
)
# Include your implementation for Part A3, Q1 here
# Include your implementation for Part A3, Q2 here
# Include your implementation for Part A3, Q3 here
# Uncomment the following two lines to get started
#walker_index = 21
#chain_a, chain_b = sampler.get_chain()[:,walker_index,:].T
Question for A3#
Briefly discuss what you observed above. Do the results from these checks make sense to you? Do you think the MCMC was done correctly?
// Write your answers to Part A3 here
A4. Make a corner plot for the two parameters (a, b)#
# Include your implementation for Part A4 here
Question for A4#
Briefly discuss what you observed. What conclusion would you draw from the distribution of a, b?
// Write your answers to Part A4 here
A5. Repeat the analysis with uniform priors#
We will do a MCMC run with a uniform prior instead. Repeat the analysis from A3 and A4 and see how the results change.
# For A5 setup: Uncomment the last line run this cell
def log_posterior_uniform_priors(parameters, *args):
a, b = parameters
return log_likelihood(a, b, *args)
sampler_uniform_priors = emcee.EnsembleSampler(nwalkers, ndim, log_posterior_uniform_priors, args=args)
#sampler_uniform_priors.run_mcmc(p0, nsteps, progress=True);
# For A5: Repeat the steps from A3 and A4
Question for A5#
Briefly discuss what you observed. How does the new prior change the result?
// Write your answers to Part A5 here
A6. Repeat the analysis with different initial guesses#
We will do another MCMC run with initial guesses centered around \((a=2, b=-0.3)\) instead. Repeat the analysis from A3 and A4 and see how the results change.
# For A6 setup: Uncomment the last line run this cell
# generate initial guess
a0 = 2
b0 = -0.3
p0 = np.random.default_rng().multivariate_normal([a0, b0], [[(a0*0.1)**2, 0], [0, (b0*0.1)**2]], size=nwalkers*2)
# remove any invalid initial guesses
p0 = p0[(p0[:,0] + p0[:,1] > 0) & (p0[:,0] - p0[:,1] > 0)]
p0 = p0[:nwalkers]
sampler_new_initial_guesses = emcee.EnsembleSampler(nwalkers, ndim, log_posterior, args=args)
#sampler_new_initial_guesses.run_mcmc(p0, nsteps, progress=True);
# For A6: Repeat the steps from A3 and A4
Question for A6#
Briefly discuss what you observed. How does the new initial guesses change the result?
// Write your answers to Part A6 here
Final question#
Roughly how much time did you spend on this lab (not including the time you spent in class)?
// Write your answers here
Tip
How to submit this notebook on Canvas?
Once you complete this lab and have all your code, text answers, and all the desired results after running the notebook properly displayed in this notebook, please convert the notebook into HTML format by running the following:
jupyter nbconvert --to html /path/to/labs/10.ipynb
Then, upload the resulting HTML file to Canvas for the corresponding assignment.