Statistics and Data Analysis

Description

This 3 day module will run on 14th, 16th and 18th June 2021. The module will be delivered on-line.

Each day will comprise intro lectures in the morning (1000-1100, 1130-1230) and practical exercises in the afternoons (1330-1730).

Day 1 Introduction to the methods of maximum likelihood and least-squares. Bayes theorem, priors and a posteriori probabilities. Goodness of fit: chi squared test, likelihood ratio, Bayesian evidence.

Day 2 Monte Carlo methods: simulations in particle physics and astronomy. Markov Chain Monte Carlo (MCMC): exploring a multi-dimensional parameter space using emcee.

Day 3 Model fitting: dealing with outliers, errors on 'independent' variable, intrinsic scatter in fitted model. Bayesian Hierarchical Models: introduction of latent variables to parametrize unknowns in the problem.

Aim

To acquire the skills needed for analysis of experimental data and model fitting.

 

Objectives

At the end of this course, a successful student will be able to:

  1. Fit models to data using maximum likelihood and least squares, incorporating known priors on the model parameters;

  2. Assess the model goodness of fit, and obtain the covariance matrix of fitted parameters;

  3. Be able to simulate (parts of) an experiment or model, in order to test analysis code;

  4. Fit Bayesian hierarchical models to data, allowing marginalisation over unknown nuisance parameters.

Prerequisites / Linked Modules

Students should have previous familiarity with basic probability and be reasonably competent in Python scripting.

 

It is recommended that students have the following software installed on their laptops: Anaconda python distribution (https://www.anaconda.com/download/) emcee, affine-invariant MCMC code (http://dfm.io/emcee/current/)