You are here

A Student’s Guide to Bayesian Statistics

A Student’s Guide to Bayesian Statistics

  • Ben Lambert - Imperial College London (London, United Kingdom)
Additional resources:

April 2018 | 520 pages | SAGE Publications Ltd

Supported by a wealth of learning features, exercises, and visual elements as well as online video tutorials and interactive simulations, this book is the first student-focused introduction to Bayesian statistics.

Without sacrificing technical integrity for the sake of simplicity, the author draws upon accessible, student-friendly language to provide approachable instruction perfectly aimed at statistics and Bayesian newcomers. Through a logical structure that introduces and builds upon key concepts in a gradual way and slowly acclimatizes students to using R and Stan software, the book covers:

  • An introduction to probability and Bayesian inference
  • Understanding Bayes' rule 
  • Nuts and bolts of Bayesian analytic methods
  • Computational Bayes and real-world Bayesian analysis
  • Regression analysis and hierarchical methods

This unique guide will help students develop the statistical confidence and skills to put the Bayesian formula into practice, from the basic concepts of statistical inference to complex applications of analyses.

Chapter 1: How to best use this book
The purpose of this book

Who is this book for?


Book outline

Route planner - suggested journeys through Bayesland


Problem sets


R and Stan

Why don’t more people use Bayesian statistics?

What are the tangible (non-academic) benefits of Bayesian statistics?

Part I: An introduction to Bayesian inference
Chapter 2: The subjective worlds of Frequentist and Bayesian statistics
Bayes’ rule - allowing us to go from the effect back to its cause

The purpose of statistical inference

The world according to Frequentists

The world according to Bayesians

Do parameters actually exist and have a point value?

Frequentist and Bayesian inference

Bayesian inference via Bayes’ rule

Implicit versus Explicit subjectivity

Chapter 3: Probability - the nuts and bolts of Bayesian inference
Probability distributions: helping us explicitly state our ignorance


Central Limit Theorems

A derivation of Bayes’ rule

The Bayesian inference process from the Bayesian formula

Part II: Understanding the Bayesian formula
Chapter 4: Likelihoods
What is a likelihood?

Why use ‘likelihood’ rather than ‘probability’?

What are models and why do we need them?

How to choose an appropriate likelihood?

Exchangeability vs random sampling

Maximum likelihood - a short introduction

Chapter 5: Priors
What are priors, and what do they represent?

The explicit subjectivity of priors

Combining a prior and likelihood to form a posterior

Constructing priors

A strong model is less sensitive to prior choice

Chapter 6: The devil’s in the denominator
An introduction to the denominator

The difficulty with the denominator

How to dispense with the difficulty: Bayesian computation

Chapter 7: The posterior - the goal of Bayesian inference
Expressing parameter uncertainty in posteriors

Bayesian statistics: updating our pre-data uncertainty

The intuition behind Bayes’ rule for inference

Point parameter estimates

Intervals of uncertainty

From posterior to predictions by sampling

Part III: Analytic Bayesian methods
Chapter 8: An introduction to distributions for the mathematically-un-inclined
The interrelation among distributions

Sampling distributions for likelihoods

Prior distributions

How to choose a likelihood

Table of common likelihoods, their uses, and reasonable priors

Distributions of distributions, and mixtures - link to website, and relevance

Chapter 9: Conjugate priors and their place in Bayesian analysis
What is a conjugate prior and why are they useful?

Gamma-poisson example

Normal example: giraffe height

Table of conjugate priors

The lessons and limits of a conjugate analysis

Chapter 10: Evaluation of model fit and hypothesis testing
Posterior predictive checks

Why do we call it a p value?

Statistics measuring predictive accuracy: AIC, Deviance, WAIC and LOO-CV

Marginal likelihoods and Bayes factors

Choosing one model, or a number?

Sensitivity analysis

Chapter 11: Making Bayesian analysis objective?
The illusion of the ’uninformative’ uniform prior

Jeffreys’ priors

Reference priors

Empirical Bayes

A move towards weakly informative priors

Part IV: A practical guide to doing real life Bayesian analysis: Computational Bayes
Chapter 12: Leaving conjugates behind: Markov Chain Monte Carlo
The difficulty with real life Bayesian inference

Discrete approximation to continuous posteriors

The posterior through quadrature

Integrating using independent samples: an introduction to Monte Carlo

Why is independent sampling easier said than done?

Ideal sampling from a posterior using only the un-normalised posterior

Moving from independent to dependent sampling

What’s the catch with dependent samplers?

Chapter 13: Random Walk Metropolis
Sustainable fishing

Prospecting for gold

Defining the Metropolis algorithm

When does Metropolis work?

Efficiency of convergence: the importance of choosing the right proposal scale


Judging convergence

Effective sample size revisited

Chapter 14: Gibbs sampling
Back to prospecting for gold

Defining the Gibbs algorithm

Gibbs’ earth: the intuition behind the Gibbs algorithm

The benefits and problems with Gibbs and Random Walk Metropolis

A change of parameters to speed up exploration

Chapter 15: Hamiltonian Monte Carlo
Hamiltonian Monte Carlo as a sledge

NLP space

Solving for the sledge motion over NLP space

How to shove the sledge

The acceptance probability of HMC

The complete Hamiltonian Monte Carlo algorithm

The performance of HMC versus Random Walk Metropolis and Gibbs

Optimal step length of HMC: introducing the “No U-Turn Sampler”

Chapter 16: Stan
Why Stan, and how to get it

Getting setup with Stan using RStan

Our first words in Stan

Essential Stan reading

What to do when things go wrong

How to get further help

Part V: Hierarchical models and regression
Chapter 17: Hierarchical models
The spectrum from fully-pooled to heterogeneous

Non-centered parameterisations in hierarchical models

Case study: Forecasting the EU referendum result

The importance of fake data simulation for complex models

Chapter 18: Linear regression models
Example: high school test scores in England

Pooled model


Heterogeneous coefficient model

Hierarchical model

Incorporating LEA-level data

Chapter 19: Generalised linear models and other animals
Example: electoral participation in European countries

Discrete parameter models in Stan



Click for online resources

very essential has to my lectures

Mrs Catherine Otene
Faculty of Engineering & Science, Greenwich University
June 8, 2018

there aren't many students doing Bayesian Statistics analysis in dissertation this year so we don't provide such course unit. This book is a really helpful supplementary material for the students.

School of Planning and Landscape, Manchester University
September 9, 2019

A very useful reference with good examples, well-structured and progressive.

Professor Colin McCulloch
International Finance and Management, Pyongyang University of Science And Technology
November 4, 2018

Clear and useful guide

Dr Martin Kunc
Warwick Business School, Warwick University
March 27, 2018