# A Student’s Guide to Bayesian Statistics

- Ben Lambert - Imperial College London (London, United Kingdom)

Supported by a wealth of learning features, exercises, and visual elements as well as online video tutorials and interactive simulations, this book is the first student-focused introduction to Bayesian statistics.

Without sacrificing technical integrity for the sake of simplicity, the author draws upon accessible, student-friendly language to provide approachable instruction perfectly aimed at statistics and Bayesian newcomers. Through a logical structure that introduces and builds upon key concepts in a gradual way and slowly acclimatizes students to using R and Stan software, the book covers:

- An introduction to probability and Bayesian inference
- Understanding Bayes' rule
- Nuts and bolts of Bayesian analytic methods
- Computational Bayes and real-world Bayesian analysis
- Regression analysis and hierarchical methods

This unique guide will help students develop the statistical confidence and skills to put the Bayesian formula into practice, from the basic concepts of statistical inference to complex applications of analyses.

The purpose of this book |

Who is this book for? |

Pre-requisites |

Book outline |

Route planner - suggested journeys through Bayesland |

Video |

Problem sets |

Code |

R and Stan |

Why don’t more people use Bayesian statistics? |

What are the tangible (non-academic) benefits of Bayesian statistics? |

Bayes’ rule - allowing us to go from the effect back to its cause |

The purpose of statistical inference |

The world according to Frequentists |

The world according to Bayesians |

Do parameters actually exist and have a point value? |

Frequentist and Bayesian inference |

Bayesian inference via Bayes’ rule |

Implicit versus Explicit subjectivity |

Probability distributions: helping us explicitly state our ignorance |

Independence |

Central Limit Theorems |

A derivation of Bayes’ rule |

The Bayesian inference process from the Bayesian formula |

What is a likelihood? |

Why use ‘likelihood’ rather than ‘probability’? |

What are models and why do we need them? |

How to choose an appropriate likelihood? |

Exchangeability vs random sampling |

Maximum likelihood - a short introduction |

What are priors, and what do they represent? |

The explicit subjectivity of priors |

Combining a prior and likelihood to form a posterior |

Constructing priors |

A strong model is less sensitive to prior choice |

An introduction to the denominator |

The difficulty with the denominator |

How to dispense with the difficulty: Bayesian computation |

Expressing parameter uncertainty in posteriors |

Bayesian statistics: updating our pre-data uncertainty |

The intuition behind Bayes’ rule for inference |

Point parameter estimates |

Intervals of uncertainty |

From posterior to predictions by sampling |

The interrelation among distributions |

Sampling distributions for likelihoods |

Prior distributions |

How to choose a likelihood |

Table of common likelihoods, their uses, and reasonable priors |

Distributions of distributions, and mixtures - link to website, and relevance |

What is a conjugate prior and why are they useful? |

Gamma-poisson example |

Normal example: giraffe height |

Table of conjugate priors |

The lessons and limits of a conjugate analysis |

Posterior predictive checks |

Why do we call it a p value? |

Statistics measuring predictive accuracy: AIC, Deviance, WAIC and LOO-CV |

Marginal likelihoods and Bayes factors |

Choosing one model, or a number? |

Sensitivity analysis |

The illusion of the ’uninformative’ uniform prior |

Jeffreys’ priors |

Reference priors |

Empirical Bayes |

A move towards weakly informative priors |

The difficulty with real life Bayesian inference |

Discrete approximation to continuous posteriors |

The posterior through quadrature |

Integrating using independent samples: an introduction to Monte Carlo |

Why is independent sampling easier said than done? |

Ideal sampling from a posterior using only the un-normalised posterior |

Moving from independent to dependent sampling |

What’s the catch with dependent samplers? |

Sustainable fishing |

Prospecting for gold |

Defining the Metropolis algorithm |

When does Metropolis work? |

Efficiency of convergence: the importance of choosing the right proposal scale |

Metropolis-Hastings |

Judging convergence |

Effective sample size revisited |

Back to prospecting for gold |

Defining the Gibbs algorithm |

Gibbs’ earth: the intuition behind the Gibbs algorithm |

The benefits and problems with Gibbs and Random Walk Metropolis |

A change of parameters to speed up exploration |

Hamiltonian Monte Carlo as a sledge |

NLP space |

Solving for the sledge motion over NLP space |

How to shove the sledge |

The acceptance probability of HMC |

The complete Hamiltonian Monte Carlo algorithm |

The performance of HMC versus Random Walk Metropolis and Gibbs |

Optimal step length of HMC: introducing the “No U-Turn Sampler” |

Why Stan, and how to get it |

Getting setup with Stan using RStan |

Our first words in Stan |

Essential Stan reading |

What to do when things go wrong |

How to get further help |

The spectrum from fully-pooled to heterogeneous |

Non-centered parameterisations in hierarchical models |

Case study: Forecasting the EU referendum result |

The importance of fake data simulation for complex models |

Example: high school test scores in England |

Pooled model |

Interactions |

Heterogeneous coefficient model |

Hierarchical model |

Incorporating LEA-level data |

Example: electoral participation in European countries |

Discrete parameter models in Stan |

### Supplements

An excellent resource on Bayesian analysis accessible to students from a diverse range of statistical backgrounds and interests. Easy to follow with well documented examples to illustrate key concepts.

**College of Business and Economics, Australian National University**

When I was a grad student, Bayesian statistics was restricted to those with the mathematical fortitude to plough through source literature. Thanks to Lambert, we now have something we can give to the modern generation of nascent data scientists as a first course. Love the supporting videos, too!

**Information Technology, Monash University**

Written in highly accessible language, this book* *is the gateway for students to gain a deep understanding of the logic of Bayesian analysis and to apply that logic with numerous carefully selected hands-on examples. Lambert moves seamlessly from a traditional Bayesian approach (using analytic methods) that serves to solidify fundamental concepts, to a modern Bayesian approach (using computational sampling methods) that endows students with the powerful and practical powers of application. I would recommend this book and its accompanying materials to any students or researchers who wish to learn and actually *do* Bayesian modeling.

**Psychology, Rice University**

A balanced combination of theory, application and implementation of Bayesian statistics in a not very technical language. A tangible introduction to intangible concepts of Bayesian statistics for beginners.

**Senior Lecturer in Statistics, School of Science & Technology, Nottingham Trent University**

The late, famous statistician Jimmie Savage would have taken great pleasure in this book based on his work in the 1960s on Bayesian statistics. He would have marveled at the presentations in the book of many new and strong statistical and computer analyses.

**Professor Emeritus of Statistics, Swarthmore College**