Introduction to Bayesian Regression

Presenter(s): Oliver Perra


decorative image to accompany text

This resource will introduce key concepts and practical examples of linear regression models estimated using a Bayesian approach. These models will be illustrated using practical examples and exercises based on R software. 

Bayesian statistics is an approach to data analysis and inference that takes its name from Bayes’ theorem, which provides a formal method for combining prior information (i.e. information gathered before collecting new data) with evidence from observed data. One way to highlight the characteristics of Bayesian statistics is to contrast it with the conventional statistical approach, which is often dubbed frequentist

 

The key difference between these approaches lies in the way they characterise population parameters. The frequentist approach assumes parameters (e.g. population means) are unknown but fixed quantities. This assumption leads to a counterfactual approach embodied in null hypothesis testing: a sample mean is compared to a counterfactual scenario where the population mean is assumed to have a specific value. The probability distribution that would be generated under the null hypothesis scenario can be estimated through a hypothetical resampling of the data whereby one can imagine the same study being repeated over several occasions. The standard approach then proceeds to test if the observed sample mean is so extreme that it is unlikely to occur in the null hypothesis scenario.  The results do not inform about the probability of the parameter of interest, but rather the probability of the parameter taking the observed value, or a more extreme one, if the null hypothesis were true. Null hypothesis testing has been criticised for leading to publication bias: emphasis has now shifted to reporting estimates of uncertainty such as confidence intervals. However, the interpretation of confidence intervals relies on the same counterfactual scenario as the null hypothesis: they represent ranges of values that would not be rejected with a certain probability if the null hypothesis were true. 


Decorative

Bayesian analysis assumes that parameters are uncertain and can be described by a probability distribution. Thus, this approach can answer questions about the probability of a population being a specific value or ranging between some values of interest. The assumption that parameters have a probability distribution has another important consequence: the modelled parameter distribution can be updated by new data being collected.  Indeed, the fundamental principle of Bayesian analysis is that a-priori assumptions about the distribution of a parameter are updated considering new data. Formal processes that derive from Bayes theorem allow to estimate posterior distributions of parameters probability: These rank the plausibility of all combinations of parameter values, conditionally on the data collected and the models. This approach is thus amenable to be applied to regression analyses and provide information about the point probability of parameters, their plausible values, as well as the uncertainty around these estimates. 


Introduction to Bayesian Approach

This presentation will provide a non-formal description of the Bayesian approach using a practical example in R. I will introduce the key elements of the Bayesian approach, and highlight the difference from a conventional frequentist approach.

> Download script examples (ZIP folder).



   Download transcript    |   Download slides [ 180 Views ]

Specify a Linear Regression Model

In this presentation I will formally specify a linear regression model using a practical example of data analysis in R. I will emphasise the process of defining the linear model and prior assumptions for the distribution of parameters. 

> Download script examples and ‘data_birthweight.csv’ (ZIP folder).



   Download transcript    |   Download slides [ 100 Views ]

Specify a Linear Regression Model

In this presentation I will illustrate the run of the model specified in Video 2 using an Hamiltonian Monte Carlo algorithm. I will emphasise methods to check the convergence and stability of the algorithm solutions. I will then illustrate methods to describe and illustrate results and further inference, using tabular and graphical tools, through examples in R. 

> Download script examples and ‘data_birthweight.csv’ (ZIP folder).



   Download transcript    |   Download slides [ 138 Views ]

 

> Download exercises (and solution) files (ZIP folder).




More on this topic

About the author

Dr Oliver Perra is a lecturer at the School of Nursing and Midwifery, Queen’s University Belfast. His research revolves around the early experiences that explain differences in children’s adaptation and socio-cognitive abilities. He explores these issues by applying a transactional approach: this allows to investigate how interactions between children's characteristics and modifiable environmental factors can affect children's developmental pathways.

Primary author profile page



BACK TO TOP