Workshop - , 08-11-2023

Using Sankey plots to visualise complex pathway data in public health and social research


Bio: Instructor Biography: Hello, my name is Mary Abed Al Ahad and I am a fourth year PhD student in Geography at the University of St Andrews, Scotland, United Kingdom (UK). Through my PhD degree, I had the opportunity to design, manage and coordinate a research project aiming to investigate the effect of air pollution on self-reported health, wellbeing, mortality and hospital admissions by ethnicity in the UK. I have linked, cleaned and analysed large cross-sectional and longitudinal datasets. I have applied multiple approaches and tests including event-history survival analysis, multilevel mixed effects modelling, and longitudinal analysis using STATA and R/ R studio software. Alongside my PhD studies, I am also working as a Research Fellow at the MigrantLife Project at the University of St Andrews. In this project, I am researching on the topic of housing tenure and type of immigrants and thier children in Sweden. I am using a huge administrative data for more than 15 million individuals in Sweden and I am applying complex data cleaning, management and analysis methods including life course and event history quantitative analysis. Prior to this, I have worked as a research assistant at the school of Geography (University of St Andrews) on the 'HATUA' project: 'Holistic Approach to Unravel Antibacterial resistance in East Africa'. I was involved in data cleaning and management, quantitative data analysis, conducting literature review, and writing reports, academic articles, and conference abstracts. One of the abstracts that I led was accepted into the competitive 'American Society of Tropical Medicine and Hygiene (ASTMH)' conference. I also lectured numerous modules and workshops in statistical analysis, Geographical information system software (GIS) and sustainable development. Prior to my PhD, I worked as a research assistant in nursing public health and epidemiology for 1 year at the American University of Beirut where I gained valuable skills including writing proposals, applying for research grants, writing literature reviews, designing surveys, data collection, data entry and cleaning, quantitative data analysis using excel, STATA, and R studio, and results dissemination and reports writing. I have a Masters in Environmental Health from the American University of Beirut and a Bachelor of Science in Biology. Alongside my Masters studies, I worked part time as a project assistant at SPARK international NGO on a project which aims to help Syrian refugees gain access to higher education through merit scholarships and leadership/entrepreneurship development. Session Information: I will be giving a session on data visualization using Sankey plots. A Sankey plot is a visualization used to depict a flow from one set of values to another. The session will involve an explanation of Sankey plots and thier applications and structure using examples from Public Health and Social research. This will be followed with a hands-on application on RStudio to construct Sankey plots and answer example research questions. Specifically, by the end of the workshop, participants will be able to: 1. Understand the usage of Sankey plots in different research domains (e.g. public health, social sciences) 2. Prepare the data format needed to construct a Sankey plot using R studio 3. Construct a Sankey plot using R studio 4. Interpret the constructed Sankey plot 5. Apply what was learnt in the workshop on their research projects 6. Discuss the advantages and limitations of Sankey plots. The session will take place online. Session participants should have installed R and RStudio free software on their computers prior to the workshop to save time. Knowing how to use Rstudio is not a requirement. The session instructor will provide the needed code and explain everything assuming that participants have zero knowledge in R.


A Sankey diagram is a visualization used to depict a flow from one set of values to another. This two-hour workshop will provide an introduction into how to construct a Sankey plot to visualize and analyze complex patterns and pathways in public health and social data. In the first part of the workshop, the concept of Sankey plots and how they are used in public health and social research will be explained. This will be followed by a demonstration on how to generate Sankey plots in R studio software. In the second part of the workshop, participants will be divided into groups and each group will be given a research question and synthetic data that they will use to construct a Sankey plot to answer the research question. This will be followed by a general discussion on the benefits and drawbacks of the Sankey plot method.