This July, Médecins Sans Frontières / Doctors Without Borders (MSF) field epidemiologists working in 10 countries collected together virtually to trial software templates that are transforming how epidemiologists analyse data across the globe.
Epidemiologists – or 'epis' – are MSF's point of call during disease outbreaks, population displacement and field-based operational research. Epis are brought in to make sense of medical data, guide the approach to disease outbreaks and provide digestible evidence that allows MSF to better target their medical humanitarian interventions.
Epis are faced with several challenges in their data analysis in the field. One of the major hurdles they face is the variety of data collection formats. Another is that different epis take different methodological approaches to the analysis of data. Lastly, software for epidemiological data analysis might be too expensive to purchase, or people might not have the training on how to use it.
All these aspects lead to a lack of consistency in data analysis approaches and time lost in fixing data formats so they could be analysed. Combining, standardising and comparing analyses across outbreaks and other public health emergencies helps speed up the process of getting the data to those who need the information and helps us learn for the future.
Larissa Vernier, a field epidemiologist for MSF, experienced these issues first hand. She witnessed the lack of systematic methods of data analysis and how difficult it is to learn new software under pressure. With data coming in from all sides in different formats and poorly cleaned, making decisions takes precious time that epis don't have to spare.
A library of standard, context-specific analysis scripts would relieve some of this pressure for Larissa and other epis in the field. It needed to be open-source, transparent and usable in all of MSF's most common epidemics and field settings. This library of innovative tools could help address the complexities of rapid data analysis in crises in the most common epidemics MSF field staff face.
Writing the books for the library
So, R4Epis was born.
In October of 2018, the R for Epis team embarked on a journey to help address these complexities of rapid data analysis. Using R – open-source, transparent software – we took a collaborative, community approach to develop standard, context-specific tools for analysing and visualising epidemiological data.
R4Epis builds on the efforts made by the R for Epidemics Consortium (RECON), with MSF field staff partnering up with members of RECON.
The R4Epis team created templates for the most common outbreaks MSF field staff face: meningitis, Acute Jaundice Syndrome, cholera and measles. The software analyses line list data for outbreaks and generates reports covering the aspects of time, place and person. These aspects are the cornerstones of field epidemiology and are essential for a systematic approach to any epidemiological analysis.
R4Epis templates are accessible globally on GitHub and can read any line list data. The templates include coding to clean, recode and rename variables to match the rest of the template. Once the user fixes these variables, they decide what analysis they need, and a simple click produces a report in Microsoft Word. This report includes graphs, tables, maps, text: any information the user requires.
Fieldwork and analysis are ongoing, so we designed the templates to be used whenever new data is available. This ongoing reporting produces a systematic summary of the ongoing outbreak that anyone can view, learn and implement.
Before the templates could be sent out into the world, we needed to test them out. We held three virtual hackathons in July, bringing together MSF field epis based across ten different countries, from Bangladesh to Lebanon.
The epis had a chance to put the templates to the test with their real field data with R4Epis coders on call to lend a hand. Our coders could immediately address issues, errors and clarifications, and real issues for first-time R4Epis users were logged for the team to address later as part of training materials.
One of our epis participating in the hackathon was Elburg van Boetzelaer. Based in Cox's Bazar, Bangladesh, Elburg is tracking Acute Jaundice Syndrome (AJS). Cox's Bazar is home to over 700,000 Rohingya people, having fled Myanmar to find refuge in Bangladesh. The high population density with poor water and sanitation conditions has resulted in AJS outbreaks since October 2017.
The frustration for Elburg is the inconsistency in data gathering and measurement across the different NGOs operating in the mega-camp and the Bangladeshi Ministry of Health. "Different line list templates that contain different variables in different orders make analysis incredibly challenging," she said.
"R4epis means I can combine those different line lists just by renaming and recoding variables," says Elburg. "All of a sudden, I have complete data sets and epi curves, creating a much more complete and accurate picture of what is going on in the mega-camp."
As she moves into her next field assignment this month, R4Epis will simplify Elburg’s handover to her successor. It will also be valuable as she is settling into her new role: "R4eips is going to make my life significantly easier. It's going to save time and standardise the way that we analyse our data in the field," she said.
What's next for R4Epis?
Templates for disease outbreaks are just the beginning for R4Epis. Survey analysis templates are in the pipeline: the more we can do to save the amount of time epis spend cleaning and managing data, the more informed decisions they can make on that analysis.
By moving away from STATA and other software and transitioning into R, we can make epidemiology transparent, reproducible and available for anyone to use – across borders and organisations.
The potential impact of R4Epis is indisputable. In the long term, epis will be able to make more rapid and accurate decisions in time-critical situations. As the library continues to be built and refined, we'll see greater consistency in analysis, more time for epis to focus on the details, and a greater understanding of the issues facing the world's most vulnerable people.
R4Epis is available and free for all epidemiologists. If you think that R4Epis would assist and streamline your data analysis, you can find the code here.
R4Epis is always changing and growing, so we'd love to hear your requests for additional coding or training. If there's something we're missing, get in touch. Let's work together to make sure we have all the tools we need.