As a data analysis group embedded in a proteomics facility, we look at many datasets each year, generated with various proteomics techniques. Lately the emphasis has been on SWATH and TMT, excellent techniques for discovery proteomics, each with their respective strengths. In this talk I will discuss what we see as key elements of a data analytical pipeline for such techniques. A starting point is often a decision of which technique is better suited for each project, and with it capturing in a uniform fashion as much of the experimental design as possible. Batch considerations are important for larger experiments, which are becoming more common, as well as deciding how batch effects will be handled – typically via normalisation such as IRS or other statistical methods. Analysis environments such as Perseus are fantastic for individual in-depth analyses, but automated analysis options can save time and help with reproducibility and versioning, which are important in a quality accreditation environment. In our in-house workflows we often use R based tools such as SwathXtend or TMTPrePro. And finally, a crucial background ingredient is typically a controlled, spike-in dataset which can be relied on for assessing the key methods and their subsequent fine tuning.
Lead Scientist, Data Analysis
Australian Proteome Analysis Facility, Macquarie University
Dana Pascovici is a biostatistician working at the Australian Proteome Analysis Facility (APAF) based at Macquarie University, Sydney. She comes from a mathematical and computational background, having completed a bachelor degree in Mathematics and Computer Science at Dartmouth College in the US, followed by a PhD in Mathematics at MIT. For the past 15 years both in the industry and research environment, her research has been focused on generating reliable methods of interpreting and analysing data from a variety of quantitative proteomics platforms, lately emphasizing SWATH and TMT, and wherever possible incorporating them into software workflows. Areas of particular relevance to APAF’s bioinformatics team have been plasma proteomics, and plant proteomics of agriculturally important species. As a lead scientist in data analysis at APAF, she has helped researchers to generate biological insights out of their proteomics data, especially in the context of complex experiments. Such work is always collaborative, and benefits from interactions with researchers, students, and the APAF team of mass spectrometry specialists and analytical chemists.