Themes
The main themes that I have experience in are project management, data science, software development and quantitative research. Below I give examples of various projects that I have worked on under each of these themes.
Project Management
I have experience in managing numerous project as well as working as part of small and large collaborations. I enjoy working as part of teams to efficiently obtain quantifiable results.
Planet Hunters TESS
I designed, built, and continue to manage a global non-profit project designed to analyse time-series data with a global community of citizen scientists. The project, known as Planet Hunters TESS, is one of the largest citizen science projects in the world. Since I launched the project in 2018, the user base has grown to 45,000 volunteers from over 90 different countries.
Management of this project includes running A/B test to ensure a positive user experience; using simulated data in order to assess the success and sensitivity of the project; managing user interactions to further the scientific goal of the project; maintaining and analysing large amounts of data; and researching new ways in which we can grow and advance the project in the short and long term.
In order to further the scientific goal of the project we use numerous resources (e.g. telescopes, high performance computing). I apply for these resources via grant applications and manage their use in order to obtain results. To date, I have acquired and deployed over 500 hours of telescope time and millions of hours of computing time (∼1 million GBP). Overall, I have experience in identifying a need for additional resources, acquiring and allocating those resources, and managing a team to analyse the resulting data.
Coffee Chat
In collaboration with a team at NASA Ames, I produce and manage resources that further scientific engagement of underrepresented groups in STEM subjects. In particular, I designed and created a series of videos and accompanying resources (e.g. Jupyter Notebooks) that lower the barrier of entry into real scientific research. This work has a strong aspect of both resource and team management.
The coding tutorials were viewed world-wide (+15,000 views) and are used for teaching purposes at various Universities across the USA.
Product Information Manager
I worked as a product information manager at Vetrag AG, Switzerland, where I was in charge of maintaining accurate and up to date information on the companies wares. This included overseeing the production of the company catalog and managing a team of five. Following my full time employment at the company, I remained an external consultant for a number of years.
Data Science
Data science is a primary component of my research and I have led (and published) numerous projects involving the analysis of large data sets. Below is a non-exhaustive list of some examples:
Planetary Astrophysics
In order to characterise the masses and radii of planet candidates, I use parametric models that describe the orbital motions of planets around stars and the motion of planets across the surface of a star. Furthermore, I use a combination of least-squares optimization techniques to find initial models and numerical sampling Markov Chain Monte Carlo methods to make parameter inferences and uncertainty estimates.
I use existing probabilistic models to calculate the likelihood that a signal seen in time-series data is planetary in origin (as opposed to a false positive). This allows me to statistically validate planetary signals.
Solar System Objects
I obtain and characterize time-series brightness observations of comets. I corrected and calibrated the raw time-series data for various effects such as galactic extinction, background brightness variations and instrumental effects (e.g. by comparing the brightness variations of the target star to the brightness variations of nearby stars).
The measured periodic variations in the brightness allowed me to determine the rotation, and change in rotation of the cometary body. The periodicity was determined using phase dispersion minimization and Lomb-Scargle algorithms in Python.
Social Science
The Zurich Study on the Social Development from Childhood to Adulthood (z-proso) interviews children, parents and teachers in order to further our understanding of behavioural problems in children and young adults and to help policymakers develop more effective strategies to promote psychosocial health and reduce violence. I used the data from each round of interviews to assess and help mitigate the effects (e.g. biases, reduction of the power of the study) of ‘missing data’. In brief, I used multiple probit regression models to assess predictors of parent and youth non-participation.
Project Performance (of Planet Hunters TESS, see above under 'project management’)
Project performance is assed using injection (simulated data) and recovery tests. This includes the analysis of large amounts of data in order to quantify the success of the project.
I designed and tested a framework that uses the simulated data in order to determine the skill of individual volunteers, in turn allowing us to maximize the scientific output of the project. With ongoing work I am investigating the use of machine learning and latent variable models in order to further optimize this framework by including more indicative ‘skill’ parameters (e.g. time of classification, duration, device used etc.).
Software Development
I have 9 years of experience in using python to develop algorithms, write scripts and software, and build analysis pipelines. I have expertise in using python to build parametric models and make parameter inferences through numerical sampling and forward modelling procedures. Additionally, I have experience in writing analysis software that allows non-experts to use diagnostic tools to model time-series data.
I developed and published a suite of high-level analysis tools, known as the Light Curve Analysis Tool for Transiting Exoplanets (LATTE), that are contained in a python module with a graphical user interface. This software tool allows for a fast, in-depth analysis of time-series data and is widely used within the exoplanet and stellar astrophysics community.
I developed a pipeline, in python, that 1) automatically collects data from a NASA database, 2) generates simulations for benchmarking and calibration, 3) collects and aggregates classifications from users based on calibrations from simulations, and 4) identifies features in time series for further modelling. This pipeline makes use of libraries such as numpy, pandas, scipy, and sklearn, and uses parallelization tools such as mpi to speed up calculations. This pipeline is hosted on the University of Oxford Physics Departments computing cluster.
The orbiting motion of multiple stars produces observable dynamic effects in the time-series brightness measurements of these stars. I built a generalized probabilistic model that predicts the observable effects of this motion in multiple types of astronomical time-series data, using symbolic coding tools including pyMC. I then use numeric sampling tools, such as Hamiltonian Monte Carlo, to derive parameter inferences of the physical stellar system by modelling the data.
Quantitative Research
An important research skill is to assess the landscape of existing knowledge and using that to identify novel and interesting research questions and directions. Here are some examples of where I have identified a lack of knowledge/tools and applied quantitative methods to tackle them:
Missing data
I am currently building an algorithm that mitigates the effects of gaps (or ‘missing data’) in astronomical time-series data. I am carrying out this work to help identify low amplitude variable signals that are hidden by the signature of gaps in Fourier analysis. This, in turn, will have important implications for our ability to precisely characterize the properties of stars and planets. Beyond stellar and planetary physics, missing data poses an ubiquitous threat to the study of periodic signals in time-series data across numerous disciplines. This work has given me extensive expertise in signal processing methods, imputation techniques, and Fourier analysis.
Planet evolution
There is a notable gap in our understanding of how planets and stars interact in the later stages of the star’s life. Beyond general scientific curiosity, knowing how stars and planets interact will directly help our understanding of what will happen to our own Solar System in the future. In order to address this, I am carrying out a targeted search for old planets, in order to quantify planet-star interactions such as tides and in-spiraling (planets spiraling in towards their host stars).