SCD SciML Projects

Research within SciML can be considered under three strands: experimental data analysis, smart facilities, and core research into fundamentals of machine learning. The detailed list of our projects are provided below.

EXPERIMENTAL DATA ANALYSIS

With the rapid development in detector technologies and the emergence of new experimental techniques, such as cryoEM, and accelerated detection rates, there are now major challenges for scientists to manage and process larger datasets for larger scientific discoveries. The topmost mission of SciML is to explore how Machine Learning and other AI technologies can help scientists to analyse the vast amounts of experimental data being routinely generated by the large experimental facilities at the Rutherford Appleton Laboratory (RAL) and STFC's Harwell campus.

Free-Form Inversion for Small-Angle Scattering (SAS)

ffsas is a Python library to invert for free-form distributions of model parameters in a polydisperse SAS system. It yields the maximum likelihood estimator of the parameter distributions and the sensitivity and uncertainty of the maximum likelihood estimator. ffsas comes with the following features:

Generality: ffsas formulates the SAS inverse problem as a Green's-tensor-based multi-linear map, which covers any complex SAS models with multiple parameters. An arbitrary model can be easily implemented by supplying the Green's function for forward modelling, i.e., a function that computes the monodisperse intensity on a structured grid of model parameters (i.e., the Green's tensor). In short, users only need to care about the physics for forward modelling, leaving the inverse problem all to ffsas. Besides, it does not require an initial guess of the parameter distributions, the scaling factor or the source background.
Efficiency: through theoretical analysis, the inverse problem is simplified as a highly solvable nonlinear programming (NLP) problem with a few equality constraints. It is solved by a trust-region method, implemented in SciPy as scipy.optimize.minimize(method='trust-constr'), officially mentioned as "the most versatile constrained minimization algorithm implemented in SciPy and the most appropriate for large-scale problems". Computation of the Jacobian and Hessian of the χ2 error, as the most expensive step during solution, is accelerated by GPU and mini-batch computation. The idea is borrowed from deep learning and implemented with PyTorch.
Accuracy: the model parameters and the resulting intensity in a SAS problem can span many orders of magnitude, and a good choice of unit system is essential to avoid an ill-conditioned inverse problem. ffsas automatically analyzes the orders of magnitude of data and parameters so as to determine a proper internal unit system to avoid accuracy loss. Such an internal unit system is hidden from users, who can use an arbitrary unit system for input and output.
Usability: ffsas can be installed with pip in one line. Its usage only includes four APIs respectively for
- computing the Green's tensor G given a model and a parameter space
- define a G-system with this Green's tensor
- using this G-system to compute intensity given parameter distributions (forward modelling)
- using this G-system to invert for parameter distributions given intensity data (inversion)

Figure below: A large-scale synthetic test on size and orientation distribution inversion of polydisperse cylinders.

Paper (open access): Leng, K., et al. (2022). J. Appl. Cryst. 55, 966-977.
Repository: https://github.com/stfc-sciml/ffsas

Detecting Bragg Peaks in 3D X-ray/Neutron Images

Bragg peak detection in noisy 3D X-ray/neutron images is a challenging problem in experimental crystallography. Combing 2D blob detection algorithms, 3D morphological analysis and machine learning, we have developed a method for robust and fast detection of weak Bragg peaks in a noisy 3D background. Our method can speed up the process of setting up experiments and maximising beamtime. Further applications could include cleaning up multiple scattering from synchrotron data, identifying extinction severity.

Figure below: An example of Bragg peak detection using our method.

Repository: https://github.com/stfc-sciml/bragg-detect

Machine Learning for Classifying Inelastic Neutron Scattering

The goal of this work was to develop machine learning tools that can infer information directly from inelastic neutron scattering experiments. These tools will allow for rapid identification and quantification of important signatures of magnetic ordering in materials. Currently this task is a time-consuming and tedious effort for researchers and presents a bottleneck to further progress.

We developed convolutional neural networks that could successfully classify magnetic structures of materials from the INS signal. We added uncertainty quantification to the predictions, so that the CNN could also identify where a new class, outside of the training data was encountered. We also generated explanations for the CNN predictions, using the Grad-CAM approach. The Grad-CAM highlights the areas of the spectrum that were most important for a classification. When compared to previous experimental characterisation it was found that the CNN was concentrating on the same area of the spectrum as a trained physicist for making the classification.

This project is in collaboration with beam scientists from ISIS Neutron and Muon Source. The paper is avialble online Interpretable, calibrated neural networks for analysis and understanding of inelastic neutron scattering data There is also a code repository associated with the work coming soon.

Accelerating Inelastic Neutron Scattering Simulations with Generative Machine Learning

In this work we developed a surrogate model which can generate realistic simulated inelastic neutron scattering data, consistent with spin-wave theory, including instrument resolution effects, in a fraction of the time required for first-principles simulations. We trained conditional variational autoencoder (cVAE) which can be used both for simulation and hamiltonian parameter prediction. We can speed-up the grid search for several parameters from several months with linear spin wave theory (SpinW' and Horace'') to less than half an hour, using CVAE as a surrogate on multiple GPUs. We are also able to do parameter estimation on experimental data using the ML model with MCMC sampling. This can be applied during an experiment to provide confidence measures on the estimated parameters, while the experiment is running.

[1] 2015 J. Phys.: Condens. Matter 27 166002; https://spinw.org/ [2] 2016 Nucl. Instrum. Methods Phys. Res. A 834 132–42; https://pace-neutrons.github.io/Horace/3.6.0/

Image below: Illustration of our approach.

Image below: Corner plot below shows the results of MCMC sampling for parameter estimation of experimental Pr(Ca,Sr)₂Mn₂O₇ (PCSMO) data.Orange lines indicate position of each parameter as previously estimated with LSWT and crosses – their intersection.

This project is in collaboration with beam scientists from ISIS. Paper and repository will be open to public shortly.

Supervised Segmentation for X-ray Tomography on Prehistoric Fish Data

In this work we trained a U-net architecture to segment low contrast tomography images. It was trained on 100 images chosen from two datasets, Afterwards applied on new data, segmentation took less than 10 minutes for the whole dataset and proven to be useful for preliminary data analysis.

Image below: Results of segmentation.

This project is in collaboration with beam scientists from Diamond and paleonthologists at Imperial College London. Paper and repository will be open to public later.

Alignment of Frames from Stochastic Optical Reconstruction Microscopy (STORM)

Stochastic optical reconstruction microscopy (STORM) is a method for super resolution imaging based on the high accuracy localization of individual fluorophores. The drift happens due to various sources such as mechanical vibration, thermal expansion and so on. The goal is to detect beads from each frame to align. In theory, beads appear on all frames, and they move slightly in the same direction.

Image below: Beads are overlapped in the aligned data (right) and objects become more sharpen.

This project is in collaboration with beam scientists from CLF. Paper and repository will be open to public shortly.

SMART-EM

The objective of the project is to build an algorithm to find a region of interest for Cryo-EM data collection.

Image below: Data collection pipeline in Cryo-EM.

This project is in collaboration with beam scientists from eBIC. The details will be disclosed in the future.

Mega Ampere Spherical Tokamak (MAST) Data

The MAST was a nuclear fusion experiment testing a spherical tokamak atomic fusion reactor commissioned by the UKAEA. The MAST experiment occurred at the Culham Centre for Fusion Energy, England, from December 1999 to September 2013. During this period, a massive amount of data was generated. In this collaborative project between the SciML and UKAEA, we will work on investigating different data modelling and processing techniques for making the data FAIR compliant. We will also be enabling UKAEA and the fusion communities making the best out of this data using various machine learning techniques, offered through an interactive portal.

Image below: Mega Ampere Spherical Tokamak (MAST). The image was taken from web.

This project is in collaboration with scientists from UKAEA. Paper and repository will be open to public shortly.

Filtering Contaminants in Picked CryoEM Particles

Achieveing a high quality CryoEM reconstruction requires thousands of particle images picked from micrograph images of the sample. Automated particle pickers, such as crYOLO, have been develop for this purpose. However, regardless of the particle picking technique used, there is also a fair number of false positive particle images picked. Such images are often contain contaminants, such as ice thickness or overlapping particles, which hinder reconstruction. In this project, we are exploring new methods to filter contaminant particle images from stacks of picked particles images prior to reconstruction in the CryoEM pipeline.

Image below: Leftmost Column: Average particle image for the output of Relion Class2D. Other Columns: Outlier particle images detected within the class.<\p>

This project is in collaboration with scientists from Diamond Light Source.

Pulse Shape Discrimination and Saturated Pulse Reconstruction Using Kalman Filtering

Scintillators emit light when interacting with both neutrons and gamma rays. A photomultiplier (PMT) vacuum tube is used to convert incident photons into an electrical signal. Different particles have different rate of decay in the scintillator due to their difference in weight and ionising densities. Therefore, by examining the shape of the pulses of gamma-ray and neutron pulses, it is possible to determine the incident particle. This process is known as pulse shape discrimination (PSD).

Pulse shape discrimination

In this project, we aim to derive low-cost and high-accuracy classifciation algorithm for real-time processing. The developed algorithms can tackle typical PSD problems such as pile-up and saturated pulse. Multiple model Kalman filtering is used (MMKF) for noise filtering and classification. The pulses are fitlered in frequency and time domain to obtain a better result

Image below: Combining time and frequency domain (Left), Time domain filtering (Middle), Frequency domain filtering (Right)

Image below: Pulse shape Discrimination (Left: Raw data, Right: Classified data)

Piled-up and saturated pulse reconstruction

By using the empirical model and MMKF method, piled-up pulses can be classifed and separated which could boost the detection rate signicantly. Image below: Pile-up pulses reconstruction (Left: Raw data, Right: Processed data)

As the pulses can be classified using the tail section only, the saturated pulse can be reconstructed as below. Image below: Saturated pulse reconstruction

This project is in collaboration with beam scientists from ISIS and TECH department.

Pulse Shape Discrimination and Pile-up Recovery Using Machine Learning

At high event rates, two or more events may be recorded in quick succession leading the a phenomena called pile-up. Pile-up is where a second event is recorded on the tail of first event. This makes pulse shape discrimiantion challenging as the tail of the first pulse is no longer characteristic of the particle type. In this scentario traditional tail integral methods fail to disciminate the pulses. In this project, we developed a two step process for classifying and recovering pile-up events.

Image below: Left: A single pulse event. Right: A pile-up event, where two events have been recorded at almost the same time.

Pile-up Recovery

We first classify the single pulse events as either neutron or gamma. Using the classified single pulse events, we can simulate a dataset of pile-up events with known event components. We then train a multi-layer perceptron (MLP) network to discriminate between different classes of pile-up. Using an efficient multi-label classification scheme, we show that we can train a single MLP to perform both pulse discrimination and pile-up recovery in a single unifed network.

Image below: Classified high rate data showing many pile-up events recorded from the EMMA facility. Top Left: original high rate data in PSD space. Top Center: Data with events classed as pile-up removed. Top Right: Only the pile-up events. Bottom left: neutron events (including pile-up events). Bottom Center: Gamma events (including pile-up events). Bottom Right: Neutron events with pile-up removed.

Image below: ROC curve in log scale for the classification or 1) single, 2) first pulse and 3) second pulse events in simulated high rate EMMA data.

Cloud Segmentation in Sentinel-3 SLSTR Satellite Images

Clouds appear ubiquitously in the Earth's atmosphere, and thus present a persistent problem for the accurate retrieval of remotely sensed information. The task of cloud masking can be represented as a binary classification problem where we aim to assign each pixel in an image “cloudy” or “clear”. With the deluge of data now being collected on a daily basis from Earth observation platforms, machine learning algorithm are becoming an increasingly attractive option to tackle highly complex problems using a data driven approaches where traditional rule based systems can fail. A major barrier to applying machine learning to EO data is the lack of labelled data from which to train models. We make use of a relatively small hand-labelled dataset of single pixels to train a model which can distinguish between cloud and clear sky.

We use Gradient Boosting Decision Trees (GBDTs) trained on single pixel features. GBDTs iteratively fit an ensemble of decision trees, with each tree additively fitting to the residual of the previous tree. For input features we use the 9 NIR, SWIR, and TIR channels. Additionally, we compute spatial features over 5x5 windows centred on each labelled pixel such as the min, max, mean, and standard deviation. We also include auxiliary information including variables from meteorological prediction (such as TCWV), as well as binary flags for day/night and ocean/land.

Image below: Columns left to right: Sentinel-3 SLSTR images. Predicted probability that a particular pixel is cloud. Binary cloud mask.

Image below: F1 scores for different classifications of cloud cover

This project is in collaboration with beam scientists from RAL Space.

SMART FACILITIES

A large-scale experimental facility makes a complex system involving diverse hardware and software for many tasks such as sample preparation, source generation, signal detection, and data storage and processing. One major focus of SciML is to integrate new machine learning techniques to our facilities for all possible promotions such as beamtime saving, control automation and noise reduction.

Realtime Reduction, Monitoring and Early Stopping for ISIS-SANS

Currently many neutron beamlines run like a black box without real-time results visible to the experimentalists. To ensure the convergence of the measurements, long-time experiments must be performed. The need for reduction in measurement time and resources is particularly acute in neutron scattering. The essential reason for such blindness is that the reduction process from the direct acquisition to physically meaningful measurements is heavy, theoretically only doable after an experiment is completed. In the project, we aim to use machine learning techniques to solve this problem, initially for the SANS beamline at ISIS and with the potential to extend to other beamlines.

Machine Learning Surrogate Models for Synchrotron Injectors

The goal of this work was to develop machine learning (ML) tools to help with the optimisation of beam injector profiles. These tools will allow for rapid exploration of how settings of beam injector parameters affect the profile of the beam and also how often the beam strikes the measurement foil in the synchrotron. Ultimately the aim is to produce a more concentrated beam which strikes the foil less often. Currently this kind of optimisation is done based on physics based models, but there is a trade off; the very accurate models are too expensive to use for iterative optimisation, while the quicker models are not accurate enough to trust the results.

We built long short term memory (LSTM) neural networks that could take the outputs from inexpensive, low-fidelity simulations and learn to make the results closer to the high-fidelity, but expensive physics simulations. The availability of this new fast, but high-fidelity prediction will facilitate future injector design and operation, as the control parameters can be optimise the beam control parameters.

This project is in collaboration with beam scientists from ISIS and Diamond. Paper and repository will be open to public shortly.

Machine Learning Electron Probes for Ptychographic Reconstruction

The main motivation for the present work is to explore the possibility of learning the illumination function of the probe from a ptychographic dataset, i.e. a series of diffraction patterns complying with the required sampling. Having a good guess of the probe at the start of ePIE reconstruction would result in a more robust guess of the object complex array. It is often the case that during data collection, especially if the sample is electron beam sensitive, the experimentalist would collect ptychography data “blind”, moving the sample stage randomly with the beam blanked and collecting data from pristine regions with no prior exposure. Hence for these datasets the estimate of the probe defocus can be inaccurate. Having access to a trained network able to provide a good guess of the probe would be valuable in converging the correct object guess. What normally occurs in absence of such network is a series of parameter sweeps plus an evaluation step on the resulting reconstructions to finally guess the starting probe defocus.

The image below shows A the probe as estimated by ML, compared to the true probe; B the results of reconstruction using the ML probe compared to the true probe and a set of incorrect probes; C a zoom in of the reconstructed image using the ML probe an the true probe.

This project is in collaboration with beam scientists from Diamond Light Source. Paper and repository will be open to public shortly.

Automated Annotation and Analysis of 3D Cellular cryoFIBSEM Imaging

Pre-processing pipeline for cryo-FIB-SEM imaging Focused Ion Beam-Scanning Electron Microscopy (FIB-SEM) is an invaluable tool to visualize the 3D architecture of cell constituents and map cell networks. In order to automatically extract information from this imaging technique. Pre-processing pipeline is developed to help with Machine learning based segmentation. The images of the cryo-FIB-SEM stacks were processed to correct for artifacts curtaining, drift, burnt mark and were denoised as follows

Curtaining and Contrast boosting

Image below: Left: Raw data, Middle: Contrast boosting, Right: De-curtaining

The cell content is heavily distorted by the curtaining and burnt mark in the raw data. It is caused by uncontrolled fracturing of the amorphous ice blocks and from the presence of small contaminating ice crystals picked up from liquid nitrogen during the transfer steps. Contrast Limited Adaptive Histogram Equalization (CLAHE) is used to take care of over-amplification of the contrast at some region. CLAHE operates on small regions in the image, called tiles, rather than the entire image and so it can preserve the original content of the cell. The de-curtaining step is carried out by using wavelet decomposition up to a level N so that the vertical striping noise components are condensed in the vertical detail coefficient map. Then the coefficient map is filtered in the frequency domain at every level. The de-striped image was then calculated by applying an inverse wavelet transform using the 2D filtered vertical coefficient maps.

Drifting

Image below: Left: Raw data (Frame 1), Right: Raw data (Frame 100)

The cell content here is drifted gradually which can pose a challenge for 3D reconstruction. Therefore, the image drift was corrected by using phase correlation. The operation takes advantage of the Fourier shift theorem for detecting the translational shift in the frequency domain. It can be used for fast image registration as well as motion estimation. The displacement vector of each image to the previous one is calculated.

Denoising

Image below: Left: Denoised data, Right: Raw data

CryoFIBSEM images are in general very noisy and will affect the segmentation result. Therefore, in here, Deep image prior is used for denoising. Since the noise model is unknown and no available labelled data can be used, DI is a very suitable tool for denoising in here. DI is a type of generative model, and it only requires one image to be trained on. It relies on the fact the neural network tends to learn meaningful feature first rather than random noise. Therefore, if the learning process can be stopped earlier, the reconstructed image will be cleaner than the original one and hence attaining the function of denoising.

Segmentation

We train a neural network with a small number of frames. Since frames have different statistical properties, we adapt neural network parameters when we have unseen datasets. The results shows an improvement of segmentation when we have adaptation.

Image below: Raw data

Image below: Segmentation result

This project is in collaboration with beam scientists from Central Laser Facility (CLF).

Auto-calibration of Scanning Transmission Electron Microscopes using Reinforcement Learning

The electron Ronchigram is an important method for assessing and correcting aberrations for Scanning Transmission Electron Microscopes (STEM). The Ronchigram is named after a Italian physicist Vasco Ronchi when he was developing a test, "Ronchi Test", to understand aberrations in optical system. Aberrations are deviations from a perfect spherical wave, and an aberration function is used to describe such phenomenon

Assessing the aberrations in STEM is vital for improving imaging resolution and contrast. Lens aberrations cause a shift in the phase of the electron wavefunction across the objective aperture which would degrade imaging quality. In practice, a trained electron microscopists can assess the quality of the instruments by tuning the current in a stack of electromagnetic lenses. The shape, texture, and symmetry of Ronchigrams are tuned until a large featureless region appears.

This project aims to perform aberration function estimation on the Ronchigram and auto-calibrate such aberration. Automatic feature extraction is needed to understand the state of calibrated Ronchigram, and the feature can be subtle as the number of order of aberration increase. The state of the Ronchigram can then be used as a guide for automatic aberration calibration.

The aberration function is estimated by using a convolutional neural network as shown in image below.

Image below: Aberration function estimation by CNN

After obtaining the parameter of the aberration of the ronchigram, it is then used as an input for the Deep Q-Network for controlling the current automatically as shown in Figure below.

Image below: Current control by DQN

Image below: Simulated result

This project is in collaboration with beam scientists from Rosalind Franklin Institute (RFI).

Laser Induced Damage Prediction in Real-time

Damaging optics during experiments are a significant problem in CLF laser beamlines which cause a loss of productive time, loss of inexpensive optical component and a corruption of scientific results. The aim of this project is to identify the laser induced optical damage in real-time. It is challenging to observe a laser induced damage in real-time and the reason for the damage in optics is not well understood yet.

In this project, various meta-data from the laser experiment has been acquired to correlate the image of the optics to its damage information. This study helps to find damage growth pattern and damage threshold which enables us to identify the laser damage during its onset. To predict the damage, we developed two approaches. Firstly, based on a moving average method, the temporal information is collected and analyeed in both individual image and sub frame level intensity information. Second, we built an LSTM-Autoeconders to recreate the undamaged image with minimum reconstruction error and to use the trained network to predict the damage by comparing its reconstruction error. The following near-field images captured during the laser damage test.

Image below: Comparision between damaged and undamaged optics.

Machine Learning for Fluorescence Localisation Imaging with Photobleaching

The goal of this work is to develop machine learning tools that can infer information from fluorescence localisation imaging with photobleaching (FLImP) images collected at Central Laser Facilities (CLF). FLImP allows to map out the spacing of molecules in complexes with a resolution better than 5 nm. FLImP has been used to characterise the molecular architecture of complexes of epidermal growth factor receptor (EGFR), the target of a number of drugs in clinical use. Signals from EGFR are responsible for the control of cell growth and EGFR mutations are implicated in many cancers. At present, successful FLImP imaging is a user-intensive process, requiring manual intervention for the selection of regions of interest (ROI) for image segmentation, autofocusing of images, and track selection. More efficient tools need to be developed to automate FLImP and achieve translation to clinical use.

Efficient autofocusing

At present, the optimal focus is determined by using a deconvolution technique on top of the Oxford Nanoimager autofocusing. In this method, the defocused image is treated as a convolution of the in-focus image and the defocusing point spread function (Gaussian Kernel approximating PSF determined from experiments). We plan to develop a machine learning-based offline autofocusing method that enables the prediction of the focusing distance from a couple of defocused images without any prior knowledge of the defocus distance, its direction, or the PSF.

We have developed three convolutional neural network (CNN) based models, one deep CNN, and two pre-trained models with MobileNetV3 and InceptionV3, respectively, to predict the focusing distance of the microscope images. The images obtained from CLF contain both fiducial markers and cell images. The deep-CNN model outperformed the pre-trained models and we found a 91% correlation between the predicted focusing distance and the true focusing distance. The Bland-Altman analysis between the prediction and ground truth showed 95% limits of agreement between +0.66 $\mu$m and -0.65 $\mu$m which is good but not satisfactory enough. We are exploring a deep Q-network-based RL model to achieve autofocusing with the desired accuracy.

Automatic ROI detection

At present, a classical image segmentation approach is used to determine ROI. First, ROIs are drawn on the cells of interest (Hoechst channel) and the EGFR receptors (FLImP labelling channel) by applying the Otsu threshold and triangle threshold, respectively. Then the ROI is selected for recording a FLImP video if both fractions satisfy pre-determined thresholds.

We are exploring various machine learning models to automatize the ROI detection process. A two-step model has been developed. First, this model classifies data frames with and without cells and then segments ROIs from the frames with cells. We compared several architectures made of different backbones and additional neural network modules for segmentation. Three segmentation models with 10 backbones have been designed to classify and segment ROIs from the image frames with cells. JPU-FCN (Joint Pyramid Upsampling-Fully Convoluted Network) and DeeplabV3 are the best-performing among the segmentation models, and MobileNetV3 is the best among the backbones. These models utilize information from two channels and demonstrate promising performances. Now we are validating our model against different sets of manual ROIs.

Image below: ROI or mask prediction using our model and its comparison with the manual mask.

Automatic track selection

Each FLImP series typically returned between 1,000 and 10,000 track objects of which only a small fraction was suitable for FLImP analysis. At present, the identification of FLImP suitable tracks is a laborious process, requiring trained operators to manually go through track lists from each FLImP series to identify tracks that may be suitable for downstream FLImP fitting processes. The goal of this work is to identify hidden features between good and bad tracks and classify them automatically.

At first, an autoencoder is used for unsupervised feature extraction. The track data are transformed into Fourier space to obtain a more continuous representation of the data. After that, Kalman Filter-based method is used for denoising and level-classification in each of the individual tracks to reconstruct and polish conventionally unused tracks, which can boost the overall processing pipeline.

Image below: Classifcation result from autoencoder

Image below: Track feature extraction

Image below: Kalman filter based method for denoising and level-classification

Note: This project is in collaboration with scientists from CLF.

CORE MACHINE LEARNING

The SciML members are not only interested in applying existing machine learning technologies to scientific data but also keen on making innovative contributions to machine learning. Our members are interested in various topics such as representative learning, physics-informed neural networks, graph neural networks, 3D image segmentation, AI benchmarking, and related teaching and tutoring.

Fully Unsupervised 2D/3D Image Segmentation

Fully unsupervised semantic segmentation of images has been a challenging problem in computer vision due to its non-convexity and data insufficiency (e.g., only one image available). Many deep learning models have been developed for this task, from dense or pixel-based to sparse or graph-based ones, mostly using representative learning guided by certain loss functions towards segmentation. In this project, we conduct dense representative learning using an existing fully-convolutional autoencoder; based on a precomputed over-segmentation, the learned dense features are reduced to a sparse graph where segmentation can be encouraged from three aspects: similarity, continuity and normalized cut. Our model can be trained with one or a few input images. To alleviate overfitting, we compute the reconstruction loss using size-varying random patches taken from the input images. We show that the model trained with one or a few images can be highly robust for predicting other images with similar semantic contents, meaning that the model trained in 2D can be used to segment a 3D image or a video.

Image below: Segmentation of a video with 400 frames, the neural network trained with only 3 frames.

Image below: Segmentation of a X-ray tomographic data with 800 slices, the neural network trained with only 1 slice.

This project is in collaboration with beam scientists from ISIS and Diamond. Paper and repository will be open to public shortly.

Active Learning Materials Properties with Graph Neural Nets

Graph neural networks trained on experimental or calculated data are becoming an increasingly important tool in computational materials science. Networks once trained are able to make highly accurate predictions at a fraction of the cost of experiments or first-principles calculations of comparable accuracy. However, these networks typically rely on large databases of labeled experiments to train the model. In scenarios where data are scarce or expensive to obtain, this can be prohibitive. By building a neural network that provides confidence on the predicted properties, we are able to develop an active learning scheme that can reduce the amount of labeled data required by identifying the areas of chemical space where the model is most uncertain. We present a scheme for coupling a graph neural network with a Gaussian process to featurize solid-state materials and predict properties including a measure of confidence in the prediction. We then demonstrate that this scheme can be used in an active learning context to speed up the training of the model by selecting the optimal next experiment for obtaining a data label. Our active learning scheme can double the rate at which the performance of the model on a test dataset improves with additional data compared to choosing the next sample at random. This type of uncertainty quantification and active learning has the potential to open up new areas of materials science, where data are scarce and expensive to obtain, to the transformative power of graph neural networks.

This work was done in collaboration with the UKRI AIMLAC CDT. The paper is available online Entropy-based active learning of graph neural network surrogate models for materials properties. A repository for the work will be available soon

The Federation Project: Machine Learning for Science

This project produced a set of practical Jupyter Notebooks designed for the DiRAC Federation Project, Work Package 3.1, namely, ML for Science. The notebooks have been developed and curated by SciML, in strong collaboration with DiRAC, and its partners. These notebooks cover the domains of Material Sciences, Astronomy/Cosmology, Physics, Healthcare and Fusion Research. There are 23 notebooks in this collection, as outlined in the table below.

Domain	Number of notebooks
Material	6
Astronomy	6
Physics	6
Healthcare	2
Fusion	3

Image below: Continuum Image Classification in SKA

Image below: Evolution of Electron Temperatures

Foundation Models for Science

Foundation models have strong adaptation (e.g., fine-tuning) capabilities for various tasks and are key to various real-world applications. We envision that foundation models can be used for scientific knowledge extraction. In this project, we will explore such opportunities.

The SciML Benchmark Suite

The SciML Group is developing a suite of Scientific Machine Learning benchmarks. These benchmarks enable to rank computers and help to understand the interaction between applications and the underlying hardware. SciMLBench is represented by a framework and a collection of large scale applications drawn from scientific domains such as material, life, and earth sciences, particle physics and astronomy. The framework and the benchmarks are written in Python and they use libraries such as TensorFlow, PyTorch and SciKit-Learn.

SciMLBench is not just a collection benchmarks, but it is also an initiative which aims to provide assistance for the community to develop and deploy new benchmarks. These benchmarks will enable scientists to map out the applicability and limitations of deep learning neural networks and other machine learning algorithms applied to a range of real applications. The benchmarks will provide hands-on experience of using machine learning algorithms and environments on realistic-scale scientific datasets. More information can be found at SciML Benchmarks code repository on Github.

Improved Imaging by Invex Regularizers with Global Optima Guarantees

Image denoising plays a critical role in modern signal processing systems since images are inevitably contaminated by noise during acquisition, compression, and transmission, leading to distortion and loss of image information. One of the most successful assumptions is that a signal can be well approximated by a linear combination of few basis elements in a transform domain. Under this assumption, a denoising method can be implemented as a two-step procedure: i) to obtain high-magnitude transform coefficients that convey mostly the true-signal energy, ii) to discard the transform coefficients which are mainly due to noise. Typical choices for the first step are the wavelet, cosine transforms, and principal component analysis (PCA). The second step is seen as a filtering procedure that is formally modelled as a proximal optimisation problem

where g(x) acts as a regularisation term, and u represents the noisy transform coefficients. In this project we improve this important proximal operator by incorporating invex regularisers. What makes this class of functions special is that, for any point where the derivative of a function vanishes (stationary point), it is a global minimiser of the function. Convexity is a special case of invexity. Since 1990s, a lot of mathematical implications for invex functions have been developed, but with the lack of practical applications. Examples of the few successful works implementing the invexity theory include. To the best of our knowledge, there is no existing work on the application of invex regularisation for imaging.

Image below: first list of regularisers with proved invexity; Denoising method using invex regularisers; Importance of Invexity

Images below: Denoised results using the above denoising method, and the top three invex regularisers

This project is in collaboration with researchers at University of Manchester, and Diamond. Paper and repository will be open to public shortly.

SciML Projects 01 Aug 2022 Yes -