Managed and operated by the Scientific Computing and Digital Infrastructure departments of the Science and Technology Facilities Council (STFC), the RAL data centre is also home to the second largest TFinity tape robot, which has also recently been upgraded with an additional two frames, giving it 13 frames.
Hiten Patel, who leads the team managing the RAL data centre, said, “These libraries have a combined storage space of 440 petabytes, and currently have less than half of that in use, holding over 300 million files but with the capacity to more than double that number. Users can call up the data on their laptops, and within a few seconds the tape robot will have located and scanned the relevant tape and made the data available."
The estimated size of just one petabyte is equivalent to 20 million tall filing cabinets of files, or 500 billion pages of text, so you can see how efficient and effective the tape libraries are!
Affectionately named Asterix and Obelix respectively, these automated tape libraries are a cost effective way to store the vast amounts of data generated through scientific research. The RAL data centre is a nationally-leading Tier-1 High Performance Computing Facility and is used for the particle physics data coming from the Large Hadron Collider at CERN, managed by GridPP*. It now has a maximum storage capacity of 240 petabytes, and is currently the largest Spectra TFinity tape library in the UK and Europe.
“It's nice to be a record holder, even for a short while!" said Hiten. “It won't be long, though, before even bigger Spectra tape robots are installed in European data centres."
The slightly smaller Obelix has 13 frames and stores data for use by the large-scale national facilities on the RAL site, such as the Diamond Light Source which provides funding for the two additional frames. It also stores huge amounts of environmental data from the Intergovernmental Panel on Climate Change and some from the NERC-funded JASMIN supercomputer, also housed at RAL and managed jointly by CEDA (Part of RAL Space) and Scientific Computing.
Scientific Computing's Tim Folkes is responsible for managing these huge data libraries and the data being housed in them. He said, "Both of the libraries are critical in addressing STFC's data storage solutions. They are already taking a combined average of 18 petabytes of new data annually but, with the start of Run 3** at the Large Hadron Collider last July, Asterix is expected to be in even higher demand in future to store large amounts of raw experiment data."
Hiten Patel, Tom Ashby and Andrew Knightley, HPC Operations Team, Digital Infrastructure Department, pictured beside Asterix, the largest Spectra TFinity tape library in the UK and Europe.
* GridPP is a community of particle physicists and computer scientists based in the United Kingdom and at CERN. Drawing on expertise from nineteen UK institutions, their vision is to create, manage and oversee the evolution of the computing infrastructure needed to maintain the UK's position as world leaders in particle physics.
**Run 3 is the newest period of data-taking for the experiments run in the Large Hadron Collider (LHC) at CERN. This is the first period of data transfer after more than three years of upgrade and maintenance work, and they expect to record more collisions during Run 3 than in the two previous LHC runs combined.