Users of the Research Data Store (RDS) may have noticed intermittent performance issues affecting storage recently, which has been causing some disruption to users. This work is a necessary precursor to a major program of work in which we will replace, expand and enhance, the RDS service.
Phase 1
Between now and summer 2025 we will replace around half of the hardware that underpins the RDS service, specifically, the portion used for health research data. This will increase capacity from 8PB to around 14PB.
The new hardware will be used to underpin a much expanded RDS service, used to store health research data (irrespective of which college or department generates or owns the data).
In preparation for this change, we have been asking all BEAR project owners to confirm whether their project is primarily health research. This is being done during the routine project re-registration process, and you may have already been asked this question.
Beyond classifying your data correctly, at the point you register or renew a project, there is nothing else that you need to do.
IMPORTANT: Specifying that your project is health research does not imply any additional cybersecurity certification and, in general, does not imply a level of enhanced security over and above that of the standard RDS service. In practice, new security features (such as on-disk encryption) may be enabled in due course, and this may make it easier to obtain additional cybersecurity certification. However, acquiring additional cybersecurity certification remains a separate activity, outside the scope of this program of work.
Phase 2
In phase two, we will enable cross-site replication of data. This applies to all data in the RDS service. The primary goal of this work is to improve our Disaster Recovery position.
Currently, we retain one primary copy of all data (this is the working copy of your data held on the RDS). In addition, we keep two offline copies of your data in our backup system, with each copy being held in a different location. Therefore, in total we have three copies of your data, but only one primary copy.
What does this mean in practice?
It means, that in the event of a disaster, that resulted in the complete loss of the RDS service (such as a building-wide fire), we would still have a copy of your data. However, it could be many weeks before you were able to access that data.
In real life, disasters of this scale are very rare, and most users will never have experience of one.
We will drastically improve on our current position by synchronously replicating all RDS data across two data centres. In effect creating two primary copies of your data, in two different locations.
We believe this work will be completed by Autumn 2025.
Phase 3
In the final phase of this work, we will reconfigure existing services so that users can access data seamlessly from either location. At this point recovery should be close to instant. We hope to complete the final phase of this work by Christmas 2025.