‘Data and AI for researchers’ at the University’s first Data Conference – Birmingham Environment for Academic Research

On Wednesday 26 February, the University’s inaugural Data Conference was held. With a series of talks and interactive sessions covering advice and tips on handling data, and a focus on building a community of data professionals, we ran a session in collaboration with the Institute for Data and AI entitled ‘Research and AI for researchers’. The aim was to raise awareness of the support available for researchers at the University for handling data, followed by case studies to show how researchers have made use of the various services.

As this was the first Data Conference at the University, we were keen to find out a bit more about the audience and using Mentimeter we discovered that 70% were from Professional Services roles, with the remainder being researchers. Existing awareness of the services provided by Advanced Research Computing (ARC) seemed high, with only 4 people not being aware (see slide below). However, only 60% of attendees were aware of the recently formed Institute for Data and AI.

Mentimeter poll showing highest awareness in the audience of research data storage, followed by the training workshops that we provide.

Luckily, the Director of the Institute for Data and AI (IDAI), Professor Paolo Missier, was the first to speak, who was able to provide an introduction to the Institute and the interdisciplinary approach that they use to address challenges in society related to data and AI. As well as describing the aims of IDAI, Paolo introduced the people supporting it, including the Deputy Directors, Professional Services support staff, and Research Data Scientists. Paolo described the three pillars underlying the Institute; 1) Education – upskilling the research community in data and AI, 2) Engagement – enabling strategic partnerships, and 3) Research – attracting large interdisciplinary grants to support data-intensive research. Extended networks were also discussed, with plans for a ‘network of networks’ where there will soon be a call for IDAI Fellows and Affiliates.

The Deputy Director of ARC, Dr Andrew Edmondson (Ed), then gave an introduction to the services and support provided by ARC for researchers. Advanced Research Computing provide BEAR (Birmingham Environment for Academic Research) services, which include High Performance Computing, secure storage, training, Research Software Engineers and in collaboration with IDAI, access to Research Data Scientists.

Ed gave some statistics on the number of current users of BEAR services, with nearly 5,500 users and >4,000 projects all producing nearly 6 Petabytes (1 PB = 1000 TB) of research data! Ed also described the various training workshops available to research students and research staff, as well as free and funded support from the Research Software Engineering service and Research Data Science service.

It was then time to hear from the researchers who use our services, with Dr Jason Turner from the Institute of Inflammation and Ageing describing his use of the research data storage provided in the BEAR Research Data Store. Jason works across a range of groups and projects in the research area of autoimmune diseases, with 41 projects he has access to 135TB of free storage and an additional 167TB of purchased storage, with only 3 projects requiring storage to be purchased above the free quota of 5TB. Jason gave a great analogy of comparing the number of books that would be required to hold 302TB of information – with 13.5 million books in the British Library, it would require 22x the size of the current building!

Jason’s slide showing how many books and space would be required to hold 302TB of data – see text for details.

Dr Vincenzo Brachetta from the School of Metallurgy and Materials then described the benefits of using our High Performance Computing-system, BlueBEAR. Using generative AI, Vincenzo predicted what BlueBEAR would look like (a fluffy blue bear!) but that doesn’t quite match up with the supercomputer that has 40,000 compute cores and >200TB of memory. Vincenzo then went on to describe the use of supercomputing in various data-driven disciplines to enable data analysis, including in the humanities. He finished with a case study on how he has used BlueBEAR to develop numerical simulations to predict residual stresses (read more in Vincenzo’s case study).

Vincenzo’s slide describing the advantages of BlueBEAR, including free access and support.

As Vincenzo mentioned, the services and support provided by ARC (and IDAI) are used by a wide variety of disciplines across the University, and Dr Hazel Wilkinson from the School of English Literature and Deputy Director of IDAI then described her use of Research Software Engineers (RSEs) and Research Data Scientists. Hazel has compiled a database of over 1 million images of eighteenth-century printer’s ornaments, which were used to illustrate books. RSEs helped to move the database from Cambridge where Hazel was based, to the BEAR Research Data Store and created a new website to share the data with others – see compositor.bham.ac.uk. A visual image search engine was also created to help find matching images on the database. Hazel has a pump priming project with IDAI to continue to cleanup the database and characterise the images.

Professor Dylan Owen from the Institute of Immunology and Immunotherapy then finished the lightning talks by describing his use of RSEs and Research Data Scientists to build a collaborative public repository for single-molecular microscopy data. In the field of single molecule imaging, it is very informative to compare images to others, and generating images of just 1MB in size can take months. Therefore, Dylan wanted to create a public database of single molecule images that other research groups could access. RSEs set up the database for Dylan and help to run it, with a Research Data Scientist producing user guides to aid researchers. You can find out more about the nano-org project in the bioRxiv paper, and visit the database at nano-org.bham.ac.uk.

We finished up with a Q and A session with the speakers where the importance of interdisciplinary collaboration and the formation of networks was discussed. The call for IDAI’s network of Fellows and Affiliates will be live soon but ARC also have a team of over 20 BEAR Champions and coordinate several Special Interest Groups.

Slides from other sessions at the UoB Data Conference

University of Birmingham staff members can access slides from the other sessions held at the Data Conference via the SharePoint site here.