On Wednesday 6th December we held our first Digital Research Conversation or DRC to a sell out crowd! The topic was ‘Managing Data from Creation to Destruction’ and our aim was to follow the example set by the University of Lancaster with their Data Conversations where they encourage researchers to come together to talk about the issues they wish to discuss around data management. Some feedback from Lancaster was that researchers from the arts were put off by having ‘Data’ in the title, hence we decided to use ‘Digital Research’ in the hope that we would get a good mix of attendees from both the arts and sciences (which we achieved – see pie chart below showing attendees per College).
We held the event at the Graduate School’s home – Westmere. Setting up the room was a little interesting – it was going to be a squeeze if everyone turned up!
We started with some delicious pizza and an opportunity to network in the lovely festive surroundings of Westmere complete with two Christmas trees.
It was then time to get down to the talks. First up was Melanie Britton who is a Senior Lecturer and Chair of Chemistry’s Data Management Committee. She has embedded the use of data management plans within her own research group (which uses magnetic resonance imaging of chemistry), in order to make data handling more efficient and create a system enabling open data. Originally the lab was having to back up to DVD and external hard drives. They also had difficulties accessing network drives through different instruments, so her lab now uses BEAR DataShare (our sync & share solution) to upload all their active data to (including raw, processed and analysed) and all material for reports and manuscripts. Melanie is full of praise for BEAR DataShare saying “it has transformed my life”!! Each PhD student/post-doc then has their own Research Data Store where any ‘approved’ data goes, and data associated with any publications or a thesis is archived in the Research Data Archive. Data stored on instruments is reviewed annually and any unwanted data deleted so that only viable data is kept long-term. They categorise their data by technique used to collect it, so for each paper written they will need access to several different folders. Each folder contains a readme file containing all the experimental information. Melanie advised to keep a data management plan specific to your research, ensure that the PI retains access to data after students finish and include a check on data recoverability every 6 months.
Catherine Smith from Theology then introduced us to the very different data that she works with using electronic tools to analyse texts. She is currently writing software and using it to analyse translations of the Gospel of John, comparing all the different versions and looking for differences. As a lot of the translations are done by volunteers, the accuracy of the secondary data needs to be assessed. Catherine uses a database to manage her data and is looking to automate data deposition so she doesn’t have to remember to dump data in it! She makes use of Git (version control) to manage her code. Transcriptions are all made available on a website and small quantities of data are sent to ePapers. Catherine has developed a workflow but needs to refine it as it is currently too rigid.
Aslam Ghumra (our Research Data Manager from within the BEAR team) then described his own perspective on data management describing the importance of having a plan for when things go wrong eg. a laptop being stolen. As well as backing up data, Aslam emphasised the importance of reviewing your data to assess its value and determining which data you actually need to retain and archive. Open access to data allows us to compare and validate results but it also requires researchers to keep data in a format that is readable by both other humans and machines eg. is the data stored in a common & persistent file format? He discussed one of the advantages of making your own data open as increasing your digital footprint.
Danielle Fuller from the Department of English Literature then joined our other speakers for the panel session. As she works with sensitive data, Danielle was able to describe how she deals with it securely by anonymising sound files and transcripts, keeping the codes separate in a locked drawer and storing the anonymised data on the Research Data Store. There was a discussion on open theses – should students make their thesis/data openly available if only parts of it are published? There was also discussion around preservation vs sustainability of data. It seems preservation of code is becoming easier via use of Git but sustainability of data such as software is something researchers still struggle with and requires the use of virtual machines (see BEAR Cloud).
To conclude, it was a great turnout for our first Digital Research Conversation and we had some good feedback. We have had requests for the next Digital Research Conversation to be around the hot topic of ‘Data security’. We will be approaching speakers over the next few weeks and putting together a programme. Follow us on twitter to keep up to date and for booking details. We hope you can join us!
Further Information
University of Birmingham researchers can sign up to our Digital Research Conversations Canvas course to find slides and summaries of our series of DRC events: https://canvas.bham.ac.uk/enroll/Y7C7LY