The saving grace ( well on Git )

Published: Posted on

In this case study, we hear from Dr Catherine Smith, a Research Fellow and Technical Officer in the Institute for Textual Scholarship and Electronic Editing (ITSEE), who has been making use of BEAR’s code management platform, BEAR GitLab, to enable her to support researchers using online editing tools to transcribe texts.

I started working in the ITSEE about a week before the first commit was made in GitLab (https://about.gitlab.com/company/history/). GitHub already existed but I’m not sure whether private repositories did or not. It was a time when lots of people were moving on from SVN (which is what my team had used in my previous job) to git. The existing technical staff in ITSEE were already using git for all of the code development when I arrived. However, we ran our own remote git installation on our server because it was the best way to have private repositories.

At that time the repository was used exclusively by the two developers and only used for code. It worked fine for us but obviously, we had the extra overheads of maintaining our own installation. Several years later I hooked up some of our software to the system, so we could also get our XML data into version control. At that point, non-developer researchers in the team also needed to start working with git. I taught them the basics on the command line and, as long as there were no conflicts to resolve, they got along pretty well. We were intentionally quite disciplined in the way we used the repositories to try to avoid conflicts.

So, we didn’t really have a research problem that we were facing because we were using git anyway but we did have the overheads of maintaining our own remote repository and our own authentication for it. We also had no GUI for our remote repository, so the researchers just followed the instructions I had given them and didn’t really understand what was happening behind the scenes. We could, of course, have put all of the code and the data on GitHub but some of the code wasn’t really ready to be made public at that time and while the data was mostly already published elsewhere, I didn’t really want our software hooked up to an external host for the git interaction.

An example of transcription of text converted from a manuscript image through to xml code.

As soon as I heard about the trial of GitLab, I got an account and became a pilot user. Once it was confirmed as part of the BEAR suite of tools we moved everything from our own remote repository into the BEAR GitLab instance. The only concern I had was whether we would be able to authenticate via our software for the connection we needed to the data repositories. That possibility was quickly confirmed via an IT services ticket, so the choice to move didn’t require any thought at all. It was the obvious thing to do.

We weren’t really using BEAR GitLab to solve a problem but to make our existing solution better. It gives us the advantage of a better remote repository with a GUI and we don’t have to maintain our own system, so it reduces the admin work for me. I think it also makes things more understandable and less intimidating for the researchers using GitLab for XML data. I was also able to show them in the GUI what happens to their commit messages and the ones we get now are much better than they were when everything just disappeared into the remote repository, never to be seen again.

It is difficult to evaluate the impact on our research because, again, it is more that it is a better solution rather than a completely new one. I think the biggest difference is having the GUI. It makes Git more accessible to the non-developer researchers, so they are more willing to use it and that makes sharing our data and working on it in parallel much more efficient and much safer.

Conclusion

Eventually, we will put all of our code onto GitHub or public GitLab but it has been so useful to have a more private platform to develop code before it is ready for that and know that we have it in version control and backed up.

We were so pleased to hear of how Cat is able to make use of what is on offer from Advanced Research Computing, if you have any examples of how a BEAR service has helped your research then do get in contact with us at bearinfo@contacts.bham.ac.uk. We are always looking for good examples of use of High Performance Computing to nominate for HPC Wire Awards – see our recent winners for more details.