In this blog post we hear from Salih, who is part of our Architecture, Infrastructure and Systems group, which look after all the hardware in our data centres that provide BEAR services. Salih completed his MSc here at Birmingham.
Starting my career as a graduate system engineer in the Advanced Research Computing group, has been an enriching experience. The architecture, infrastructure and systems (AIS) team, dedicated to supporting infrastructure for BEAR services, offered me the opportunity to work with BlueBEAR, Baskerville, BEARCloud, and large-scale storage solutions. This role provided a unique platform to apply my academic knowledge to real-world scenarios, and gain invaluable hands-on experience.
The first three months in this role were essentially a crash course in hardware management within a data centre environment. I was introduced to the intricacies of lifecycle management of devices such as nodes, switches, and PDUs, using the ‘xCAT’ cluster management tool. Alongside hardware management, I delved into the complexities of networking, namely Ethernet Virtual Private Networks (EVPN) and InfiniBand networks, and details of the implementation to ensure resiliency and efficiency. I gained proficiency in using ‘slurm’, a powerful workload manager, which is vital for managing and scheduling jobs on high-performance computing systems.
One of my significant contributions thus far is my work on streamlining, automating, and documenting the processes for hardware lifecycle management. This task involved creating comprehensive guides and Python scripts, to simplify & unify these processes for current and future team members, reducing the potential for inconsistencies & errors.
This culture of collaboration has not only facilitated my learning but has also made my transition into the professional world smoother and more enjoyable
My role also offered opportunities to engage with the broader computing community. Attending the HPC Special Interest Group (HPC-SIG) conference in Cardiff and the Computing Insight UK (CIUK) event in Manchester were particularly enlightening experiences. These events allowed me to explore cutting-edge advancements in HPC, and network with professionals from other universities and organizations.
The supportive environment in the AIS team has been instrumental in my successful adaptation to the role. The team is hard-working, yet always willing to offer assistance and answer any questions I have. This culture of collaboration has not only facilitated my learning but has also made my transition into the professional world smoother and more enjoyable.
I look forward to continuing my journey as a graduate system engineer, here at ARC.
Salih MSA