What is the #RC21 project all about?

Organising and interpreting concordance lines via a Key Word in Context (KWIC) display format is probably the most central technique in corpus linguistics. This kind of vertical ‘reading’ helps analysts spot patterns in large samples of language, adding a qualitative dimension to studies of frequency, collocation, and keyness.

Sample concordance for ‘in’ on pink background.

Project background

The use of computers to find evidence of how language is used has been popular practice in lexicography since the 1980s. John Sinclair and his team revolutionised dictionary writing by using electronic corpora to select examples of everyday language use for inclusion in the first COBUILD English language dictionary. Now in the 21st century, concordance reading is characterised by its versatility. The technique is applied in all sorts of disciplines to investigate a huge variety of problems—social, linguistic, and stylistic. As a result, the number of available corpus query tools with functions that support concordance reading has grown substantially. Over 100 concordancing tools exist, including some that support the analysis of concordances for specific purposes such as language learning and teaching and the study of literary texts.

With so many applications and tools, approaches to concordance analysis must vary widely, posing challenges to reproduction and replication. At the same time, such extensive use of concordance analysis must lead to patterns of practice that can provide insights into underlying strategies – even if these strategies are not always explicitly spelled out by the users. Such shared underlying strategies are what we, the Reading Concordances in the 21st Century (#RC21) project members, are interested in.

RC21 team members: Michaela Mahlberg (UoB), Stephanie Evert (FAU), Natalie Finlayson (UoB), and Alexander Piperski (FAU).

Our collaboration between the University of Birmingham and Friedrich-Alexander-Universität Erlangen-Nürnberg combines expertise in corpus linguistics theory, computational algorithms, and applications of corpora in lexicography, language education, discourse analysis, and literary stylistics.

Our aims

To guide us in exploring commonalities in concordance reading practices across our disciplines of interest, we ask: What do users do to organise concordance lines for interpretation? And what drives the choices users make, beyond their intuition and the features of concordancing tools they are familiar with? This should increase our awareness of how computational algorithms support analysts in different parts of the concordance reading process, independently of the features of any specific tool. On this basis, new algorithms and display formats can be developed that are driven by the needs of corpus analysts rather than technological considerations.

The tasks we have set ourselves include:

1. Developing a consistent methodology of concordance reading strategies and suggesting a transparent terminology that describes tool-independent principles of reading concordances. We will do this by reviewing examples of current practice described in corpus linguistics textbooks, research studies and the user manuals for concordancing tools.

2. Developing a concordance management tool and new algorithms that support a consistent concordance reading methodology. We will achieve this by identifying popular algorithms implemented in existing tools, and integrating them and developing them further in the FlexiConc tool.

3. Conducting case studies to assess the effectiveness of the outputs of (1) and (2). To ensure that our findings apply to a broad range of purposes, we will use our approach to analyse two very different types of text (historical novels and tweets) for very different purposes (studying linguistic aspects of literary style and political argumentation mining) in two languages (English and German).

4. Organising outreach activities such as workshops and this project blog.

We believe strongly in the power of describing and explaining patterns in language for addressing questions in linguistics and beyond. We are basing our work in four disciplines with a lot of momentum in concordancing work: lexicography, language learning, discourse analysis, and literary studies. These disciplines align with our personal interests, and with the interests of members of our Advisory Board of international experts in corpus and digital research. Stay tuned to find out more about them in future posts!

Please cite this post as follows: Finlayson, N., Evert, S., Mahlberg, M., & Piperski, A. (2023). What is the #RC21 project all about? [Blog post]. Reading Concordances in the 21st Century Blog, University of Birmingham. Retrieved from [https://blog.bham.ac.uk/rc21/2023/10/30/what-is-the-rc21-project-all-about/]

Project background

Our aims

Author: RC21 Team

1 thought on “What is the #RC21 project all about?”

Leave a Reply Cancel reply