Dr Rosalind White takes you through a quick-start guide exploring some of CLiC’s features. If you would prefer video instructions these instructions are available in a Twitter thread. You can also find further guidance on the help tab of the CLiC Web App.
The CLiC Web App (Mahlberg et al. 2020) was designed specifically for the analysis of literary texts and developed as part of the CLiC Dickens project. You can view a full list of the texts available here. For more detailed instructions please refer to the CLiC user guide.
A concordance is a list of examples of a word as it occurs in a given corpus or corpora. Concordances are presented in lines so that you can view them in the context in which they occur in the text.
- Select a corpus or multiple corpora (in the digital humanities a corpus or corpora refers to a text or a group of texts).
- Select your subset (whether you want to search through ‘all text’ – the whole book(s) – or just one of the subsets: ‘short suspensions’, ‘long suspensions’, ‘quotes’ and ‘non- quotes’).
- Type in your search term and hit enter. * can be used as a wildcard – so candle* would also find candles or candlestick; as well as as a placeholder between words – so with * hands would find with her hands, with his hands, with clean hands etc).
Lines can be sorted alphabetically by any of the columns in the concordance by clicking on the header, which will then be marked with dark arrows. For example, by clicking on ‘Left‘ the lines will be sorted by the first word to the left of the node and by clicking on ‘Right’ by the first word on the right.
The filter option lets you filter the concordance output further. For example, searching for hands in Oliver Twist yields 124 results; but when we use the option ‘filter rows’ and search for pockets this is filtered down to 8 results.
After you’ve run a basic concordance you can also view your results as a ‘Distribution plot’ if you select ‘view as distribution plot’. See, for example the distribution of “marriage” across three of Jane Austen’s works.
KWICGrouper and Tagging
The KWICGrouper is a tool that allows you to quickly group the concordance lines according to patterns that you find as you go through the concordance. Any matching lines will be highlighted and moved to the top of the screen. By dragging the slider, you can adjust the number of words that will be searched to the left and right of the search term (R1 refers to a word to the immediate right, whereas L1 refers to a word to the immediate left).
Once you have identified lines with patterns of interest, you might want to place these into one or more categories. CLiC provides a flexible tagging system for this. The tags are user-defined so you can create tags that are relevant to your project.
The Keywords tool finds words (and phrases) that are used significantly more often in one corpus compared to another. Apart from comparing single words, CLiC also allows you to compare clusters (multiple words).
- ‘Target corpora’: Choose the corpus/corpora that you are interested in.
‘within subset’: Specify which subset of the target corpus you want to compare (or simply choose ‘all text’)
- ‘Reference corpora’: Choose the reference corpus to compare your target corpus to (we have built a nineteenth-century reference corpus specifically to compare against Dickens’ works).
- ‘Within subset’: Specify the subset for the reference corpus.
- ‘n-gram’: Do you want to compare single words (1-grams) or phrases (2-grams up to 7-grams)
The Counts tab lists information about individual books and all books in a corpus.
Clusters are also called ‘n-grams’, where ‘n’ stands for the length of the phrase. If we choose a ‘1-gram’ (single word), we retrieve a simple word list. (In Oliver Twist, for example, the top 10 words retrieved from this tool are the, and, to, of, a, he, in, his, that – all function words, as we would generally expect.) From version 2.0 onwards, CLiC supports clusters of length 1 (single words) up to 7 (i am very much obliged to you).
The Texts tab shows the full text of individual books and allows you to navigate to particular chapters. Selections of the text can be easily copied and pasted into other applications (e.g. Microsoft Word documents). You can select the levels of annotation you want to display, such as “Sentences” and the “Quote” and “Non-quote” subsets, etc.
Join the discussion0 people are already talking about this, why not let us know what you think?