14th February 2018 by

ChiLit: the GLARE 19th Century Children’s Literature corpus in CLiC

Working in close cooperation with the CLiC team (our thanks go especially to Viola Wiegand, Anthony Hennessey and Jamie Lentin for their help with the corpus compilation, annotation and other necessary treatment), the new GLARE 19th Century Children’s Literature corpus (ChiLit) has been made available in November 2017 and it is now fully accessible through the CLiC interface.

CLiC currently represents the state of the art in online interfaces for digital literary research. Not only can you search for individual words, word frequencies, clusters (repeated sequences of words) and keywords but with the help of the ‘KWICGrouper’ you can also conduct  sophisticated searches for various repeated patterns that are of interest for stylistic analysis.

One of the major features that distinguishes fiction from other text types is the presence of speech (or, rather, its presentation in written form), so ChiLit (as all the corpora in CLiC) is annotated to distinguish between speech and non-speech, which means that you can for instance run concordances for linguistic features and patterns that occur in speech only (for the new features in CLiC see Viola Wiegand’s blog post from November 8, 2017).

As Martin Wynne points out, CLiC is a big step forward in this respect “because it allows anyone who can get online to explore for themselves the text, word frequency lists, clusters of words, collocations, ‘suspensions’, reported speech, etc. And this can be done not only from a desktop computer, but from a mobile device as well” (Wynne 2018).

ChiLit and other CLiC corpora

In CLiC, ChiLit nicely complements the other 19th century texts (‘DNov’ – Charles Dickens’s novels and ‘19C’ – a 19th Century Reference Corpus), which can be conveniently used as reference corpora for ChiLit. ChiLit contains 71 texts amounting to 4.5 million words (for more detail see my recent CLiC blog post). And what is in the corpus?

We aimed at a representative sample of the Golden Age of English children’s literature, i.e. fiction written for (or read by) children in the 19th century. The words representative sample are a bit tricky here (even more so for a corpus linguist or a statistician!). ‘Representative sample’ is meant to reflect that the selection of books for ChiLit was primarily guided by Children’s Literature. An Anthology 1801–1902 compiled by Peter Hunt (2001) and Children’s Literature. An Illustrated History edited by Peter Hunt (1995). Decisions as to what to include and what to leave out were not always straightforward – selections are always subjective – some of the texts we wanted to initially include were left out for technical reasons; and ultimately, the final selection depended on the availability and quality of the texts on Project Gutenberg (for more detail on the selection see my CLiC blog post).

Author gender representation in GLARE

The GLARE project aims to explore the representation of gender in children’s literature. Hence, we aimed for a balanced representation of female and male authors while keeping other principles of selection in mind. We were able to achieve this to a degree; while the number of books is similar – 35 by female and 36 by male writers; the total number of words is so to a lesser extent: women writers take up 1.9 million words while men writers have 2.5 million words. The representation is, however, unequal in terms of the overall selection of authors. While there are 71 texts, some authors have written more than one text. Overall, 38 different writers are included in ChiLit. This number splits into 14 female and 24 male authors. The table below shows the complete contents of ChiLit.

If you are interested in learning more about our research with ChiLit or in opportunities for your own work with children’s literature do not hesitate to get in touch.

You can also follow us on Twitter @GlareProject


Hunt, P. (2001). Children’s Literature. An Anthology 1801 — 1902. Oxford: Blackwell.

Hunt, P. (Ed.) (1995). Children’s Literature. An Illustrated History. Oxford: Oxford University Press.

Wynne, M. (2018).  Dickens and the History of Literary and Linguistic Computing – a (very) short retrospective [Blog post 6.2.2018].

Author Title Published
Anstey, F. The Brass Bottle 1900
Anstey, F. Vice Versa or A Lesson to Fathers 1882
Ballantyne, R. M. The Coral Island. A Tale of the Pacific Ocean 1858
Barrie, J. M. Peter and Wendy (Peter Pan) 1911
Burnett, F. H. The Secret Garden 1911
Carroll, L. Alice’s Adventures in Wonderland 1865
Carroll, L. Through the Looking-Glass 1871
Crockett, S. R. The Surprising Adventures of Sir Toady Lion With Those of General Napoleon Smith 1897
De La Mare, W. The Three Mulla-mulgars 1910
Ewing, J. H. Jackanapes 1883
Ewing, J. H. Mrs. Overtheway’s Remembrances 1869
Falkner, J. M. Moonfleet 1898
Farrar, F. W. Eric, Or, Little by Little, A Tale of Roslyn School 1858
Farrow, G. E. Adventures in Wallypug-Land 1898
Grahame, K. Dream Days 1898
Grahame, K. The Golden Age 1895
Grahame, K. The Wind in the Willows 1908
Haggard, H. R. Allan Quatermain 1887
Haggard, H. R. King Solomon’s Mines 1885
Henty, G. A. Winning His Spurs. A Tale of the Crusades 1882
Henty, G. A. With Clive in India. Or, The Beginnings of an Empire 1884
Hughes, T. Tom Brown’s Schooldays (By An Old Boy) 1857
Ingelow, J. Mopsa the Fairy 1869
Jefferies, R. Wood Magic. A Fable 1881
Kingsley, C. Madam How and Lady Why. Or, First Lessons in Earth Lore for Children 1870
Kingsley, C. The Water-Babies 1863
Kipling, R. Stalky & Co. 1899
Kipling, R. The Jungle Book 1894
Lang, A. Prince Prigio. From “His Own Fairy Book” 1889
MacDonald, G. At the Back of the North Wind 1871
MacDonald, G. The Princess and the Goblin 1872
Marryat, F. Masterman Ready. The Wreck of the “Pacific” 1841
Marryat, F. The Children of the New Forest 1847
Marryat, F. The Settlers in Canada 1844
Martineau, H. Feats on the Fiord 1841
Martineau, H. The Crofton Boys 1841
Martineau, H. The Peasant and the Prince 1841
Martineau, H. The Settlers at Home 1841
Meade, L. T. A World of Girls: The Story of a School 1886
Mrs. Molesworth The Carved Lions 1895
Mrs. Molesworth The Cuckoo Clock 1877
Mrs. Molesworth The Tapestry Room: A Child’s Romance 1879
Nesbit, E. Five Children and It 1906
Nesbit, E. Nine Unlikely Tales 1901
Nesbit, E. The Book of Dragons 1899
Nesbit, E. The Railway Children 1905
Nesbit, E. The Story of the Amulet 1906
Nesbit, E. The Story of the Treasure Seekers 1899
Potter, B. The Tale Of Benjamin Bunny 1904
Potter, B. The Tale of Jemima Puddle-Duck 1908
Potter, B. The Tale of Peter Rabbit 1902
Potter, B. The Tale of Squirrel Nutkin 1903
Potter, B. The Tale of the Flopsy Bunnies 1909
Potter, B. The Tale of Two Bad Mice 1904
Reed, T. B. The Fifth Form at Saint Dominic’s: A School Story 1887
Ruskin, J. The King of the Golden River; or the Black Brothers: A Legend of Stiria 1841
Sewell, A. Black Beauty. The Autobiography of a Horse 1877
Sinclair, C. Holiday House: A Series of Tales 1839
Stevenson, R. L. Kidnapped 1886
Stevenson, R. L. Treasure Island 1883
Stretton, H. Alone In London 1869
Stretton, H. Jessica’s First Prayer — Jessica’s Mother 1867
Stretton, H. Little Meg’s Children 1868
Strickland, A. The Rival Crusoes; Or, The Ship Wreck 1826
Thackeray, W. M. The Rose and the Ring 1854
Tytler, A. F. Leila at Home. A continuation of Leila in England 1870
Wilde, O. The Happy Prince, and Other Tales 1888
Yonge, C. M. The Daisy Chain, or Aspirations 1856
Yonge, C. M. The Dove in the Eagle’s Nest 1866
Yonge, C. M. The Heir of Redclyffe 1853
Yonge, C. M. The Little Duke: Richard the Fearless 1854


Please cite this blog as follows: Čermáková, A. (2018, 14 February). ChiLit: the GLARE 19th Century Children’s Literature Corpus in CLiC [Blog post]. Retrieved from: https://blog.bham.ac.uk/glareproject/2018/02/14/chilit-the-glare-19th-century-childrens-literature-corpus-in-clic/.

Join the discussion

0 people are already talking about this, why not let us know what you think?

Leave a comment

Your email address will not be published. Required fields are marked *