Skip to main content

Research Repository

Advanced Search

Corpus tools and methods, today and tomorrow: Incorporating linguists' manual annotations

Smith, N.; Hoffmann, S.; Rayson, P.

Authors

N. Smith

S. Hoffmann

P. Rayson



Abstract

Today’s corpus tools offer the user a wide range of features that greatly facilitate
the linguistic analysis of large amounts of authentic language data (e.g. frequency
distributions, collocations, keywords, etc.). However, these tools typically fail to
address the fundamental need of the linguist to add interpretive information to a
concordance or query result, by coding individual concordance lines for
structural, functional, discoursal, and other features in a flexible way. The ability
to add such qualitative data is indispensable to a fuller understanding of the
phenomenon under investigation as it allows the linguist to produce more
rigorous descriptions—and theories—about language in use.
Our article has two aims: first, to assess the merits and drawbacks of existing
solutions, by surveying what can be achieved using state-of-the-art corpus tools
and generic database software; second, we draw up a set of desiderata and
recommendations for the incorporation of flexible encoding features into future
corpus tools. We describe an initial step in this direction, with a recent enhancement
to the BNCweb corpus analysis software. More generally, we hope our
suggestions will lead to linguists and software developers working together more
closely to ensure that the needs of the former are provided for by the available
technology.

Citation

Smith, N., Hoffmann, S., & Rayson, P. (2007). Corpus tools and methods, today and tomorrow: Incorporating linguists' manual annotations. Literary and Linguistic Computing, 23(2), 163-180. https://doi.org/10.1093/llc/fqn004

Journal Article Type Article
Publication Date Jan 1, 2007
Deposit Date Jan 17, 2012
Journal Literary and Linguistic Computing
Print ISSN 0268-1145
Publisher Oxford University Press
Peer Reviewed Peer Reviewed
Volume 23
Issue 2
Pages 163-180
DOI https://doi.org/10.1093/llc/fqn004
Publisher URL http://dx.doi.org/10.1093/llc/fqn004


Downloadable Citations