Abstract
This paper presents work based on the new Birmingham Blog Corpus: a 600 million word collection of blog posts and reader comments, available through the WebCorp Linguist?s Search Engine interface. We begin by describing the steps involved in building the corpus, including a discussion of the sources chosen for blog data, the ?seeding? techniques used, and the design decisions taken. We then go on to focus on textual ?aboutness? (Phillips 1985). Whereas in previous work we examined social tagging sites as an aboutness indicator (Kehoe & Gee 2011), in this paper we analyse the reader comments found at the bottom of posts in our blog corpus. Our aim is to determine whether free-text comments offer different insights into the reader perspective on aboutness than those offered by social tags, and whether comments present further linguistic challenges. Online comments are often associated with blogs but are found increasingly on web documents of all kinds, and we also examine the growing importance of reader comments on online news articles.
| Original language | English |
|---|---|
| Journal | Studies in Variation, Contacts and Change in English: Aspects of Corpus Linguistics: Compilation, Annotation, Analysis |
| Volume | 12 |
| Publication status | Published (VoR) - 2012 |
Fingerprint
Dive into the research topics of 'Reader comments as an aboutness indicator in online texts: introducing the Birmingham Blog Corpus'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver