As more researchers turn to the analysis of spontaneous language samples, questions about various aspects of the methodology of corpus work emerge. In this paper we give an overview of the problems that turn up in the collection, transcription and analysis of spontaneous language samples and we illustrate how these questions are answered in the CHILDES system.
The topics that we will address are:
In our discussion of these topics we will take CHILDES as a reference point, but we will also refer to the practical experience of various research groups from around the world with collecting, transcribing and annotating (spoken) language corpora. The main theme will be: how can we find a balance between methodological soundness and practical feasibility?