Corpus Pragmatics https://doi.org/10.1007/s41701-018-0043-1 BOOK REVIEW
Review of Discourse Markers and (Dis)fluency: Forms and Functions Across Language and Registers Meaghan Blanchard1 Received: 18 June 2018 / Accepted: 20 June 2018 © Springer International Publishing AG, part of Springer Nature 2018
As research in the fields of both discourse markers and fluency continues to expand, incorporating novel uses of corpus linguistics, cross-linguistic studies and discourse phenomena, Ludivine Crible integrates these domains in a study that demonstrates the scalar relationship between fluency, discourse markers (DMs), and their nuances. Discourse Markers and Disfluency: Forms and Functions across Languages and Registers contributes greatly to this discussion by providing an in-depth, multilingual analysis of DMs in varying contexts. In Chapter 1 Crible first establishes her definition of spoken fluency with a space–time metaphor. Spoken fluency is “resolutely multi-dimensional, combining spatial-like moves with temporal constraints” (1). This illuminates the importance of temporality, which written fluency often disregards, and the fact that spoken dialogue happens in real time, often in spontaneous and unplanned settings. Equating fluency with minimal flaws and maximum fluidity is not applicable to spoken fluency, as the cognitive processes of language production and perception happen in real time and real situations. One sign of these cognitive processes in speech production is the notion of (dis)fluencies (i.e. DMs, truncations, repetitions, etc.). (Dis)fluencies are not necessarily a sign of disfluency. Rather, (dis)fluencies are often an essential part of spoken dialogue, and can be used as tools to create coherent and efficient discourse in real time. Crible contributes to growing research that denies (dis)fluencies are exclusively a sign of error. Instead, they are often considered communicative in their own right. One of Crible’s over-arching goals in this monograph is to create a tentative scale on which to rate DM functions, ranging from fluent to disfluent. Or, more realistically, “to use functional and positional features of discourse markers to interpret the relative frequency of the clusters they occur in” (3). In creating a scale, she wishes to “uncover the strategic uses of (dis)fluencies in relation to discourse structure, with a focus on discourse markers as they give a more quantifiable insight into the intentions and purpose of certain (dis)fluencies” (4). * Meaghan Blanchard
[email protected] 1
KU Leuven, Warmoesberg 26, 1000 Brussels, Belgium
13
Vol.:(0123456789)
M. Blanchard
Chapter 2 contains a literature review that evaluates key definitions from fluency and disfluency studies. For fluency, Crible integrates Levelt’s covert/overt repair distinction from his model of repair, holistic notions of flow and efficiency (Levelt 1989; Lennon 2000), and quantitative measures of evaluation from componential approaches (Shriberg 1994; Besser and Alexandersson 2007; Segalowitz 2010). Crible opts to use the term fluenceme in lieu of the more pejorative term (dis) fluency (as this paper will now do). Fluencemes, a term borrowed from Götz (2013), are “discrete devices which function as signals of cognitive processes of speech production and perception”, potentially demonstrating fluency or disfluency based on certain contextual clues (22). This can include phenomena such as unfilled pauses, filled pauses, DMs, truncations, etc. Therefore, the function of fluencemes in spoken discourse is ambivalent, sometimes acting as a symptom of processing error or alternately as a signal of production. It is important to note that, in a different approach from Götz’s (2013), Crible argues that DMs are a subset of fluencemes rather than a separate entity. Fluency, to Crible, is not marked by an apparent lack of fluencemes, but by efficient/effective uses of those fluencemes to aid in discourse. Disfluency, on the other hand, is marked by fluencemes that signal production errors and create disruptions in syntax or prosody. Crible also integrates notions of “usage-based linguistics”, emphasising the importance of frequency in cognitive entrenchment; of accounting for variation and change; and of context in language processing (23). A portion of this perhaps takes inspiration from Aijmer (2013), who emphasises the influence of register on DM usage. Crible also defines an important hypothesis, the fluency-as-frequency hypothesis, which establishes that the more frequently a certain sequence manifests, the more likely it is a signal of fluency, as the cognitive effort needed to process it is low due to cognitive entrenchment. Because of its common nature and lack of need for cognitive processing, Crible hypothesizes that the more often a certain sequence is used in a certain setting, the more likely it is to be considered fluent and vice versa. Chapter 3 narrows the study’s focus by creating a comprehensive definition of DMs and establishing why DMs are relevant to fluency studies. Crible defines DMs as a: “grammatically heterogeneous, syntactically optional, polyfunctional type of pragmatic marker. Their specificity is to function on a metadiscursive level as procedural cues to constrain the interpretation of the host unit in a co-built representation of ongoing discourse. They do so by either signaling a discourse relation between the host unit and its context; making the structural sequencing of discourse segments explicit; expressing the speaker’s meta-comment on their phrasing; or contributing to the speaker–hearer relationship” (35). Her definition takes inspiration from Schriffin’s (1987) and Hansen’s (2006, 2008) functional approach to DMs, along with countless others. Crible notes that she takes a bottom-up approach, labeling anything that fits this description as a DM, rather than having a determined list before analysing data.
13
Review of Discourse Markers and (Dis)fluency: Forms…
One of Crible’s greatest achievements in this work is her comprehensive taxonomy used to describe DM function, which had impressively consistent results among various blind reviewers. The fact that it could be readily applied to two languages attests to its significance. For a fully comprehensive look at her taxonomy Crible (2017) can certainly be recommended. DMs, in this taxonomy, are first categorized into their domain—sequential, rhetorical, interpersonal, ideational—and then further nuanced into function. In the field of (dis)fluency studies, Crible argues that DMs are relevant in their reflection of certain cognitive processes of production/comprehension. DMs “segment spoken discourse,” and allow “both speakers and hearers to backtrack or project their attention along a string of words, in a relative freedom without which communication cannot seem to be performed efficiently” (48). Crible follows Götz’s (2013) example by foregoing past scholarly tendencies to exclude DMs from fluency research (due to polyfunctionality, lack of methodological validity, etc.). Crible uses a quantitative approach to her corpus, using statistical models such as linear regression and multifactorial models to see tendencies in how DMs present themselves both in clusters (with other fluencemes) and in various positions, registers, and syntactic categories. By understanding how specific DMs manifest in specific situations, she argues that insight into fluent/disfluent uses of DMs will become more apparent. Based on past studies, Crible introduces a predefined subset of Potentially Disfluent Functions (PDFs) of DMs. She hypothesizes that DMs used in monitoring, punctuation and reformulation will most likely be associated with disfluent usage, due to their interruptive nature. She later uses corpus data to verify the validity of this prediction. Chapter 4 deals with Crible’s corpus and methodology. An impressive feat of Crible’s work is the extensive, multilingual corpus (DisFrEn), annotated for DMs. It is an impressively large multilingual corpus, containing conversations, interviews, radio interviews, phone conversations, etc. in both English and French. For English she uses the ICE-GB corpus and the Backbone project. For French she uses the VALIBEL database, CLAPI, C-PhonoGenre, LOCAS-F, the French Corpus of Humorist Speech, and Rhapsodie. English and French are equally represented in terms of register, for which various factors are taken into account such as elicitation (natural vs. semi-structured speech), number of speakers taking part in the interaction, degree of preparation, and interactivity (symmetrical, semi-interactive, non-interactive). Crible annotates DMs based on function, position, part of speech (POS) tags and contextual features. The positioning system is threefold, involving micro-syntactic units, macro-syntactic units, and turns-of-speech. Annotation is furthered by marking any other fluenceme that occurs in the same sequence as a DM. All in all, Crible provides a rich and extensive annotation scheme for fluencemes, which produced impressively (though not completely) consistent results among various blind-reviewers. Chapter 5 gives a qualitative view of DM patterns. No major distinctions can be made between English and French (minus a higher raw frequency of DMs in French), noting that the most prevalent DMs in both languages seem to be
13
M. Blanchard
semantic-pragmatic equivalents, following the exact same ranking in usage and frequency. Regarding position and part of speech, coordinating conjunctions tend to occur in pre-field position; subordinating conjunctions in both left and right integrated position; adverbs in middle field; and interjections as independent units. Against her hypothesis, sequence variation occurs more frequently in the post-field position than in initial position. Crible finds that DM usage variance is highly dependent on register. The sequential domain is the most common in all registers, whereas interpersonal domains are the least, manifesting most commonly in spontaneous speech and very rarely in planned speech. For domain-specific part-of-speech tags statistical tests suggest that adverbs are the most polyfunctional type of DM, showing up in substantial numbers across all domains. This contradicts past studies which have commonly considered coordinating conjunctions as the most polyfunctional. In terms of function, the most common for both English and French was addition. After integrating all of these elements, Crible creates a predictive chart that correlates certain DM functions with certain syntactic positions and POS tags. She establishes a number of form-function patterns through her mapping, enabling functions of language to be associated with DMs in specific slots of speech where they are most typical. Thus, she concludes that the polyfunctionality of DMs is not random, but is formally grounded, and therefore, relatively predictable. Crible also validates her hypothesis that DMs tend to occur in clusters more than as individual units. She argues that DMs that co-occur often do so in a recurrent manner, meaning that they are cognitively meaningful and aid in communication. These tend to occur more in turn-initial positions, characteristic of their discourse-structuring potential. DM clustering is much less common in independent and medial micro-positions, which are usually more disruptive instances of DM usage, further arguing that DM clustering has cognitive purposes. In sum, Crible creates statistical models that show relative frequency of DMs based on their register, POS tag, syntactic position, and clustering habits. In turn, her models display tendencies that emphasise that certain combinations of these factors indeed vary in an often statistically significant manner. Chapter 6 narrows focus by analysing disfluency in radio and face-to-face interviews. Rather than exclusively annotate fluencemes that cluster with DM fluencemes, all fluencemes were annotated. This may explain the choice to analyse a sub-corpus rather than the entirety of DisFrEn. The major conclusions are that fluencemes (i.e. DMs, unfilled pauses, filled pauses, etc.) are prevalent in conversations, but more so in face-to-face interviews than in radio. This could be due to the degree of preparation these registers require, topic familiarity, or degree of professionalism of certain speakers. The most common fluenceme in these clusters were unfilled pauses, demonstrating their poignant functional ambivalence. In addition, registers tend to use certain DMs (i.e. you know in conversations) more than others, further emphasising Crible’s theory that DM usage is purposeful, often aiding in communication rather than inhibiting it. Crible goes on to test her fluency-as-frequency hypothesis. Surprisingly, low occurrence of rare fluenceme sequences does not correlate with disfluency as predicted. Rather, the most disfluent sequences occur in medium-sized, mixed sequences with isolated fluenceme usage. Complex sequences do not correspond
13
Review of Discourse Markers and (Dis)fluency: Forms…
with what may intuitively be considered major disruptions in an utterance. In sum, the rarest occurrences of fluencemes are not often the most disfluent of fluenceme occurrences. Fluencemes that correlate with more interruptive/disfluent sequences were usually simple fluencemes in medium-sized sequences. Compound fluencemes also tend to be neither uncommon or disruptive. Contrary to DMs’ tendency to cluster with other forms of fluencemes, most fluencemes occur as isolated units rather than in clusters. Crible displays that there is a complex interplay of sequence length, fluenceme type and frequency as indicators of fluency. However, further study is needed to verify to what degree and how they affect it. In Chapter 7, Crible attempts to “rank different functional uses of DMs on a register-sensitive scale of (dis)fluency,” taking in the statistical analysis of the past few chapters (149). For register, she found that more formal (pre-planned) registers yielded a lower rate of DM sequence variation than informal registers where conversation is more spontaneous, showing the benefit of DMs as a communicative device in real time communication. In terms of register and sequence, DMs as standalone occurrences and DMs paired with unfilled pauses are the most prevalent of sequences among all registers. For fluency in the functional domain, Crible creates a table that associates certain domains with different degrees of fluency. She evaluates these degrees by analysing corpus frequency, formal features, and qualitative interpretation of data as parameters. The sequential domain is considered the most fluent, followed by ideational, rhetorical and interpersonal respectively. This is based on the attraction of sequential DMs to pauses, absence of ideational DMs from interruptions and mixed sequences, the attraction of rhetorical DMs to mixed sequences, and that of interpersonal DMs to interruptions. The more disruptive the DM is considered to be, the more likely it is to be considered less fluent, or less helpful in aiding cognition. When analysing register and functional domains, Crible creates an incredibly interesting two-dimensional, predictive scale of fluency. Unfortunately, when syntactic position is incorporated, the scale no longer holds, as there were no tendencies for DMs in certain domains to cluster to certain syntactic positions. For Potentially Disfluent Functions (monitoring, punctuation and reformulation), Crible uses association plots to analyse which registers and sequence types they are most attracted to. For register, her PDF proposal seems to hold, as these functions occur more frequently in spontaneous settings, where dialogues are more prone to error. In terms of sequence type, PDFs tend to cluster with other fluencemes. Sequences of interruption, mixed fluencemes and substitution tend to increase the chance of a PDF compared to an isolated DM, and the longer the sequence, the higher the probability of a PDF. PDFs particularly manifest in mid-positions, which correlates with more disfluent, interruptive interpretations. Chapter 8 strays away from the initial seven chapters in its more qualitative approach. As Crible describes it, the first seven chapters follow a corpus annotation model, whereas her final chapter follows a model of discourse analysis. In this chapter, she relates DM usage to different types of repair in both English and French. A large portion of her analysis is inspired by Levelt’s (1989) typology of repair, especially his distinction between overt/covert repair. In turn, she conducts another small literature review on DM distribution across repair types
13
M. Blanchard
with a focus on position and function. She also analyses whether DMs and modified repetitions are redundant and whether there are nuanced differences between English and French. Crible analyses face-to-face interviews, labelling repairs in accordance with Levelt’s (1989) eight categories of repair. She finds that there are no major differences between English and French. The most frequent type of repair in both languages are linked to issues of structuring. DMs are relatively absent in overt repairs, but prevalent in covert repairs, meaning that they are not necessarily a sign of a production error, but rather a signal that the speaker is searching for a solution to a linguistic problem. In sum, DMs often appear “to be used strategically to maintain the illusion of fluency” (202). The analysis confirms that PDFs tend to occur alongside disfluencies. In addition, DMs located in various positions of repairs (i.e. the editing phase, the repair itself, the periphery) correlate with varying degrees of fluency. DMs located in the editing phase tend to have a more fluent function than DMs located in the repair or in the periphery, meaning that in repairs “DMs are more often part of the solution (signalling the interruption or beginning of the new utterance) than of the problem (being repaired themselves)” (194). Chapter 9 contains a summary of all important conclusions and results of previous chapters. The value of this work is undeniable, as it provides one of the most expansive multilingual corpora to date and introduces well-tested annotation systems that could potentially be used by many other scholars. These annotation systems also make certain aspects of DMs and their relation to fluency (syntactic position, register, etc.) quantifiable, allowing for her research to have replicable, verifiable parameters. By analysing both functional and positional elements, Crible has demonstrated the significance of the ambivalent nature of both DMs and fluencemes, and the importance of their effects on perceptions of fluency. Crible does an excellent job of balancing statistical analysis with theoretical approach, creating a highly interesting and comprehensive study on a subject that is by no means easy to analyse. Neither approach was valued over the other, and Crible manages to demonstrate how these two approaches are symbiotic rather than antithetical. However, a work of this nature is not without its complications. The first complication stems from the corpus itself. Because English data was more expansive and readily available, Crible was forced to pull from 6 different French databases (in comparison to the two English databases) to have enough comparable data. This could result in contaminated data wherein factors that have not been accounted for could influence DM usage. These corpora come from different time periods—some dating back to the 1990s, which could render data incomparable. Moreover, Crible evaluated the nature of certain recordings herself (i.e. the professionalism of certain speakers), which is always subject to certain personal biases. A future study in which data was collected in a more controlled environment would be beneficial in verifying Crible’s work. Her annotation system, though it received favorable results in statistical tests, still needs to be tested on different languages and registers to test how well it can be applied to languages other than English and French. Attempting to apply this system to learner corpora would also provide interesting insights into the functionality of this system.
13
Review of Discourse Markers and (Dis)fluency: Forms…
Perhaps the main issue of this study is the subjective nature of the scale Crible implements. Throughout her analysis, qualitative measures were always an integral part of evaluation. What may seem fluent to one evaluator may be seen as less fluent to another; hence, personal perceptions of the evaluator will always be a pervasive factor. Different blind annotators often had variances in their own interpretations and labelling of DMs, demonstrating how vulnerable DM interpretation is to situational context and personal bias. In addition, on a two-dimensional scale (i.e. the relationship between sequence type and function), her predictive model provides interestingly good results. The sequential domain tends to correlate with fluency, the interpersonal domain with disfluency. However, when the predictive relationship of register and sequence attempts to incorporate syntactic position, the model does not withstand. When analysed together, there is no clear correlative relationship between fluency and domain, position, and sequence type. All factors individually seem to have some sort of significance in relation to DMs and fluency, but tend to cancel one another out when analysed as a whole. This brings into question the validity of either the factors analysed or the manner in which DMs/sequences are annotated. One solution Crible suggests is swapping the factor domain for that of function in future studies. This may indeed provide deeper insight into the elaborate relationships at play, but it is a question that can only be answered with further investigation. These critiques, however, should not take away from the significance of Crible’s contribution. She has reaffirmed the importance of register on both usage and perceptions of fluency, identified key functional and positional elements of DMs, and illustrated the very complicated relationship all of these elements have in ideas of fluency. All in all, Crible has managed to perform quite a few impressive tasks in one project. This work will surely provide insight and nuance to the field of both pragmatics and corpus linguistics.
References Aijmer, K. (2013). Understanding pragmatic markers: A variational pragmatic approach. Edinburgh: Edinburgh University Press. Besser, J., & Alexandersson, J. (2007). A comprehensive disfluency model for multi-party interaction. In S. Keizer, H. Bunt & T. Paek (Eds.), Proceedings of the 8th SIGdial workshop on discourse and dialogue (pp. 182–189). Antwerp. Crible, L. (2017). Towards an operational category of discourse markers: A definition and its model. In B. C. Fedriani & A. Sanso (Eds.), Discourse markers, pragmatic markers and modal particles: New perspectives (pp. 101–126). Amsterdam: John Benjamins. Götz, S. (2013). Fluency in native and nonnative English speech. Amsterdam: John Benjamins. Hansen, M. M. (2006). A dynamic polysemy approach to the lexical semantics of discourse markers (with an exemplary analysis of French toujours). In K. Fischer (Ed.), Approaches to discourse particles (pp. 21–41). Amsterdam: Elsevier. Hansen, M. M. (2008). Particles at the semantics/pragmatics interface: Synchronic and diachronic issues. A study with special reference to the French phrasal adverbs. Elsevier: Oxford. Lennon, P. (2000). The lexical element in spoken second language fluency. In H. Riggenback (Ed.), Perspectives on fluency (pp. 25–42). Ann Arbor: The University of Michigan Press.
13
M. Blanchard Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge: MIT Press. Schiffrin, D. (1987). Discourse markers. Cambridge: Cambridge University Press. Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge. Shriberg, E. (1994). Preliminaries to a theory of speech disfluencies. Ph.D. thesis. Berkeley: University of California at Berkeley.
13