Free Essay

Intentional Disfluency Communication

In:

Submitted By jrelse
Words 5814
Pages 24
ABSTRACT
Disfluency is the interruption of an otherwise continuous flow of speech.
Current views explain speech disfluency in terms of both an epiphenomenon of cognitive overload, and as an intentional function for easing social interaction to convey non-explicit thought processes. This study looked at both of these hypotheses, with main focus upon disfluency as a form of social communication. The disfluencies focused upon were: ‘uh’, ‘um’, ‘hmm’, ‘oh’, laughter and silences.
The Autism Spectrum Disorder is partially defined by a lack of social awareness. The Autism Quotient (AQ) test is used for determining where any individual lies on the continuum from typical development (TD) to Autism Spectrum
Disorder (ASD). This study used the AQ as a measure of meta-cognitive awareness.
TD students at the University of Edinburgh (N=50) undertook both a written
AQ test and a verbal general knowledge test. Disfluency use during the general knowledge test was analyzed and compared to: utterance length, question answer confidence ratings, gender and AQ scores. All modeled disfluencies were found to increase with utterance length, which has been related to cognitive load (Oviatt, 1995;
Shriberg, 1996). The use of ‘um’, laughter, and silence increased during moments of uncertainty, as shown by the individual confidence ratings. However, this does not distinguish whether participants were intentionally communicating uncertainty or whether it was accidental. Conversely, the use of ‘uh’ increased with confidence, insinuating a distinction between the uses of ‘uh’ and ‘um’ consistent with findings by
Clark and Fox Tree (2002). Laughter was predicted significantly by uncertainty and gender (more common in females) consistent with Provine (1996), who theorized laughter as a social buffer rather than a communication tool. The most noteworthy
1

finding was that an increased AQ score predicts a decreased use of fillers; ‘uh’, ‘um’,
‘oh’ and ‘hmm’. This suggests that filler use is significantly related to meta-cognitive interaction and thus may serve as an intentional function for communication.
These results indicate that different disfluencies serve different functions.
Furthering this, the use of fillers (‘uh’, ‘um’, ‘oh’ and ‘hmm’) can be considered as words rather than speech errors.

2

Table of Contents
1. Introduction……………………………………………………………………….……….…5
1.1 Disfluency as an epiphenomenon of cognitive overload……….……..…………..….6
1.2 Disfluency as a form of communication………..……………………………………….8
1.3 Autism spectrum disorder….……………………………………………………...…....10
1.4 Speech disfluency and autism quotient.……………………………………………….13
2. Methodology and materials………………………………………………………………16
2.1 Participants……………………………………………………………………………….16
2.2 Materials and procedure………………………………………………………………..17
2.3 Data analysis.……………………………………………………………………………..19
2.4 Statistical analysis.……………………………………………………………………….20
3 Results…………………………………………………………….………………………….22
3.1 Disfluency models………………………………………………………………………..26
4. Discussion……………………………………………………...…………………………...31
4.1 Repetitions and restarts.…….…………………………………………………………..32
4.2 Silent pauses………………………………………………………………………………32
4.2.1 AQ, confidence and silent pauses…..………………………………………………..32
4.3 Laughter…………………………………………………………………………………...33
4.3.1 Gender and laughter….……………………………………………………………….33
4.3.2 Confidence and laughter…….………………………………………………………..34
4.4 Filled pauses (uh, um, oh and hmm)…………………………………………………..34
4.4.1 Gender and filled pauses.……………………………………………………………..34
4.4.2 Utterance length and filled pauses…………………………………………………..35
4.4.3 Confidence and filled pauses…..……………………………………………………..35
4.4.4 AQ and filled pauses…….…………………………………………………………….36
4.5 Further research.…………………………………………………………………………38
4.6 Conclusion………………………………………………………………………………...40
5. Appendix…………………………………………………………………………………….42
6. References…………………………………………………………………………………..52

3

1. Introduction

Six to ten percent of normal speech is made up of disfluencies (Bortfeld, Leon,
Bloom, Schober & Brennan, 2001) categorized by: interjections, hesitations, slips, lengthened syllables, revisions, repetitions and silences (Kidd, White & Aslin, 2011).
Despite the occurrence of one disfluency for every seven to ten words (Shriberg,
1996), the human mind does not falter to decode speech; rather it adjusts to successfully interpret such alterations (Bortfeld et al., 2001; Lickley & Bard, 1996).
Questioning disfluency interpretation is important for understanding human cognitive processes, including how these processes adapt to deal with challenges and limitations. However, questioning disfluency formation is imperative for extending our understanding of these cognitive processes by deducing the intentions and patterns behind them.
Mahl et al. (1956) and Shriberg (1996) suggested that disfluencies arise during moments of hesitation and that two types of speaker exist during these moments: those that use interjections and repetitions, and those that change and alter sentence formation. Further to this Mahl et al. (1950) suggested that 40% of disfluencies were interjections, 45% were revisions and repetitions, and less than 1% were accountable to slips. The reasoning behind variation between disfluency type and individual disfluency usage is not fully defined. Reviewing whether disfluencies are a symptom of cognitive struggle during executive functioning and speech production, or whether they are implemented for purposeful communication to aid fluency during moments of hesitation, is critical for understanding whether they should be considered conventional words and part of the human language or not.

4

If disfluencies are indeed an intentional form of communication, a certain capacity for ‘Theory of Mind’ (TOM) and social imitation should be implicated in their use. To formulate effective and fluid communication, a degree of meta-cognition and appraisal of others desires is necessary. Baron-Cohen (2001) developed the AQ test to measure autistic traits, including TOM and imitation ability. Since high autism quotient (AQ) individuals are associated with having a low capacity for TOM and imitation (Baron-Cohen, 2001), higher AQ scores should theoretically be associated with lower disfluency use. Looking at disfluency use in relation to AQ score could help to determine the intentions behind the formation and function of different forms of disfluency. Further, the current concern with escalating autism prevalence, peaking
600% in the past two decades (Mulvihill et al., 2006), demonstrates a necessity for research into detailed behavioral analysis of the autism spectrum disorder, since specific biological markers are still lacking.

1.1 Disfluency as an epiphenomenon of cognitive overload

Chomsky (1965) highlighted the underlying mental processes between what a speaker knows and what a speaker does. He encouraged the opinion that disfluencies are unintentional errors and “advocated that they should be excluded from linguistic theory” (as cited in Clark & Fox Tree, 2002, p. 74). This hypothesis suggested that disfluencies are a product of cognitive struggle.
Multiple studies have examined the possible effect of cognitive demand on disfluency formation. Bortfeld et al., (2001) manipulated planning load on participants through: altering discourse length, introducing unfamiliar discourse topics and directing the speaker to either the role of a director or matcher. The rate of restarts

5

and repetitions increased with cognitive load. The same effect however, was not demonstrated by the use of fillers. Bortfeld et al., (2001) also found no difference in rate of any form of disfluency use between strangers and couples. This suggests that disfluencies do not alter with familiarity and therefore reduces credibility that they are functioning to aid communication.
Schachter, Christenfeld, Ravina and Bilous (1991) looked at the effect of choice on cognitive demand. They discovered that humanities lecturers used more fillers than social science lecturers, and that both of these subject groups used more fillers than hard science lecturers. However, this difference was not found to be significant when lecturers addressed the same topic. Therefore the variation in filler rate was explained through a necessity for increased processing, in order to deal with more complex sentence formations in humanities-based subject matter.
Both Oviatt (1995) and Shriberg (1996) found more disfluency to occur before longer utterances and attributed this to a heightened necessity for executive planning.
Therefore the cognitive overload hypothesis for speech disfluency supports that as speech length increases so should the number of disfluencies produced.
Barr (2001) found participants to use 34% more fillers when describing new as opposed to already seen abstract shapes. They explained that the introduction of unfamiliar material necessitated an increased processing ability, attributing evidence for the effect of increased cognitive demand on filler rates. Merlo and Mansur (2004) similarly found unfamiliar subject matter to successfully predict greater disfluency production. Oomen and Postma (2001) found that more repetitions were produced in conditions of increased cognitive load based on time constrained speech conditions.
However, they did not find filler rates to show an altered rate.

6

These findings contribute evidence to the hypothesis that an overload during executive functioning, due to task difficulty or information quantity, can cause disfluency production to increase. The studies indicated suggest that cognitive overload and distinct disfluency types vary according to differential experimental design. 1.2 Disfluency as a form of communication

An alternative hypothesis proposes disfluency as a function to aid social communication. It has been suggested that speakers use disfluencies as a resource to coordinate interaction; to communicate uncertainty; and further to indicate that an answer is in process so that the listener can be cued effectively (Smith & Clark,
1993).
Bortfeld et al., (2001), Oomen and Postma (2001) and Shriberg (1996) found cognitive load to successfully predict repetition and restarts, this was not found to be the case with fillers. If they are not merely epiphenomenon perhaps their formation has some intentional purpose.
Clark and Fox Tree (2002) proposed that fillers (specifically ‘uh’ and ‘um’) signify that the speaker is either searching for the correct word; is planning what to say; or is indicating their wish to hold the floor. They suggested that individuals vary fillers in length, type, and whether they are attached to another word or not. They also found ‘uh’ to be a significant predictor of minor delays before speech; and ‘um’ to be a significant predictor of major delays.
Additionally Clark and fox Tree (2002) distinguished between fillers ‘aswords’ and fillers as ‘non-linguistic signals’. Fillers were proposed as ‘words’ by

7

James (1972) who considered them to be a form of interjection. Alternatively fillers were considered as ‘non-linguistic signals’ by Wilkes and Gibbs (1986) used at the end of a speech turns to indicate an invitation for help; Maclay and Osgood (1959) proposed that fillers at the beginning of speech turns may be a means of holding the dialogue floor. ‘Word’ or ‘non-linguistic signal’ these various hypotheses agree that fillers serve an intentional role.
Shriberg (1996) found participants to produce higher filler rates and sentence revisions when interacting with other participants as opposed to interaction with speech recognition apparatus. This finding implicates the social nature of disfluency production. Oviatt (1995) also found that speech disfluency substantially increased during human interactions as opposed to machine-controlled interactions. The disfluencies included: fillers, corrections, repeats and false starts. However, the task imposed by a speech recognizer could have been more direct and structured than human-to-human speech tasks, potentially confounding results.
Several studies have tested the effects of removing face-to-face interaction, and therefore visual gestures such as eye contact. Kasl and Mahl (1965) found filler rates to increase by 41% with audio conversation with individuals in different rooms.
Oviatt (1995) found higher disfluency rates during telephone than face-to-face discourse (77% accounted for by utterance length). Schachter et al., (1991) found filler rates to be lower in speech produced while gesturing than speech produced in the absence of gesturing. These studies suggest that verbal compensation is necessary in the absence of in vivo interaction to allow for successful turn taking co-ordination to take place. This is further evidence for disfluencies as an aid to communication.
Broen and Siegal (1972) and Oviatt (1995) found more disfluency in dialogue speech than monologue speech. Oviatt (1995) found 5.5-8.53 disfluencies per 100

8

words during dialogue as opposed to 3.6 disfluencies per 100 words during monologue. If disfluency decreases in the absence of another, this implies that it must have a partially intentional use by the speaker. Nonetheless the presence of some remaining disfluency in monologues indicates that communication cannot be assumed to be the sole purpose for disfluency.

1.3 Autism spectrum disorder

Autism spectrum disorder (ASD) is marked by an altered social and communication development, and an increased incidence of limited imagination and repetitive behavior (American Psychiatric Association [APA], 1994). These behaviors make the ‘triad of impairments’ (Wing & Gold, 1979), and must be present to some degree for a diagnosis to be made (Frith, 1989). The disorder is measured on a spectrum, indicating traits to be variable in extent. The present study is concerned with social reciprocity and communication skills, traits commonly underdeveloped in
ASD individuals resulting in poor environmental adaptation.
The TOM deficit hypothesis (Baron-Cohen, Leslie & Frith, 1985), more currently termed as ‘Mind-blindness’ (Baron-Cohen, 2009), explains the triad of impairments in terms of an egocentric character trait and a basic difficulty in accounting for others mental states. Baron-Cohen (2011) described this trait as a tendency to over systemize and under empathize. He also suggested that this trait is an extreme form of the male brain, differing from the highly empathetic female brain
(Wheelwright, Baron-Cohen, Goldenfeld, Delaney, Fine, Smith, Weil &
Wakabayashi, 2006). This is consistent with ASD incidence rates, demonstrating a
4:1 male to female incidence ratio (Baron-Cohen, 2003).

9

The ‘transfer of false beliefs test’ assesses TOM (Wimmer & Perner, 1983).
The test necessitates individuals to infer the mental state of a doll, concerning its believed location of an object, which is incongruous to its actual location. BaronCohen et al., (1989) found an 80% fail rate in ASD individuals on this test. Happé
(1994) highlighted that the remaining 20% advocated remaining uncertainties about the TOM deficit and the credibility of the test. Frith, Happé and Siddons (1994) suggested the possibility that individuals can answer the test correctly using logical deductions rather than TOM.
Baron-Cohen (1989) developed the ‘second-order false belief task’ to test this and found a 100% fail rate in ASD children, a 40% fail rate in down syndrome children and a 10% fail rate in typical development (TD) children. However, mental age varied between test groups possibly confounding the results (Happé, 1995).
Further tests using the ‘second-order task’ by Ozonoff, Rogers and Pennington (1991) and Bowler (1992) found that ASD individuals performed significantly worse than controls. Alternatively Bowler (1992) found that Asperger’s individuals did not perform significantly worse on the test, however they could not justify answers using psychological state terms. Mixed findings indicate gaps concerning the explanation of the TOM deficit.
The ‘strange stories’ (Happé, 1994) also test TOM, necessitating individuals to analyze visual stories and justify underlying character motives. Patients who demonstrate a distinction between first-order and second-order false belief tasks have shown to vary in their degree of ability concerning the strange story test. Jolliffe and
Baron-Cohen (1999) found autistic individuals with no delay in language ability to have difficulty on the strange story tests.

10

The ‘eye task’ (Baron-Cohen, 1997) demonstrated Asperger disorder and high functioning ASD individuals to perform significantly less well than controls and tourettes syndrome individuals when analyzing emotional states, through eye images.
This distinction was not significant upon analysis of the entire face (van der Geest,
2002), suggesting a particular relation between TOM and the eyes. However, a distinction has been demonstrated between ASD individuals’ capacity to analyze moving and static eye images (Speer, 2007). Therefore the TOM deficit in ASD patients is unlikely to be based solely on eye analysis, additional social clues are gained through movement. Rutherford, Baron-Cohen and Wheelwright (2002) found evidence that ASD individuals have a reduced capacity in recognizing emotional states from vocal intonations as well.
Klin, Jones, Schultz and Volkmar (2003) suggested the ‘enactive mind’ hypothesis. This argued that TD individuals find social explanations for all aspects of life including inert objects, but that ASD individuals do not demonstrate this. Klin
(2000) tested this distinction with the Social Attribution Task (SAT) looking at individuals’ perceptions of geometric shapes. ASD individuals had difficulty making social inferences from cartoons involving geometric shapes.

TD individuals struggle to comprehend the difference between their own needs and the desires of others, so where is the line drawn? It may not be a concrete lack of empathy but an inability to process social cues involving imitation.
Rogers et al. (1996) found deficits in high functioning ASD individuals in imitating facial expressions and hand gestures. Hamilton (2008) additionally found limited imitation ability in ASD children compared to control groups. Alternatively

11

Hobson and Lee (1999) suggested direct mimicry to be impaired in ASD individuals but that this was not the case when it involved goal directed behavior.
Allen, Haywood, Rajendran and Branigan (2010) found that ASD children were able to produce linguistic imitation to the same extent as control groups; ASD individuals were able to align syntactically during interaction.

It is undeniable that ASD individuals are markedly reduced in their capacity to understand social behavior and recognize others emotional states. For ASD individuals difficulty is commonly found in correctly interpreting deception and humor; instigating pretend play; and talking about mental states involving dreaming, wanting and secrets. Brain damage and fMRI studies (Baron-Cohen et al., 1999;
Happé et al., 1996; Stone, Baron-Cohen & Knight, 1999; Stone, Baron-Cohen, Young
& Calder, 1998) implicate the amygdala, orbito-frontal cortex, and medial frontal cortex in the comprehension of others emotions and thoughts (implicating TOM).
Inconsistent results and incomplete explanations of the TOM necessitate a further understanding of this deficit.

1.4 Speech disfluency and autism quotient

Disfluencies are distinct from words in that they lack concrete definitions. If speakers use them actively for the benefit of the listener as a form of communication, rather than as symptoms of cognitive overload, they deem a certain amount of interpersonal perception and social reciprocity. This is necessary despite listeners’ abilities to accurately interpret and code for disfluencies. ASD individuals typically have social communication and imitation deficits. Therefore an understanding of the

12

relationship between disfluencies and ASD could help to deepen our understanding of speech disfluency processes, and to aid the diagnosis and comprehension of ASD.
To date two published studies have examined the relationship between the
ASD and speech disfluency.
Heeman, Lunsford, Selfridge, Black and Santen (2010) looked at dialogue use in both ASD and TD children. Particular focus was on use of pauses, fillers, acknowledgments and discourse markers. Both TD (N=22) and ASD (N=26) children between the ages of 4 and 8 participated. ASD children paused for significantly longer periods of time between turns than TD children. Concerning the use of fillers
ASD children used ‘um’ significantly less than TD children, but there was no significant difference between their uses of ‘uh’. These results are consistent with those demonstrated by Clark and Fox Tree (2002) that ‘uh’ and ‘um’ stem from distinct cognitive processes. It also suggests that ‘um’ may be linked more closely than ‘uh’ to a social function rather than a symptom of executive stalling. ASD children used significantly less acknowledgments and the word ‘and’ but there was no significant difference between any other discourse markers.
Lake, Humphreys and Cardy (2011) also studied spontaneous language use of both ASD (N=13) and TD (N=13) participants between the ages of 19 and 35.
Analysis demonstrated ASD individuals to use significantly less filled-pause words
(specifically ‘uh’ and ‘um’) and revisions than controls. Conversely ASD individuals used more silent pauses and disfluent repetitions.
Understanding the relationship between ASD and disfluency serves several important functions. It could help to objectify the diagnostic process of ASD; to develop implications for therapeutic methods used on ASD patients; to aid a better understanding of the ASD phenotype; and further to help demonstrate how much of

13

language is social and intentionally advocated, bettering our understating of human cognitive processes. The current study evaluates disfluency use in relation to AQ scores in TD individuals. The AQ test has been deemed a valid and reliable method for measuring autistic traits on a continuum between TD and ASD individuals
(Woodbury-Smith, Robinson, Wheelwright & Baron-Cohen, 2005; Allison, Auyeung,
& Baron-Cohen, 2012; Hoekstra, Bartels, Cath & Boobsma, 2008; Wakabayashi,
Baron-Cohen, Wheelwright & Tojo, 2006).

14

2. Methodology and Materials

2.1 Participants

Participants (N=50) were undergraduate and postgraduate students (30 male:
21 female) from the University of Edinburgh between the ages of 18 and 25 ( × age =
22.08, SD = 1.412 ). The recording for participant number 46 (female) did not work and therefore was not included.
Participants were recruited from both the College of Science and Engineering, and from the College of Humanities and Social Sciences. Baron-Cohen, Wheelwright,
Skinner, Martin and Cubley (2001) found science students to have higher AQ scores than humanities and social science students, with mathematicians scoring the highest of the science students. For this reason a range of participants were recruited based on their subject groups (Appendix A) in order to ensure a large distribution of comparable AQ scores. Students at the University of Edinburgh were chosen as participants to reduce variation concerning IQ and VIQ as much as possible.
Age range was kept between 18 and 25 since language ability has been closely related to age, and further due to evidence of language decline post age 25 (Timothy,
1992). Disfluency rates have also shown to increase with age (Bortfeld, 2001).
Forty-eight of the participants were native English speakers, with English as their first language. The three foreign participants were Polish, Estonian and German, but demonstrated sufficient VCI scores.
All participants were of typical development (TD); in perfect physical and mental health, with no speech or auditory impediments. Two participants did obtain scores equal to or above 32 on the AQ test, which is deemed sufficiently high to

15

implicate the ASD (Baron-Cohen et al., 2001), however the AQ test is not adequate alone to assume this diagnosis.
The experimenter attempted to recruit a sufficiently representative number of both males and female participants in this study. This was deemed important since
ASD has a 4:1 male to female incidence rate (Anello et al., 2009). Filler rates have also been demonstrated to be higher in males than females (Bortfeld et al., 2001).
Twenty females and thirty males results were used in this study. The ratio was not equal however a relatively high number of participants representing each sex were used. 2.2 Materials and procedure

An independent measures design was used. Participants were randomly selected and invited to join the experiment from outside their corresponding department building. The same experimenter undertook all of the testing to help to standardize procedures. Data was collected on campus in two study rooms; one belonging to the science and engineering department, and the other to the humanities and social sciences department. Testing materials included: a consent form, a debrief form, a manual AQ test, a confidence rating score sheet, a general knowledge questionnaire, relevant sections of the WAIS 111 and a Dictaphone.
Upon entrance participants first completed a written form of the Autism
Quotient test (AQ), participants were unaware of the test’s function.
This was followed by general knowledge questionnaire, read aloud by the experimenter. The participants answered the questions verbally whilst a Dictaphone recording was taken of the dialogue. Participants simultaneously scored their

16

confidence level for each question on a score sheet rather than verbally. This was done so that the participants felt their scores to be anonymous from the experimenter, which could have altered their rating choices. Participants scored confidence ratings immediately following each question in order to ensure accuracy.
The general knowledge questionnaire was constructed through pilot testing on a range of students varying in subject group, gender and age. Questions that were found to be too easy or hard were removed from the test. This was done to ensure that a suitable range of unbiased questions was formulated. Forty questions in total were extracted from the pilot test. Participants were informed before the general knowledge questionnaire commenced that they could not ask for help to ensure that questioning practice was standardized for each participant.
Participants then completed the Verbal Comprehension Index (VCI) section of the Wechsler Adult Intelligence Scale III (WAIS-3®, 1997) (this was the most updated version accessible through the Edinburgh University Psychology department). The VCI is one of the best predictors of overall intelligence and was used to ensure that all participants were of an acceptable verbal ability standard.
The total experiment did not exceed 20 minutes to control for boredom effects, and all of the tests were carried out between 10 AM and 3 PM.
Participants were then debriefed and asked for their personal opinion as to why they believed that they used certain fillers, and under which circumstances they felt they produced more.

17

2.3 Data analysis

Transcription and Coding
The general knowledge questionnaire was administered verbally and recorded using a Dictaphone. The recorded questionnaires were listened to multiple times and thoroughly transcribed; they were then analyzed in detail by the experimenter. The number of disfluencies used per participant and per question were then counted and recorded on a spreadsheet. Fillers following one after another were counted separately. Another experimenter listened to a cross section of ten participants’ recordings whilst viewing their transcriptions, and counted their individual disfluencies (any irregularity during speech). This was done to increase the reliability of the data found. An insignificant number of recommended changes were disputed in the transcriptions, therefore the remaining transcriptions were deemed reliable.

The AQ Test
The AQ test (Baron-Cohen, 2001) consists of fifty questions assessing behavioral areas that include: attention switching, attention to detail, social skills, communication, and imagination. Individuals must choose one of four answers for each question; “strongly agree”, “slightly agree”, “slightly disagree”’ or “strongly disagree”. Questions are either awarded 1 or 0 points, these are added to obtain an overall score. The AQ test ranges from 0-50. A score of 32 has been judged as the cut-off point to screen for ASD (Baron-Cohen et al., 2001), however others have implicated and even lower score of 26 being sufficient (Woodbury-Smith, Robinson,
Wheelwright & Baron-Cohen, 2005). The AQ test has been found to be a valid and reliable measurement for screening both Asperger syndrome and the Austism

18

Spectrum Disorder (Woodbury-Smith et al., 2005; Allison et al., 2012; Hoekstra, et al., 2008; Wakabayashi, et al., 2006).

VCI
Participants were tested on the four sections: similarity, vocabulary, information and comprehension of the WAIS 111 (WAIS-3®, 1997). The similarity, vocabulary and information scores were used to account for the verbal comprehension index. Each participant’s score was calculated according to the age-adjustment scale found in the back to gain the Verbal Comprehension Index score.

2.4 Statistical analysis

Data analyses were carried out in R (R Development Core Team, 2011) using the lem4 package (Bates, Maechler & Bolker, 2011). The dependent variables were coded binomially (whether or not a disfluency had been made) therefore a logistic regression model of mixed effects was appropriate for comparing models. This allowed for both categorical and continuous predictors to be included in the same analysis (Breslow & Clayton, 1993; Deb Roy & Bates, 2004).
The logistic regression model of mixed effects accounts for variation not generalizable to the IV, by including participants and items simultaneously to create a control model. This means that any differences in the data produced by participant and question item variation (which is not of experimental interest) is considered as error variance rather than significant. Therefore for each disfluency model created a base control model was fitted, including an intercept and accounting for participant and question item variation.

19

Once this model had been created predictor variables were individually added into the model. The probability of a disfluency model was derived from the model residuals, controlling for participant and questionnaire number. The model residuals demonstrate whether additional factors significantly alter the shape of the model.
A log-likelihood ratio test was calculated at -2 (l1-l0), where l0 and l1 signify the maximized likelihoods of models with and without the added variable as a predictor, with the degrees of freedom being the difference in the number of factors included in the new model. Since the control model had a null distribution, a χ2 test was appropriate in evaluating whether the additional predictor produced a significant effect on the model.
If the effect of a predictor was significant it was included in the model before the next predictor was added. If the effect of a predictor was not significant it was removed from the model before another predictor was added.

20

3. Results
Each participant’s general knowledge questionnaire was recorded and then transcribed. The disfluencies extracted from the recorded dialogues were coded for binomially per individual general knowledge question. The disfluencies that were considered to be most significant were accounted for. These included: ‘uh’, ‘um’,
‘oh’, ‘hmm’, silences and laughter (Table 1). Other disfluencies occurred too rarely to show a significant result and therefore were not included in the analysis (Appendix
E).

Table 1:

Total number of disfluencies used
Disfluency

Total

Uh

347

Um

445

Silences

908

Laughter

224

Hmm

59

Oh

47

21

Disfluency models (Table 2) were then evaluated in conjunction with the following predictors: number of words used, confidence rating (Table 3), gender
(Appendix G) and AQ score (Figure 1).

Table 2:

Average number of disfluencies used/ participant
Model
Fillers; uh, um, oh, hmm
Laughter
Silences

Participant Average/ question Std. Dev
0.451
12.058
0.141
4.519
0.46
8.762

Table 3:

Average number of predictors used/ participant
Predictor
No. Words
Confidence rating

Participant Average/ question Std. Dev
5.246
152.452
4.287
33.427

22

Figure 1:

Histogram to Demonstrate the Distribution of Autism Quotient Sores Collected
7
6

Frequency

5
4
3
2
1
0
3 6 7 8 9 10 12 13 14 15 16 17 18 19 20 21 22 23 24 25 27 28 29 32 37
AQ Score (1-50)
× = 17.12
Std. Dev = 7.2044
N = 50

23

Significant disfluency models were extracted which each included an intercept and slope coefficients, demonstrating the effect of relevant additional experimental predictors in the model. The statistical significance of each coefficient was calculated using the Wald statistic (Z = coefficient/ std. error) (Table 4).

24

3.1 Disfluency Models
Model 1 - silence
The null model (controlling for participant and question number) testing for the disfluency silence was significantly altered by the inclusion of the predictor ‘number of words’ (χ2 (1) = 2.2e-16, p < .001), and further by the predictor ‘confidence rating’
(χ2 (1) = 2.2e-16, p < .001) once the predictor ‘number of words’ had been included into the model, with log likelihood -1133. The updated model demonstrated number of words to increase with number of silences (OR = 1.0772, p < .001), and number of silences to increase as confidence rating decreased (OR = .7849, p < .001).
The coefficients of the model (and the probabilities that they differ from zero) are demonstrated in Table 4.
Neither AQ (χ2 (1) = 0.6309, p> .05) nor gender (χ2 (1) = 0.1325, p> .05) significantly improved the model accounting for the disfluency silence.

Model 2 - laughter
The null model (controlling for participant and question number) testing for the disfluency laughter demonstrated to be significantly altered by the predictor
‘number of words’ (χ2 (1) = 2.715e-08, p < .001), and further by the predictor
‘confidence rating’ (χ2 (1) = 9.677e-16, p < .001) once the predictor ‘number of words’ had been included into the model.
The inclusion of gender to the updated model demonstrated a further significant improvement (χ2 (1) = 0.0368, p < .05), with log likelihood -578.9. The new best-fit model showed significant effects of the three predictors: number of words used (OR =
1.0456, p < .001), confidence rating (OR = .7465, p < .001) and gender (OR = .4881, p < .05).

25

The coefficients of the model (and the probabilities that they differ from zero) are given in Table 4.
AQ alternatively did not show to significantly improve model 2 once accounting for number of words and confidence (χ2 (1) = 0.2814, p> .05), no significant effect was show between AQ score and laughter.

Model 5 – uh, um
Model 5 tested for the combined use of ‘uh’ and ‘um’. The null model
(controlling for participant and question number) was significantly altered by the inclusion of the predictor ‘number of words’ (χ2 (1) = 2.2e-16, p < .001), and further by the predictor ‘confidence rating’ (χ2 (1) = .04989, p < .05) once the predictor
‘number of words’ had been included into the model, with log likelihood -1071.
Participants were more likely to produce the disfluencies ‘uh’ and ‘um’ during longer utterances (OR =1.0856, p < .001). Considering the two fillers ‘uh’ (model 3) and
‘um’ (model 4) separately demonstrated confidence rating to significantly predict both ‘uh’ (χ2 (1) = 0.02556, p < .05) with log likelihood -736.4, and ‘um’ (χ2 (1) =
0.0001322, p < .001) with log likelihood -837.9. However, the use of ‘uh’ significantly increased with confidence rating (OR = 1.0781, p < .05) and ‘um’ significantly decreased as confidence rating increased (OR = .8886, p < .001).
The coefficients of the models (and the probabilities that they differ from zero) are demonstrated in Table 4.
The inclusion of AQ (χ2 (1) = 0.6309, p> .05) and gender (χ2 (1) = 0.1325, p>
.05) into these models did not significantly predict filler use when tested separately.

26

Model 4 – uh, um, oh, hmm
The null model (controlling for participant and question number) testing for the combined disfluencies: ‘uh’, ‘um’, ‘hmm’ and ‘oh’ was significantly altered by the inclusion of the predictor ‘number of words’ (χ2 (1) = 2.2e-16, p < .001), and further by the predictor ‘confidence rating’ (χ2 (1) = .02207, p < .05) once ‘number of words’ had been accounted for in the model as random variance. The introduction of
AQ scores to this updated model further significantly improved it (χ2 (1) = .04255, p
< .05), with log likelihood -1096.
The new model found disfluency to increase with number of words (OR =
1.0837, p < .001) and to decrease as confidence rating increased (OR = .9393, p <
.05). The new model also demonstrated that participants with higher AQ scores were less likely to produce the disfluencies: ‘uh’, ‘um’, ‘oh’ and ‘hmm’ (OR = .9544, p <
.05).
The coefficients of the model (and the probabilities that they differ from zero) are given in Table 4. The correlation between AQ scores and the residual probability of disfluency model 6, controlling for: participant, question number, number of words used and confidence is shown in Figure 2.
The predictor ‘gender’ demonstrated to have no significant effect on the control model for the disfluencies ‘uh’, ‘um’, ‘oh’ and ‘hmm’ (χ2 (1) = .7178, p >
.05).

27

Figure 2:

Scatter Plot to Demonstrate the Correlation Between AQ Score and the Residual
Probability of Disfluency Model 1 (uh, um, oh, hmm)

1

1

Further analysis of the data controlling for participant 29, 31 and 43 (demonstrating particularly high
AQ scores) validated the model to be consistently significant. Their extreme scores did not confound the results.

28

Table 4:

Coefficients and probabilities for each best-fitting disfluency model
Coefficient

p(coefficient

Model

Predictor

Estimate

Std. Error

1 (silence)

Intercept

-0.0315

0.189

0.868

No. Words

0.0744

0.0093

1.01E-15

0.001

Confidence

-0.2422

0.0252

Similar Documents

Premium Essay

Business Management

... Sandra Ajaps Geography Education in the Google age: A Case Study of Nsukka Local Government Area of Nigeria 30 Helen Afang Andow Impact of Banking Reforms on Service Delivery in the Nigerian Banking Sector 45 Billy Batlegang Green IT Curriculum: A Mechanism For Sustainable Development 59 Rozeta Biçaku-Çekrezi Student Perception of Classroom Management and Productive Techniques in Teaching 74 Thomas J.P.Brady Developing Digital Literacy in Teachers and Students 91 Lorenzo Cherubini Ontario (Canada) Education Provincial Policy: Aboriginal Student Learning 101 Jennifer Dahmen Natascha Compes Just Google It?! But at What Price? Teaching Pro-Environmental Behaviour for Smart and Energy-Efficient Use of Information and Communication Technologies 119 Marion Engin Senem Donanci Using iPads in a dialogic classroom: Mutually exclusive or naturally compatible? 132 Nahed Ghazzoul Teaching and Learning in the Age of 'Just Google it' 149 Saba A. Gheni Falah H. Hussein Teaching Against Culture of Terrorism in the Middle East 162 Jessica Gordon Bonnie Boaz Integrating Digital Media into Multimodal Compositions: Five Trends in the Transfer of Rhetorical Skills 173 Jeehee Han Public Opinion on Health Care Policies in the 21st Century 181 Elijah C. Irozuru M. Ukpong Eno Home Environment,...

Words: 236613 - Pages: 947