Papers by figures – Mark Dingemanse

A dive into my work from the perspective of figures — diagrams, photographs, plots, and other visually supportive material. For other views, see publications by theme or the full list.

Actionable metrics for openness in generative AI

Fine-grained openness judgements (1) can be turned into actionable metrics in several ways: they can be combined to form a cumulative openness score (2), discretized into categories like energy labels (3), or dichotomised into a binary measure (4). Yet another measure, increasingly popular, is to let only a single measure be the arbiter of openness, for instance ‘open weights’ or an ‘open license’ (5). Because this obscures the composite and gradient nature of openness, this is one of the most effective methods for open-washing.

Figure from a paper in which we discuss open source generative AI in light of the EU AI Act and the growing trend of open-washing: claiming brownie points for openness without actually delivering the goods. As we write:

We take time to discuss these ways of turning rich openness assessments into reductive metrics because it is important to be aware of the distorting effects of metrics (…) This figure offers both a manual for lobbyists and the means to counter them.

2473932 VRF2EUKH items 1 0 default asc 1 2461

Liesenfeld, A., & Dingemanse, M. (2024). Rethinking open source generative AI: open-washing and the EU AI Act. The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24). PDF

Tags: diagram
The toron of language

The Grand Mosque of Djenné, Mali, during the 2023 annual replastering. Perhaps the most visually striking aspect of this earthen architecture are the wooden stakes called toron that are an integral part of the design but also serve a functional role. Perched on the toron, people work to resurface the mud plaster of the walls. Interjections are like the toron: dotting the surface of language and helping to ensure its continued structural integrity. Photo reproduced with permission from Ousmane Makaveli.

Earthen architecture is a fitting metaphor for language because very little in language is truly permanent, and yet its structures appear stable and robust to us. As I write in my review of interjections for Annual Review of Linguistics:

Exposed to the elements, earthen structures erode under rain showers, sandstorms, and scorching sun. If they exist today it is because each time anew they are restored by human hands. The people who climb the toron to make and remake these buildings are the same who sit in their shade and pray in their halls. And so it is with our languages. If they exist today it is because they have passed through countless hands and heads and because each part of them has been propped up, picked apart, and put together again with the crucial support of interjections.

2473932 QXAAL3R8 items 1 0 default asc 1 2428

Dingemanse, M. (2024). Interjections at the heart of language. Annual Review of Linguistics, 10, 257–277. doi: 10.1146/annurev-linguistics-031422-124743 PDF

Tags: interaction, photo
Interjections and conversational structure

Excerpt of an English conversation. Transcript adapted from the Newport Beach corpus transcribed by Gail Jefferson, reference NB:1:1:19, timecode 4m57s. Clearly visible is a mix between simpler and more complex turns that relate to each other in socially normative ways, forming orderly sequences, often recursive. Conversation analysts and interactional linguists have documented many of the structural positions and social actions of conversation, including the preproposal of line 1 (Houtkoop-Steenstra 1990); the assessment and counterassessment in lines 2 and 5 (Heritage & Raymond 2005); the repair initiation with Huh? in line 6, which starts a side sequence ( Jefferson 1972); displays of alignment like Mmhm in line 3 (Schegloff 1982); the sequence-closing third in line 8, hearable as either a no or an oh [Schegloff (1997, p. 507) transcribes it as “Oh”]; and the proposal in line 10, for which we now can see that the way was paved by the preproposal at line 1. A technical rendition of conversation like this shows how turns weave in and out of one another like traffic at a busy intersection, and reveals the pivotal role played by interjections, the traffic signals of conversation.

2473932 QXAAL3R8 items 1 0 default asc 1 2417

Dingemanse, M. (2024). Interjections at the heart of language. Annual Review of Linguistics, 10, 257–277. doi: 10.1146/annurev-linguistics-031422-124743 PDF

Tags: conversation, transcript
How frequent are interjections?

The occurrence of interjections in 10-min excerpts of informal dyadic conversations in six spoken languages. Every panel shows the turns of a dyadic exchange; colored dots indicate turns that belong to the top 10 most common one-word standalone turn formats in the language. These excerpts cannot support strong comparative or typological inferences; they are only meant to give an impression of the prevalence of interjections across unrelated languages.

2473932 QXAAL3R8 items 1 0 default asc 1 2285

Dingemanse, M. (2024). Interjections at the heart of language. Annual Review of Linguistics, 10, 257–277. doi: 10.1146/annurev-linguistics-031422-124743 PDF

Tags: interaction, panel, time series, typology
Interactive repair and computational complexity

A simple form of repair can alleviate computational demands of pragmatic reasoning. Agent-based simulations and complexity analysis comparing repairers (interactive repair-capable agents, without reasoning) and reasoners (pragmatic reasoning-capable agents without repair) across three lexicon sizes (shown here in three shades) with moderate ambiguity. (A) Reasoners suffer decreasing communicative success with growing lexicon size. (B) Computational complexity rises quadratically with lexicon size for reasoners and only linearly for repairers. (C) When needed, repair-capable agents take additional turns to reduce entropy (uncertainty about meaning) in cost-effective ways.

In short, this modelling work suggests that a simple form of interactive repair can be a significantly less computationally costly strategy than pragmatic reasoning for securing communicative success. Simulation data and complexity calculations are from van Arkel et al. 2023; this image is from Dingemanse & Enfield 2024..

2473932 TPXV34IM,G3BU2JET items 1 0 default asc 1 2309

Dingemanse, M., & Enfield, N. J. (2024). Interactive repair and the foundations of language. Trends in Cognitive Sciences, 28(1), 30–42. doi: 10.1016/j.tics.2023.09.003 PDF

Arkel, J. van, Woensdregt, M., Dingemanse, M., & Blokpoel, M. (2020). A simple repair mechanism can alleviate computational demands of pragmatic reasoning: simulations and complexity analysis. Proceedings of the 24th Conference on Computational Natural Language Learning. doi: 10.18653/v1/2020.conll-1.14 PDF

Tags: barplot, complexity, interaction, modelling
Turn-taking in the broad sense and in the narrow sense

Recent work on animal interaction uses the term ‘turn-taking’ to refer to alternation of signalling among conspecifics where interactants avoid overlap. This can be termed ‘turn-taking in the broad sense’ (TTB). This broad notion of turn-taking seldom considers the explicit defining rules of human turn-taking discovered and described in the field of conversation analysis. By contrast, turn-taking in the narrow sense (TTN) has more specific properties, the most important of which relate to flexibility in recipient selection, unfixed and recursively complex internal structure of turns, and the possibility of carrying out interactive repair across turns.

This distinction can serve as a conceptual anchoring point in comparative work. Without it, we risk making unwarranted inferences about one form of turn-taking (e.g. TTB) by carrying over assumptions based on another (e.g., TTN). For instance, while some work on animal communication loosely speaks of ‘turn-taking rules’ (and so invites comparison to the socially normative rules of human interaction), this tends to refer merely to the avoidance of overlap that characterizes TTN, not the intricate, socially normative rules of overlap resolution and turn allocation seen in TTN.

2473932 G3BU2JET items 1 0 default asc 1 2325

Dingemanse, M., & Enfield, N. J. (2024). Interactive repair and the foundations of language. Trends in Cognitive Sciences, 28(1), 30–42. doi: 10.1016/j.tics.2023.09.003 PDF

Tags: interaction, table
Transcript of a Siwu conversation featuring continuers

Transcript from a conversation in Siwu (a Kwa language of Ghana) in which Foster and Beatrice talk about house building and discuss why there might be several unfinished compound houses in their hometown in eastern Ghana. The sum of Beatrice’s contributions in this excerpt is a series of mm-like tokens, which brings home one important function of this kind of item: acknowledging the other’s turn while passing the opportunity to take the floor. But the forms are not all the same: they come in multiple variants and appear to be finely adjusted to their sequential environment.

2473932 FM9H84UK items 1 0 default asc 1 2349

Dingemanse, M. (2023). Interjections. In E. van Lier (Ed.), The Oxford Handbook of Word Classes (pp. 477–491). Oxford University Press. https://doi.org/10.31234/osf.io/ngcrs PDF

Tags: continuers, conversation, interaction, transcript
Anatomy and frequency of interactive repair

A With interactive repair, another participant initiates repair, inviting a repair solution by the first; the repair initiation is a pivot, pointing both back and forward. B While a fitted response is preferred, initiating repair is always a possible next move; likewise, within repair, while a restricted format is preferred, an open format is always an option. C Across diverse languages, formats for interactive repair range fall into three types, depending on how they target the trouble in prior turn and the kind of response they typically invite; these can be ranked from less to more specific in terms of the grasp of the trouble source they display. D Empirical cumulative distribution of independent repair sequences (black curve) as they occur over time in informal conversation in a global sample of 12 languages (grey curves). Across languages, the steepest part of the slope is around 17 s, the average 84 s, and nearly all sequences occur within a 4-min window from the last.

2473932 G3BU2JET items 1 0 default asc 1 2275

Dingemanse, M., & Enfield, N. J. (2024). Interactive repair and the foundations of language. Trends in Cognitive Sciences, 28(1), 30–42. doi: 10.1016/j.tics.2023.09.003 PDF

Tags: diagram, interaction, panel
How conversational data challenges speech recognition (ASR)

A Word error rates (WER) for five speech-to-text systems in six languages. B One minute of English conversation as annotated by human transcribers (top) and by five speech-to-text systems, showing that while most do some diarization, all underestimate the number of transitions and none represent overlapping turns (Whisper offers no diarization). C Speaker transitions and distribution of floor transfer offset times (all languages), showing that even ASR systems that support diarization do not represent overlapping annotations in their output.

2473932 3QWNA6Y7 items 1 0 default asc 1 2263

Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems. Proceedings of the 24th Annual SIGdial Meeting on Discourse and Dialogue, 482–495. doi: 10.18653/v1/2023.sigdial-1.45 PDF

Tags: density, diagram, interaction, panel, time series, timing
How speech recognition warps dialog act classification

How different speech recognition engines warp dialog act classification in the same dataset of conversational English. For 8 frequent dialog acts, coloured lines show dialog acts based on ASR output deviate from those based on human transcripts of the same data (baseline). Dot size scales to number of times a tag is assigned. Only the most frequently assigned dialog acts (with at least 25 tokens in at least one dataset) are shown here. Mean absolute percentage deviations by ASR system: nemo 27.8%, amazon 31.4%, whisper 33.8%, rev 47.4%.

2473932 3QWNA6Y7 items 1 0 default asc 1 2260

Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems. Proceedings of the 24th Annual SIGdial Meeting on Discourse and Dialogue, 482–495. doi: 10.18653/v1/2023.sigdial-1.45 PDF

Tags: diagram, interaction
Opening up ChatGPT

ChatGPT is sufficiently well known to warrant critical scrutiny, and for this project we wrote a paper, developed a website where we track open-source instruction-tuned large language models, designed a poster for presentation at the ACM conference on Conversational User Interfaces (CUI’23) and, yes, even designed a logo that combines a key image of the open source movement with a variation on ChatGPT’s corporate logo.

2473932 UMBCUI8F items 1 0 default asc 1 2246

Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). Opening up ChatGPT: tracking openness, transparency, and accountability in instruction-tuned text generators. ACM Conference on Conversational User Interfaces (CUI ’23), July 19-21, Eindhoven. doi: 10.1145/3571884.3604316 PDF

Tags: illustration, logo, table
Acts of kindness around the world

A global comparative study of 8 languages on 5 continents finds that people overwhelmingly like to help one another, independent of differences in language, culture or environment. This is a surprising finding from the perspective of anthropological and economic research, which has tended to foreground differences in how people work together and share resources.

2473932 L29E5QTS items 1 0 default asc 1 1972

Rossi, G., Dingemanse, M., Floyd, S., Baranova, J., Blythe, J., Kendrick, K. H., Zinken, J., & Enfield, N. J. (2023). Shared cross-cultural principles underlie human prosocial behavior at the smallest scale. Scientific Reports, 13(1), 6057. doi: 10.1038/s41598-023-30580-5 PDF

Tags: map
Iconicity ratings

Iconicity ratings are a key tool in psycholinguistic studies of vocabulary. This figure shows the distribution of ratings for 14,000 English words in two ways: (a) A kernel density plot of the distribution of average ratings; the dashed line indicates a normal distribution with the same mean and standard deviation; (b) standard deviations across raters (y-axis) as a function of average rating (x-axis). Extreme values are rarer, but people agree more strongly on them. (Figure by first author Bodo Winter, open data here.)

2473932 7YKY9I78 items 1 0 default asc 1 2122

Winter, B., Lupyan, G., Perry, L. K., Dingemanse, M., & Perlman, M. (2023). Iconicity ratings for 14,000+ English words. Behavior Research Methods. doi: 10.3758/s13428-023-02112-6 PDF

Tags: density, graph, scatterplot
Iconicity measures across tasks

Discriminability of iconicity measures from different tasks. Iconicity ratings have been transformed so that they vary between 0 and 1 (to compare with guessing accuracies). Guesses —where people try to guess the meaning of an iconic word, or the word form belonging to a given meaning— appear to be somewhat more evenly spread than ratings. Iconicity ratings by native speakers (rightmost, showing data from Thompson et al. 2020) are on average higher than iconicity ratings by people who don’t speak the language whose words they rate, confirming the notion that native speakers will generally feel that words of their own language are more iconic. (Figure by Bonnie McLean, open data here.)

2473932 6HGRC4SF items 1 0 default asc 1 2224

McLean, B., Dunn, M., & Dingemanse, M. (2023). Two measures are better than one: combining iconicity ratings and guessing experiments for a more nuanced picture of iconicity in the lexicon. Language and Cognition, 15(4), 719–739. doi: 10.1017/langcog.2023.9 PDF

Tags: density, panel, scatterplot
Beyond Single-Mindedness

Seen from Earth, the movements of celestial bodies display near-intractable complexity. When taking not a single vantage point but multiple (here, Sun and Earth), suddenly the picture changes, and new forms of order become visible (Sousanis, 2015). Likewise, key concerns of cognitive science may be illuminated by a change of perspective that locates cognition not in isolated but in interacting minds.

Image from Dingemanse et al. (2023). Sources: Left: Encyclopaedia Brittanica (1771), after a similar engraving by Cassini (via); Right: Copernicus (1543) De revolutionibus orbium cœlestium.

2473932 NG73DB8A items 1 0 default asc 1 2025

Dingemanse, M., Liesenfeld, A., Rasenberg, M., Albert, S., Ameka, F. K., Birhane, A., Bolis, D., Cassell, J., Clift, R., Cuffari, E., De Jaegher, H., Dutilh Novaes, C., Enfield, N. J., Fusaroli, R., Gregoromichelaki, E., Hutchins, E., Konvalinka, I., Milton, D., Rączaszek-Leonardi, J., … Wiltschko, M. (2023). Beyond Single-Mindedness: A Figure-Ground Reversal for the Cognitive Sciences. Cognitive Science, 47. doi: 10.1111/cogs.13230 PDF

Tags: illustration
Multimodal effort in repair sequences

Boxplots showing the joint amount of multimodal effort invested by both participants to resolve the interactional trouble. The boxes represent the interquartile range; the middle line the median; the whiskers the minimum and maximum scores (outliers excluded). Every dot represents a repair sequence, i.e., repair initiation and repair solution together. As the specificity of repair formats goes up, joint multimodal effort invested goes down.

2473932 9JKJHL5F items 1 0 default asc 1 2097

Rasenberg, M., Pouw, W., Özyürek, A., & Dingemanse, M. (2022). The multimodal nature of communicative efficiency in social interaction. Scientific Reports, 12(1), 19111. doi: 10.1038/s41598-022-22883-w PDF

Tags: boxplot, graph
Vowel space and the colour circle

Illustration of how colour space is mapped onto vowel space based on the findings for >1100 participants in Cuskley, Dingemanse et al. 2019. Red usually goes with back vowels like /a/, while light hues like yellow and green go with front vowels like /i/ and darker hues go with /u/ and /o/. None of this is deterministic: associations vary across people and this just represents one of the most common solutions on average. Made by MD for the classroom materials in Van Leeuwen & Dingemanse 2022.

2473932 VFGLWSAG,PNEYCGA2 items 1 0 default asc 1 2013

van Leeuwen, T., & Dingemanse, M. (2022). Samenwerkende zintuigen. In S. Dekker & H. Kause (Eds.), Wetenschappelijke doorbraken de klas in! (pp. 85–116). Wetenschapsknooppunt Radboud Universiteit. PDF

Cuskley, C., Dingemanse, M., Kirby, S., & van Leeuwen, T. M. (2019). Cross-modal associations and synesthesia: Categorical perception and structure in vowel–color mappings in a large online sample. Behavior Research Methods, 51(4), 1651–1675. doi: 10.3758/s13428-019-01203-7 PDF

Tags: diagram, illustration, phonetics, synaesthesia
Simulating phonetic evolution

Plots of where in a phonetic possibility space different words end up after 10,000 rounds of interaction, across 20 independent simulation runs (each cloud of 100 exemplar dots/triangles represents a single word at round 10,000 of a single simulation run). Blue, yellow, green and orange are regular words; purple is the continuer word. On each independent simulation run, all words are initialised at randomly selected positions in the space. A shows a selection of 6 separate simulation runs chosen for illustrative purposes (showing how regular words end up in different positions); B shows the end-state of all 20 simulation runs overlaid. Parameter settings: (i) minimal effort bias 3 times as strong for continuer word (G=1250) than for regular vocabulary words (G=5000), and (ii) the bias for reuse of features (i.e. segment-similarity bias) is not applied to the continuer category.

2473932 AN6ZRKXB items 1 0 default asc 1 2090

Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mmhm). The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE), 61–67. PDF

Tags: panel, phonetics, simulation
Sequential context of continuers

A Candidate continuer forms in 10 unrelated languages, B shown in their natural sequential ecology (annotations as in the original data), C with spectrograms and pitch traces of representative tokens made using the Parselmouth interface to Praat (Jadoul et al., 2018; Boersma & Weenink, 2013).

2473932 AN6ZRKXB items 1 0 default asc 1 2084

Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mmhm). The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE), 61–67. PDF

Tags: continuers, panel, phonetics, sequence, spectrogram, speech, time series
Sampling response tokens

A. Overview of included languages with dataset size in hours and top 3 sequentially identified response tokens as transcribed in the corpus. B. Location of largest speech community. C. Assessing the impact of sparse data on UMAP projections using three samples of Dutch response tokens. A look at the full dataset (a) and random-sampled subsets of decreasing size (b, c) suggests isomorphism across scales and interpretability of clustering solutions as small as 150 tokens.

2473932 P3CWL6X3 items 1 0 default asc 1 2081

Liesenfeld, A., & Dingemanse, M. (2022). Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages. Proceedings of Interspeech 2022, 1126–1130. doi: 10.21437/Interspeech.2022-11288 PDF

Tags: clustering, continuers, map, panel, table, typology, UMAP
Cultural evolution of continuers

Continuers (frequent standalone utterances like mm-hm that people often use in succession) differ in interesting ways from other elements that are common, like top tokens (the most common words in a corpus) and discontinuers (frequent standalone utterances that people do not produce in successive streaks). A. Length of tokens for continuers, discontinuers and top tokens in 32 languages. B. Frequencies of major sound classes across types. Vowel nuclei occur across types, but continuers stand out for their preferences for nasals. C. Random forest analysis of 118 continuer forms in 32 spoken languages showing the top 10 most predictive phonemes (out of 29 attested).

2473932 AN6ZRKXB items 1 0 default asc 1 2162

Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mmhm). The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE), 61–67. PDF

Tags: continuers, graph, panel, phonetics, phonology, typology
Clustering response tokens

Response tokens like English mhmm, uhuhh, yeah or Catalan mm, sí, vale are tricky to study in the wild: their phonetic realizations can be quite different from how they are transcribed. Here we use UMAP, a method for dimensionality reduction used in bioacoustics and other fields, to explore the shape of inventories of response tokens in 16 languages. Every point represents a single response token; the closer two points are the more similar they are acoustically. Spectrograms drawn around the rim of the plots provide a direct view of the acoustic structure of tokens and enable quick sanity checks.

2473932 P3CWL6X3 items 1 0 default asc 1 1502

Liesenfeld, A., & Dingemanse, M. (2022). Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages. Proceedings of Interspeech 2022, 1126–1130. doi: 10.21437/Interspeech.2022-11288 PDF

Tags: clustering, continuers, phonetics, scatterplot, spectrogram, speech, UMAP
Continuers and repair initiators

Two-panel figure showing (A) Typical sequential structures for continuers versus
repair initiators. Continuers are recurring items found in alternation with unique turns (a, c). Repair initiators are recurring items found between a unique turn a and its near-copy a’. (B) Prevalence of sequentially identified candidate continuers and repair initiators, demonstrating the potential of using sequential patterns to identify them in language-agnostic ways. Most frequent formats exemplified in 10 languages (9 phyla), from left to right: Akhoe Hai||om, Hausa, Tehuelche, Gutob, Kerinci, Siwu, Mandarin, German, Korean, Dutch.

Another useful feature of this diagram is that it makes it possible to infer a minimum corpus size for spotting interactional resources of interest. For instance, the smallest corpora among the 10 languages for which tokens are exemplified in the figure are Akhoe Hai||om and Hausa, both corpora that make up less than one hour in total. This appears to be a lower bound for identifying phenomena like repair, though continuers are about an order of magnitude more frequent and so can be reliably found even in smaller corpora.

2473932 D6ELWQYD items 1 0 default asc 1 2345

Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), 1178–1192. https://aclanthology.org/2022.lrec-1.126 PDF

Tags: continuers, interaction, panel, repair, scatterplot, time series
Quality control for conversational corpora

Conversational data can be transcribed in many ways. This panel provides a quick way to gauge the quality of transcriptions, here illustrated with data from Ambel (Arnold, 2017). A. Distribution of the timing of dyadic turn-transitions with positive values representing gaps between turns and negative values representing overlaps.
This kind of normal distribution centered around 0 ms is typical; when corpora starkly diverge from this it usually indicates noninteractive data, or segmentation methods that do not represent the actual timing of utterances. B. Distribution of transition time by duration, allowing the spotting of outliers and artefacts of automation (e.g. many turns of similar durations). C. A frequency/rank plot allows a quick sanity check of expected power law distributions and a look at the most frequent tokens in the corpus. D. Three randomly selected 10 second stretches of dyadic conversation give an impression of the timing and content of annotations in the corpus.

2473932 D6ELWQYD items 1 0 default asc 1 2018

Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), 1178–1192. https://aclanthology.org/2022.lrec-1.126 PDF

Tags: density, duration, graph, panel, timing, turn-taking
How ASR training data differs from real conversation

L: Distributions of durations of utterances and sentences (in ms) in corpora of informal conversation (blue) and CommonVoice ASR training sets (red) in Hungarian, Dutch, and Catalan. Modal duration and annotation content differ dramatically by data type: 496ms (6 words, 27 characters) for conversational turns and 4642ms (10 words, 58 characters) for ASR training items. R: Visualization of tokens that feature more prominently in conversational data (blue) and ASR training data (red) in Dutch. Source data: 80k randomsampled items from the Corpus of Spoken Dutch (Taalunie, 2014) and the Common Voice corpus for automatic speech recognition in Dutch (Ardila et al., 2020), based on Scaled F score metric, plotted using scattertext (Kessler, 2017)

2473932 D6ELWQYD items 1 0 default asc 1 2003

Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), 1178–1192. https://aclanthology.org/2022.lrec-1.126 PDF

Tags: density, duration, graph, panel, scatterplot
A survey of conversational corpora

Under the auspices of various language documentation projects, language resources have been collected in more and more communities across the world, and these often include at least some conversational data. Such corpora harbour important insights for language science and technology. This map plots >60 corpora found to offer at least some conversational data.

2473932 D6ELWQYD items 1 0 default asc 1 1968

Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), 1178–1192. https://aclanthology.org/2022.lrec-1.126 PDF

Tags: map
Repair across species

Types of redoings of communicative behaviour and their interactional contingency. This diagram sums up the species-agnostic framework for studying communicative repair we introduce in a wide-ranging review of animal communication systems.

2473932 D3NF2HLB items 1 0 default asc 1 1522

Heesen, R., Fröhlich, M., Sievers, C., Woensdregt, M., & Dingemanse, M. (2022). Coordinating social action: A primer for the cross-species investigation of communicative repair. Philosophical Transactions of the Royal Society B: Biological Sciences, 377(1859), 20210110. doi: 10.1098/rstb.2021.0110 PDF

Tags: diagram, sequence
From text to talk

Most NLP methods and models focus on text rather than talk. What are they missing? Scattertext plot of words and phrases characteristic of spoken interaction (green) versus written text (purple) in English, with words most characteristic of conversational interaction in the upper left (and shown in a separate inset on the right). High-frequency metacommunicative interjections like uhhuh, hm, wow, um are most typical of talk, and most often underrepresented in text.

2473932 IM7WXJQI items 1 0 default asc 1 1514

Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5614–5633. doi: 10.18653/v1/2022.acl-long.385 PDF

Tags: interaction, scatterplot
Mhmm over time

Even apparently universal patterns (like the use of ‘mhm’ during tellings) can show important cross-cultural differences. A. Continuers (marked ○) are among the most frequent recipient behaviours in both English and Korean, shown here in four 80 second stretches of tellings. B. However, the relative frequency of continuers is about twice as high in Korean based on 100 random samples of 80 second segments in both languages: on average, 21% of turns are continuers in Korean, against 9% of turns in English (measures expressed this way to control for speech rate differences).

2473932 IM7WXJQI items 1 0 default asc 1 1686

Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5614–5633. doi: 10.18653/v1/2022.acl-long.385 PDF

Tags: continuers, conversation, frequency, sequence, time series, typology
Timing of yes/no sequences

Assessing the timing of turn-taking requires careful operationalisation. The largest comparative study so far (Stivers et al., 2009) looked at polar questions and their answers in order to have a directly comparable sequential context.

In our paper on conversational corpora, we use this same sequential context, and compare it to the larger set of dyadic speaker transitions in interaction. Given the broad-scale comparability of the overall timing distributions (in grey) and the more controlled subset of at least 250 question-answer sequences per language (in black), we conclude that QA sequences can act as a useful proxy for timing in general (supporting Stivers et al. 2009), but also that QA-sequences are not necessary for a relatively robust impression of overall timing.

2473932 IM7WXJQI items 1 0 default asc 1 2068

Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5614–5633. doi: 10.18653/v1/2022.acl-long.385 PDF

Tags: density, interaction, timing, turn-taking
Vowel-colour associations

L: The vowel space with colour associations by a synaesthete. R: The same vowels displayed according to tongue position when produced. Visualization: Christine Cuskley & Mark Dingemanse. For an interactive version of this visual, see here.

2473932 PNEYCGA2,VFGLWSAG items 1 0 default asc 1 2008

van Leeuwen, T., & Dingemanse, M. (2022). Samenwerkende zintuigen. In S. Dekker & H. Kause (Eds.), Wetenschappelijke doorbraken de klas in! (pp. 85–116). Wetenschapsknooppunt Radboud Universiteit. PDF

Cuskley, C., Dingemanse, M., Kirby, S., & van Leeuwen, T. M. (2019). Cross-modal associations and synesthesia: Categorical perception and structure in vowel–color mappings in a large online sample. Behavior Research Methods, 51(4), 1651–1675. doi: 10.3758/s13428-019-01203-7 PDF

Tags: diagram, illustration, phonetics, popsci, speech, synaesthesia
Zipf in conversation

Frequency/rank distributions of tokenized items (‘words’) and recurring turn formats in conversational corpora with at least 20 such turn formats, representing 22 languages (8 phyla). Tokenized items (blue) show a linear frequency/rank relation in log/log space. Recurring turn formats (whether one-word ○ or multi-word ＋) appear to obey a similar frequency/rank distribution for the 20% of turns that occur >20 times (purple), tapering off towards lower frequencies and unique turns (grey). Fit fluctuates with corpus size and the parallelism of distributions is most apparent in larger corpora.

2473932 IM7WXJQI items 1 0 default asc 1 1965

Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5614–5633. doi: 10.18653/v1/2022.acl-long.385 PDF

Tags: frequency, graph
/r/ for rough in Indo-European

A Across the Indo-European language family, the proportion of rough words with /r/ is much higher than the proportion of smooth words with /r/; B Each dot represents a language (size of the circle = number of words); whiskers show 95% Bayesian credible intervals corresponding to the mixed-effects Bayesian logistic regression analysis indicating that rough words have a much higher proportion of /r/ (posterior mean = 63%) than smooth words (posterior mean= 35%).

2473932 Y7TCRRCP items 1 0 default asc 1 1999

Winter, B., Sóskuthy, M., Perlman, M., & Dingemanse, M. (2022). Trilled /r/ is associated with roughness, linking sound and touch across spoken languages. Scientific Reports, 12(1), 1035. doi: 10.1038/s41598-021-04311-7 PDF

Tags: graph, iconicity, map, panel, scatterplot
/r/ for rough

How phonemes relate to roughness ratings. Left: The ten most predictive phonemes in a random forest analysis for A English and B Hungarian; the vertical black line corresponds to the absolute value of the least predictive phoneme, which is a heuristic cut-off rule for predictors that do not contribute. Right: Boxplots (whiskers = smallest, largest value within 1.5× IQR) for words with and without /r/ in English and Hungarian.

2473932 Y7TCRRCP items 1 0 default asc 1 1931

Winter, B., Sóskuthy, M., Perlman, M., & Dingemanse, M. (2022). Trilled /r/ is associated with roughness, linking sound and touch across spoken languages. Scientific Reports, 12(1), 1035. doi: 10.1038/s41598-021-04311-7 PDF

Tags: boxplot, iconicity, random forest
Rolling /r/ around the world

Map accompanying news coverage of our study of the link between /r/ and rough textures. The red data points represent languages that often feature /r/ in words with words for rough textures but not words for smooth textures. Blue data points, much rarer, are cases where the pattern is the reverse. The map shows that overwhelmingly, languages prefer to express rough meanings with /r/ sounds (if they have them).

2473932 Y7TCRRCP items 1 0 default asc 1 2148

Winter, B., Sóskuthy, M., Perlman, M., & Dingemanse, M. (2022). Trilled /r/ is associated with roughness, linking sound and touch across spoken languages. Scientific Reports, 12(1), 1035. doi: 10.1038/s41598-021-04311-7 PDF

Tags: map, popsci
Gesture kinematics

Setup of a study using motion tracking to investigate continuous properties of evolving manual signals. Panel a: Seed gestures for a fixed set of meanings are learned by next generations in an iterative learning experiment. Panel b: Using motion tracking, we derive automatic kinematic measures of entropy, temporal variability and intermittency over time and over generations.

2473932 7EW7ME4T items 1 0 default asc 1 1517

Pouw, W., Dingemanse, M., Motamedi, Y., & Özyürek, A. (2021). A Systematic Investigation of Gesture Kinematics in Evolving Manual Languages in the Lab. Cognitive Science, 45(7), e13014. doi: 10.1111/cogs.13014 PDF

Tags: gesture, graph
Bootstraps, bridges and scaffolds

Graphic I made for a talk about our paper on roles of iconicity in words learning. As part of this paper we briefly review the role of metaphors in theories about language & development.

2473932 BQEG95GN items 1 0 default asc 1 2157

Nielsen, A. K. S., & Dingemanse, M. (2021). Iconicity in Word Learning and Beyond: A Critical Review. Language and Speech, 64(1), 52–72. doi: 10.1177/0023830920914339 PDF

Tags: illustration
The iconicity boom

Proportional number of publications cataloged in Web of Science (1900–2017), showing concurrent upsurges in six topics related to iconicity (corrected for overall publication volume).

2473932 BQEG95GN items 1 0 default asc 1 2103

Nielsen, A. K. S., & Dingemanse, M. (2021). Iconicity in Word Learning and Beyond: A Critical Review. Language and Speech, 64(1), 52–72. doi: 10.1177/0023830920914339 PDF

Tags: graph, scatterplot, time series
Five dimensions of alignment

The relationship between the two parts of a behavior pair can vary on five dimensions, as outlined in this table. For each dimension, we visualize two different relationships between instances of behavior—one with a solid arrow and one with a dashed arrow. For meaning, we use tangram figures to visualize the referent of speech and/or gestures

2473932 MGQ2P8D4 items 1 0 default asc 1 2100

Rasenberg, M., Özyürek, A., & Dingemanse, M. (2020). Alignment in Multimodal Interaction: An Integrative Framework. Cognitive Science, 44(11). doi: 10.1111/cogs.12911 PDF

Tags: diagram, sequence, table
Computational complexity of repair and pragmatic reasoning

A comparison of computational complexity (in basic computation steps) by agent type and lexicon size. The main take-away from this figure is that complexity increases exponentially with lexicon size for pragmatic agents, but only linearly for interactional agents. Three types of agent are compared, each with three lexicon sizes. The Interactional agent is a model equipped with a simple form of repair and no pragmatic reasoning. The other two agents cannot initiate repair, but instead feature pragmatic reasoning. The Frugally Pragmatic agent is a model that only uses complex pragmatic reasoning above a certain uncertainty threshold; the Fully Pragmatic agent always uses it. For interactional agents with a 6 × 4 lexicon no data is visible as the computation cost is very small (48) relative to the range of the y-axis.

2473932 TPXV34IM items 1 0 default asc 1 2119

Arkel, J. van, Woensdregt, M., Dingemanse, M., & Blokpoel, M. (2020). A simple repair mechanism can alleviate computational demands of pragmatic reasoning: simulations and complexity analysis. Proceedings of the 24th Conference on Computational Natural Language Learning. doi: 10.18653/v1/2020.conll-1.14 PDF

Tags: barplot, graph, simulation
Shooing words

Shooing words —words that people use to chase away chickens— turn out to be highly similar across unrelated languages. These illustrations by Josje van Koppen accompanied a write-up about my serendipitous finding in popular science magazine Onze Taal.

The actual table from my paper looks a lot less exciting, but it does contain additional information about language families and about words for ‘chicken’ in the same set of languages. The basic conclusions is that words for ‘shoo’, but not ‘chicken’, show strong convergence towards sibilant sounds in 17 languages from 11 unrelated language families.

Illustrations from: Renckens, Erica. “‘Ksst!’ Het Lokken En Wegjagen van Dieren.” Onze Taal, 2020.

2473932 E679L7E4 items 1 0 default asc 1 title 2046

Dingemanse, M. (2020). Recruiting assistance and collaboration: a West-African corpus study. In S. Floyd, G. Rossi, & N. J. Enfield (Eds.), Getting others to do things: A pragmatic typology of recruitments (pp. 369–421). Language Science Press. PDF

Tags: illustration, popsci, table
Three barreled request

After a check of starting conditions (‘take uh:. is there water there?’, line 1), a sequence of pointing gestures accompanies a three-barreled request: ‘take this gallon’, ‘pour some water’, ‘put it on the fire’.

2473932 E679L7E4 items 1 0 default asc 1 title 2041

Dingemanse, M. (2020). Recruiting assistance and collaboration: a West-African corpus study. In S. Floyd, G. Rossi, & N. J. Enfield (Eds.), Getting others to do things: A pragmatic typology of recruitments (pp. 369–421). Language Science Press. PDF

Tags: photo, sequence, transcript
Getting others to do things

Interactional challenges to be negotiated in recruitment sequences, along with some of the interactional practices mobilized to address them.

Not a figure, I know, but sometimes tables are the only way to bring multidimensional problem space into view. In this case, the table is also a map to the resources discussed in the paper.

2473932 E679L7E4 items 1 0 default asc 1 2038

Dingemanse, M. (2020). Recruiting assistance and collaboration: a West-African corpus study. In S. Floyd, G. Rossi, & N. J. Enfield (Eds.), Getting others to do things: A pragmatic typology of recruitments (pp. 369–421). Language Science Press. https://doi.org/10.5281/zenodo.4018387 PDF

Tags: interaction, table
Iconicity and funniness ratings

The intersection of iconicity and funniness ratings for 1419 words. A: Scatterplot of iconicity and funniness ratings in which each dot corresponds to a word. A loess function generates the smoothed conditional mean with 0.95 confidence interval. Panels B and C show the distribution of iconicity and funniness ratings in this dataset.

2473932 BPDB86AT items 1 0 default asc 1 2006

Dingemanse, M., & Thompson, B. (2020). Playful iconicity: structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 12(1), 203–224. doi: 10.1017/langcog.2019.49 PDF

Tags: density, iconicity, panel, scatterplot
“Hand me that phone”

Bella is holding Aku’s phone and taking a call Aku asked her to pick up. Speaking into the phone, she notes she is ‘not sister Aku’. When it becomes clear the caller wants Aku, Aku asks Bella to give the phone back, adding a gesture of reaching out to receive the phone. These kinds of events (‘recruitments’) are frequent in everyday interaction and show how people weave together talk and action to get others to do things.

2473932 E679L7E4 items 1 0 default asc 1 1993

Dingemanse, M. (2020). Recruiting assistance and collaboration: a West-African corpus study. In S. Floyd, G. Rossi, & N. J. Enfield (Eds.), Getting others to do things: A pragmatic typology of recruitments (pp. 369–421). Language Science Press. https://doi.org/10.5281/zenodo.4018387 PDF

Tags: interaction, photo
The space between our heads

Not strictly a scientific visualization, and not by me. Still included here because it is a compelling illustration of the central point of this essay on brain-to-brain interfaces, which deals with naïve ideas about a cyberpunk future in which we’d be connected by wires instead of words. (Source of the image is Technology Review, who got it from shutterstock.)

2473932 8QJUG8V9,Y2DSTBS7 items 1 0 default asc 1 1977

Dingemanse, M. (2020). Der Raum zwischen unseren Köpfen. Technology Review, 2020(13), 10–15. PDF

Dingemanse, M. (2017). Brain-to-brain interfaces and the role of language in distributing agency. In N. J. Enfield & P. Kockelman (Eds.), Distributed Agency (pp. 59–66). Oxford University Press. PDF

Tags: illustration, popsci
Playful iconicity

Illustration accompanying news coverage in NRC of our paper on playful iconicity: when words sound like what they mean. By Jet Peters.

2473932 BPDB86AT items 1 0 default asc 1 2145

Dingemanse, M., & Thompson, B. (2020). Playful iconicity: structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 12(1), 203–224. doi: 10.1017/langcog.2019.49 PDF

Tags: illustration, popsci
Structural markedness

The relation between structural markedness and funniness ratings (A), iconicity ratings (B), and funniness and iconicity together (C), in a set of 1.419 English words. Each dot represents 14 or 15 words. Solid line with smoothed mean shows cumulative markedness. Other lines show relative prevalence of complex onsets (flap), codas (clunk), and verbal diminutives (drizzle). Higher structural markedness goes together with higher iconicity and funniness ratings. This supports the theory of structural markedness as a metacommunicative cue.

2473932 BPDB86AT items 1 0 default asc 1 1519

Dingemanse, M., & Thompson, B. (2020). Playful iconicity: structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 12(1), 203–224. doi: 10.1017/langcog.2019.49 PDF

Tags: iconicity, scatterplot
Definitions of ideophones

Given the long history of interest in ideophones, it is remarkable that there are relatively few definitions intended for comparative use. This table compares a number of accounts of ideophones used or intended for cross-linguistic comparisons. It shows how every definition has its own points of emphasis, but also that across the board, there is strong convergence in the properties proposed as fundamental to understanding ideophones.

2473932 S8QXX6BM items 1 0 default asc 1 2323

Dingemanse, M. (2019). “Ideophone” as a comparative concept. In K. Akita & P. Pardeshi (Eds.), Ideophones, Mimetics, Expressives (pp. 13–33). John Benjamins. PDF

Tags: table
Vowel-colour associations

Vowel-colour associations for 1164 participants (central panel), showing, clockwise from bottom left, (a) a participant with very low structure yet high consistency across trials, probably a false positive synaesthete, (b) a typical nonsynaesthete with mappings that are both inconsistent and unstructured; (c) a middling participant with significant structure but inconistent choices across trials; (d) a highly structured but inconsistent participant; and (e) a typical vowel-colour synaesthete, with highly structured, consistent and categorical mappings.

2473932 PNEYCGA2 items 1 0 default asc 1 1963

Cuskley, C., Dingemanse, M., Kirby, S., & van Leeuwen, T. M. (2019). Cross-modal associations and synesthesia: Categorical perception and structure in vowel–color mappings in a large online sample. Behavior Research Methods, 51(4), 1651–1675. doi: 10.3758/s13428-019-01203-7 PDF

Tags: graph, panel, scatterplot, synaesthesia
Codability of sensory domains

The hierarchy of the senses across languages according to the mean codability of each domain, with the presumed universal Aristotelian hierarchy on Top. There is no universal hierarchy of the senses across diverse languages worldwide. (Figure by coauthor Sean G. Roberts, open data here.)

2473932 AQA9D4RX items 1 0 default asc 1 2110

Majid, A., Roberts, S. G., Cilissen, L., Emmorey, K., Nicodemus, B., O’Grady, L., Woll, B., LeLan, B., de Sousa, H., Cansler, B. L., Shayan, S., de Vos, C., Senft, G., Enfield, N. J., Razak, R. A., Fedden, S., Tufvesson, S., Dingemanse, M., Ozturk, O., … Levinson, S. C. (2018). Differential coding of perception in the world’s languages. Proceedings of the National Academy of Sciences, 115(45), 11369–11376. doi: 10.1073/pnas.1720419115 PDF

Tags: graph
Language of perception

Languages (and researchers) contributing to a large comparative study of the differential coding of perception across cultures. Locations indicate field sites where data were collected.

2473932 AQA9D4RX items 1 0 default asc 1 2106

Majid, A., Roberts, S. G., Cilissen, L., Emmorey, K., Nicodemus, B., O’Grady, L., Woll, B., LeLan, B., de Sousa, H., Cansler, B. L., Shayan, S., de Vos, C., Senft, G., Enfield, N. J., Razak, R. A., Fedden, S., Tufvesson, S., Dingemanse, M., Ozturk, O., … Levinson, S. C. (2018). Differential coding of perception in the world’s languages. Proceedings of the National Academy of Sciences, 115(45), 11369–11376. doi: 10.1073/pnas.1720419115 PDF

Tags: map
Locations and settings

There are a myriad ways to refer to places, but one useful way to think about their affordances in interaction is in terms of a distinction between locations and settings. Locations tell you where something is; settings invoke activities and actors. Many place references usefully combine the two: setting a story in the graveyard area not only localizes it for the audience in the know, but also provides a setting for ominous encounters.

2473932 URAJUNL3 items 1 0 default asc 1 1997

Dingemanse, M., Rossi, G., & Floyd, S. (2017). Place reference in story beginnings: a cross-linguistic study of narrative and interactional affordances. Language in Society, 46(2), 129–158. doi: 10.1017/S0047404516001019 PDF

Tags: diagram, interaction
Ideophone constructions in Siwu

The canonical syntactic home of ideophones in Siwu is toward the end of the clause. A finer analysis of patterns of occurrence in the corpus reveals a number of constructions in which ideophones can occur. The five most common constructions, together accounting for 95 % of ideophone tokens, are shown here.

This type of visualization —a table with horizontal bar plot— has no name as of yet. It uses the same logic as E.J. Tufte’s sparklines, which also display numerical information inline.

2473932 ULBUUBDL items 1 0 default asc 1 1990

Dingemanse, M. (2017). Expressiveness and system integration. On the typology of ideophones, with special reference to Siwu. STUF – Language Typology and Universals, 70(2), 363–384. doi: 10.1515/stuf-2017-0018 PDF

Tags: diagram, linguistics, sparkline, table
How ideophones stand out

Pitch trace of a Japanese utterance starting with two tokens of the ideophone zabɯ:n ‘splash’, showing how they are produced in the upper part of the speaker’s pitch range, and how their articulation is drawn out relative to other non-ideophonic elements in the utterance. This illustrates the special treatment that ideophones often get in everyday speech, which makes them stand out from the surrounding material.

2473932 EIZ4V5YK items 1 0 default asc 1 1959

Dingemanse, M., & Akita, K. (2017). An inverse relation between expressiveness and grammatical integration: on the morphosyntactic typology of ideophones, with special reference to Japanese. Journal of Linguistics, 53(3), 501–532. doi: 10.1017/S002222671600030X PDF

Tags: iconicity, phonetics, speech
Gesture, expressiveness and grammatical integration in Japanese

A scattering of 625 datapoints showing first how ideophones (circles) in a Japanese corpus can be realized at four levels of cumulative expressiveness, from 0 to 3; second, how these tokens are spread across three grammatical constructions that are ranked from least integrated (Quotative) to most integrated (Predicative); third, that maximally expressive ideophone tokens mostly cluster in Quotative constructions; and fourth, that gestures (filled circles ⬤, n=242) co-occur especially with ideophone tokens that are most expressive and least integrated.

2473932 EIZ4V5YK items 1 0 default asc 1 2342

Dingemanse, M., & Akita, K. (2017). An inverse relation between expressiveness and grammatical integration: on the morphosyntactic typology of ideophones, with special reference to Japanese. Journal of Linguistics, 53(3), 501–532. doi: 10.1017/S002222671600030X PDF

Tags: iconicity, scatterplot
Two dimensions of interactive repair

Two dimensions of formats for repair initiation. The distinction between open and restricted type formats is retrospective: it is about the nature and location of the trouble in prior turn. The distinction between request and offer type formats is prospective: it is about the nature of the response that is relevant in next turn. The two dimensions together define three basic types of formats for repair initiation: (1) open request, (2) restricted request, and (3) restricted offer.

2473932 2I79D5G6 items 1 0 default asc 1 2065

Dingemanse, M., & Enfield, N. J. (2015). Other-initiated repair across languages: towards a typology of conversational structures. Open Linguistics, 1, 98–118. doi: 10.2478/opli-2014-0007 PDF

Tags: diagram, repair
Properties and formats of repair

Using elementary properties of interactional resources, we can capture commonalities and differences between repair formats in principled and precise ways. For instance, to capture the distinctions between four repair initiation formats in English (as presented in Sidnell 2010), we can use the following three properties: Question (is there a content question word?), Repetition (does the repair initiator repeat some material from the prior turn?) and Confirmation (does the repair initiator make confirmation relevant in next turn?).

2473932 2I79D5G6 items 1 0 default asc 1 2062

Dingemanse, M., & Enfield, N. J. (2015). Other-initiated repair across languages: towards a typology of conversational structures. Open Linguistics, 1, 98–118. doi: 10.2478/opli-2014-0007 PDF

Tags: diagram, linguistics, repair, table, typology
Probability of encountering repair

Interactive repair —when people work together to fix trouble in conversation— is quite common. In these 12 languages from around the world, it takes only 84 seconds on average between one repair sequence and the next. The sheer frequency shows how important repair is as a system that keeps conversation on track and helps us negotiate common understanding in a world full of noise. We are united in asking questions.

2473932 IJED7Z8B items 1 0 default asc 1 1982

Dingemanse, M., Roberts, S. G., Baranova, J., Blythe, J., Drew, P., Floyd, S., Gisladottir, R. S., Kendrick, K. H., Levinson, S. C., Manrique, E., Rossi, G., & Enfield, N. J. (2015). Universal Principles in the Repair of Communication Problems. PLOS ONE, 10(9), e0136100. doi: 10.1371/journal.pone.0136100

Tags: ecdf, frequency, graph, interaction
‘Seeing as’

My dear friend Ruben Owiafe was one of the most colourful Siwu teachers I had. His explanation of what it means to provide folk definitions is insightful in terms of both its content and form. As he explained, they enable you to see one thing in terms of another: “If you see this here” (points to his right), “you see how it is here” (points to his left). Figures and diagrams in academic publications serve the same kind of purpose. They provide us with different ways of seeing, and help us understand by analogy.

2473932 9TWQ9KZU items 1 0 default asc 1 1961

Dingemanse, M. (2015). Folk definitions in linguistic fieldwork. In J. Essegbey, B. Henderson, & F. McLaughlin (Eds.), Language Documentation and Endangerment in Africa (pp. 215–238). John Benjamins. PDF

Tags: depiction, gesture, photo
Elements of other-initiated repair

A repair sequence consists of a repair initiation that points back to a prior turn (identifying it as a trouble source) and points forward to a next turn (the repair solution). The visual style of this schematic was adapted in a broader account of repair in conversation by Albert & De Ruiter.

2473932 IJED7Z8B items 1 0 default asc 1 1947

Dingemanse, M., Roberts, S. G., Baranova, J., Blythe, J., Drew, P., Floyd, S., Gisladottir, R. S., Kendrick, K. H., Levinson, S. C., Manrique, E., Rossi, G., & Enfield, N. J. (2015). Universal Principles in the Repair of Communication Problems. PLOS ONE, 10(9), e0136100. doi: 10.1371/journal.pone.0136100

Tags: diagram, repair, sequence
Arbitrariness, iconicity and systematicity

(A, B) Words show arbitrariness when there are conventional associations between forms and meanings. Words show iconicity when there are perceptuomotor analogies between forms and meanings, here indicated by shape, size and proximity (inset). (B, C) Words show systematicity when statistical regularities in phonological form, here indicated by color, serve as cues to abstract categories such as word classes. (D) The cues involved in systematicity differ across languages and may be arbitrary. (E) The perceptual analogies involved in iconicity transcend languages and may be universal.

2473932 MM326C9Y items 1 0 default asc 1 1945

Dingemanse, M., Blasi, D. E., Lupyan, G., Christiansen, M. H., & Monaghan, P. (2015). Arbitrariness, iconicity and systematicity in language. Trends in Cognitive Sciences, 19(10), 603–615. doi: 10.1016/j.tics.2015.07.013 PDF

Tags: diagram, iconicity, panel
Folk definitions of ideophones

Ideophones have rich imagistic meanings that can be hard to describe. In explaining the ideophone minimini, four speakers of Siwu independently use gestures that are both similar (in depicting a spherical shape) and different (in size, handshape, and method of representation). Collectively, the gestures illustrate an elusive aspect of the ideophone’s meaning while also showing that its linguistic form as spoken word is more conventionalized than the gestures it comes with.

2473932 9TWQ9KZU items 1 0 default asc 1 1943

Dingemanse, M. (2015). Folk definitions in linguistic fieldwork. In J. Essegbey, B. Henderson, & F. McLaughlin (Eds.), Language Documentation and Endangerment in Africa (pp. 215–238). John Benjamins. PDF

Tags: iconicity, illustration, photo
Magritte on depiction

2473932 5V3XR5VG items 1 0 default asc 1 1939

Dingemanse, M. (2015). Ideophones and reduplication: Depiction, description, and the interpretation of repeated talk in discourse. Studies in Language, 39(4), 946–970. doi: 10.1075/sl.39.4.05din PDF

Tags: iconicity, illustration
Known cross-modal associations to vowels

Diagram of attested cross-modal mappings to linguistic sound represented on typical vowel space. (Figure by first author Gwilym Lockwood.)

2473932 C9IQDN5U items 1 0 default asc 1 2116

Lockwood, G., & Dingemanse, M. (2015). Iconicity in the lab: a review of behavioural, developmental, and neuroimaging research into sound-symbolism. Frontiers in Psychology, 6(1246), 1–14. doi: 10.3389/fpsyg.2015.01246 PDF

Tags: diagram, phonetics, synaesthesia
Evolving language

Visual by Sean Roberts, Shawn Tice, and Marisa Casillas

This was one of the most fun science communication events we did, part of a series over the course of 2012-2014. Participants did a communication game where they could only signal using slide whistles. After completing a round, a new pair could start by learning the signals invented in the prior round (an iterated communication game). While people played, onlookers could get an abstract view into the evolving system.

2473932 ZLA8VVK5,ZN2QJI2S items 1 0 default asc 1 1617

Dingemanse, M., Verhoef, T., & Roberts, S. G. (2014). The role of iconicity in the cultural evolution of communicative signals. In B. de Boer & T. Verhoef (Eds.), Proceedings of Evolang X Workshop on Signals, Speech and Signs (pp. 11–15). PDF

Verhoef, T., Roberts, S. G., & Dingemanse, M. (2015). Emergence of systematic iconicity: transmission, interaction and analogy. In D. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. P. Maglio (Eds.), Proceedings of the 37th Annual Meeting of the Cognitive Science Society (pp. 2481–2486). Cognitive Science Society. PDF

Science communication expert Dr. Hannah Little singled out this event as an example of effective science communication.

Little 2023:25 ‘Principles of good research communication’

Tags: language evolution, popsci, time series
Which words are the same across languages?

Illustration made by Frank Landsbergen for a piece on universal words I wrote for a popular science book. It covers three types of words that, each for their own reason, come out similarly across languages. The three types are: (i) interactional tools (huh? for repair, oh! for a news receipt); (ii) expressive interjections (au for ‘ouch’); and (iii) onomatopoeia (bam ‘BAM’).

Simplifying somewhat, interactional tools are similar across languages because the ecology they live in (the rapid-fire turn-taking of conversation) provides the same selective pressures across languages; a case of convergent cultural evolution. Expressive interjections may go back to ancestral vocalizations also found in our close evolutionary relatives. And onomatopoeia come out similarly to the extent that they imitate the same kinds of sounds.

2473932 BGE2TXRJ,K34KHLD2,FM9H84UK items 1 0 default asc 1 2134

Dingemanse, M. (2023). Interjections. In E. van Lier (Ed.), The Oxford Handbook of Word Classes (pp. 477–491). Oxford University Press. https://doi.org/10.31234/osf.io/ngcrs PDF

Dingemanse, M. (2014). Welk woord is in elke taal hetzelfde? In S. Deurloo (Ed.), Waarom drinken we zoveel koffie? 101 slimme vragen (pp. 159–161). Kennislink. PDF

Dingemanse, M. (2017). On the margins of language: Ideophones, interjections and dependencies in linguistic theory. In N. J. Enfield (Ed.), Dependencies in language (pp. 195–202). Language Science Press. PDF

Tags: illustration, popsci
The Austin/Clark action ladder

Herb Clark, building on Austin’s (1962) distinctions of levels of speech acts, notes that successful communication is grounded in joint actions by speaker and addressee at at least four distinct levels. In the Austin/Clark action ladder, higher levels depend on lower levels in terms of causality (higher levels are implemented by means of lower ones) and entailment (completion of a higher level entails completion of the ones below it). As a corollary, the action ladder exhibits the property of “downward evidence”: evidence that B recognized A’s intended action (level 4) is also evidence that B succeeded in interpreting A’s words (level 3), that B correctly identified the words (level 2), and that B attended to A’s vocalisation (level 1). All four levels are involved in building mutual understanding, and each of them can be a locus of trouble.

2473932 HJB7PQ6D items 1 0 default asc 1 2059

Dingemanse, M., Blythe, J., & Dirksmeyer, T. (2014). Formats for other-initiation of repair across languages: An exercise in pragmatic typology. Studies in Language, 38(1), 5–43. doi: 10.1075/sl.38.1.01din PDF

Tags: interaction, repair, table
Cultural evolution of continuous signals

The cultural evolution of continuous signals over 4 generations in a single experimental chain of iterated communication. Colour represents communicative success. Through trial and error, participants in consecutive trials narrow down to a set of signals that is both iconic (in mirroring aspects of form) and systematic (in using slope direction to signal the way animals are facing). This represents in miniature form how iconicity can provide the building blocks for systematicity in linguistic systems.

2473932 ZN2QJI2S items 1 0 default asc 1 2023

Dingemanse, M., Verhoef, T., & Roberts, S. G. (2014). The role of iconicity in the cultural evolution of communicative signals. In B. de Boer & T. Verhoef (Eds.), Proceedings of Evolang X Workshop on Signals, Speech and Signs (pp. 11–15). PDF

Tags: graph, iconicity, interaction
Universal and specific aspects of social interaction

“So she was weird today,” Kofi says. In response to Aku’s “What?”, Kofi closes his eyes and moves jerkily from side to side. All present turn to him to watch. Aku checks: “The spirit- the spirit’s been coming again?” Kofi confirms and tells a story of spirit possession. The segment, only a few seconds long, illustrates both universal and culture-specific aspects of social interaction.

2473932 QV2ZL5WR items 1 0 default asc 1 1941

Dingemanse, M., & Floyd, S. (2014). Conversation across cultures. In N. J. Enfield, P. Kockelman, & J. Sidnell (Eds.), Cambridge Handbook of Linguistic Anthropology (pp. 434–464). Cambridge University Press. PDF

Tags: photo
Repair interjections in vowel space

Panel showing average positions of repair interjections in vowel space. Left: The vowel inventories of the world’s languages tend to make maximal use of vowel space. In contrast to this, the vowels of the repair interjections all cluster in the same low-front region. Abbreviations: Cha’palaa (Cha), Dutch (Dut), Icelandic (Ice), Italian (Ita), Lao (Lao), Mandarin (Man), Murrinh-Patha (Mur), Russian (Rus), Siwu (Siw), Spanish (Spa). Right: An instrumental analysis of interjection tokens from Spanish and Cha’palaa shows that the interjections have distinct, language-specific vowel targets, with Spanish closer to /e?/ and Cha’palaa closer to /a?/.

This is an annotated version of Figure 2 and Figure 3 from our original paper, showing more clearly that the right panel is zooming in on the small part of the vowel space populated by all of the languages.

The overall point is two-fold: First, there is strong similarity in the overall region of vowel space languages end up in for their repair interjections. Second, there is nonetheless also room for a small degree of language-specificity in precisely how the interjection is realised in a language. This demonstrates the two parts of our argument in the paper: that repair interjections have universal properties, and that they are (language-specific) words.

2473932 DUX55FYW items 1 0 default asc 1 2372

Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent evolution of linguistic items. PLOS ONE, 8(11), e78273. doi: 10.1371/journal.pone.0078273

Tags: panel, phonetics, scatterplot
‘Huh?’ around the world

A word like huh? —used to initiate repair when, for example, one has not clearly heard what someone just said— is found in roughly the same form and function in conversational corpora from 31 spoken languages from across the globe. The ten in bold are examined in phonetic detail and found to be about as similar to each other as variants of the word dog across English varieties. Languages 11–20 are from [14], 21–31 from sources cited. Locations are approximate. 1. Cha‘palaa ʔa:↘ 2. Icelandic ha 3. Spanish e↗ 4. Siwu ã:↗ 5. Dutch h↗ 6. Italian ε:↗ 7. Russian a:↗ 8. Lao hã:↗ 9. Mandarin Chinese ã:↗ 10. Murrinh-Patha a:↗ 11. ‡Âkhoe Hai//om hε↗ 12. Chintang hã↗ 13. Duna ɛ̃:↗ 14. English hã↗ 15. French ɛ̃:↗ 16. Hungarian hm↗/ha↗ 17. Kri ha:↗ 18. Tzeltal hai↗ 19. Yélî Dnye ɛ̃:↗ 20. Yurakaré æ↗ 21. Lahu hãi[38] 22. Tai/Lue há↗ [92] 23. Japanese e↗ [93] 24. Korean e↗ [94] 25. German hɛ̃ [95] 26. Norwegian hæ↗ [96] 27. Herero e↗ [97] 28. Kikongo e↗ [98] 29. Tzotzil e↗ [99] 30. Bequia Creole ha:↗ [100] 31. Zapotec aj↗ [101].

2473932 DUX55FYW items 1 0 default asc 1 1979

Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent evolution of linguistic items. PLOS ONE, 8(11), e78273. doi: 10.1371/journal.pone.0078273

Tags: map, repair, typology
Depiction in speech and gesture

On the ground is a plate of metal on which two small amounts of gunpowder have been laid to dry in the sun; besides it stands the speaker, explaining why one needs to be careful when igniting the gunpowder to test its quality: it may flare up “SHÛ, SHÛ”, a vocal depiction that is produced in precise synchrony with the two hands moving symmetrically in a quick upward motion. (The right hand holds an object.)

2473932 KRUAGIBI items 1 0 default asc 1 1984

Dingemanse, M. (2013). Ideophones and gesture in everyday speech. Gesture, 13(2), 143–165. doi: 10.1075/gest.13.2.02din PDF

Tags: gesture, iconicity, illustration, photo
Deideophonisation and ideophonisation

Deideophonization turns depictive signs into descriptive ones by decreasing expressiveness and increasing morphosyntactic integration; ideophonization turns descriptive signs into depictive ones by increasing expressiveness and decreasing morphosyntactic integration.

This simple diagram was created in 2012, in a style that evokes typical Langackerian cognitive linguistics diagrams. Published (due to editorial delays) only in 2017.

The paper has a further variation on the theme, displaying the two types of ideophone constructions in Siwu as “Bound” versus “Free” and placing them on opposite ends of this continuum:

2473932 ULBUUBDL items 1 0 default asc 1 2034

Dingemanse, M. (2017). Expressiveness and system integration. On the typology of ideophones, with special reference to Siwu. STUF – Language Typology and Universals, 70(2), 363–384. doi: 10.1515/stuf-2017-0018 PDF

Tags: diagram, grammar, iconicity
Clustering ideophones

MDS plot of similarity ratings for ideophones derived from a pile-sorting field task. Interpretable clusters are circled and indicated in the plot. One group, with saaa ‘cool sensation’, nyagbalaa ‘pungent’, buàà ‘tasteless’, nyɛ̃kɛ̃nyɛ̃kɛ̃ ‘intensely sweet’ and mɛ̃rɛ̃mɛ̃rɛ̃ ‘sweet’, can be characterised as TASTE. Another cluster includes dɔbɔrɔɔ ‘soft’, safaraa ‘coarse-grained’, wòsòròò ‘rough’, fũɛ̃ fũɛ̃ ‘malleable’, wùrùfùù ‘fluffy’, pɔlɔpɔlɔ ‘smooth’, fiɛfiɛ ‘silky’, kpɔlɔkpɔlɔ ‘slippery’ and pɔtɔpɔtɔ ‘soggy’. These ideophones seem to form a domain of HAPTIC TOUCH. Another group is comprised of gelegele ‘shiny’, fututu ‘pure white’, kpinakpina ‘black’ and wɔ̃̀rã̀wɔ̃̀rã̀ ‘spotted’. This domain we may summarise as SURFACE APPEARANCE. A further cluster is formed by minimini ‘spherical’ and gìlìgìlì ‘circular’ (these two tightly together) and sɔ̀dzɔ̀lɔ̀ɔ̀ ‘oblong’, miɔmiɔ ‘pointed’ and tagbaraa ‘long’, suggesting a broader domain of SHAPE.

2473932 NV6KKLF7 items 1 0 default asc 2093

Dingemanse, M. (2011). The Meaning and Use of Ideophones in Siwu [PhD dissertation, Radboud University]. http://thesis.ideophone.org/

Tags: clustering, MDS
Noun classification in Siwu

Nouns in Siwu come with noun class prefixes that also mark number (singular, plural, or mass). Most grammars present such classes as simple SG/PL class pairings, making it hard to see underlying regularities. In this diagram, line thickness shows relative frequency. This kind of visualization is helpful for learners but also for linguists, who may be able to use it in work on grammaticalization and change.

2473932 NV6KKLF7 items 1 0 default asc 1987

Dingemanse, M. (2011). The Meaning and Use of Ideophones in Siwu [PhD dissertation, Radboud University]. http://thesis.ideophone.org/

Tags: diagram, linguistics