Can detection of extraneous visual signals reveal the syntactic structure of sign language?



The ability to combine individual lexical items into phrases and sentences is at the core of the human capacity for language (Friederici et al., 2017). Linguistic research indicates that the world’s sign languages exhibit complex hierarchical organisation of utterances just like spoken languages (e.g., Cecchetto, 2017), but the role of hierarchical organisation during online sign language processing is poorly understood. The present study constitutes the first adaptation of the classical psycholinguistic “click” paradigm (e.g., Holmes & Forster, 1970) from the auditory-oral to the visuo-spatial modality. Using short flashes inserted into videos of signed sentences as analogues to auditory clicks, we seek to determine whether deaf signers, like hearing speakers, automatically attribute constituent structure onto sequences of signs during language comprehension.


The paradigm is implemented as an automated reaction-time experiment which can comfortably be run by deaf participants from home via their web browser. Instructions are given in German Sign Language (DGS) in the form of pre-recorded videos. In the experiment, participants watch different types of complex DGS sentences such as (1).


During the presentation of sentences, a white flash (duration: 80 ms) may occur as an overlay to the stimulus clip at different positions in the sentence and participants have to respond to this cue as fast as possible via button press. After every trial, participants have to answer a binary comprehension question (Figure 1). The flash can occur in the first or second half of the sentence. Importantly, the exact point in time when the flash occurs differs with regard to the syntactic structure of a sentence, so that a flash may occur either after a major break in the constituent structure separating two clauses as indicated by “/” in (1), after a minor break, or not at a break. This yields a 2x3 within-subject design with the factors Position (first vs. second half) and Structure (major vs. minor vs. no break). In addition to the six experimental conditions, filler trials (22 %) in which no flash occurs were also included. The stimuli were designed and recorded with a deaf native signer. All clips were annotated using ELAN (Lausberg & Sloeties, 2009) and flashes were inserted using an automated video-editing procedure. In addition, we performed automated motion-tracking on the stimuli using OpenPose (Cao et al., 2019) and extracted motion information using OpenPoseR (Trettenbrein & Zaccarella, 2021) to control our stimuli for a possible correlation between articulatory pauses and the probed constituent structure.


At the time of writing, data collection is still ongoing which is why we will limit our discussion here to the effects we expect to observe. Assuming that the placement of flashes at different positions in the constituent structure of sentences will impact the time that participants take to respond, we expect to observe a main effect of Structure. In particular, faster RTs are expected for detecting a flash at a major (no constituent interrupted) and minor (small number of constituents interrupted) boundaries, compared to the no boundary condition (large number of constituents interrupted). This would provide first psycholinguistic evidence for the relevance of constituent structure during sign language comprehension, expanding previous findings for spoken language. We do not expect to observe a main effect of Position, due to the inclusion of filler trials without any flashes which should counteract the increased probability of requiring a response towards the second half of the sentence, which was inherent to the design of earlier auditory studies (Holmes & Forster, 1970). In sum, the expected effect of Structure would provide evidence for the modality-independence of the cognitive mechanisms underlying syntactic processing.

Figure 1: Example of an experimental trial in which the sentence given in (1) is presented and a flash occurs after the sign VISIT at the major break separating the two clauses. Every trial is followed by a comprehension question.


Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., & Sheikh, Y. (2019). OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. ArXiv:1812.08008 [Cs].

Cecchetto, C. (2017). The syntax of sign language and Universal Grammar. In I. Roberts (Ed.), The Oxford handbook of Universal Grammar. Oxford: Oxford University Press.

Friederici, A. D., Chomsky, N., Berwick, R. C., Moro, A., & Bolhuis, J. J. (2017). Language, mind and brain. Nature Human Behaviour. 1, 713–722.

Holmes, V. M., & Forster, K. I. (1970). Detection of extraneous signals during sentence recognition. Perception & Psychophysics, 7(5), 297–301.

Lausberg, H., & Sloetjes, H. (2009). Coding gestural behavior with the NEUROGES-ELAN system. Behavior Research Methods, 41(3), 841–849.

Trettenbrein, P. C., & Zaccarella, E. (2021). Controlling video stimuli in sign language and gesture research: The OpenPoseR package for analyzing OpenPose motion-tracking data in R. Frontiers in Psychology, 12, 628728.

This talk will be bilingual! We will be co-presenting in spoken English and in International Sign (IS). English ⇔ IS interpreting will be available.

Tübingen, Germany