Recent Forum Posts
From categories:
page »

Lin, Y., Ding, H., & Zhang, Y. (Accepted). Prosody dominates over semantics in emotion word processing: Evidence from cross-channel and cross-modal Stroop effects. Journal of Speech, Language and Hearing Research.

Purpose: Emotional speech communication involves multisensory integration of linguistic (e.g., semantic content) and paralinguistic (e.g., prosody and facial expressions) messages. Previous studies on linguistic vs. paralinguistic salience effects in emotional speech processing have produced inconsistent findings. In the present study, we investigated the relative perceptual saliency of emotion cues in cross-channel auditory alone task (i.e., semantics-prosody Stroop task) and cross-modal audiovisual task (i.e., semantics-prosody-face Stroop task).

Method: Thirty normal Chinese adults participated in two Stroop experiments with spoken emotion adjectives in Mandarin Chinese. Experiment 1 manipulated auditory pairing of emotional prosody (happy or sad) and lexical semantic content in congruent and incongruent conditions. Experiment 2 extended the protocol to cross-modal integration by introducing visual facial expression during auditory stimulus presentation. Participants were asked to judge emotional information for each test trial according to the instruction of selective attention.

Results: Accuracy and reaction time data indicated that despite an increase in cognitive demand and task complexity in Experiment 2, prosody was consistently more salient than semantic content for emotion word processing and did not take precedence over facial expression. While congruent stimuli enhanced performance in both experiments, the facilitatory effect was smaller in Experiment 2.

Conclusion: Together, the results demonstrate the salient role of paralinguistic prosodic cues in emotion word processing and congruence facilitation effect in multisensory integration. Our study contributes tonal language data on how linguistic and paralinguistic messages converge in multisensory speech processing, and lays a foundation for further exploring the brain mechanisms of cross-channel/modal emotion integration with potential clinical applications.

Keywords: Multimodality, Stroop, Emotion word processing, Paralinguistic cues, Prosody, Facial expression

Miller, S. & Zhang, Y. (Accepted). Neural coding of syllable-final fricatives with and without hearing aid amplification. Journal of the American Academy of Audiology.

The results suggest that hearing aid amplification alters neural representations of syllable-final fricatives in a complex manner. Consistent with results for syllable-initial fricative sounds (Miller & Zhang, 2014), normal-hearing listeners were able to discriminate the contrast with ease, and their aided and unaided ACC components did significantly differ for /s/ versus /sh/, suggesting a differentiation of underlying cortical processing of the speech contrast that is sensitive to the use of hearing aids. Together, the ERP results revealed that hearing aids altered the cortical processing of fricative contrasts across the scalp in both onset and coda positions.

Acclimatization to hearing aid use when measured using longitudinal speech recognition scores has a long time course (Gatehouse 1992). Our results indicate that hearing aid signal processing altered the spectral and temporal properties of fricatives that corresponded with neural response changes to the contrast. Therefore, even though behavioral responses to the fricatives were unaffected by amplification, the brain would need to accommodate these acoustic changes from hearing aid signal processing to recognize the respective sound categories.

Fei Chen from the Department of Chinese and Bilingual Studies, the Hong Kong Polytechnic University is a current visiting scholar at Zhang Lab for the Fall semester 2019.
Research profile:
Google Scholar page:

Hong, T., Wang, J., Zhang, L., Zhang, Y., Shu, H., & Li, P. (2019). Age-sensitive associations of segmental and suprasegmental perception with sentence-level language skills in Mandarin-speaking children with cochlear implants. Research in Developmental Disabilities.


Background and aim: It remains unclear how recognition of segmental and suprasegmental phonemes contributes to sentence-level language processing skills in Mandarin-speaking children with cochlear implants (CIs). Our study examined the influence of implantation age on the recognition of consonants, lexical tones and sentences respectively, and more importantly, the contribution of phonological skills to sentence repetition accuracy in Mandarin-speaking children with CIs.

Methods: The participants were three groups of prelingually deaf children who received cochlear implants at various ages and their age-matched controls with normal hearing. Three tasks were administered to assess their consonant perception, lexical tone recognition and language skills in open-set sentence repetition.

Results: Children with CIs lagged behind NH peers in all the three tests, and performances on segmental, suprasegmental and sentence-level
processing were differentially modulated by implantation age. Furthermore, performances on recognition of consonants and lexical tones
were significant predictors of sentence repetition accuracy in the children with CIs.

Conclusion: Overall, segmental and suprasegmental perception as well as sentence-level processing is impaired in Mandarin-speaking children with CIs compared with age-matched children with NH. In children with CIs recognition of segmental and suprasegmental phonemes at the lower level predicts sentence repetition accuracy at the higher level. More importantly, implantation age plays an important role in the development of phonological skills and higher-order language skills, suggesting that age-appropriate aural rehabilitation and speech intervention programs need to be developed in order to help CI users who receive CIs at different ages.

What this paper adds?

Findings of this study contribute to better understanding of speech perception and language processing in Mandarin-speaking children with
cochlear implants (CIs). Specifically, our results demonstrate that performances of Mandarin-speaking children with CIs on segmental,
suprasegmental, and sentence-level processing were differentially modulated by implantation age. Furthermore, recognition of both
consonants and lexical tones contributes to sentence repetition accuracy in children with CIs. These findings have prognostic implications for
developing post-implant rehabilitation and intervention programs.

Kao, C., & Zhang, Y. (2019). Magnetic source imaging and infant MEG: Current trends and technical advances. Brain Sciences, 9, 181.

Abstract: Magnetoencephalography (MEG) is known for its temporal precision and good spatial resolution in cognitive brain research. Nonetheless, it is still rarely used in developmental research, and its role in developmental cognitive neuroscience is not adequately addressed. The current review focuses on the source analysis of MEG measurement and its potential to answer critical questions on neural activation origins and patterns underlying infants’ early cognitive experience. The advantages of MEG source localization are discussed in comparison with functional Magnetic Resonance Imaging (fMRI) and functional near-infrared spectroscopy (fNIRS), two leading imaging tools for studying cognition across age. Challenges of the current MEG experimental protocols are highlighted, including measurement and data processing, which could potentially be resolved by developing and improving both software and hardware. A selection of infant MEG research in auditory, speech, vision, motor, sleep, cross-modality, and clinical application is then summarized and discussed with a focus on the source localization analyses. Based on the literature review and the advancements of the infant MEG systems and source analysis software, typical practices of infant MEG data collection and analysis are summarized as the basis for future developmental cognitive research.

Keywords: Magnetoencephalography (MEG); infant; cognitive development; source localization; equivalent current dipole (ECD); minimum norm estimation (MNE)

Praat tools
Zhang LabZhang Lab 16 Jul 2019 08:17
in discussion Researchers / General issues » Praat tools

Here is a list of plugins and tools for Praat users to do speech (and nonspeech) analysis and synthesis.

Easy to learn and use

More advanced

More plugins with CPrAN manager

Praat tools by Zhang LabZhang Lab, 16 Jul 2019 08:17

Yu, K., Li, L., Chen, Y., Zhou, Y., Wang, R., Zhang, Y., & Li, P. (2019). Effects of native language experience on Mandarin lexical tone processing in proficient second language learners. Psychophysiology.

Abstract: Learning the acoustic and phonological information in lexical tones is significant for learners of tonal languages. Although there is a wealth of knowledge from studies of second language (L2) tone learning, it remains unclear how L2 learners process acoustic versus phonological information differently depending on whether their first language (L1) is a tonal language. In the present study, we first examined proficient L2 learners of Mandarin with tonal and non-tonal L1 in a behavioral experiment (identifying a Mandarin tonal continuum) to construct tonal contrasts that could differentiate the phonological from the acoustic information in Mandarin lexical tones for the L2 learners. We then conducted an event-related potential (ERP) experiment to investigate these learners’ automatic processing of acoustic and phonological information in Mandarin lexical tones by mismatch negativity (MMN). Although both groups of L2 learners showed similar behavioral identification features for the Mandarin tonal continuum as native speakers, L2 learners with non-tonal L1, as compared with both native speakers and L2 learners with tonal L1, showed longer reaction time to the tokens of the Mandarin tonal continuum. More importantly, the MMN data further revealed distinct roles of acoustic and phonological information on the automatic processing of L2 lexical tones between the two groups of L2 learners. Taken together, the results indicate that the processing of acoustic and phonological information in L2 lexical tones may be modulated by L1 experience with a tonal language. The theoretical implications of the current study are discussed in light of L2 speech learning.

Key words: Mandarin Chinese, L2 lexical tones, acoustic information, phonological information, mismatch negativity (MMN), L1 tonal experience

Cheng, B., Zhang, X., Fan, S., & Zhang, Y. (2019). The role of temporal acoustic exaggeration in high variability phonetic training: A behavioral and ERP study. Frontiers in Psychology (Auditory Cognitive Neuroscience). doi: 10.3389/fpsyg.2019.01178

Abstract: High variability phonetic training (HVPT) has been found to be effective in helping adult learners acquire nonnative phonetic contrasts. The present study investigated the role of temporal acoustic exaggeration by comparing the canonical HVPT paradigm without involving acoustic exaggeration with a modified adaptive HVPT paradigm that integrated key temporal exaggerations in infant-directed speech (IDS). Sixty native Chinese adults participated in the training of the English /i/ and /ɪ/ vowel contrast and were randomly assigned to three subject groups. Twenty were trained with the typical HVPT (the HVPT group), twenty were trained under the modified adaptive approach with acoustic exaggeration (the HVPT-E group), and twenty were in the control group. Behavioral tasks for the pre- and post- tests used natural word identification, synthetic stimuli identification, and synthetic stimuli discrimination. Mismatch negativity (MMN) responses from the HVPT-E group were also obtained to assess the training effects in within- and across-category discrimination without requiring focused attention. Like previous studies, significant generalization effects to new talkers were found in both the HVPT group and the HVPT-E group. The HVPT-E group, by contrast, showed greater improvement as reflected in larger progress in natural word identification performance. Furthermore, the HVPT-E group exhibited more native-like categorical perception based on spectral cues after training, together with corresponding training-induced changes in the MMN responses to within- and across- category differences. These data provide the initial evidence supporting the important role of temporal acoustic exaggeration with adaptive training in facilitating phonetic learning and promoting brain plasticity at the perceptual and pre-attentive neural levels.

Keywords: High variability phonetic training, categorical perception, mismatch negativity, second language learning, acoustic exaggeration

Funding: The research was supported in part by grants from the National Social Science Foundation of China (15BYY005). Yang Zhang additionally received support from
University of Minnesota’s Brain Imaging Grant to work on the manuscript.

Zhang, L., Jiang, W., Shu, H., & Zhang, Y. (In press). Congenital blindness enhances perception of musical rhythm more than melody in Mandarin speakers. Journal of the Acoustical Society of America.

Abstract: This study adopted the Musical Ear Test (Wallentin et al., 20103) to compare musical competence of sighted and congenitally blind Mandarin speakers. On the rhythm subtest, the blind participants outperformed the sighted. On the melody subtest, however, the two groups performed equally well. Compared with sighted speakers of non-tonal languages reported in previous studies (Wallentin et al., 2010; Bhatara et al., 2015), Furthermore, the sighted Mandarin speakers performed better than sighted speakers of non-tonal languages (i.e., Dutch and French) only on the melody subtest. These results indicate that tonal language experience and congenital blindness exert differential influences on musical aptitudes with rhythm perception reflecting a cross-modal compensation effect and melody perception dominated by a cross-domain language-to-music transfer effect.

Keywords: congenital blindness; Mandarin speakers; musical aptitudes; rhythm; melody 

Khosravani, S., Mahnan, A., Yeh, I., Watson, P. J., Zhang, Y., Goding, G., & Konczak, J. (Accepted). Atypical somatosensory-motor cortical response during vowel vocalization in spasmodic dysphonia. Clinical Neurophysiology.


Objective: Spasmodic dysphonia (SD) is a debilitating voice/speech disorder without an effective cure. To obtain a better understanding of the underlying cortical neural mechanism of the disease we analyzed electroencephalographic (EEG) signals of people with SD during voice production.

Method: Ten SD individuals and 10 healthy volunteers produced 50 vowel vocalization epochs of 2500ms duration. Two EEG features were derived: 1) event-related change in spectral power during vocalization relative to rest, 2) inter-regional spectral coherence.

Results: During early vocalization (500-1000ms) the SD group showed significantly larger alpha band spectral power over the left motor cortex. During late vocalization (1000-2500ms) SD patients showed a significantly larger gamma band coherence between left somatosensory and premotor cortical areas.

Conclusions: Two atypical patterns of cortical activity characterize the pathophysiology of spasmodic dysphonia during voice production: 1) a reduced movement-related desynchronization of motor cortical networks, 2) an excessively large synchronization between left somatosensory and premotor cortical areas.

Significance: The pathophysiology of SD is characterized by an abnormally high synchronous activity within and across cortical neural networks involved in voice production that is mainly lateralized in the left hemisphere.

Funding: NIH 1 R01 DC016315-01A1 (PI: JK; co-Investigators: PW, YZ, GG)

Chieh Kao has been selected by the Graduate Fellowship Office to be the recipient of the Interdisciplinary Doctoral Fellowship for the 2019-20 academic year. Congratulations! This prestigious award is a tribute to Chieh's excellent academic record and professional promise. The Fellowship is a non-service award that carries an academic year stipend of $25,000, plus tuition for up to 14 credits per semester at the regular Graduate School rate (the IDF fellowship does not cover collegiate fees or student services fees). Subsidized health insurance will also be included during the academic year and through summer 2020.

Koerner, T.K., & Zhang, Y. (2018). Differential effects of hearing impairment and age on electrophysiological and behavioral measures of speech in noise. Hearing Research, 370, 130-142.

Abstract: Understanding speech in background noise is difficult for many listeners with and without hearing impairment (HI). This study investigated the effects of HI on speech discrimination and recognition measures as well as speech-evoked cortical N1-P2 and MMN auditory event-related potentials (AERPs) in background noise. We aimed to determine which AERP components can predict the effects of HI on speech perception in noise across adult listeners with and without HI. The data were collected from 18 participants with hearing thresholds ranging from within normal limits to bilateral moderate-to-severe sensorineural hearing loss. Linear mixed effects models were employed to examine how hearing impairment, age, stimulus type, and SNR listening condition affected neural and behavioral responses and what AERP components were correlated with effects of HI on speech-in-noise perception across participants. Significant effects of age were found on the N1-P2 but not on MMN, and significant effects of HI were observed on the MMN and behavioral measures. The results suggest that neural responses reflecting later cognitive processing of stimulus discrimination may be more susceptible to the effects of HI on the processing of speech in noise than earlier components that signal the sensory encoding of acoustic stimulus features. Objective AERP responses were also potential neural predictors of speech perception in noise across participants with and without HI, which has implications for the use of AERPs as a potential clinical tool for assessing speech perception in noise.

Keywords: hearing impairment; speech perception; electrophysiology; event-related potentials

Acknowledgments: This research was supported by funding from the University of Minnesota, including the Graduate Research Partnership Program (GRPP) Fellowship and Bryng Bryngelson Research Fund to Koerner, and the Grand Challenges Exploratory Research Grant and Brain Imaging Research Project Award to Zhang. Special thanks are due to Dr. Peggy Nelson for co-advising and consulting on the behavioral measures and Dr. Edward Carney for invaluable assistance with implementing the behavioral speech perception tests used in this project. The authors would also like to thank PhD committee members, Drs. Robert Schlauch and Andrew Oxenham, for comments and suggestions.

Lin, Y., Ding, H., & Zhang, Y. (2018). Emotional Prosody Processing in Schizophrenic Patients: A Selective Review and Meta-Analysis. Journal of Clinical Medicine. (Impact factor: 5.583).

Yi Lin 1, Hongwei Ding 1,* and Yang Zhang2,*
1 Institute of Cross-Linguistic Processing and Cognition, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China; nc.ude.utjs|nil.y.lorac#nc.ude.utjs|nil.y.lorac; nc.ude.utjs|gnidwh#nc.ude.utjs|gnidwh
2 Department of Speech-Language-Hearing Sciences & Center for Neurobehavioral Development, University of Minnesota, Twin Cities, MN, USA; ude.nmu|balgnahz#ude.nmu|balgnahz

  • Correspondence: nc.ude.utjs|gnidwh#nc.ude.utjs|gnidwh; ude.nmu|balgnahz#ude.nmu|balgnahz Tel.: +86 213420 5377; +1 612 624-7818

Abstract: Emotional prosody (EP) has been increasingly recognized as an important area of schizophrenic patients’ dysfunctions in their language use and social communication. The present review aims to provide an updated synopsis on emotional prosody processing (EPP) in schizophrenic disorders with a focus on performance characteristics, the influential factors and underlying neural mechanisms. Literature search up to 2018 was conducted with online databases, and final selections were limited to empirical studies which investigated the prosodic processing of at least one of the six basic emotions in patients with a clear diagnosis of schizophrenia without co-morbid diseases. A narrative synthesis was performed, covering the range of research topics, task paradigms, stimulus presentation, study populations and statistical power with a quantitative meta-analytic approach in Comprehensive Meta-Analysis Version 2.0. Study outcomes indicated that schizophrenic patients’ EPP deficits were consistently observed across studies (d = -0.92, 95% CI = -1.06 < δ < -0.78), with identification tasks (d = -0.95, 95% CI = -1.11 < δ < -0.80) being more difficult to process than discrimination tasks (d = -0.74, 95% CI= -1.03 < δ < -0.44) and emotional stimuli being more difficult than neutral stimuli. Patients’ performance was influenced by both participant- and experiment-related factors. Their social cognitive deficits in EP could be further explained by right-lateralized impairments and abnormalities in primary auditory cortex, medial prefrontal cortex and auditory-insula connectivity. The data pointed to impaired pre-attentive and attentive processes, both of which played important roles in the abnormal EPP in the schizophrenic population. The current selective review and meta-analysis support the clinical advocacy of including EP in early diagnosis and rehabilitation in the general framework of social cognition and neurocognition deficits in schizophrenic disorders. Future cross-sectional and longitudinal studies are further suggested to investigate schizophrenic patients’ perception and production of EP in different languages and cultures, modality forms and neuro-cognitive domains.

Keywords: Emotional prosody processing; Schizophrenia; Meta-analysis

Funding: This research was funded by the interdisciplinary program of Shanghai Jiao Tong University grant number (14JCZ03). The APC was funded by the interdisciplinary program of Shanghai Jiao Tong University grant number (14JCZ03).

Acknowledgments: Zhang was supported by a summer visiting professorship from Shanghai Jiao Tong University to work on the project. We also thank the anonymous reviewers and the editors for providing us with insightful suggestions to revise our review.

Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Congratulations to Luodi Yu, who will start her assistant professor job at School of Psychology in South China Normal University in Fall, 2018!

Dissertation title: An Electrophysiological Investigation of Linguistic Pitch Processing in Tonal-language-speaking Children with Autism

Defense date: 9/7/2018
Revision date: 9/12/2018

Chieh Kao received the Graduate Research Partnership Program Fellowship (mentor: Yang Zhang) to do infant speech perception research. The award carries $4000 stipend and an additional $1000 for subject fees and travel.

Zhang, L., Wang, J., Hong, T., Li, Y., Zhang, Y., & Shu, H. (In press). Mandarin-speaking kindergarten-aged children with cochlear implants benefit from natural F0 patterns in the use of semantic context during speech recognition. Journal of Speech Language and Hearing Research.

Purpose: The purpose of the current study was to investigate the extent to which semantic context and F0 contours affect speech recognition by Mandarin-speaking kindergarten-aged children with CIs.

Method: The experimental design manipulated two factors, i.e., semantic context by comparing the intelligibility of normal sentence vs. word list, and F0 contours by comparing the intelligibility of utterances with natural vs. flat F0 patterns. Twenty- two children with cochlear implants completed the speech recognition test.

Results: Children with cochlear implants could use both semantic context and F0 contours to assist speech recognition. Furthermore, natural F0 patterns provided extra benefit when semantic context was present than when it was absent.

Conclusion: Dynamic F0 contours play an important role in speech recognition by Mandarin-speaking children with cochlear implants despite the well-known limitation of cochlear implant devices in extracting F0 information.

Yu, L., Wang, S., Huang, D., Wu, X., & Zhang, Y. (2018). Role of inter-trial phase coherence in atypical auditory evoked potentials to speech and nonspeech stimuli in children with autism. Clinical Neurophysiology.


Objective: This autism study investigated how inter-trial phase coherence (ITPC) drives abnormalities in auditory evoked potential (AEP) responses for speech and nonspeech stimuli.

Methods: Auditory P1-N2 responses and ITPCs in the theta band (4~7 Hz) for pure tones and words were assessed with EEG data from 15 school-age children with autism and 16 age-matched typically developing (TD) controls.

Results: The autism group showed enhanced P1 and reduced N2 for both speech and nonspeech stimuli in comparison with the TD group. Group differences were also found with enhanced theta ITPC for P1 followed by ITPC reduction for N2 in the autism group. The ITPC values were significant predictors of P1 and N2 amplitudes in both groups.

Conclusions: Abnormal trial-to-trial phase synchrony plays an important role in AEP atypicalities in children with autism. ITPC-driven enhancement as well as attenuation in different AEP components may coexist, depending on the stage of information processing.

Significance: It is necessary to examine the time course of auditory evoked potentials and the corresponding inter-trial coherence of neural oscillatory activities to better understand hyper- and hypo- sensitive responses in autism, which has important implications for sensory-based treatment.

Chieh Kao received a conference travel grant award from the Graduate School (Council of Graduate Students Grant Opportunities Program) for presenting at the 2018 Annual Conference of the Cognitive Neuroscience Society in Boston. Congratulations to Chieh!

Yu, L., & Zhang, Y. (In press). Testing native language neural commitment at the brainstem level: A cross-linguistic investigation of the association between frequency-following response and speech perception. Neuropsychologia.

Abstract: A current topic in auditory neurophysiology is how brainstem sensory coding contributes to higher-level perceptual, linguistic and cognitive skills. This cross-language study was designed to compare frequency following responses (FFRs) for lexical tones in tonal (Mandarin Chinese) and non-tonal (English) language users and test the correlational strength between FFRs and behavior as a function of language experience. The behavioral measures were obtained in the Garner paradigm to assess how lexical tones might interfere with vowel category and duration judgement. The FFR results replicated previous findings about between-group differences, showing enhanced pitch tracking responses in the Chinese subjects. The behavioral data from the two subject groups showed that lexical tone variation in the vowel stimuli significantly interfered with vowel identification with a greater effect in the Chinese group. Moreover, the FFRs for lexical tone contours were significantly correlated with the behavioral interference only in the Chinese group. This pattern of language-specific association between speech perception and brainstem-level neural phase-locking of linguistic pitch information provides evidence for a possible native language neural commitment at the subcortical level, highlighting the role of experience-dependent brainstem tuning in influencing subsequent linguistic processing in the adult brain.

Keywords: Native Language Neural Commitment Theory; Frequency Following Response; speech perception; lexical tones; Garner paradigm

page »
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License