Browsing by Author "Bergelson, E"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Open Access Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech(Speech Communication, 2019-10-01) Räsänen, O; Seshadri, S; Karadayi, J; Riebling, E; Bunce, J; Cristia, A; Metze, F; Casillas, M; Rosemberg, C; Bergelson, E; Soderstrom, M© 2019 The Authors Automatic word count estimation (WCE) from audio recordings can be used to quantify the amount of verbal communication in a recording environment. One key application of WCE is to measure language input heard by infants and toddlers in their natural environments, as captured by daylong recordings from microphones worn by the infants. Although WCE is nearly trivial for high-quality signals in high-resource languages, daylong recordings are substantially more challenging due to the unconstrained acoustic environments and the presence of near- and far-field speech. Moreover, many use cases of interest involve languages for which reliable ASR systems or even well-defined lexicons are not available. A good WCE system should also perform similarly for low- and high-resource languages in order to enable unbiased comparisons across different cultures and environments. Unfortunately, the current state-of-the-art solution, the LENA system, is based on proprietary software and has only been optimized for American English, limiting its applicability. In this paper, we build on existing work on WCE and present the steps we have taken towards a freely available system for WCE that can be adapted to different languages or dialects with a limited amount of orthographically transcribed speech data. Our system is based on language-independent syllabification of speech, followed by a language-dependent mapping from syllable counts (and a number of other acoustic features) to the corresponding word count estimates. We evaluate our system on samples from daylong infant recordings from six different corpora consisting of several languages and socioeconomic environments, all manually annotated with the same protocol to allow direct comparison. We compare a number of alternative techniques for the two key components in our system: speech activity detection and automatic syllabification of speech. As a result, we show that our system can reach relatively consistent WCE accuracy across multiple corpora and languages (with some limitations). In addition, the system outperforms LENA on three of the four corpora consisting of different varieties of English. We also demonstrate how an automatic neural network-based syllabifier, when trained on multiple languages, generalizes well to novel languages beyond the training data, outperforming two previously proposed unsupervised syllabifiers as a feature extractor for WCE.Item Open Access Developing a Cross-Cultural Annotation System and MetaCorpus for Studying Infants’ Real World Language Experience(Collabra: Psychology, 2021-05-25) Soderstrom, M; Casillas, M; Bergelson, E; Rosemberg, C; Alam, F; Warlaumont, AS; Bunce, JRecent issues around reproducibility, best practices, and cultural bias impact naturalistic observational approaches as much as experimental approaches, but there has been less focus on this area. Here, we present a new approach that leverages cross-laboratory collaborative, interdisciplinary efforts to examine important psychological questions. We illustrate this approach with a particular project that examines similarities and differences in children’s early experiences with language. This project develops a comprehensive start-to-finish analysis pipeline by developing a flexible and systematic annotation system, and implementing this system across a sampling from a “metacorpus” of audiorecordings of diverse language communities. This resource is publicly available for use, sensitive to cultural differences, and flexible to address a variety of research questions. It is also uniquely suited for use in the development of tools for automated analysis.Item Open Access Quantifying Sources of Variability in Infancy Research Using the Infant-Directed-Speech Preference(Advances in Methods and Practices in Psychological Science, 2020-03) Frank, MC; Alcock, KJ; Arias-Trejo, N; Aschersleben, G; Baldwin, D; Barbu, S; Bergelson, E; Bergmann, C; Black, AK; Blything, R; Böhland, MP; Bolitho, P; Borovsky, A; Brady, SM; Braun, B; Brown, A; Byers-Heinlein, K; Campbell, LE; Cashon, C; Choi, M; Christodoulou, J; Cirelli, LK; Conte, S; Cordes, S; Cox, C; Cristia, A; Cusack, R; Davies, C; de Klerk, M; Delle Luche, C; de Ruiter, L; Dinakar, D; Dixon, KC; Durier, V; Durrant, S; Fennell, C; Ferguson, B; Ferry, A; Fikkert, P; Flanagan, T; Floccia, C; Foley, M; Fritzsche, T; Frost, RLA; Gampe, A; Gervain, J; Gonzalez-Gomez, N; Gupta, A; Hahn, LE; Hamlin, JK; Hannon, EE; Havron, N; Hay, J; Hernik, M; Höhle, B; Houston, DM; Howard, LH; Ishikawa, M; Itakura, S; Jackson, I; Jakobsen, KV; Jarto, M; Johnson, SP; Junge, C; Karadag, D; Kartushina, N; Kellier, DJ; Keren-Portnoy, T; Klassen, K; Kline, M; Ko, ES; Kominsky, JF; Kosie, JE; Kragness, HE; Krieger, AAR; Krieger, F; Lany, J; Lazo, RJ; Lee, M; Leservoisier, C; Levelt, C; Lew-Williams, C; Lippold, M; Liszkowski, U; Liu, L; Luke, SG; Lundwall, RA; Cassia, VM; Mani, N; Marino, C; Martin, A; Mastroberardino, M; Mateu, V; Mayor, J; Menn, K; Michel, C; Moriguchi, Y; Morris, B; Nave, KM; Nazzi, TPsychological scientists have become increasingly concerned with issues related to methodology and replicability, and infancy researchers in particular face specific challenges related to replicability: For example, high-powered studies are difficult to conduct, testing conditions vary across labs, and different labs have access to different infant populations. Addressing these concerns, we report on a large-scale, multisite study aimed at (a) assessing the overall replicability of a single theoretically important phenomenon and (b) examining methodological, cultural, and developmental moderators. We focus on infants’ preference for infant-directed speech (IDS) over adult-directed speech (ADS). Stimuli of mothers speaking to their infants and to an adult in North American English were created using seminaturalistic laboratory-based audio recordings. Infants’ relative preference for IDS and ADS was assessed across 67 laboratories in North America, Europe, Australia, and Asia using the three common methods for measuring infants’ discrimination (head-turn preference, central fixation, and eye tracking). The overall meta-analytic effect size (Cohen’s d) was 0.35, 95% confidence interval = [0.29, 0.42], which was reliably above zero but smaller than the meta-analytic mean computed from previous literature (0.67). The IDS preference was significantly stronger in older children, in those children for whom the stimuli matched their native language and dialect, and in data from labs using the head-turn preference procedure. Together, these findings replicate the IDS preference but suggest that its magnitude is modulated by development, native-language experience, and testing procedure.Item Open Access Social and Environmental Contributors to Infant Word Learning.(CogSci, 2013) Bergelson, E; Swingley, DItem Open Access Virtual machines and containers as a platform for experimentation(Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016-01-01) Metze, F; Riebling, E; Warlaumont, AS; Bergelson, ECopyright © 2016 ISCA. Research on computational speech processing has traditionally relied on the availability of a relatively large and complex infrastructure, which encompasses data (text and audio), tools (feature extraction, model training, scoring, possibly on-line and off-line, etc.), glue code, and computing. Traditionally, it has been very hard to move experiments from one site to another, and to replicate experiments. With the increasing availability of shared platforms such as commercial cloud computing platforms or publicly funded super-computing centers, there is a need and an opportunity to abstract the experimental environment from the hardware, and distribute complete setups as a virtual machine, a container, or some other shareable resource, that can be deployed and worked with anywhere. In this paper, we discuss our experience with this concept and present some tools that the community might find useful. We outline, as a case study, how such tools can be applied to a naturalistic language acquisition audio corpus.