Corpus of Spanish in Southern Arizona

¡Bienvenido/a al Corpus del Español en el Sur de Arizona!

CESA (Corpus del Español en el Sur de Arizona) aims at documenting and disseminating Spanish varieties spoken in Arizona. Graduate and undergraduate students, under PI’s supervision, collect, transcribe and analyze interviews with local Spanish speakers. This digital oral corpus provides material for multiple linguistic analyses of local bilingual Spanish and subsequent comparisons with other varieties.

The United States ranks fifth with respect to number of Spanish speakers in the world. In Arizona, at least one in five residents speaks Spanish at home. While the majority of the local Spanish-speakers descend from Mexican nationals and speak dialects very close to the ones found across the border, Spanish in Arizona has been in sustained contact with English, leading to the insertion of English words, phrases, and grammatical patterns that deviates from the Spanish monolingual forms. This project seeks to provide the first comprehensive documentation, description, and analyses of Spanish in the Tucson area. The CESA-based analyses aim at scientifically testing to what extent Spanish grammar actually converges towards English in the speech of local bilinguals, and to what extent it presents sociolinguistic continuities with the varieties found across the border, in the Sonoran region in Mexico.

In order to be able to document and study Arizona Spanish as it is actually spoken, sociolinguistic interviews are conducted with native speakers of this variety. Volunteers from the Spanish-speaking communities are asked to sit for an interview for approximately 60 minutes to discuss themselves and their community. The interviews are conducted wherever the participant prefers, either in his/her home or in a public space. Interviews are conducted in Spanish, although participants switch to English whenever they want to. Participants are interviewed and digitally recorded in individual sessions that aim at eliciting spontaneous speech. The speech data are transcribed, anonymized, and stored in a password protected online file. The project is IRB-protected and all measures are taken to assure the participants’ anonymity.

In addition to the interview sound file and accompanying transcription, the speaker’s language background and social characteristics, and information about the interviewer and the interview, are stored and available to provide information to the analysts. The website is created by the College of the Humanities Technology Support staff. Anyone interested in accessing the files should contact the PI and fill out a form explaining their background and intentions. Once a user has a password, s/he will have access to the webpage containing the sound files (mp3), the transcriptions (PDF), and general information about the speaker (PDF), such as sex, place and year of birth, and years of schooling.

CESA has been made possible in part by a Humanities Collections and Referral Resources grant from the National Endowment of the Humanities: Exploring the Human Endeavor. However, any views, findings, conclusions, or recommendations expressed in this website and larger project, do not necessarily represent those of the National Endowment for the Humanities.

The CoBiVa UTRGV | (Corpus Bilingüe del Valle), which documents the varieties spoken in Rio Grande Valley in Texas, is a sister corpus to CESA.

¡Bienvenido/a al Corpus del Español en el Sur de Arizona!

User login