Jerid Francom – Graduate Program in Interpreting and Translation Studies

Jerid Francom

Associate Professor

Areas of Expertise
Knowledge Management,
Computational Language Applications

Dr. Francom is a graduate of the University of Arizona Linguistics Program. His research focuses on the quantitative study of language using large-scale language archives, or corpora, from a variety of sources such as news, social media, and other internet sources, to better understand the linguistic variation. He has published on topics including the development, annotation, and evaluation of corpora and explored machine learning algorithms for performing text classification and clustering.
http://www.wfu.edu/~francojc

francojc@wfu.edu

Recent Publications

Francom, J. (Forthcoming). Corpus Studies of Syntax. In Cambridge Handbook of Experimental Syntax, Cambridge University Press

Hulden, M. and Francom, J. (2016). Spanish diacritic error detection and restoration—a survey. In Z. Vetulani and H. Uszkoreit (eds.), Lecture Notes in Artificial Intelligence (LNAI). Springer Verlag. Chapter

Hulden, M., Silfverberg, M., and Francom, J. (2015) Kernel density estimation for text-based geolocation. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI-2015). Paper

Current Projects

Textbook aimed at introducing the fundamental concepts and practical code for applying computational methods to analyze language from text sources.

ACTIV-ES is a comparable Spanish corpus comprised of tv/film dialogue from Argentine, Mexican and Spanish productions. Titles for each of these three countries were seeded from the Internet Movie Database, subtitle data for the hearing impaired was provided by Opensubtitles.org and was post-processed to correct/remove subtitle, OCR and diacritic artifacts and annotated for part-of-speech.