Jerid Francom
Associate Professor
Areas of Expertise
Knowledge Management,
Computational Language Applications
Dr. Francom is a graduate of the University of Arizona Linguistics Program. His research focuses on the quantitative study of language using large-scale language archives, or corpora, from a variety of sources such as news, social media, and other internet sources, to better understand the linguistic variation. He has published on topics including the development, annotation, and evaluation of corpora and explored machine learning algorithms for performing text classification and clustering.
http://www.wfu.edu/~francojc
Recent Publications
Francom, J. (Forthcoming). Corpus Studies of Syntax. In Cambridge Handbook of Experimental Syntax, Cambridge University Press
Hulden, M. and Francom, J. (2016). Spanish diacritic error detection and restoration—a survey. In Z. Vetulani and H. Uszkoreit (eds.), Lecture Notes in Artificial Intelligence (LNAI). Springer Verlag. Chapter
Hulden, M., Silfverberg, M., and Francom, J. (2015) Kernel density estimation for text-based geolocation. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI-2015). Paper
Current Projects
Textbook aimed at introducing the fundamental concepts and practical code for applying computational methods to analyze language from text sources.
ACTIV-ES is a comparable Spanish corpus comprised of tv/film dialogue from Argentine, Mexican and Spanish productions. Titles for each of these three countries were seeded from the Internet Movie Database, subtitle data for the hearing impaired was provided by Opensubtitles.org and was post-processed to correct/remove subtitle, OCR and diacritic artifacts and annotated for part-of-speech.