Text-/Web-Scraping for Under-resourced Languages
I'm currently building a corpus for San Lucas Quiaviní Zapotec. I've been scraping Zapotec data from linguistics publications, websites, and Twitter using Python and R.
I'm currently the TA for Gaja Jarosz's graduate-level Cognitive Modeling class. I teach weekly sessions on R and Python, and grade labs on computational techniques for psycholinguistics, like parsing, clustering, and neural nets.
As an RA for Gaja Jarosz, I worked on computational learners for phonological hidden structure.
I worked on improving the efficiency of the implementation of the expectation- and error-driven learners described in Jarosz 2015.
I also wrote a JavaFX GUI for the learners.
I spent a summer as an intern for Dr. Nate Foster in the Computer Science department at Cornell University.
I presented results at POPL 2014.