Manuscripts
StudentEval: a Benchmark of Student-Written Prompts for Large Language Models of Code
Hannah Babe, Sydney Nguyen, Yangtian Zi, Arjun Guha, Molly Q Feldman, and Carolyn Jane Anderson.
Arxiv draft
HuggingFace dataset
Untangling classes of context-sensitivity: a closer look at the semantics of American English tomorrow.
Carolyn Jane Anderson. Submitted.
2019 draft on LingBuzz
The andative and venitive construction in San Lucas Quiaviní Zapotec.
Carolyn Jane Anderson. 2017. Ms.
Draft on LingBuzz
2023
StarCoder: May the Source Be With You!
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Randy, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Suriya Gunasekar, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries. Accepted to Transactions on Machine Learning Research
Draft
Protagonist-mediated perspective
Carolyn Jane Anderson and Arjun Guha. Poster accepted to Sinn und Bedeutung 28
Solving and Generating NPR Sunday Puzzles with Large Language Models
Jingmiao Zhao and Carolyn Anderson. Accepted to the International Conference on Computational Creativity (ICCC) 2023
MultiPL-E: A Scalable and Extensible Approach to Benchmarking NL2Code for 18 Programming Languages
Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. Accepted to IEEE Transactions on Software Engineering
Draft on arxiv
Do All Minority Languages Look the Same to Chat-GPT? Linguistic (Mis)information in a Large Language Model.
Sydney Nguyen and Carolyn Jane Anderson. Poster to be presented at the Society for Computation in Linguistics (SCiL) 2023.
Cross-linguistic differences in processing parentheticals between English and Korean.
Yoolim Kim and Carolyn Jane Anderson. Accepted for presentation at Comparative Punctuation Worldwide.
SantaCoder: Don’t Reach For the Stars!
Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Terry Yue Zhuo, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Michael Lappert, Ian Yu, Paulo Villegas, Jia Li, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Arjun Guha, Harm de Vries, Leandro von Werra.
Best Paper Award at Deep Learning 4 Code (DL4C) workshop.Draft
Grammatical perspective-taking in comprehension and production.
Carolyn Jane Anderson and Brian Dillon. Open Mind.
Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course
Skylar Kolisko and Carolyn Jane Anderson. Proceedings of the Thirteenth Symposium on Educational Advances in Artificial Intelligence (EAAI-23).
Preprint
2022
Eliciting Associated Motion Constructions in Two Zapotec Languages
Fe Silva-Robles, Felipe H. Lopez, John Duff, and Carolyn Jane Anderson. Semantic Fieldwork Methods
Protagonist-Mediated Perspective
Carolyn Jane Anderson. Talk to be given at the Narration in Context workshop at the Deutsche Gesellschaft für Sprachwissenschaft (DGfS), 2022.
(Some) parentheses are focus-sensitive operators
Carina Bolaños Lewen and Carolyn Jane Anderson. Proceedings of Sinn und Bedeutung (SuB) 26.
Abstract
2021
ProSPer: Probing Human and Neural Network Language Model Understanding of Spatial Perspective.
Tessa Masis and Carolyn Jane Anderson. Accepted to the BlackboxNLP workshop at the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021.
Preprint
Solver-based Gradual Type Migration.
Luna Phipps-Costin, Carolyn Jane Anderson, Michael Greenberg, and Arjun Guha. Accepted to the ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA) 2021.
Tell Me Everything You Know: A Conversation Update System for the Rational Speech Acts Framework
Carolyn Jane Anderson. Proceedings of the Society for Computation in Linguistics (SCiL) 2021.
Paper
Coming in, or going out? Measuring the effect of discourse factors on perspective prominence
Diagnosing the semantics of perspectival expressions
Carolyn Jane Anderson. Poster presented at the annual meeting of the Linguistic Society of America (LSA) 2021.
Abstract
2020
Shifting the Perspectival Landscape: Methods for Encoding, Identifying, and Selecting Perspectives.
Carolyn Jane Anderson. Dissertation, University of Massachusetts, Amherst.
LingBuzz
Can neural network language models understand spatial perspective?
Carolyn Jane Anderson and Tessa Masis. Paper presented at Bridging AI and Cognitive Science (BAICS), at the International Conference on Learning Representations (ICLR) 2020.
Non-archival paper
2019
Guess Who's Coming (And Who's Going): Bringing Perspective to the Rational Speech Acts Framework.
Carolyn Jane Anderson and Brian Dillon. Proceedings of the Society for Computation in Linguistics (SCiL) 2019.
Paper Poster
"Tomorrow" Isn't Always A Day Away.
Taking other perspectives into account: an RSA model of perspectival reasoning.
Carolyn Jane Anderson and Brian Dillon. Talk given at Rational Approaches in Language Science (RAiLS) 2019.
Explaining the progressive motion verb puzzle in Zapotec.
Carolyn Jane Anderson. Talk given at the Texas Linguistics Society 2019.
Slides
2018
"Tomorrow" Isn't Always A Day Away.
Carolyn Jane Anderson. Poster presented at the 31st annual CUNY Human Sentence Processing Conference (CUNY) 2018.
Abstract
The San Lucas Quiaviní Zapotec Andative and Venitive.
2017
The Andative and Venitive Construction in San Lucas Quiaviní Zapotec.
2016
Negation in Colonial Valley Zapotec.
Carolyn Jane Anderson and Brook Danielle Lillehaugen. Transactions of the Philological Society 114(3).
2015
The Morphosyntax of Negation in Colonial Valley Zapotec.
2014
NetKAT: Semantic Foundations for Networks.
Carolyn J. Anderson, Nate Foster, Arjun Guha, Jean-Baptiste Jeannin, Dexter Kozen, Cole Schlesinger, and David Walker. ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL) 2014.
PDF Slides
La morfosintaxis de la negation en el zapoteco del Valle colonial.
Carolyn Jane Anderson and Brook Danielle Lillehaugen. Talk presented at Coloquio sobre Lenguas Otomangues y Vecinas IV: Mario Molina Cruz (COLOV) 2014.
Abstract
"I talk it and I feel it": Language attitudes of Moroccan university students
Carolyn Jane Anderson. Honors thesis, Swarthmore College.
2013
Language Ideology and Human Rights Doctrine in Morocco.
Carolyn Anderson. Talk presented at New Ways of Analyzing Variation (NWAV) 42.