CV
Education
- Ph.D. 2006. Carnegie Mellon University, School of Computer Science, Language Technologies Institute.
- Advisors: Alon Lavie (Language Technologies) and Brian MacWhinney (Psychology)
- Additional committee members: Jaime Carbonell, Lori Levin, John Carroll (Univ. of Sussex)
- Dissertation title: A multi-strategy approach to parsing of grammatical relations in child language transcripts
- M.S. 2003. Carnegie Mellon University, School of Computer Science, Language Technologies Institute
- B.S. 1999. University of California, Los Angeles. Computer Science with Linguistics minor
Academic Positions
- Chair, Deapartment of Linguistics. 2023 - present. University of California, Davis.
- Professor, Department of Linguistics. 2023 - present. University of California, Davis, Department of Linguistics.
- Director, linguistics graduate program 2018 - 2022. University of California, Davis.
- Associate Professor. 2020 - 2023. University of California, Davis, Department of Linguistics.
- Assistant Professor. 2016 - 2020. University of California, Davis, Department of Linguistics.
- Research Assistant Professor. 2010 - 2015. University of Southern California, Computer Science Department.
- Research Scientist. 2008 - 2015. University of Southern California, Institute for Creative Technologies.
- Visiting Scholar. Summer 2011. Johns Hopkins University, Center for Language and Speech Processing (CLSP Summer Workshop).
- Research Associate. 2006 - 2008. University of Tokyo, Computer Science (TsujiiLab).
Industry Positions
- Co-founder. 2015 - 2016. KITT.AI, a tech startup (acquired by Baidu).
- Software Engineer. 1998 - 1999. InQuizit Technologies (later Cognition, acquired by Nuance).
Teaching
- Computational Linguistics. UC Davis LIN 177. Once or twice per year (90 to 120 students), 2016 - present.
- Text Processing and Corpus Linguistics. UC Davis LIN 127. Once per year (60 students), 2016 - present.
- Graduate seminar on Computational Linguistics. UC Davis LIN 205x. Once per year (15 students), 2016 - present.
- Applied Natural Language Processing. USC CSCI 544. Spring 2014 (59 students), Spring 2015 (205 students).
- Foundations of Artificial Intelligence. USC CSCI 561 (56 students). 2012.
Service (external)
- General Chair, 16th International Conferene on Parsing Technologies (IWPT 2020).
- General Chair, 15th International Conferene on Parsing Technologies (IWPT 2017).
- Secretary, ACL Special Interest Group on Natural Language Parsing (SIGPARSE). 2020 - present.
- Information Officer, ACL Special Interest Group on Natural Language Parsing (SIGPARSE). 2005 - 2020.
- Editorial board, Computational Linguistics. 2015 - 2017.
- Associate Editor, ACM TALLIP. 2015 - present.
- Senior Area Chair or Area Chair for various NLP and computational linguistics conferences (e.g ACL, NAACL, Coling, EMNLP, EACL).
- Standing reviewer, Transactions of the Association for Computational Linguistics (TACL)
- Ad hoc reviewer. Computational Linguistics, Language, Liguistics Vanguard, JAIR, Computational Intelligence, Journal of Child Language, Journal of Natural Language Engineering, Computer Speech and Language, ACM Transactions of Asian Language and Information Processing.
- Conference program committee member (reviewer). ACL, NAACL, EACL, IJCNLP, EMNLP, COLING, CoNLL, IJCAI, IVA, various ACL workshops.
- NSF panel member, 2018, 2010.
- Exhibits chair, NAACL 2010.
Students advised and committees
- Dian Yu, PhD 2022. PhD advisor, Computer Science, UC Davis.
- Sam Davidson, PhD candidate. PhD advisor, Linguistics, UC Davis.
- Aly Butler. MA 2022. MA advisor, Linguistics, UC Davis.
- Zoey Liu, PhD 2020. PhD advisor, Linguistics, UC Davis.
- Justin Garten, PhD, 2018. PhD advisor (with Morteza Dehghani).
- Vincent Hellendoorn, PhD 2020. Qualifying committee and dissertation committee.
- Qing Dou, PhD, 2015. Proposal committee and thesis committee.
- Christopher Weinberg, PhD. Proposal committee.
- Himanshu Joshi, PhD, 2018. Proposal committee.
- Shannon Lubetich, Pomona college undergraduate. Intership supervisor. (Third place in the 2015 ACM student research competition)
- Arne Kohn, PhD, University of Hamburg. Internship supervisor.
- Jay Whang, USC undergraduate. Directed research supervisor.
- More than 10 master’s students supervised in directed research at USC.
Funding Received
- NSF CICI SSC: TrOnto - A community-based ontology for a trustworthy scientific cyberspace (co-PI). 2018 - 2020. $600k
- Navy: Soliloquy, STTR subcontract from Soartech (PI of UC Davis subcontract). 2017 - 2018. $60k
- UC Davis faculty senate grant: COWS-L2H (co-PI). $20k
- ARO: Natural Language Graphs (PI). 2014 - 2015. $200k
- ARO: Always Available Support Agents (co-PI). 2014 - 2017. $800k
- ARO: Automatic Analysis of Discourse Structure (PI). 2013 - 2014. $300k
- DARPA Narrative Networks: Culture-specific neurobiological models of the influence of narrative framing (co-PI). 2012 - 2015. $1.5M
- NFS IIS RI: Incremental speech processing for rapid dialogue (co-PI). 2012 - 2015. $490k.
- NSF IIS HCC: Modeling human communication dynamics (co-PI). 2011 - 2014. $490k.
- TATRC: Advancing speech recognition technology to support training with virtual humans. 2011 - 2013. $1.5M.
- Google faculty award: Linear-time dynamic programming for parsing and translation (co-PI). 2011 - 2012. $75k.
- ARO: Enabling rapid definition of sophisticated dialogue policies without programming (PI). 2010 - 2011. $100k.
Selected Publications
See my page on Google Scholar for a more complete list of publications (100+ peer reviewed papers).
Taiqi He, Megan A Boudewyn, John E Kiat, Kenji Sagae, Steven J Luck. (2022). Neural correlates of word representation vectors in natural language processing models: Evidence from representational similarity analysis of event-related brain potentials. Psychophysiology, 59(3).
Kenji Sagae. (2021). Tracking Child Language Development with Neural Network Language Models. Frontiers in Psychology, 12.
Dian Yu, Taiqi He and Kenji Sagae. (2021). Language embeddings for typology and cross-lingual transfer learning. Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021).
Dian Yu and Kenji Sagae. (2021). Automatically Exposing Problems with Neural Dialog Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021).
Justin Garten, Brendan Kennedy, Kenji Sagae and Morteza Dehghani. (2019). Measuring the importance of context when modeling language comprehension. Behavior Research Methods.
Casey Casalnuovo, Kenji Sagae, Prem Devanbu. (2018). Studying the difference between natural and programming language corpora. Empirical Software Engineering.
Garten, J., Kennedy, B., Hoover, J., Sagae, K., Dehghani, M. (2018). Incorporating Demographic Embeddings into Language Understanding. Cognitive Science.
Ashish Vaswani* and Kenji Sagae*. (2016). Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States. Transactions of the Assocation for Computational Linguistics.
Ashish Vaswani, Yonatan Bisk, Kenji Sagae, Ryan Musa. (2016). Supertagging with LSTMs. Proceesings of NAACL.
Morteza Dehghani, Kenji Sagae, Sonya Sachdeva, Jonathan Gratch. (2014). Analyzing political rhetoric in conservative and liberal weblogs related to the construction of the Ground Zero Mosque. Journal of Information Technology and politics, 11(1), 1-14.
S. Lubetich and K. Sagae. Data-driven measurement of child language development. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2151–2160, 2014.
A. Kohn, U C. Lao, A. B. Zadeh., K. Sagae. Parsing Morphologically Rich Languages with (Mostly) Off-The-Shelf Software and Word Vectors. In Proceedings of the 2014 Shared Task of the COLING Workshop on Statistical Parsing of Morphologically Rich Languages. 2014.
K. Sagae, A. S. Gordon, M. Dehghani, M. Metke, J.S. Kim, S.I. Gimbel, C. Tipper, J. Kaplan and M.H. Immordino-Yang. A Data-Driven Approach for Classification of Subjectivity in Personal Narratives In Proceedings of the Fourth Workshop on Computational Models of Narrative, OASICS OpenAccess Series in Informatics, Volume 32. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. 2013.
L. Huang and K. Sagae. Dynamic programming for linear-time incremental parsing. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 1077–1086. 2010.
K. Sagae, E. Davis, A. Lavie, B. Macwhinney, and S. Wintner. Morphosyntactic annotation of childes transcripts. Journal of child language, 37(03):705–729, 2010. DOI 10.1017/S0305000909990407. Copyright Cambridge University Press.
K. Sagae. Analysis of discourse structure with syntactic dependencies and data-driven shift-reduce parsing. In Proceedings of the 11th International Conference on Parsing Technologies (IWPT), pages 81–84. 2009.
D. DeVault, K. Sagae, and D. Traum. Incremental interpretation and prediction of utterance meaning for interactive dialogue. Dialogue & Discourse, 2(1):143–170, 2011.
Y. Miyao, K. Sagae, R. Saetre, T. Matsuzaki, and J. Tsujii. Evaluating contributions of natural language parsers to protein–protein interaction extraction. Bioinformatics, 25(3):394, 2009.
K. Sagae and J. Tsujii. Shift-reduce dependency DAG parsing. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING) - Volume 1, pages 753–760. 2008.
K. Sagae, Y. Miyao, T. Matsuzaki, and J. Tsujii. Challenges in mapping of syntactic representations for framework-independent parser evaluation. Proceedings of the Workshop on Automated Syntatic Annotations for Interoperable Language Resources at the First International Conference on Global Interoperability for Language Resources (ICGL’08), 2008.
Y. Miyao, R. Saetre, K. Sagae, T. Matsuzaki, and J. Tsujii. Task-oriented evaluation of syntactic parsers and their representations. Proceedings of ACL-08: HLT, pages 46–54, 2008.
K. Sagae and J. Tsujii. Dependency parsing and domain adaptation with LR models and parser ensembles. Proceedings of the CoNLL shared task session of EMNLP-CoNLL, 7:1044–1050, 2007.
K. Sagae, Y. Miyao, and J. Tsujii. HPSG parsing with shallow dependency constraints. Proceedings of the 44th Meeting of the Association for Computational Linguistics (ACL’07). Prague, Czech Republic. 2007.
K. Sagae and A. Lavie. Parser combination by reparsing. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pages 129–132. 2006.
K. Sagae and A. Lavie. A best-first probabilistic shift-reduce parser. In Proceedings of the COLING/ACL on Main conference poster sessions, pages 691–698. 2006.
K. Sagae, A. Lavie, and B. MacWhinney. Automatic measurement of syntactic development in child language. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 197–204. 2005.
K. Sagae and A. Lavie. A classifier-based parser with linear run-time complexity. In Proceedings of the Ninth International Workshop on Parsing Technology (IWPT), pages 125–132. 2005.
A. Lavie, K. Sagae, and S. Jayaraman. The significance of recall in automatic metrics for MT evaluation. Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas (AMTA’04), pages 134–143, 2004.