Publications
2025
- PiCSAR: Probabilistic Confidence Selection And Ranking for Reasoning ChainsarXiv preprint, 2025
- CoMAT: Chain of mathematically annotated thought improves mathematical reasoningIn Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2025
- Theorem Prover as a Judge for Synthetic Data GenerationIn Proceedings of the Association for Computational Linguistics (ACL), 2025
- Are We Done with MMLU?In Proceedings of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025