Contribute

Interested in contributing or formalizing your own paper?

To get started in formalizing your own paper, clone the repository and open an LLM agent tool (I use Codex with GPT 5.5 in xhigh thinking mode). Give the agent the paper link, and ask it to formalize the paper using the skill and workflow in the repository. (And please let me know what your experience is like).

New paper formalizations should start in a private workflow and be proposed to enter the library through a pull request when ready. For questions or contributions, contact ngarg@cornell.edu. See CONTRIBUTING.md for more detail.

Library: reusable components

The shared library contains paper-independent Lean infrastructure that current paper formalizations reuse.

Component Examples Lines of code
Foundations: finite math and graph tools Finite-sum rewrites, order/rank lemmas, threshold and interval characterizations, asymptotic/exponential estimates, and cycle extraction in finite directed graphs. 8,223
Foundations: probability and stochastic processes Finite distributions, conditional expectations, kernels, atom and variance lemmas, Gaussian and exponential calculations, stochastic dominance, Markov chains, MDPs, CTMCs, and renewal-reward identities. 50,319
Foundations: optimization and certificates Argmax and endpoint principles, finite-search certificates, approximation guarantees, linear-program certificates, binary-choice optimality, move-graph descent, and choice-equilibrium existence. 2,753
Rating-system applications Binary and ordinal signal models, posterior-mean ratings, bias/variance decompositions, prior-weighted updates, monotonicity/correction lemmas, and minimal bandit-regret interfaces. 502
Matching markets Preference profiles, blocking pairs and stable matchings, deferred-acceptance invariants, proposer incentives, and many-to-one admissions models. 3,418
Auctions and mechanisms Allocation and payment rules, dominant-strategy truthfulness, benchmark-competitive digital-goods auctions, VCG-style welfare maximization, and single-minded set-packing mechanisms. 12,347
Online algorithms and regret Online matching/allocation state machines, primal-dual accounting, competitive-ratio certificates, regret interfaces, and platform-learning abstractions. 4,372
Complexity abstractions Decision/search problem interfaces, reductions, NP/ZPP-style class consequences, Yao-style minimax wrappers, and explicit external-solver assumptions. 463
Applications: recommender systems Exposure and allocation policies, classwise fairness constraints, top-k recommendation surfaces, policy averaging, and accuracy/diversity trade-off statements. 2,714
Social choice, rankings, and fair division Ranking profiles, Kendall distance and Mallows-style sequential ranking models, score/payoff interfaces, envy graphs, bounded-envy allocations, and indivisible-goods fairness statements. 9,540

Current status

The public repository currently contains formalized papers and selected public partial formalizations whose remaining assumptions are explicit. Human review and LLM-as-judge statement translation refer to validation that the Lean statement corresponds to the corresponding mathematical statement in the paper.

Paper Status/Artifacts Full proof LOC Note Human review LLM-as-judge statement translation
College Admissions and the Stability of Marriage by D. Gale and L. S. Shapley; American Mathematical Monthly, 1962. Formalized / DAG 388 This only uses a few lines of code as its infrastructure has largely been elevated to the shared matching library. 0/7 reviewed 7/7 match
The Economics of Matching: Stability and Incentives by Alvin E. Roth; Mathematics of Operations Research, 1982. Formalized / DAG 8,890 0/27 reviewed 27/27 match
Competitive Auctions and Digital Goods by Andrew V. Goldberg, Jason D. Hartline, and Andrew Wright; SODA, 2001. Formalized / DAG 14,469 Formalizes the SODA paper; Theorem 8.2 uses the refined monotone-auction wording from the journal version Goldberg-Hartline-Karlin-Saks-Wright 2006. 0/18 reviewed 18/18 match
AdWords and Generalized Online Matching by Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay V. Vazirani; Journal of the ACM, 2007. Formalized / DAG 13,480 0/26 reviewed 26/26 match
Who is in Your Top Three? Optimizing Learning in Elections with Many Candidates by Nikhil Garg, Lodewijk Gelauff, Sukolsak Sakshuwong, Ashish Goel; HCOMP, 2019. Formalized / DAG 34,984 0/10 reviewed 10/10 match
Designing Informative Rating Systems: Evidence from an Online Labor Market by Nikhil Garg, Ramesh Johari; Manufacturing & Service Operations Management 23(3), 2020. Formalized / DAG 6,921 0/8 reviewed 7/8 match; 1 uncertain
Test-optional Policies: Overcoming Strategic Behavior and Informational Gaps by Zhi Liu and Nikhil Garg; EAAMO, 2021. Formalized / DAG 125,476 0/16 reviewed 16/16 match
Driver Surge Pricing by Nikhil Garg and Hamid Nazerzadeh; Management Science, 2022. Formalized / DAG 142,131 0/24 reviewed 24/24 match
User-item fairness tradeoffs in recommendations by Sophie Greenwood, Sudalakshmee Chiniah, and Nikhil Garg; NeurIPS, 2024. Formalized / DAG 45,906 0/18 reviewed 18/18 match
Addressing Discretization-Induced Bias in Demographic Prediction by Evan Dong, Aaron Schein, Yixin Wang, and Nikhil Garg; PNAS Nexus, 2025. Formalized / DAG 26,044 0/32 reviewed 30/32 match; 2 mismatch
Balancing Producer Fairness and Efficiency via Prior-Weighted Rating System Design by Thomas Ma, Michael S. Bernstein, Ramesh Johari, and Nikhil Garg; ICWSM, 2025. Formalized / DAG 601 Formalization required an interior-quality assumption (0 < q_v < 1) for the strict variance-decrease statement. 10/17 reviewed; 2 uncertain 17/17 match
Truth Revelation in Approximately Efficient Combinatorial Auctions by Daniel Lehmann, Liadan Ita O'Callaghan, and Yoav Shoham; Journal of the ACM, 2002. Partially formalized / DAG 7,436 Greedy approximation, truthfulness, and Theorem 6.1 reductions are formalized. Full formalization requires computational complexity results that are out of scope. 0/30 reviewed 30/30 match
On Approximately Fair Allocations of Indivisible Goods by Richard J. Lipton, Evangelos Markakis, Elchanan Mossel, and Amin Saberi; ACM EC, 2004. Partially formalized / DAG 80,264 Sections 2 and 4 are fully formalized. Section 3 has query/descent/rounded-search support. The PTAS/FPTAS runtime layer needs reusable fixed-dimension IP complexity infrastructure. 0/33 reviewed 33/33 match
Reconciling the Accuracy-Diversity Trade-off in Recommendations by Kenny Peng, Manish Raghavan, Emma Pierson, Jon Kleinberg, and Nikhil Garg; The ACM Web Conference, 2024. Partially formalized / DAG 48,376 Proposition 2's printed finite bound appears to miss a factor of 2; Lean proves the corrected finite bound, which is sufficient for the asymptotic 1/2-homogeneity result. Fully formalizing the remaining result, Proposition 4, requires a general Laplace-principle-related analysis library. 0/27 reviewed 27/27 match