An open-source model just claimed the top spot on SWE-Bench Pro — the benchmark that's become the de facto measuring stick for agentic software engineering.
PageIndex hit 98.7% accuracy on FinanceBench.