Software developers working on complex, multi-file projects now have a new tool to evaluate after Microsoft released MAI-Code ...
Yann LeCun world model research crossed a milestone in May 2026: two preprints from his group formally proved when the JEPA ...
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A team of Abacus.AI, New York University, ...
OpenAI today detailed o3, its new flagship large language model for reasoning tasks. The model’s introduction caps off a 12-day product announcement series that started with the launch of a new ...
Google confirmed that Gemini 3.5 Pro, the most powerful model in its Gemini lineup, is already running inside the company and ...
Microsoft's new vulnerability-scanning system, codenamed MDASH, scored 88.45% on the CyberGym benchmark, surpassing ...
Datacurve's new DeepSWE benchmark puts GPT-5.5 ahead of Claude and challenges older AI coding rankings by arguing verifier design can distort results.