Chinese startup Z.ai has launched GLM-5.2, a powerful AI model for complex coding projects. This new large language model ...
It allows engineering teams to host frontier-level AI on their own sovereign infrastructure, entirely eliminating vendor lock ...
2don MSNOpinion
Multilingual benchmark evaluates how well AI interprets clinical text and health records in nine languages
Researchers at Mass General Brigham recently developed BRIDGE, a multilingual benchmark that evaluates how well large ...
A team of nine researchers at Sina Weibo has introduced VibeThinker-3B, a compact language model that reportedly matches or ...
What if the tools we trust to measure progress are actually holding us back? In the rapidly evolving world of large language models (LLMs), AI benchmarks and leaderboards have become the gold standard ...
There is a temptation, when AI systems begin to outperform human baselines on established tests, to interpret this as a sign ...
Have you ever wondered why off-the-shelf large language models (LLMs) sometimes fall short of delivering the precision or context you need for your specific application? Whether you’re working in a ...
B, a 3-billion-parameter AI model, is challenging OpenAI, Google and DeepSeek on math and coding benchmarks while reigniting ...
A multilingual benchmark of 1,886 vaccine-related questions found that large language models answered most items accurately ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results