NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark
AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infr...
Loading latest AI news...
Intelligence Terminal
Source-backed intelligence across model releases, research, policy, tools, funding, and companies.
24 signals found for "Benchmark"
AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infr...
AI agents have fundamentally changed the complexity of inference workloads. Until now, the industry has struggled to ...
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-im...
Railway, a San Francisco-based cloud platform that has quietly amassed two million developers without spending a doll...
For a quarter century, the Google search box has been one of the most recognizable interfaces in computing: a thin wh...
Production GraphRAG portfolio — knowledge graph platform (Neo4j, hybrid retrieval, GNN reranking, RAGAS-evaluated), R...
Privacy-first CLI for running local LLMs — discover, configure, run, benchmark
Research Proposal: MentalBench - unified benchmark evaluating LLMs on clinical knowledge, empathy, and safety simulta...
Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluation...
AI inference benchmarks on Intel Meteor Lake (Core Ultra 7 155H) iGPU — OpenVINO embeddings, OpenVINO GenAI LLM, and ...
Benchmark de LLMs como code agents (desafio ETL CEP Correios), inspirado na metodologia do Akita
Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fu...
Tracking how fast glaciers are shrinking is crucial for measuring the pace of climate change and projecting future se...
The artificial intelligence coding revolution comes with a catch: it's expensive. Claude Code, Anthropic's terminal-b...
Elon Musk's net worth has passed the trillion-dollar mark after SpaceX's IPO. His net worth, which was hovering aroun...
Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowle...
This work presents RepWAM, a representation-centric world action model (WAM) built on representation visual-action to...
Goal-conditioned visual navigation requires a robot to act under partial observability by anticipating how its motion...
Amazon says its data centers use water more efficiently than the industry average and is urging others to improve as ...
Modern conversational agents condition on an ever-growing dialogue history at each turn, incurring redundant attentio...
General-purpose large language models (LLMs) are routinely used as baselines when evaluating specialized pathology mo...
Vision-Language-Action (VLA) models inherit semantic grounding from large-scale pretraining and perform competently a...
Long input sequences are central to document understanding and multi-step reasoning in Large Language Models, yet the...
Recently, large language models (LLMs) have achieved promising progress in the fields of classical Chinese translatio...