researchHugging Face (Krispin Wandel)11d ago
ViT-Up: Faithful Feature Upsampling for Vision Transformers
Vision Transformers (ViTs) have become a dominant architecture for visual representation learning, providing exceptio...
latestSihombing; Noel Matthew Imaniku; Raz; Hashfi Fauzan; Siahaan; Sandra Rosa Uli; Shinaya Yemia Retra; Rifa'i; Ahmad Dwi; Nasution; Muhammad Hikmal Duha (Sihombing; Noel Matthew Imaniku; Raz; Hashfi Fauzan; Siahaan; Sandra Rosa Uli; Shinaya Yemia Retra; Rifa'i; Ahmad Dwi; Nasution; Muhammad Hikmal Duha)4d ago
Frontiers | Machine Learning Models for Predicting Postoperative Acute Kidney Injury in Pediatric Cardiac Surgery: A Systematic Review and Meta-Analysis
Sihombing; Noel Matthew Imaniku; Raz; Hashfi Fauzan; Siahaan; Sandra Rosa Uli; Shinaya Yemia Retra; Rifa'i; Ahmad Dwi...
modelsarXiv (Miso Choi)8d ago
The Truth Stays in the Family: Enhancing Contextual Grounding via Inherited Truthful Heads in Model Lineages
Recent advances in large language models (LLMs) have produced many specialized multimodal LLMs (MLLMs) that share com...
Mistral
Models
Mistral
Meta AI
researchIEEE Spectrum AI (Angelique Parashis)3d ago
IEEE Rolls Out Large Language Models Virtual Training Course
Large language models have moved out of the research lab and into engineers’ daily workflow. LLMs serve as reasoning ...
researchHugging Face (Haoran You)12d ago
HiLo-Token: Input-Adaptive High-Low Frequency Token Compression for Efficient Image Editing
Creative image editing tools, such as Photoshop's Remove or Generative Fill buttons, are central to everyday customer...
researchHugging Face (Jian Yang)7d ago
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases l...
researchHugging Face (Jiwen Liu)12d ago
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data
Cloning camera motion from reference videos is an important task in video generation, as videos provide intuitive and...
companiesHugging Face (Chensheng Dai)19d ago
RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling
Video generation models based on Diffusion Transformers (DiTs) have achieved remarkable performance in video synthesi...
modelsHugging Face (Yalun Dai)5d ago
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence
Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and too...
researchHugging Face (Paul Kassianik)6d ago
FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines
Multi-step LLM pipelines fail through interactions among retrieval, reasoning, and formatting steps, so prompt-only o...
modelsHugging Face (Filip Sondej)8d ago
RepSelect: Robust LLM Unlearning via Representation Selectivity
Making large language models (LLMs) deeply forget specific knowledge and values without sacrificing general capabilit...
DeepSeek
Models
DeepSeek
Meta AI
researchHugging Face (Haoqin Tu)8d ago
VisualClaw: A Real-Time, Personalized Agent for the Physical World
Vision language models are serving as general-purpose interfaces for complex multimodal tasks. However, deployment st...
modelsHugging Face (Mateusz Winiarek)11d ago
LoSoNA: A Benchmark for Local Social Norm Adaptation in Group Conversations
Online group chats are social spaces with local conversational norms that are rarely stated explicitly. The ability a...
Models
Anthropic
Google DeepMind
modelsHugging Face (Dian Zheng)12d ago
InterleaveThinker: Reinforcing Agentic Interleaved Generation
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-im...
modelsHugging Face (Sanket Badhe)21d ago
Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning
Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency an...
researchHugging Face (Chao Chen)7d ago
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
Reinforcement learning pipelines for Large Language Model (LLM) training often rely on manually redesigned environmen...
Qwen
Research
Qwen
Google DeepMind
researchHugging Face (Joshua Ong Jun Leang)13d ago
Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation
Modern Lean theorem provers achieve strong performance only with substantial training and inference compute, driven i...
DeepSeek
Research
DeepSeek
AI
researchHugging Face (Abhishek Divekar)20d ago
Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference
With PRECISE, we extended Prediction-Powered Inference to produce bias-corrected estimates of ranking evaluation metr...
researcharXiv (Yaniv Livertovsky)5d ago
Complementary Attention Head Pruning for Efficient Transformers
The remarkable success of Transformer-based models in natural language processing stems from architectural scaling, w...
researcharXiv (Dantong Niu)7d ago
T-Rex: Tactile-Reactive Dexterous Manipulation
The ability to react dynamically to tactile signals has long been considered crucial to agile human-level dexterity. ...
researcharXiv (Timing Yang)10d ago
RATS! Patches Talk Through Registers: Emergent Parts in Register Attention Transformers
When humans see a bird, they recognize far more than just "bird" -- they see a head, wings, and talons, a structured ...
policyHugging Face (Sen Xu)8d ago
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models
This technical report introduces VibeThinker-3B, a compact dense model with 3B parameters developed to investigate ho...
DeepSeek
Policy
DeepSeek
Google DeepMind
chipsHugging Face (Xunhao Lai)12d ago
MiniMax Sparse Attention
Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code r...
companiesHugging Face (Jaward Sesay)8d ago
LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching
Effective personalized AI-assisted learning demands systems that can not only generate accurate learner-specific educ...
researchNVIDIA Developer Blog6d ago
How to Optimize Transformer-Based Models for Low-Precision Training
Transformer architectures are the backbone of many modern large language and generative AI models. As these models gr...
Nvidia
Research
Hugging Face
AI
open-sourceHugging Face Blog15d ago
The Open Source Community is backing OpenEnv for Agentic RL
The Open Source Community is backing OpenEnv for Agentic RL
safetyHugging Face Blog18d ago
Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI
Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI
safetyTeoh; Jayden; Tomar; Manan; Ahn; Kwangjun; Hu; Edward S; Pearce; Tim; Sharma; Pratyusha; Krishnamurthy; Akshay; Islam; Riashat; Lamb; Alex; Langford; John (Teoh; Jayden; Tomar; Manan; Ahn; Kwangjun; Hu; Edward S; Pearce; Tim; Sharma; Pratyusha; Krishnamurthy; Akshay; Islam; Riashat; Lamb; Alex; Langford; John)5d ago
Next-Latent Prediction Transformers Learn Compact World Models
Teoh; Jayden; Tomar; Manan; Ahn; Kwangjun; Hu; Edward S; Pearce; Tim; Sharma; Pratyusha; Krishnamurthy; Akshay; Islam...
fundingHugging Face Blog10d ago
olmo-eval: An evaluation workbench for the model development loop
olmo-eval: An evaluation workbench for the model development loop
latestHugging Face Blog13d ago
Introducing North Mini Code: Cohere’s First Model For Developers
Introducing North Mini Code: Cohere’s First Model For Developers