Why this is here: 90 signal strength + source-backed + recent this week.
VQV Signal
NOISE
90% signal strength
CARE: Competence-Aware Reward Shaping for Adaptive Reasoning Length in Video-MLLMs
In multimodal video reasoning, reinforcement learning-based methods typically rely on simplistic and inflexible reasoning-length control strategies that fail to adapt to the model's evolving competence. This mismatch may suppress necessary exploration at earl...
Score 60
Source Type arxiv
Reposts 0
Topic Quality 43
Open the original source for full context, or open the topic page to see related signals and the topic timeline.