A new study exposes a fundamental weakness in today's AI systems: even the most capable multimodal language models can't handle basic visual tasks that toddlers master before they learn to speak.