In your opinion, where might a boundary lie? If multi-modal LLMs can build/prune a 'dictionary' of 'concepts,' does that sidestep the Thai Library conjecture? Not arguing that a model expanding a collection of feature vectors == human learning, but curious where you see that differing from grounding