Thinking in Concepts: A New Frontier for AI

Can Large Concept Models make machines think more like humans?

May 20, 2025

When you’re about to give a talk or write an article, you probably start with an abstract idea of what you want to convey. You have scattered ideas, experiences, and perhaps favorite phrases floating around in your mind. Your thoughts might also shift between different languages and levels of abstraction. Your brain plans what you’re about to articulate.

Our brain functions very differently from today’s large language models (LLMs), like ChatGPT. For them, context matters, but it’s about probabilities. Simplified, these models predict the next word based on the context of all preceding words. While today’s language models show impressive results, they are incapable of reasoning and planning at a human level.

Enter: Large Concept Models

Researchers at Meta AI have been working on what they call large concept models, or LCMs [1]. I think this is a really cool concept to explore.

Their idea is to create models that move from working on the word level to a higher level of abstraction - concepts - independent of language and modality. These concepts correspond roughly to sentence-level meanings, expressed in an embedding space designed to carry semantic meaning without being tied to any specific language. The goal is to capture human-like reasoning at the level of ideas, not words.

The LCM operates not on sequences of tokens, but on sequences of sentence embeddings. Meta uses the SONAR embedding space, supporting over 200 languages in text and speech [2]. This means an LCM can reason over content in one language and generate output in another—or even in another modality like speech—without retraining. The architecture is explicitly hierarchical, mimicking how humans plan before articulating, and allowing for consistent and interpretable outputs.

Illustration of the concept space, with sequences of sentences. Source: *Large Concept Models, LCM Team at Meta.*

Meta explored multiple implementations: from simple transformer-based models trained to minimize distance in embedding space, to more advanced diffusion-based models that treat generation as an iterative refinement process, closer to how humans might revise and flesh out abstract ideas. They also experimented with quantized versions of the model, bridging continuous and discrete representation worlds.

These models showed promising zero-shot generalization across languages and tasks like summarization and story generation in experimental evaluations. While LCMs still trail top LLMs, their modular, language-agnostic, and concept-driven nature makes them a compelling alternative paradigm.

What Could We Use LCMs For?

There are very few practical implementations based on LCMs. However, a somewhat speculative study by Ahmad and Goel explores practical applications across industries [3]. Some of the potential use cases include:

Multilingual NLP: Summarize an English article and generate a French version without retraining. Translate low-resource languages using shared concepts instead of brittle token-level mappings.
Cybersecurity: Detect complex attack patterns by reasoning over logs and threat reports. Automate incident response summaries that preserve context.
Healthcare: Summarize long patient histories, translate discharge notes, and compare clinical trial data across studies and languages.
Education: Provide personalized feedback to students, summarize lectures, and support learning in multiple languages.
Scientific Research: Assist with literature reviews, connect ideas across disciplines, and help generate hypotheses.
Legal & Policy Analysis: Digest long contracts into briefs, compare laws across jurisdictions, and automate compliance monitoring.
Human-AI Collaboration: Co-write documents, maintain coherence across long texts, or generate idea-level suggestions during planning and editing.

The study identifies over a dozen domains - from transportation to financial services to public safety - where LCMs could enhance understanding, decision-making, and communication.

Not Quite There (Yet)

Of course, there are fundamental limitations. Sentence-level concepts can be very coarse, especially in nuanced or multi-idea sentences. The SONAR embedding space may not always align with messy, real-world input. And while diffusion models shine in continuous domains like image and speech, they still struggle with generating clean, discrete language. Basically, top-tier LLMs perform a lot better than LCMs.

But the field is moving quickly, and some of these challenges feel more like growing pains than dead ends.

Final Thought

LCMs remind us that language isn't just about words—it’s about meaning. If we want AI that reasons, plans, and communicates more like we do, we must rethink how it represents and processes information.

LCMs aren’t just a new architecture — they’re a new way of thinking about intelligence. If this research continues to mature, it could begin a shift from language models to idea models.

References

[1] Meta LCM Team. Large Concept Models: Language Modeling in a Sentence Representation Space, Meta AI, 2024. Paper

[2] Paul-Ambroise Duquenne, Holger Schwenk, Benoit Sagot. SONAR: Sentence-Level Multimodal and Language-Agnostic Representations, Meta AI 2023. Paper

[3] Hussain Ahmad, Diksha Goel. The Future of AI: Exploring the Potential of Large Concept Models. CSIRO & University of Adelaide, 2025. Paper

Christian’s PhD Bytes

Discussion about this post