Why are LLMs bad at ontology engineering?

Large language models produce five structural anti-patterns when building ontologies without guardrails: they generate flat, shallow hierarchies rather than properly nested class structures; they confuse instances with classes; they create redundant or circular properties; they fail to apply formal constraints and restrictions correctly; and they default to natural-language-style naming rather than ontology conventions. These issues arise because LLMs are trained on natural language, not on formal knowledge representation — they generate plausible-looking ontologies that fail under reasoning or validation.

Can LLMs be used safely for ontology engineering?

Yes, but only with proper guardrails and human oversight. LLMs are effective at accelerating specific steps in the ontology engineering process — such as generating initial entity lists from domain documents, suggesting candidate properties, or drafting competency questions — when their output is validated against formal OWL constraints, reviewed by a domain expert, and refined through iterative cycles. The key is treating LLMs as accelerators within a governed methodology, not as autonomous ontology engineers.

What is AI-assisted ontology engineering?

AI-assisted ontology engineering is a methodology that uses large language models to accelerate parts of the ontology design process while maintaining formal rigour through guardrails, validation, and expert review. It combines the speed of LLM-generated suggestions with the precision of traditional ontology engineering practices — SHACL validation, OWL reasoning, competency question testing, and domain expert sign-off. Graph Research Labs has developed a practical methodology for this approach, presented at KGC 2026.

Ontology Engineering

Why LLMs Are Terrible Ontology Engineers (And Why That's the Wrong Question)

Understanding the five structural anti-patterns that emerge when large language models build ontologies without guardrails – and how to fix them.

The Problem

Organisations investing in knowledge graphs typically face a significant bottleneck at the ontology engineering stage. Skilled ontology engineers are scarce – fewer than 700 people worldwide hold formal qualifications in ontology development – and the work is painstaking. It is therefore natural that teams are turning to Large Language Models to accelerate the process. However the results, in practice, are consistently poor in ways that matter.

The issue is not that LLMs produce unusable output. The first few classes typically look reasonable, and the initial properties seem plausible. The problem is that without constraints, LLMs introduce structural defects that compound as the ontology grows – defects that are expensive to find and expensive to fix once they have propagated through downstream systems.

Understanding these failure modes is the first step toward making LLMs genuinely useful in ontology engineering workflows.

The Five Anti-Patterns

Anti-Pattern 01

The first and most common anti-pattern is hierarchy explosion. LLMs are trained on text that is rich in classification, and they reproduce that instinct aggressively. For example, given a simple Patient class, an LLM will typically generate a deep tree of subclasses – PaediatricPatient, GeriatricPatient, OutPatient, InPatient – modelling distinctions that should live in properties, not in the class hierarchy itself. This results in rigid, over-specified structures that are difficult to maintain and nearly impossible to refactor without breaking dependent queries and applications.

Anti-Pattern 02

The second anti-pattern is inconsistent taxonomy. Without a style guide, an LLM will name one class PatientRecord, another patient_diagnosis, and a third DiagnosticResult – all within the same ontology. The inconsistency reflects the model drawing on different training sources with different conventions, and the result is an ontology that looks as though it were built by a committee that never met. In production systems, inconsistent naming increases the cognitive load on every developer and analyst who touches the graph.

Anti-Pattern 03

Third, LLMs exhibit uncontrolled property growth. Every time an LLM is asked to elaborate on a class, it adds properties. It rarely consolidates, reuses, or questions whether a property already exists elsewhere in the model. The result is property sprawl – dozens of overlapping attributes with subtly different names and no clear governance. For example, a model might contain both dateOfAdmission and admissionDate on related classes, each with slightly different range constraints, creating ambiguity for any downstream consumer.

"The problem is not the model. It is the absence of guardrails."

Anti-Pattern 04

Fourth, and perhaps most damaging, is concept–instance confusion. LLMs frequently blur the line between a class (a type of thing) and an instance (a specific thing). They will model "Paracetamol" as a class rather than an instance of Medication, or create a class called January2025Report when what they mean is an instance of MonthlyReport. This kind of error propagates through the entire ontology and is expensive to correct later, because dependent SHACL shapes, SPARQL queries, and application logic all need to be revised.

Anti-Pattern 05

The fifth anti-pattern is modelling by association. LLMs generate ontology structures based on statistical associations in their training data, not on principled domain analysis. This means they create relationships that feel plausible but are semantically incorrect – connecting concepts because they frequently co-occur in text, not because there is a genuine ontological relationship between them. For example, an LLM might create a direct relationship between Patient and Insurance because these terms co-occur frequently, when the actual domain model requires an intermediary Claim or Coverage class.

So Are LLMs Useless for Ontology Work?

These five anti-patterns are not edge cases. In our experience, unconstrained LLM generation produces all five within any ontology of moderate complexity. However they are not inherent limitations of LLMs – they are consequences of using LLMs without the right engineering around them.

The problem is not the model. It is the absence of guardrails. An LLM with no constraints produces ontology structures the same way it produces everything else: fluently, confidently, and without any internal quality control. It does not know your naming conventions. It has not read your style guide. It has no awareness of your existing ontology's patterns and invariants.

However when every LLM interaction in your ontology workflow is governed by modelling patterns, validated against style rules, and constrained by the structures you have already built, the results change dramatically. In our own workflows, we have reduced ontology review cycles by over 60% while maintaining conformance with enterprise modelling standards – because the LLM is no longer inventing structure, it is following structure that a human designer has defined.

That is the methodology I will be teaching at the Knowledge Graph Conference in New York this May. It is a practical, hands-on approach to AI-assisted ontology engineering that transforms LLMs from unreliable generators into controlled, effective modelling assistants within professional ontology development workflows.

The anti-patterns are real. However they are solvable – once you stop expecting the LLM to be the engineer, and start treating it as a tool that needs engineering around it.

The Five Anti-Patterns

01 · Hierarchy ExplosionOver-classifying into rigid subclass trees
02 · Inconsistent TaxonomyMixed naming conventions across classes
03 · Property SprawlDuplicate, overlapping attributes
04 · Concept–Instance ConfusionBlurring types and specific things
05 · Modelling by AssociationStatistical co-occurrence, not semantics