Ontology Engineering

Why LLMs Are Terrible Ontology Engineers (And Why That's the Wrong Question)

Understanding the five structural anti-patterns that emerge when large language models build ontologies without guardrails – and how to fix them.

Dougal Watt
CEO & Founder,
Graph Research Labs
March 2026 · 5 min read

The Problem

Organisations investing in knowledge graphs typically face a significant bottleneck at the ontology engineering stage. Skilled ontology engineers are scarce – fewer than 700 people worldwide hold formal qualifications in ontology development – and the work is painstaking. It is therefore natural that teams are turning to Large Language Models to accelerate the process. However the results, in practice, are consistently poor in ways that matter.

The issue is not that LLMs produce unusable output. The first few classes typically look reasonable, and the initial properties seem plausible. The problem is that without constraints, LLMs introduce structural defects that compound as the ontology grows – defects that are expensive to find and expensive to fix once they have propagated through downstream systems.

Understanding these failure modes is the first step toward making LLMs genuinely useful in ontology engineering workflows.

The Five Anti-Patterns

Anti-Pattern 01

The first and most common anti-pattern is hierarchy explosion. LLMs are trained on text that is rich in classification, and they reproduce that instinct aggressively. For example, given a simple Patient class, an LLM will typically generate a deep tree of subclasses – PaediatricPatient, GeriatricPatient, OutPatient, InPatient – modelling distinctions that should live in properties, not in the class hierarchy itself. This results in rigid, over-specified structures that are difficult to maintain and nearly impossible to refactor without breaking dependent queries and applications.

Anti-Pattern 02

The second anti-pattern is inconsistent taxonomy. Without a style guide, an LLM will name one class PatientRecord, another patient_diagnosis, and a third DiagnosticResult – all within the same ontology. The inconsistency reflects the model drawing on different training sources with different conventions, and the result is an ontology that looks as though it were built by a committee that never met. In production systems, inconsistent naming increases the cognitive load on every developer and analyst who touches the graph.

Anti-Pattern 03

Third, LLMs exhibit uncontrolled property growth. Every time an LLM is asked to elaborate on a class, it adds properties. It rarely consolidates, reuses, or questions whether a property already exists elsewhere in the model. The result is property sprawl – dozens of overlapping attributes with subtly different names and no clear governance. For example, a model might contain both dateOfAdmission and admissionDate on related classes, each with slightly different range constraints, creating ambiguity for any downstream consumer.

"The problem is not the model. It is the absence of guardrails."
Anti-Pattern 04

Fourth, and perhaps most damaging, is concept–instance confusion. LLMs frequently blur the line between a class (a type of thing) and an instance (a specific thing). They will model "Paracetamol" as a class rather than an instance of Medication, or create a class called January2025Report when what they mean is an instance of MonthlyReport. This kind of error propagates through the entire ontology and is expensive to correct later, because dependent SHACL shapes, SPARQL queries, and application logic all need to be revised.

Anti-Pattern 05

The fifth anti-pattern is modelling by association. LLMs generate ontology structures based on statistical associations in their training data, not on principled domain analysis. This means they create relationships that feel plausible but are semantically incorrect – connecting concepts because they frequently co-occur in text, not because there is a genuine ontological relationship between them. For example, an LLM might create a direct relationship between Patient and Insurance because these terms co-occur frequently, when the actual domain model requires an intermediary Claim or Coverage class.

So Are LLMs Useless for Ontology Work?

These five anti-patterns are not edge cases. In our experience, unconstrained LLM generation produces all five within any ontology of moderate complexity. However they are not inherent limitations of LLMs – they are consequences of using LLMs without the right engineering around them.

The problem is not the model. It is the absence of guardrails. An LLM with no constraints produces ontology structures the same way it produces everything else: fluently, confidently, and without any internal quality control. It does not know your naming conventions. It has not read your style guide. It has no awareness of your existing ontology's patterns and invariants.

However when every LLM interaction in your ontology workflow is governed by modelling patterns, validated against style rules, and constrained by the structures you have already built, the results change dramatically. In our own workflows, we have reduced ontology review cycles by over 60% while maintaining conformance with enterprise modelling standards – because the LLM is no longer inventing structure, it is following structure that a human designer has defined.

That is the methodology I will be teaching at the Knowledge Graph Conference in New York this May. It is a practical, hands-on approach to AI-assisted ontology engineering that transforms LLMs from unreliable generators into controlled, effective modelling assistants within professional ontology development workflows.

The anti-patterns are real. However they are solvable – once you stop expecting the LLM to be the engineer, and start treating it as a tool that needs engineering around it.

The Five Anti-Patterns

  • 01 · Hierarchy ExplosionOver-classifying into rigid subclass trees
  • 02 · Inconsistent TaxonomyMixed naming conventions across classes
  • 03 · Property SprawlDuplicate, overlapping attributes
  • 04 · Concept–Instance ConfusionBlurring types and specific things
  • 05 · Modelling by AssociationStatistical co-occurrence, not semantics