There is a habit, when you have read enough Jour-fixe-Protokolle, of reaching for a pencil. Not to write — to underline. The same words keep returning: Prüfgegenstand, Frist, Aufgabe, Gewerk, Projektsteuerer. They look like ordinary German nouns, but they are not. In a Pre-Construction office they are roles in a small, dense grammar — closer to the protocol of an old guild than to anything in a dictionary. The job of an ontology is to take that grammar seriously: to give the words formal types, fix the relations between them, and refuse the polite fiction that a paragraph of prose can stand in for a model.1
This note is the fourth iteration of that work. v0.1 was an afternoon's whiteboard sketch in late 2024. v0.3 — released internally last December — survived eight months of contact with real reports before three particular edge cases told us, plainly, that we had drawn the wrong boundary. v0.4 redraws it.
What we are modelling
The corpus is, on the surface, deceptively boring: the day-to-day work product of a Projektsteuerer's office — Jour-fixe-Protokolle, Vergabevermerke, Terminpläne, Pflichtenhefte, Bemusterungsprotokolle, and the Bauprotokolle that come back from site. About forty thousand documents, each between two and twenty pages, mostly written in a flattened bureaucratic German that has more in common with legal pleadings than with architectural prose.
What makes them interesting is not their language but their structure of obligation. Almost every paragraph either creates, transfers, or discharges a duty. An item under review (Prüfgegenstand) appears, gets assigned to a trade (Gewerk), gains a deadline (Frist), and is later either ticked off or escalated. Tracking the flow of those obligations across documents — from a Pre-Con Jour-fixe through to the Bauprotokoll and back — is the entire game.
Mangel → Document) directed. Subtypes of Aufgabe — Behebung, Prüfung, Abnahme — collapse into a single node here for readability. ▶ See a Bauprotokoll line parse into this schema, live.
What broke in v0.3
v0.3 was a fifteen-class taxonomy with a clean inheritance tree. It was our pride for two months and our enemy for the next six. Three things broke it.
1. The fiction of a single deadline. A Frist in the wild is rarely a date. It is a small object with three plausible interpretations: the calendar deadline, the contractually-binding deadline, and the deadline the Projektsteuerer actually means when she writes "bis nächste Woche" in the Jour-fixe minutes. v0.3 collapsed all three into a datetime field. v0.4 promotes Frist to a first-class entity with its own provenance.2
2. Composite trades. A Gewerk is not a leaf. Almost every fourth defect involves two trades blaming each other across an interface — the classic example being TGA-Planer and Rohbau arguing about a duct penetration. v0.3 forced one assignment per defect. v0.4 lets a Mangel attach to an arbitrary set of Gewerks, with a typed responsibility edge that distinguishes verursacht, beteiligt, and betroffen.
3. Documents are not containers. v0.3 modelled a Jour-fixe-Protokoll as a bag of paragraphs with a date stamp. Useful, until you discover that ten percent of all open items in the corpus are first mentioned in an email or a Vergabevermerk and only later cross-referenced into a protocol. The document graph is a network, not a tree, and the entities float above it.
"Most ontology errors are not errors of taxonomy. They are errors of cardinality — of pretending one of something exists where in fact there are several, in flux, only loosely agreed." — Working note, v0.3 retrospective, Feb 2026
The v0.4 contract
The schema is intentionally small. Seven core types, fourteen edge labels, and a handful of literal-typed properties. The decision rule we landed on, after several false starts, is this: if a thing is ever the subject or object of a sentence in a Projektsteuerer's protocol, it gets a node; if it is only ever a modifier, it stays a property. That rule sounds obvious in writing. It is not — it took six months and three ontologists arguing in a Munich basement to find it.
What follows is the canonical rendering of the schema. We use a stripped-down Turtle-like notation; the production schema is in OWL with a SHACL validation layer.
:Project a owl:Class ;
rdfs:label "Bauprojekt" .
:Document a owl:Class ;
rdfs:subClassOf :Artefact .
:Mangel a owl:Class ;
rdfs:label "Defect / Beanstandung" .
:Aufgabe a owl:Class ;
rdfs:label "Task / Pflicht" ;
owl:disjointWith :Mangel .
:Frist a owl:Class ;
rdfs:label "Deadline (typed)" .
:Gewerk a owl:Class ;
rdfs:label "Trade / Discipline" .
:Agent a owl:Class ;
rdfs:subClassOf foaf:Agent .
:contains a owl:ObjectProperty ;
rdfs:domain :Project ; rdfs:range :Document .
:recorded_in a owl:ObjectProperty ;
rdfs:domain :Mangel ; rdfs:range :Document .
:assigned_to a owl:ObjectProperty ;
rdfs:domain :Aufgabe ; rdfs:range :Gewerk .
:due a owl:ObjectProperty ;
rdfs:domain :Aufgabe ; rdfs:range :Frist .
Two things to note. First, :Mangel and :Aufgabe are disjoint. A defect is not a task; a task is the duty to remedy one. Conflating the two — as most off-the-shelf "construction" ontologies do — is the single most expensive modelling error we have made.3 Second, :Frist is a class, not a literal. It carries provenance: who wrote the date, when, and against which calendar.
A small, real example
Consider one paragraph from a real Jour-fixe-Protokoll of a Projektsteuerer, lightly edited:
"BV Riemerschmidt, JF-14, 14.04.2026 — TOP 4.2: Im 3. OG ist die Brandschottung im Bereich der Lüftungstrasse Achse C/4 in der Ausführungsplanung nicht eindeutig festgelegt. Klärung durch TGA-Planer in Abstimmung mit Rohbau, Vorlage bis KW 18."
v0.3 would render this as a single Mangel with a string-typed deadline = "bis KW 18" and an assignee = "TGA-Planer". v0.4 renders it as a small subgraph: one Prüfgegenstand node ("Brandschottung Achse C/4"), recorded in the Jour-fixe-Protokoll JF-14 of 14 April 2026, attached to two Gewerks with different responsibility edges (verursacht: TGA-Planer, beteiligt: Rohbau), spawning a single Aufgabe ("Klärung Brandschottung, Vorlage Detail"), with a Frist entity carrying both the calendar interpretation (KW 18 of 2026, ending 03.05.2026) and the textual original ("bis KW 18").
This is more nodes than v0.3. It is also, by every retrieval metric we care about, dramatically more useful.4
What we did not solve
Three known unknowns, in order of how much they keep us up at night:
- Negation across documents. "Der Mangel besteht nicht mehr" in protocol N+1 is not a deletion of the
Mangelnode from protocol N. It is a state transition. v0.4 has a temporal layer on top, but the layer is brittle. - Implicit assignees. Roughly fifteen percent of defects in our corpus have no explicit responsible trade. A human reads the surrounding context and knows. The model, at the moment, guesses.
- Cross-project transfer. Two projects with the same Projektsteuerer use slightly different vocabularies for the same class of open item. We have no formal mechanism for aligning them; we currently do it by hand.
v0.5 is already on the whiteboard. It will be smaller, again, in the parts that matter, and larger in the parts we keep hoping not to need.
— V. T., Munich, 22 April 2026.