SpecLynx ApiDOM - The Data Model

A parser that hands back generic objects and arrays only gets you halfway. The moment your tool reaches for an operation, a schema, or a server URL, it's back to matching strings against keys — the work the parser refused to do.

ApiDOM hands back something different: a tree that already knows what every piece of it represents. An operation is an OperationElement. A schema is a SchemaElement. A server URL isn't just a string — it's a string tagged as a URL, in a namespace, with source positions attached.

This chapter is the shape of that tree.

Three principles

The whole data model is built around three ideas. They show up everywhere in the rest of this chapter, so it's worth naming them up front.

Uniform shape

Every node is an element, and every element has the same four parts. A traversal that handles one element handles them all — no special cases hiding in the corners.

Semantic typing

Every element answers what it represents, not just what kind of value it carries. Two strings that look identical in JSON can mean very different things — a server URL and a description are both quoted text, but they're not the same element. Tools can ask "give me every OperationElement" instead of "find every node whose parent key matches an HTTP method" — the parser already did that classification.

Clean separation

The value of the document and ApiDOM's bookkeeping never mix. content holds what came from the source; meta and attributes hold everything ApiDOM computed about it. A round-trip serializer can write the document back without leaking any of ApiDOM's own state.

The parse result

parse() doesn't hand you the root document directly. It hands you a parse result: a small wrapper that carries the parsed data alongside everything ApiDOM noticed on the way in.

import { parse } from '@speclynx/apidom-reference';

const result = await parse('/path/to/openapi.json');

result.api;          // the root data model — your OpenAPI document
result.annotations;  // diagnostics collected while parsing

When you parse a clean document, result.api is what you came for and the annotations list is empty. When you parse a malformed one in non-strict mode, the broken parts surface as annotations rather than crashing the parse. Either way, the shape is the same — one place to look for the data, one place to look for what went wrong.

What you have at this point is strictly a tree: every element has exactly one parent, and there are no cycles. $ref pointers stay as reference elements rather than getting inlined. Resolving them in a later chapter can relax both constraints — shared references break the single-parent rule and the structure becomes a directed acyclic graph, while circular references break the no-cycle rule and it becomes a directed cyclic graph. parse() itself never does either.

Everything is an element

Inside result.api you won't find plain JavaScript objects. You'll find elements. Every node in the tree is one — the root document, every operation, every schema, every string, every number.

An element is a small object with four parts:

element.element

the kind — "string", "operation"

element.content

the value — varies by kind

element.meta

metadata — classes, identifiers, titles

element.attributes

semantic attributes specific to the element kind

element.element is the part that makes the model semantic. Two strings that look identical in JSON can mean different things — a server URL is not a description, even if both are quoted text. ApiDOM keeps that distinction by tagging every element with what it is.

Three kinds of element

Every element in the tree is one of three kinds, and each kind builds on the last. The base sets the shape, primitives specialise for JSON's data types, and semantic elements layer in spec-specific meaning on top.

Base

Element

Every node in the tree. Same four parts on every one: element, content, meta, attributes.

Extends Element

Primitive elements

One per JSON data type — null, boolean, number, string, array, object, member. The leaves of every tree.

Extends a primitive

Semantic elements

One per spec concept — operations, schemas, servers, paths, info objects. Grouped into namespaces by specification.

Everything that follows — reading values, walking the tree, validating, round-tripping — works against this hierarchy. The next two sections fill in the primitive and semantic layers.

Primitive elements

At the leaves of the tree sit the primitives. They map one-to-one to the JSON data model:

NullElement

null

BooleanElement

true / false

NumberElement

a number

StringElement

a string

ArrayElement

ordered list of elements

ObjectElement

set of members

MemberElement

key/value pair (both elements)

These are the bricks. Everything else — an entire OpenAPI document — is built from them.

Semantic elements

The semantic elements are where ApiDOM earns its name. They extend the primitives, but each one carries meaning specific to a specification. An OperationElement is still an object underneath — but it answers to element === 'operation', exposes the fields you'd expect from an operation, and can be located by traversal without inspecting keys.

import { parse } from '@speclynx/apidom-reference';

const result = await parse('/path/to/openapi.json');

result.api.element;       // "openApi3_1"
result.api.info.element;  // "info"
result.api.paths.element; // "paths"

This is what makes the model semantic rather than syntactic. Your code can ask "give me every operation" without walking strings and matching keys, because every operation in the tree has already been tagged as one.

Namespaces

Each supported specification lives in its own namespace. A namespace is a bag of semantic elements specific to that spec — OpenAPI 3.1, OpenAPI 3.0, OpenAPI 2.0, Overlay 1.x, AsyncAPI 2.x, Arazzo 1.x, JSON Schema.

When ApiDOM parses your document, it picks the matching namespace and produces semantic elements from it. The primitives are shared across all namespaces; the semantic elements are not. An InfoElement from the OpenAPI 3.1 namespace is not the same type as the one from AsyncAPI 2.6, even though both extend the same primitive object element.

This is how a single library covers every shape of API document without conflating them. Tooling that wants to handle all of them can work against the shared primitives; tooling that targets a single spec can lean on the semantic types of that namespace.

A document in tree form

Put it all together. A short OpenAPI snippet on the left, the tree ApiDOM produces from it on the right — semantic elements at the branches, primitives at the leaves, every node sharing the same four-part shape.

Source

openapi: "3.1.2"
info:
  title: Pet Store
  version: "1.0.0"
paths:
  /pets:
    get:
      summary: List pets

Data model

OpenApi3_1Element
├─ openapi:  StringElement   "3.1.2"
├─ info:     InfoElement
│  ├─ title:    StringElement "Pet Store"
│  └─ version:  StringElement "1.0.0"
└─ paths:    PathsElement
   └─ /pets:  PathItemElement
      └─ get:  OperationElement
         └─ summary: StringElement "List pets"

Reading the model

Even though everything is an element, you don't need to remember that to read a document. Semantic elements expose the fields you'd reach for naturally:

import { parse } from '@speclynx/apidom-reference';
import { toValue } from '@speclynx/apidom-core';

const result = await parse('/path/to/openapi.yaml');

result.api.info.title;            // StringElement
toValue(result.api.info.title);   // "Pet Store" — raw JavaScript string
toValue(result.api.info.version); // "1.0.0"

result.api.paths.forEach((pathItem) => {
  pathItem.element; // "pathItem"
});

Field accessors return child elements, not raw values — a title lookup yields a StringElement, not a primitive string.

Opting out of the data model is a single function call. toValue() from @speclynx/apidom-core returns the plain JavaScript representation of any element — a string for primitives, an object or array for compound elements. It works at any depth, including the root: toValue(result.api) returns the entire document as a plain object.

Where the extras live

meta and attributes are where ApiDOM stores everything that isn't the value itself. Classifications, identifiers, computed properties, references resolved — they all live here. Source map positions are exposed directly on each element for convenience, but the data backing them comes from this same metadata layer.

The separation is deliberate. content stays a faithful representation of what the document said. Everything ApiDOM added during parsing lives alongside it, never inside it — so a round-trip serialization can write the document back without leaking ApiDOM's own bookkeeping into the output.

When to drop down

There's no separate API layer to opt into. Three ways of working with the tree share the same nodes — you move between them as you need to.

Most navigation works through the semantic accessors. They return child elements, not raw values:

result.api.info.title;                          // StringElement
result.api.paths.get('/pets').get('get').summary; // StringElement

Call toValue() from @speclynx/apidom-core when you just want the plain JavaScript value:

import { toValue } from '@speclynx/apidom-core';

toValue(result.api.info.title);   // "Pet Store"
toValue(result.api.info.version); // "1.0.0"
toValue(result.api);              // the whole document, as a plain object

Drop to the element level when you need information the value alone can't carry:

Source positions — where in the original document a value came from
Classifications and identifiers — metadata the parser attached during recognition
The element kind — whether this string is a server URL or a description
Custom traversal — finding nodes by predicate or visiting every element of a kind

Same tree either way. The chapters that follow build on it.