---
# **F5 Design Philosophy: Foundational Axioms, Theorems, and Principles**
---

# 0. Purpose and Scope

This document articulates the foundational design axioms of the F5 data model. It is
not a specification — it contains no normative rules. Its purpose is to record the
reasoning from which the specifications are derived, so that future extensions,
corrections, and implementations can be evaluated against first principles rather than
against accumulated convention.

**On the relationship between axioms and practice:**
These axioms are ideals. Practical limitations — technical constraints of existing
tools, implementation effort, HDF5 API restrictions, the need for backward
compatibility — may require deviations in any given implementation or specification
version. This is acceptable. The axioms define the direction of travel: future
improvements SHOULD converge *towards* these axioms, not *against* them.

If a particular axiom turns out to be unachievable in practice, this does not
invalidate the others. Each axiom stands independently (salvatory clause). The
discovery that one axiom cannot be satisfied is itself a result — it motivates
either a refinement of the axiom or the development of a new mechanism.

**Axioms vs. theorems:**
This document distinguishes foundational axioms (statements taken as starting points,
not derived from others) from theorems (statements derivable from the axioms). Many
properties of the F5 model that appear in the specifications are theorems — they are
consequences of the axioms rather than independent design choices. Labelling them
correctly clarifies which parts of the model are fundamental and which can be
improved without changing the foundations.

The document is ordered deductively: the most fundamental axioms come first, and
theorems follow from them.

---

# 1. The Fiber Bundle Axiom

## Axiom F: {#axiom-fiber-bundle} Scientific data has the structure of a fiber bundle

The foundational choice of the F5 model is to represent scientific data as
**sections of fiber bundles** drawn from differential geometry and topology.

A fiber bundle E → B consists of:
- **Base space B**: the domain where data lives (the mesh, the grid, the parameter
  space — discrete or continuous)
- **Fiber F**: the mathematical value attached at each point of B (scalar, vector,
  tensor, spinor, multivector, or any other algebraic structure)
- **Projection π: E → B**: the map assigning fibers to base points

This is not a metaphor. The base space of a simulation mesh is a topological space.
The velocity field on that mesh is literally a section of the tangent bundle. The
metric tensor in general relativity is literally a section of a symmetric covariant
rank-2 tensor bundle. F5 models these structures directly, not as an approximation.

**The fiber bundle concept is reused at every level of the model:**
- A Grid is a fiber over the parameter space (each Grid is the same physical object
  observed/simulated at different parameter values)
- Each Skeleton defines its own base space, with Fields as fibers over that base
- The TableOfContents is a fiber bundle structure: the ToC dataset is a section of a
  trivial bundle over the parameter space, providing slice path information at each
  parameter value

The fiber bundle axiom rules out any data model that conflates base space and fiber
— for example, a format that binds a "temperature scalar" to a specific mesh type
without separating what the scalar is (fiber) from where it lives (base).

**Theorem F.1 (Topology/Geometry separation):** Because the base space (topology)
and the fiber placement (geometry = chart representation) are independent structures
in a fiber bundle, they must be stored independently. This is the origin of the
Skeleton/Representation split.

**Theorem F.2 (Multiple representations):** A single topological structure (Skeleton)
can be placed geometrically in multiple coordinate systems simultaneously. This follows
from the atlas structure of differential geometry.

Different charts over the same manifold are not necessarily equivalent in practice:
some exhibit coordinate singularities (spherical coordinates at the poles, Schwarzschild
coordinates at the event horizon), and a single chart may not cover the entire
manifold. The atlas of multiple charts is the mathematical solution to this: together,
multiple charts cover what no single chart can.

F5 expresses this directly. A Field may have multiple coordinate Representations
simultaneously — one chart per patch, one chart per coordinate system the application
supports. An implementation can switch between chart Representations as needed for a
given point or region, following the atlas concept. At the F5 level a Field can thus
be treated as an abstract mathematical object, independent of any particular coordinate
realization. The choice of coordinates is a concern of the implementation layer, not
of the stored data.

**Theorem F.3 (Transformation rules):** The algebraic type of a fiber (scalar, vector,
co-vector, tensor) determines how its components transform under chart changes.
Storage of algebraic type information (the covariance array, grade, rank) is therefore
not optional for data that will be subjected to coordinate transformations.

---

# 2. The Five (+Two) Level Hypothesis

## Hypothesis L: {#hypothesis-levels} Five (+Two) levels of hierarchy are necessary and sufficient

The F5 hierarchy — Timeslice, Grid, Skeleton, Representation, Field, and optionally
Fragment and SeparatedCompound — is hypothesized to be both necessary and sufficient
for representing the full range of scientific visualization data derived from numerical
physical simulations.

This is stated as a **hypothesis rather than an axiom** because it may be falsified
by the discovery of a class of data not representable in this hierarchy. To date, no
such class has been found. The hypothesis has been confirmed across a wide range of
data types: structured and unstructured meshes, AMR hierarchies, tensor fields in
general relativity, point clouds, parametric surfaces, time-evolving data, and
multi-source observational data.

The five core levels are conceptual. The two optional levels (Fragment, Separated
Compound) are **implementation-internal performance mechanisms** — they are
semantically invisible to the end user. An implementation may freely adjust, combine,
or omit these two levels without changing the semantic content of the file. The five
conceptual levels are normative; the two performance levels are advisory.

The scope of the hypothesis is scientific visualization of numerical physical
simulations. It does not claim universality for all possible data.

---

# 3. Foundational Axioms

## Axiom 1: {#axiom-how-not-what} Describe *how*, not *what*

F5 does not classify data into predefined categories. It describes the mathematical
and structural properties of data — from which any classification follows as a
derivable consequence.

A reader that correctly identifies the properties of a dataset can determine what
kind of data it is without any predefined vocabulary. This is the foundational
distinction between the F5 approach and the engineering approach of enumerated types.

**Theorem 1.1:** No enumeration of cell types, field types, or data categories is
required or defined in the F5 core model.

A question of the form "does this file contain a triangular surface (type 23)?" is
itself contradictory to Axiom 1 — it asks *what* the data is, not *how* it is
structured. Even setting that aside, the answer cannot be exclusive: in F5, a
triangular surface unavoidably also *is* a point cloud. It is not possible to store a
triangular surface in F5 that is not simultaneously a point cloud, because the vertex
Skeleton is always present. A format that distinguishes these as different types
conflicts with the mathematical structure of the model.

An application that insists on asking "what type is this?" must translate F5's
structural properties into its own vocabulary — an effort that is both implementable
and unnecessary for any application that asks the right question: "does this data have
the properties I need?".

**Theorem 1.2:** Forward compatibility is a consequence of Axiom 1. A data type that
does not exist at spec writing time can be expressed in F5 by describing its
properties, without modifying the specification.

## Axiom 2: {#axiom-structure-not-names} Information belongs in structure, not in names or enumerations

The semantic content of an F5 file is encoded in hierarchical position, type
structure, and object identity — not in names, naming conventions, or reserved
vocabularies.

In practice, F5 cannot yet fully achieve this axiom. Some normative name conventions
remain: the field name `Positions` is normative; the `D`/`d` prefix convention for
tangential vectors and co-vectors encodes covariance in names; the `^` separator in
exterior product names encodes the product type. These are acknowledged as
imperfections relative to the axiom — minimal naming conventions chosen because no
structural encoding is yet available.

**Theorem 2.1 (Improvement direction):** If HDF5 or any future mechanism provides a
way to encode the information currently in naming conventions into structure (e.g.,
HDF5 shared dataspaces would encode Skeleton index spaces structurally), F5 SHOULD
adopt it and retire the corresponding naming convention. This is a route to
specification version bumps that improve alignment with Axiom 2.

**Theorem 2.2:** The `Positions` name is the minimal necessary exception to Axiom 2 —
the one name F5 assigns normative structural meaning to, because some starting point
for geometric placement must be identifiable.

## Axiom 3: {#axiom-simplicity} Simplicity for simple cases

As Einstein stated: *"Keep things as simple as possible, but not simpler."*

Generality must not impose cost on users who do not need it. Every extension to the
F5 model is strictly additive. Each layer can be used independently. A reader
implementing only the core spec correctly handles files that use extensions by
ignoring what it does not understand.

**Theorem 3.1:** The two optional levels (Fragment, SeparatedCompound) are invisible
to the user model — an implementation adjusts them for performance without user
awareness.

**Theorem 3.2 (Absence is informative):** The covariance attribute is absent from
scalar types; the grade attribute is absent when grade equals rank; the Positions
field may be absent from a partial Representation. These absences carry semantic
meaning — the simplest interpretation applies when the attribute is absent.

## Axiom 4: {#axiom-file-identity} File identity is semantically irrelevant

An F5 dataset has no preferred file boundary. Merging multiple F5 files into one,
or splitting one into many, must be possible at every level of the hierarchy, down to
the field fragment. A merge or split that changes no data values must be achievable
as a zero-data-copy operation. The file name carries no meaning.

**Theorem 4.1:** Fragment-level merging and splitting (the minimum granularity) implies
that the "+2" internal levels must be restructurable without semantic consequence.

**Theorem 4.2:** Named HDF5 types do not propagate across external file links. Any
file referenced by an external link must carry its own copies of all named types it
uses. A merge tool SHOULD promote named types to the merged file's global TableOfContents.

**Theorem 4.3:** Performance enhancements that are orthogonal to the F5 core semantics
may be added anywhere — fragments, TableOfContents, external link structures. The
criterion for "orthogonal" is: removing the enhancement yields a semantically
equivalent file. The TableOfContents satisfies this; so does the Fragment level.

## Axiom 5: {#axiom-compatibility} Forward and backward compatibility

The F5 model SHOULD evolve without invalidating existing files (backward
compatibility) and existing readers SHOULD handle future files gracefully by ignoring
unknown constructs (forward compatibility).

A specification advancement that is compatible with an existing version requires no
version number change. Only contradictory advancements — where the new interpretation
conflicts with the old — require a version bump. Version bumps are therefore expected
to be rare, because the model is designed to be extensible without contradiction.

**Theorem 5.1 (Per-field versioning):** Because different fields in one file may have
been written under different specification versions, versioning must be per-field
rather than per-file. The TypeInfo named type is the mechanism.

**Theorem 5.2:** Version differences are not incompatibilities. A reader implementing
version V can read fields written under version V-n by applying the interpretation
rules for that older version, identifiable from the field's TypeInfo reference.

---

# 4. Index Space Axiom

## Axiom 6: {#axiom-index-spaces} Data is organized by index spaces; index depth encodes their role

Every field in F5 is a function from an **index space** to a mathematical value space
(the fiber). An index space is a discrete set with a defined role in the topological
hierarchy. A Skeleton is the F5 representation of an index space.

The **IndexDepth** of an index space encodes its position in the topological hierarchy
relative to vertices:
- Negative IndexDepth: entities from which vertices are derived (generators, coefficients)
- IndexDepth 0: vertices — the atomic base of spatial discretization
- Positive IndexDepth: entities composed from vertices (cells, sets of cells, etc.)

IndexDepth is not limited in magnitude in either direction. The current specification
covers the most common cases, but the axiom is general.

**Application rule for unknown data types:** When encountering a new, previously
unseen type of data, the first step is to identify its index spaces: What are the
discrete sets over which this data is defined? What is the relationship of each index
space to vertices? This analysis determines the IndexDepth, and from there the
appropriate Skeleton structure follows.

**Theorem 6.1 (Vertices as reference):** The special status of vertices (IndexDepth 0)
is not axiomatic — it is a consequence of the definition of IndexDepth. Vertices are
the index space at depth zero by definition; all other depths are relative to this.

**Theorem 6.2 (Negative IndexDepth = procedural coordinates):** If vertex coordinates
are themselves derivable from data stored at IndexDepth -1 (coefficients such as
spherical harmonics), then the resolution of the vertex coordinates is a reader
decision rather than a writer decision. The file stores the recipe; the reader
computes the result at the resolution required by the application. Negative IndexDepth
can be iterated: IndexDepth -2 covers coefficients that are themselves derived from
another source.

**On the Maximal-Depth Principle:** The recommendation to assign the maximum
IndexDepth a Skeleton may ever need is a practical heuristic, not an axiom. It is
likely correct in most cases — adding an index space at a deeper level later without
changing the existing structure is easier if the depth was pre-allocated. However,
this principle is tentative and may be refined by implementation experience.

---

# 5. Chart and Type Axioms

## Axiom 7: {#axiom-chart-names} Chart component names are the axioms of the chart; all other names derive

The member names of the named HDF5 compound type for a chart are the coordinate
function names xᵘ : M → ℝ. They are the starting axioms of the chart. All
algebraic type names within the chart — tangential vector components, co-vector
components, exterior product names, tensor component names — are derived from these
by explicit rules.

This axiom is partially compromised in the current implementation (the `D`/`d` prefix
and `^` separator are naming conventions) but it defines the ideal: if structural
encoding of co-variance and grade were available without naming conventions, the
component names would be sufficient without any prefixes.

**Theorem 7.1:** The separation of chart types (defined by component names) from
chart objects (specific instances with transformation rules) follows from Axiom 7
combined with Axiom F (fiber bundles). A chart type is the abstract structure; a
chart object is a specific embedding in the file.

**On chart types as measurement rules:** A coordinate system is not merely a naming
convention — it defines a *measurement rule*. Cartesian coordinates {x, y, z} encode
the instruction "measure three orthogonal distances." Polar coordinates {r, θ, φ}
encode "measure one distance and two angles, in a specific order with specific
orientations." These measurement instructions are usually implicit knowledge among
practitioners of a field. F5 makes this explicit: a chart type description,
storable as an attribute on the ChartDomain group, can record the measurement
semantics of each coordinate in human- and machine-readable form.

A file that carries complete chart type descriptions is **axiomatically self-descriptive**:
a reader with no prior knowledge of the coordinate system can determine how to
interpret every stored value from the file itself. Self-descriptiveness is also a
design goal of HDF5 (it achieves it at the syntactic level — how data is laid out).
F5 extends this to the semantic level — what the data means. A fully self-descriptive
F5 file requires no external documentation to interpret correctly.

---

# 6. Summary: Axioms and Theorems

## Axioms (foundational, not derived)

| Label | Statement |
|---|---|
| F | Scientific data has the structure of a fiber bundle |
| 1 | Describe how, not what |
| 2 | Information belongs in structure, not names or enumerations |
| 3 | Keep things as simple as possible, but not simpler |
| 4 | File identity is semantically irrelevant |
| 5 | Forward and backward compatibility |
| 6 | Data is organized by index spaces; IndexDepth encodes their role |
| 7 | Chart component names are the axioms of the chart |

## Hypothesis (confirmed but potentially falsifiable)

| Label | Statement |
|---|---|
| L | Five (+Two) levels of hierarchy are necessary and sufficient for scientific simulation data |

## Selected Theorems (derived from axioms)

| Label | Derived from | Statement |
|---|---|---|
| F.1 | F | Topology and geometry must be stored independently (Skeleton/Representation split) |
| F.2 | F | A Skeleton can have multiple Representations simultaneously |
| F.3 | F | Algebraic type (covariance, grade) must be stored for transformable data |
| 1.1 | 1 | No enumeration of cell or field types is defined in the core model |
| 1.2 | 1 | Forward compatibility follows from describing properties, not categories |
| 2.1 | 2 | Structural HDF5 features that encode what naming conventions encode SHOULD be adopted |
| 2.2 | 2 | `Positions` is the minimal necessary exception to the no-reserved-names principle |
| 3.1 | 3 | The +2 optional levels are user-invisible performance mechanisms |
| 3.2 | 3 | Absence of an attribute is informative; the simplest interpretation applies |
| 4.1 | 4 | The +2 levels must be restructurable without semantic consequence |
| 4.2 | 4 | Named types must be locally available in any file that uses them |
| 4.3 | 4 | Performance enhancements orthogonal to core semantics may be added freely |
| 5.1 | 5 | Versioning must be per-field, not per-file |
| 5.2 | 5 | Version differences are not incompatibilities |
| 6.1 | 6 | Vertices are the reference index space (depth zero) by definition |
| 6.2 | 6 | Negative IndexDepth encodes procedural/derived coordinates |
| 7.1 | 7, F | Chart types vs. chart objects distinction follows from axioms |

---

# 7. Literature

The fiber bundle foundation:
- Butler, D.M., Pendley, M.H.: "A visualization model based on the mathematics of
  fiber bundles." Computers in Physics 3(5), pp. 45–51 (1989).

The F5 data model and the five-level hypothesis:
- Benger, W.: "Visualization of General Relativistic Tensor Fields via a Fiber Bundle
  Data Model." PhD thesis, FU Berlin / ZIB, August 2004.
  https://www.fiberbundle.net/papers/TensorFieldViz.pdf

The algebraic type system:
- Benger, W., et al.: "Using Geometric Algebra for Navigation in Riemannian and Hard
  Disc Space." GraVisMa 2009, Plzen, pp. 80–89.
  http://gravisma.zcu.cz/GraVisMa-2009/Papers_2009/!_2009_GraVisMa_proceedings-FINAL.pdf

The "how not what" philosophy:
- Benger, W.: "On Safari in the File Format Jungle — Why Can't You Visualize My
  Data?" IEEE Computing in Science & Engineering, Nov/Dec 2009, pp. 98–103.

---

## Usage Restriction

This specification and all documents in the F5 specification series are published for
academic and personal use. Use of this specification, or any implementation derived
from it, by employees of military or defense-related organizations, or within
facilities producing weapons or conducting research on weapon design, or for any other
military purpose, is explicitly prohibited and contrary to the intent of the authors.

This restriction follows the spirit and terms of the light++ license under which the
original F5 reference implementation was published. The rationale is stated there
directly: software is technology, technology conveys power, and the inventor bears
responsibility for deciding to whom that power is granted. This specification was
developed to advance scientific understanding — not to enable harm.

See: https://www.fiberbundle.net/doc/copyright.html

---

# **End of Document**

---