0. Introduction and Scope

This document defines the canonical F5 layout model for reading, validating, traversing, and interpreting F5 HDF5 files. It is the authoritative specification and supersedes all heuristics, naming conventions, or legacy assumptions. All semantics are explicit and identity‑based.

The F5 model is:

deterministic
backend‑agnostic
semantically explicit
time‑aware
fragment‑aware
chart‑aware
index‑space‑aware
geometry‑driven rather than storage‑driven

This specification is intended for:

code generators
validators
readers and writers
tooling authors
researchers
anyone implementing F5 semantics

The canonical hierarchy is:

Timeslice
→ Grid
→ Skeleton
→ Representation
→ Field
→ Fragment datasets

Each level has strict semantics and cannot be reordered or extended with additional hierarchy layers.

1. Canonical Hierarchy

The F5 hierarchy is structurally fixed:

Timeslice
└── Grid
└── Skeleton
└── Representation
└── Field
└── Fragment datasets

Each level has a specific role:

Timeslice: temporal grouping, identified by a scalar Time attribute
Grid: collection of skeletons representing one physical domain
Skeleton: topological structure (vertices, edges, faces, cells, etc.)
Representation: coordinate or relative representation of a skeleton
Field: data defined over the skeleton’s index space
Fragment: contiguous or structured subset of a field

Additional hierarchy levels MUST NOT be inserted.

Group names do not define semantics unless explicitly stated.

2. Timeslices

2.1 Definition

A root‑level group is a timeslice if and only if it has a scalar attribute named:

Time

The attribute MUST be convertible to a double‑precision real.
Group names are irrelevant.

2.2 Acceptable Time values

The Time attribute MAY be stored as:

integer
floating‑point
text that parses unambiguously to a floating‑point value

It MUST be scalar.
Non‑scalar values (arrays, compound types) are malformed.

2.3 Behavior on malformed or missing Time

Missing Time → group is not a timeslice
Presence of a Time attribute makes the group a candidate timeslice; candidate timeslices are validated per 12.1.1.
File processing MUST not abort solely due to malformed timeslice attributes, see 12.1.2 for propagation rules.

2.4 Ordering

Valid timeslices are ordered by ascending numeric Time.
Tie‑breaking (if needed): lexicographic group name.

2.5 Future generalization

The model MAY later support multidimensional parameter slices.
This is not implemented and MUST not be assumed.

3. Timeslice Merging

If multiple root groups convert to the same numeric Time, they represent the same physical timeslice and MUST be merged.

3.1 Merge semantics

Identification

Groups with identical converted Time values form a merge set.

Logical identity

The merged timeslice uses:

the numeric Time value
a canonical name = lexicographically smallest group name in the merge set

Children union

Child groups are merged recursively:

unique names → included directly
duplicate names → recursively merged

Attribute resolution

If an attribute appears in multiple merged groups:

if semantically equal → keep
if different → warning; choose value from lexicographically smallest group

Dataset merging

If datasets share the same name:

if dataspaces or semantic types differ → fatal error
otherwise → treat as the same logical dataset

Diagnostics

Record:

merged group names
attribute conflicts
fatal inconsistencies

3.2 Ordering after merge

Merged logical timeslices are ordered by ascending numeric Time.

4. Grids

A Grid groups skeletons that belong to the same physical domain within a timeslice.

4.1 Identification

A Grid MAY define the attribute:

F5::GridID

If present, this is the canonical identifier for the grid.
If absent, the grid’s group name is used as a fallback identifier, and a warning is recommended.

Grids with the same identifier across timeslices represent the same physical grid evolving in time.

4.2 Structure

A Grid MUST contain at least one Skeleton subgroup.
A Grid MAY also contain a Charts subgroup defining local charts.

4.3 Merging grids across files

When loading multiple files:

Grids with the same identifier at the same timeslice MUST be merged.
Child skeletons, representations, fields, and datasets are merged by name.
Identical dataset paths refer to the same logical dataset; if the datasets differ in dataspace or type, this is a fatal error.
Fragment datasets without offsets are concatenated in file load order.
Overlapping fragments produce a warning; last file wins.

4.4 Deterministic traversal

Traversal order of grids is:

by F5::GridID if present
otherwise lexicographic by group name

This ordering is for traversal only; it has no semantic meaning.

5. Skeletons

A Skeleton describes a topological structure such as vertices, edges, faces, or cells.

5.1 Required attributes

A Skeleton MUST define:

F5::SkeletonDimensionality (int) mandatory F5::rank (int) recommended IndexDepth (int) mandatory

Missing mandatory attributes are a fatal error. Name‑based inference is permitted only when importing non‑F5 or legacy formats that rely on conventional naming (e.g., “Points”, “Vertices”, “Triangles”). This mechanism MUST never be applied to native F5 files.

Writers of F5 files MUST not rely on naming conventions to convey semantics. Although names carry no semantics in the F5 model, writers of native F5 files are strongly encouraged to choose human‑interpretable names for Skeletons and Representations. This improves readability and usability without introducing semantic meaning.

If F5::rank is absent, readers MUST infer rank from F5::SkeletonDimensionality. Missing rank is a warning, not an error.

5.2 Optional attributes

FiberLib::FragmentLayout (int vector)
FiberLib::NumberOfFragments (int)
Refinement (int vector)

Refinement (int vector) describes per‑dimension refinement factors. Its length MUST equal F5::SkeletonDimensionality. Each element specifies the refinement factor along that dimension. The Skeleton’s refinement level is defined as max(Refinement). If the Refinement attribute is absent, the Skeleton’s refinement level is 0.

5.3 Semantics

IndexDepth: number of indexing traversals required to reach vertices.
SkeletonDimensionality: geometric or embedding dimension.
rank: a structural descriptor of the Skeleton’s dimensionality and refinement behavior. It is redundant with SkeletonDimensionality but useful for consistency checking and for readers ingesting non‑HDF5 sources. rank is an advisory attribute for writers and readers; it carries no semantic meaning for interpretation. Readers MAY use rank for traversal or presentation, but MUST not rely on it for semantic interpretation.

5.4 Structural rules

Skeletons are not nested.
A Grid MAY contain multiple Skeletons.
Missing optional attributes produce warnings, not errors.
Skeletons define the index space for all fields under their representations.

5.5 Deterministic ordering

Skeleton ordering has no semantic meaning and is not required for any F5 operation. Readers MAY traverse Skeletons in any order. For reproducibility, readers that choose a deterministic traversal SHOULD use the following keys in order: IndexDepth, SkeletonDimensionality, Refinement, lexicographic group name.

5.6 Examples

Vertices: IndexDepth = 0
Edges: IndexDepth = 1, SkeletonDimensionality = 1
Faces: IndexDepth = 1, SkeletonDimensionality = 2
Cells: IndexDepth = 1, SkeletonDimensionality = 3
Higher nesting: IndexDepth = 2 or more

6. Representations and Charts

A Representation is a subgroup of a Skeleton that defines how the skeleton’s elements are interpreted geometrically or relationally.

A subgroup of a Skeleton is a Representation if its name matches either:

a Chart name (coordinate representation), or
another Skeleton name (relative representation).

If the relative representation relates to a Skeleton into another Grid or Timeslice, then rules apply according to 6.5, i.e. its group name is overridden by the F5::Reference attribute.

If a subgroup name matches both a Chart name and a Skeleton name, the Chart match takes precedence and the subgroup is interpreted as a coordinate representation. A warning SHOULD be issued.

Naming a Chart identically to a Skeleton (or vice versa) is a modeling error and MUST be prevented by writers. If such a conflict appears in an HDF5 file, the file is malformed. Readers MUST treat the subgroup as a coordinate representation (Chart precedence) and issue a warning.

Representations MAY reference Skeletons that appear later in the file or in other merged files. Resolution of relative representations occurs after all Skeletons in the Grid have been loaded. If a referenced Skeleton cannot be resolved at that stage, this is a fatal error.

Subgroups of a Skeleton that match neither a Chart name nor a Skeleton name are ignored with a warning, unless they contain an F5::Reference attribute (see above), in which case they are treated as explicit references.

HDF5 is a random-access hierarchical store. Readers MUST not assume any ordering of groups or datasets. Skeletons MUST be resolved independently of traversal order, see 6.6.

Recovery mode is a non-normative, reader-controlled error-tolerance mode in which readers MAY attempt best-effort recovery from modeling errors (for example, chart/skeleton name conflicts). Recovery mode is not part of the F5 normative model; readers MUST document when they operate in recovery mode and record diagnostics.

6.1 Coordinate Representations

A coordinate representation maps skeleton elements to coordinates in a Chart.

6.1.1 Charts

Charts live under:

/Charts/

Charts define:

coordinate systems
named datatypes
precision variants
optional default datatype

6.1.2 Local Charts

Each Grid MAY define local charts under:

/Charts/

Local charts MUST contain a GlobalChart attribute pointing to the corresponding chart under /Charts/.

6.1.3 Transformations

Transformations between charts are represented as subgroups.
Both directions MUST be present for a complete transformation pair.

6.2 Relative Representations

A relative representation maps skeleton elements to indices of another skeleton.

Example:

Faces/Points

Meaning: faces are defined by indices into points.

Relative representations define connectivity but do not define geometry.

6.3 Units and axes

Units and axis order come from named datatypes, not from representations.
Representations do not define units.

6.4 Default Chart (clarified)

If a Grid does not define any Chart:

A default “StandardCartesianChart3D” chart MUST be assumed.
This behavior is mandatory for backward compatibility.
A warning is recommended but not required.
Explicit charts are strongly preferred.

6.5 Cross-Grid / Cross-Timeslice Skeleton references

F5::Reference (optional attribute on a Representation group) - an HDF5 object reference or path that points to the target Skeleton group (which MAY be in another Grid or Timeslice). When present, the Representation is a relative representation to the referenced Skeleton regardless of the Representation’s name. Cross-timeslice or cross-grid references MUST use F5::Reference. When F5::Reference is present, name-matching is not used for Representation type determination. A writer SHOULD use a name that hints at the destination Skeleton, e.g. “Points_T=0” -> for reference to /T=0//Points . This is merely for human-readability.

6.5.1 Representation References

F5::Reference either an HDF5 object reference or a string attribute. A string attribute containing a full path is recommended because it allows easier name resolution for readers. An HDF5 object reference (H5Rcreate_object(), H5R_OBJECT) MAY be used to better utilize the underlying HDF5 capabilities, but requires readers to identify Skeletons via a lookup table from HDF5 IDs rather than HDF5 path. A path in the string attribute MUST always be absolute. HDF5 does not support the notion of a parent group such as “..” like in a filesystem, thus relative links cannot be modeled here (Skeleton groups are always parental to Representation groups).

6.6 Skeleton discovery and resolution strategy

6.6.1 Local discovery requirement

Readers MUST collect and index all Skeletons that are children of the same Grid before resolving Representations that reference Skeletons within that Grid.

6.6.2 Cross-Grid and cross-Timeslice references

Representations MAY reference Skeletons in other Grids or Timeslices via explicit references (see F5::Reference). When a Representation references a Skeleton outside its local Grid or Timeslice, readers MUST resolve that reference before treating the Representation as valid.

6.6.3 On-demand global discovery

Readers MAY discover Skeletons across Grids and Timeslices lazily (on demand) rather than scanning the entire file(s) up front. On-demand discovery is the recommended default for large datasets because it avoids unnecessary I/O and memory use.

6.6.4 Optional eager discovery mode

Readers MAY offer an optional eager (pre-scan) mode that enumerates all Skeletons across all Grids and Timeslices before resolving Representations. This mode is permitted but not required; it is intended for tools that prioritize global analysis or human-readable diagnostics over minimal I/O.

6.6.5 Hybrid strategy

Readers MAY implement a hybrid strategy: perform a light metadata pass to collect skeleton identifiers and cheap attributes, then perform full discovery only for Skeletons that are actually referenced or requested by the application.

6.6.6 Implementation guidance

6.6.6.1 Dependency graph

Readers that resolve cross-references SHOULD build a dependency graph of Skeletons and Representations. Use the graph to: - determine resolution order, - detect missing targets, and - identify cycles in derivation or reference chains.

6.6.6.2 Cycle detection

Readers MUST detect cycles in cross-Skeleton or cross-Representation references. On cycle detection, readers MUST abort the resolution chain for the cycle, mark the involved entities as unresolved/partial, and emit a diagnostic describing the cycle.

6.6.6.3 Caching and memory management

Readers that perform on-demand discovery SHOULD cache resolved Skeleton metadata and index structures to avoid repeated I/O. Caches SHOULD be bounded and evictable to limit memory usage for very large files.

6.6.6.4 Diagnostics and fallbacks

If a referenced Skeleton cannot be resolved (missing, invalid, or in an inaccessible file), readers MUST: - treat the referencing Representation as invalid or partial according to the fatal-error propagation rules in 12.1.2, and - emit a clear diagnostic indicating the unresolved reference and its origin. Readers MAY provide configurable fallback behavior (for example, best-effort local derivation) but such fallbacks are implementation choices and MUST be documented by the reader.

6.6.6.5 Non-Normative Performance considerations

Prefer on-demand discovery for large datasets and interactive use.
Use eager discovery for batch analysis or when the application explicitly requests a global view.
When resolving references across files, prefer metadata-only operations first (existence checks, attributes) before loading large datasets.

6.6.6.6 Normative summary

Readers MUST collect local Grid Skeletons before resolving local Representations. Readers MUST resolve explicit cross-Grid or cross-Timeslice references before accepting a Representation as valid. Readers MAY perform global Skeleton discovery eagerly, but SHOULD default to on-demand discovery for efficiency. Readers MUST detect and handle cycles, cache metadata prudently, and emit diagnostics for unresolved references.

7. Fields

A Field is a child of a Representation and attaches data to the skeleton’s index space.
Fields describe what is stored, not how it is geometrically interpreted — geometry comes from the Positions field.

7.1 Identification

Fields are semantically identified by their datatype. If multiple fields in the same Representation share the same datatype, their field names MAY be used to disambiguate. However, when the datatype alone is insufficient or unavailable to determine algebraic or geometric interpretation, TypeInfo is required (see 8.5).

Field names have no semantic meaning within F5 (Positions is the sole exception, as F5 assigns it normative structural meaning). They exist only to distinguish multiple fields of identical datatype. F5 is agnostic to any additional semantics an application may attach to field names; applications MAY use field names to encode application-specific meaning such as ‘Velocity’, ‘Temperature’, or ‘Color’, and F5 tools MUST neither require nor reject such names.

7.1.1 Positions

Positions is the only field with special semantics.

In coordinate representations, Positions MUST use the coordinate datatype defined by the Chart and encode geometric coordinates.

In relative representations, Positions MUST be an integer array whose elements index into the target Skeleton’s index space. The arity of each entry (e.g., 2 for edges, 3 for triangles) is determined by the topological structure of the target Skeleton.

The two uses of Positions are distinguished solely by the type of Representation (coordinate vs. relative).

7.1.2 Requirement rule for omitted Positions and Fields

A Representation MUST contain exactly one Positions field unless the Positions values are intentionally omitted. Omission of Positions is permitted and is not by itself a fatal error.

7.1.2.1 Reader obligations

If a Representation contains a Positions field, readers use it as the authoritative geometric embedding.

If Positions is omitted, readers MUST treat the Representation as potentially partial and MUST not assume geometry is available.

Readers MAY attempt to derive omitted Positions using any heuristics or methods they implement. Derivation is an implementation choice. Readers that cannot or will not derive Positions MUST treat the Representation as partial and MUST emit a diagnostic indicating the missing geometry.

Readers that derive Positions MAY do so at any granularity including per-Representation, per-fragment, or per-element. Readers that compute and cache derived values SHOULD record provenance (human-readable string attribute describing the performed data derivation action) and a timestamp (7.1.2.6) on the cached values.

7.1.2.2 Cycle detection and safety

Readers that attempt derivation MUST detect and break cycles in derivation dependencies. If a derivation chain forms a cycle, the reader MUST stop derivation, treat the involved Representations as partial, and emit a diagnostic describing the cycle.

7.1.2.3 Preference guidance for readers

When multiple derivation paths are available, readers SHOULD prefer methods that maximize numerical stability, then minimize computational cost. This preference is guidance only and not normative.

7.1.2.4 Interoperability principle

Interoperability is achieved by conservative defaults: if Positions are absent and no reader-supported derivation exists, treat the Representation as partial rather than attempting speculative, cross-file operations.

7.1.2.5 Non-normative guidance

Examples of reader strategies include chart transformations, indirection mappings, refinement inheritance, temporal interpolation, and procedural generation. Implementations will vary by application and numerical requirements.

Caching: readers that compute derived values are encouraged to cache them locally and record a timestamp (7.1.2.6) to aid downstream tools.

Diagnostics: readers SHOULD provide clear diagnostics indicating whether a Representation is usable for a requested operation and why (missing Positions, unsupported derivation method, cycle detected).

7.1.2.6 Timestamps

Use the HDF5 timestamp property on groups and datasets (e.g. H5O_info2_t via H5Oget_info3() and H5O_INFO_TIME ). When supporting another file format, use attributes or another file-format specific storage method.

7.2 Field contents

A Field MAY contain:

a single dataset (contiguous field)
multiple datasets (fragmented field)
subgroups representing separated compound components
attributes defining procedural fields

Fields MAY mix these forms as long as semantics remain consistent.

7.3 Field–Skeleton relationship

A Field’s index space MUST match the Skeleton’s index space.
This is enforced through:

fragment offsets
fragment sizes
procedural definitions
or implicit coverage rules

A Field MAY be partial (see Section 9).

7.4 Time‑dependence

A Field MAY be:

time‑independent (same dataset identity across timeslices)
time‑dependent (new dataset identity at a timeslice)
partially time‑dependent (some fragments change, others remain linked)

Symbolic links MUST be used to express time‑independence.

8. Field Types

F5 supports several field types, each with distinct semantics.

8.1 Contiguous Field

A contiguous field consists of a single dataset containing all values.

The dataset’s size defines the field’s size.

8.2 Fragmented Contiguous Field

A fragmented field consists of multiple datasets, each representing a fragment.

Each fragment MAY define:

offset (index‑space offset)
Range
Fiber::NumericalShift
CellSize

Fragment names have no semantic meaning.

Fragment placement is determined solely by:

the fragment’s offset attribute (if present), and
the coordinates in the Positions field.

Traversal order of fragments has no semantic effect.

8.3 Separated Compound Field

A compound datatype MAY be stored as separate datasets:

x
y
z

Each component MAY be:

contiguous
fragmented
partially fragmented

All components MUST share consistent fragment attributes.

8.4 Procedural Fields

Procedural fields define values algorithmically.

UniformSampling

Attributes:

base
offset

Value at index i:

base + offset * i

DirectProduct

Defined by N one‑dimensional arrays.
The field value is the Cartesian product of these arrays.

This is used to construct rectilinear grids (non‑uniform regular grids).

FragmentedUniformSampling

UniformSampling applied per fragment, with fragment‑level attributes.

8.5 TypeInfo

TypeInfo is required when the datatype alone is insufficient or unavailable to determine the field’s algebraic or geometric interpretation; this includes cases where multiple fields share the same low-level datatype and names alone are insufficient for unambiguous interpretation.

Examples include: – multiple fields sharing the same datatype, – tensor fields whose rank or variance cannot be inferred from the datatype, – fields participating in chart transformations, – compound datatypes lacking explicit component semantics - field stored as group rather than a dataset, e.g. separated compound layout

It is recommended for clarity.

A Field MAY exist without any datasets or fragments only if TypeInfo is present. In this case, TypeInfo defines the field’s datatype and algebraic meaning. Such a field is considered empty but well-typed, and fragments MAY be added later. A field with no datasets and no TypeInfo is malformed.

9. Fragment Semantics

Fragments are the fundamental unit of partial storage.

9.1 Fragment identity

Fragment identity is determined by dataset object identity, not by:

name
content
order
path

Two fragments are identical only if they reference the same HDF5 dataset object.

9.2 Fragment ordering

Fragment names are irrelevant.
Traversal order has no semantic meaning.

Readers MUST treat fragments as random‑access containers.

Any geometric or data‑processing algorithm MUST derive ordering from coordinates, not from fragment order.

9.3 Fragment attributes

Fragment attributes define placement and interpretation.

If multiple fields define fragment attributes:

attributes are merged
inconsistent attributes produce a warning
file load order determines precedence
inconsistent attributes across files are an error condition

9.4 Partial fields

A field is implicitly partial if some regions of the index space are not covered by any fragment.

Querying a location with no fragment coverage yields a default value:

zero, or
the HDF5 fill value (if defined)

No explicit “partial” marker is required.

10. Time‑Dependence and Identity Rules

Time‑dependence in F5 is defined strictly in terms of dataset object identity, not content equality.
This ensures deterministic interpretation across files, writers, and timeslices.

10.1 First occurrence

The first occurrence of a field is the earliest timeslice containing a real dataset object for that field.

Symbolic links do not count as first occurrences; they refer back to an earlier dataset.

10.2 Time‑independent fields

A field is time‑independent over an interval if all timeslices reference the same dataset object identity.

This MUST be expressed using symbolic links.

Writers MUST use symbolic links to express identity. Copying a dataset destroys identity information and MUST not be used to express time‑independence. If a writer copies a dataset intentionally, the reader will treat it as a distinct dataset with distinct semantics.

10.3 Time‑dependent fields

A field is time‑dependent at timeslice T if it contains a new dataset object at that timeslice.

Time‑dependence MAY be:

full (entire field changes)
partial (only some fragments change)

10.4 Fragment‑level time‑dependence

Fragments MAY be time‑dependent independently of each other.

Examples:

Fragment 0 is linked across timeslices → time‑independent
Fragment 1 is replaced at timeslice T → time‑dependent

This allows efficient incremental updates.

10.5 Forbidden duplication

Duplicating a dataset instead of linking is forbidden because:

it destroys identity information
it breaks time‑independence detection
it introduces ambiguity

Readers MUST treat every dataset object as semantically distinct unless it is literally the same HDF5 object (via link). Readers cannot detect whether two datasets were copied intentionally or accidentally. Identity is determined solely by HDF5 object identity.

Writers MUST use symbolic links to express identity.

11. Skeleton Index‑Space Rules

Skeletons define the index space for all fields under their representations.
Skeletons themselves do not have an intrinsic size; their size is derived from fields.

11.1 Field size

A field’s size is determined by its internal structure:

contiguous field → dataset size
fragmented field → sum of fragment sizes
procedural field → implied size from attributes

These sizes contribute to the skeleton’s index‑space coverage.

11.2 Skeleton size

Skeleton size is defined as the union of index coverage across all fields attached to the Skeleton. Coverage MAY come from: – contiguous datasets, – fragmented datasets, – procedural definitions, – or any combination thereof.

11.2.1 Rules

An unfragmented field defines the full Skeleton index space. If multiple unfragmented fields exist, they MUST all have identical size; otherwise the Skeleton is invalid.

Fragmented fields MAY cover any subset of this index space.

If no unfragmented field exists, the Skeleton index space is defined as the union of coverage across all fragmented or procedural fields.

If an unfragmented field exists on a Skeleton, all fragmented fields on that Skeleton MUST cover only indices within the unfragmented field’s index space. Any fragment that defines coverage outside the unfragmented field’s index space is a fatal error.

Empty fields contribute no coverage.

This rule ensures:

consistent index space
compatibility across fields
predictable behavior for partial fields

11.3 Consistency requirements

All fields attached to a skeleton MUST:

share the same index space
provide values for all indices unless partial
use consistent fragment attributes
use consistent fragment offsets
use consistent procedural definitions

If a field is partial, missing regions MUST return default values (see Section 9).

Partiality is a structural property: a field is partial if its fragments or datasets do not cover the full Skeleton index space. The F5 model does not distinguish intentional from accidental partiality; readers MUST treat all partial fields uniformly. Writers are responsible for ensuring that partial fields are semantically correct.

11.4 Geometry determines ordering

Ordering of elements in the index space is determined by:

coordinates in the Positions field
fragment offsets
procedural definitions

Fragment names and storage order have no effect on index‑space ordering.

12. Validation Rules

Validation ensures that F5 files are semantically consistent and safe to interpret.

12.1 Fatal errors

The following conditions are fatal: - missing required skeleton attributes - incompatible datasets during merge - conflicting dataspaces - inconsistent fragment attributes across files

12.1.1 Timeslice validation

A group that contains a Time attribute is a candidate timeslice. A candidate timeslice becomes a valid timeslice only after its Time attribute is successfully parsed as a scalar real.

Candidate timeslice with a non-scalar or non-convertible Time attribute -> fatal error for that candidate timeslice (see 12.1.2 for propagation).

A group with no Time attribute -> not a timeslice; ignored without error.

12.1.2 Fatal error propagation

Fatal errors are local to the entity in which they occur, but propagate along all structural and referential dependencies. A fatal error invalidates the affected entity and all entities that depend on it, regardless of where they appear in the hierarchy or across timeslices.

The fatal-error propagation rule applies to references resolved via F5::Reference. If a referenced Skeleton in another Timeslice or Grid is invalid, any Representation that references it via F5::Reference is invalid.

Examples: - Invalid Skeleton → all Representations, Fields, and Fragments referencing that Skeleton are invalid, even across Grids or Timeslices - Invalid Representation → all Fields and Fragments under that Representation are invalid - Invalid Grid → only that Grid within its Timeslice is invalid, and all references to its Skeletons from other Grids or Timeslices are invalid - Invalid Timeslice → only that Timeslice is invalid, and all references to its Grids or Skeletons from other Timeslices are invalid

12.1.3 Continuation rule

Readers MUST continue processing all unaffected entities. A fatal error MUST not abort processing of the entire file unless the root structure itself is malformed.

12.2 Warnings

Warnings SHOULD be issued for:

missing optional attributes
fallback to group names for grid identification
ignored malformed timeslices
default chart assumption
fragment attribute conflicts resolved by file load order

Warnings do not abort processing.

12.3 Diagnostics

Readers SHOULD record:

merge operations
attribute conflicts
overlapping fragments
time‑dependence intervals
fallback behaviors
missing or partial fields

Diagnostics are not part of the file format but are essential for tooling.

13. Deterministic Traversal Rules

Traversal order is defined for reproducibility but has no semantic meaning. Geometry and coordinates determine semantics, not traversal order.

Traversal order:

Timeslices by numeric Time
Grids by F5::GridID or group name
Skeletons by IndexDepth, SkeletonDimensionality, Refinement, then name (recommended for reproducible traversal only; traversal order has no semantic meaning).
Representations by name
Fields by datatype, name
Fragments by arbitrary order (names irrelevant)

13.1 Fragment traversal clarification

Because fragment names are irrelevant and geometry determines ordering:

traversal order of fragments MUST not affect results
algorithms MUST treat fragments as random‑access containers
coordinate‑based queries MUST be used for geometric operations

This is a core F5 principle:
geometry determines meaning; storage layout does not.

14. Reader Expectations

Readers (and tools implementing this specification) MUST adhere to the following behavioral requirements.

14.1 Timeslice handling

Readers MUST:

identify candidate timeslices by the presence of a Time attribute and validate them using 12.1.1.
merge timeslices with identical numeric Time values
order timeslices by numeric Time

14.2 Grid handling

Readers MUST:

identify grids by F5::GridID when present
fall back to group name when absent
merge grids across files deterministically
treat grid ordering as traversal‑only, not semantic

14.3 Skeleton handling

Readers MUST:

validate required skeleton attributes
treat missing optional attributes as warnings
derive skeleton size from fields
enforce consistent index‑space semantics

14.4 Representation handling

Readers MUST:

identify coordinate representations via chart names
identify relative representations via skeleton names
resolve local charts via GlobalChart attributes
assume default chart if none is defined (StandardCartesianChart3D)
treat chart transformations as optional but recommended

14.5 Field handling

Readers MUST:

identify fields by datatype
support contiguous, fragmented, separated compound, and procedural fields
treat Representations with absent Positions as partial per 7.1.2.
treat fragment names as irrelevant
treat fragments as random‑access containers
use coordinates to determine geometric ordering
support partial fields and default values

14.6 Time‑dependence handling

Readers MUST:

detect time‑independence via symbolic links
treat dataset duplication as semantic change
support fragment‑level time‑dependence
track identity across timeslices

14.7 Fragment handling

Readers MUST:

ignore fragment names
use fragment offsets and coordinates for placement
merge fragment attributes using file load order
warn on inconsistent attributes
treat overlapping fragments as warnings

14.8 Geometry‑driven semantics

Readers MUST:

derive ordering from coordinates, not storage order
treat geometry as the authoritative source of meaning
ensure that algorithms produce identical results regardless of fragment order

This is a core F5 principle.

15. Summary of the Canonical F5 Model

This section summarizes the entire specification in a concise, normative list.

15.1 Structural model

Timeslices identified by scalar Time
Timeslices with same numeric Time merged
Grids identified by F5::GridID or group name
Skeletons define topology and index space
Representations define coordinate or relative meaning
Fields attach data to skeleton index space
Fragments store partial data

15.2 Geometry‑driven semantics

Geometry determines ordering
Fragment names are irrelevant
Storage layout has no semantic meaning
Readers MUST use coordinates for all geometric operations

15.3 Field types

contiguous
fragmented
separated compound
procedural (UniformSampling, DirectProduct, FragmentedUniformSampling)

15.4 Fragment semantics

identity is dataset identity
ordering is irrelevant
offsets and coordinates determine placement
partial fields allowed
default values for uncovered regions

15.5 Time‑dependence

symbolic links express time‑independence
dataset duplication expresses change
fragment‑level time‑dependence supported

15.6 Validation

required attributes enforced
optional attributes warn
Fragment attribute conflicts: within a single file → warning across multiple files → fatal error
deterministic merging required

15.7 Charts

charts define coordinate systems
local charts reference global charts
default “StandardCartesianChart3D” chart assumed if none defined

16. Integrated Clarifications (OQ‑1 to OQ‑5)

All open questions have been resolved and integrated into the specification.
For completeness, they are restated here.

OQ‑1: Fragment names and ordering

Fragment names have no semantic meaning.
Ordering of fragments is irrelevant.
Placement is determined solely by fragment offsets and coordinates.
Geometry determines ordering, not storage layout.
Deterministic traversal is unnecessary for semantics.

OQ‑2: Partial fields

Partiality is implicit from missing fragments.
Querying uncovered regions yields default values (zero or HDF5 fill value).
No explicit partial marker is required.

OQ‑3: DirectProduct semantics

DirectProduct uses the Cartesian product of component arrays.
Used to construct rectilinear (non‑uniform regular) grids.

OQ‑4: Fragment attribute merging

File load order determines precedence.
Inconsistent fragment attributes produce warnings.
Inconsistent attributes across files are an error.

OQ‑5: Default chart

If no chart is defined, assume a default “StandardCartesianChart3D” chart.
This is mandatory for backward compatibility.
A warning is recommended but not required.
Explicit charts are strongly preferred.

17. Conceptual Note: “How is it?” vs. “What is it?”

The F5 model does not classify datasets by predefined geometric or topological types. Instead, it describes how a dataset is structured, not what it is.

Applications MUST not ask:

“Is this a triangular surface?”

but instead:

“Does this dataset have the properties of a triangular surface?”

This distinction is essential:

A triangular surface is also a point cloud. It MAY be part of a refinement hierarchy. It MAY coexist with line or cell structures on the same vertices. It MAY represent only a subset of a larger domain.

F5 avoids implicit assumptions and predefined categories. The model exposes structure; applications infer meaning from that structure.

This design philosophy is central to the expressive power of F5.

18. Structural Principles for Hierarchical Refinement

This appendix describes the structural principles that allow hierarchical refinement to be expressed within the F5 model. It introduces no new semantics beyond those already defined in Sections 1–17. Instead, it clarifies how the existing concepts—Skeletons, Representations, Fields, IndexDepth, and Fragments—combine to express refinement structures in a fully general and topology‑preserving manner.

The purpose of this appendix is to provide a design guide. It does not prescribe any specific refinement scheme, naming convention, or data layout. All refinement structures MUST be derivable from the core F5 rules.

18.2 Fragments as Topological Entities

Fragments are subsets of a field’s index space.
Topologically, a fragment corresponds to a set of cells, and a cell corresponds to a set of vertices. Fragments are not topological entities themselves; they are represented as topological entities only when modeled via Skeletons with IndexDepth ≥ 2.

Thus, refinement tiles MAY be represented by Skeletons whose IndexDepth reflects their topological structure:

IndexDepth = 0 → vertices (points)
IndexDepth = 1 → cells (sets of vertices)
IndexDepth = 2 → sets of cells (fragments)
IndexDepth = 3 → sets of fragments, and so on

This allows refinement tiles to be treated as first‑class topological entities.

18.4 Refinement Relations as Relative Representations

A refinement relation between two Skeletons is expressed as a relative representation.

A subgroup of a Skeleton is a refinement Representation if its name matches another Skeleton name within the same Grid, or via the 6.5 cross-grid or cross-timeslice references.

Thus, refinement relations are expressed structurally as:

Skeleton_L / Skeleton_{L+1}

This representation has the same index space as Skeleton_L.
Its Fields describe how each element of Skeleton_L relates to elements of Skeleton_{L+1}.

No new keywords or attributes are required.

A physically important example is AMR (Adaptive Mesh Refinement) governed by the Courant-Friedrichs-Lewy (CFL) condition. The CFL condition constrains the time step at each refinement level to be proportional to the cell size at that level; finer levels therefore advance at smaller time steps than coarser ones. As a result, the coarse-level refinement hierarchy Representation is partially time-dependent: it references a fine-level Skeleton that exists at a different (earlier) timeslice than the coarse level’s current timeslice. Such inter-level refinement relations MUST use F5::Reference (see 6.5) pointing to the fine-level Skeleton at the appropriate timeslice. The refinement Representation itself thus changes only as often as the coarse level advances, not at every fine sub-step - a natural expression of partial time-dependence within the F5 model.

18.8 Optional Coordinate Embeddings

Skeletons participating in refinement hierarchies MAY optionally define coordinate representations. Coordinate representations MUST refer to a chart; relative representations MUST refer to another Skeleton.

These coordinate representations:

use the same chart mechanism as all other coordinate representations,

MAY define Positions fields giving geometric embeddings of refinement elements.

Positions fields MAY be omitted when their information is derivable from other Representations, indirection mappings, refinement relationships, or chart transformations, consistent with the general Positions rule in Section 7.1.

The F5 model does not require explicit coordinate embeddings for refinement structures.

18.9 No Special Keywords or Reserved Names

The F5 model does not introduce any special keywords, attribute names, or reserved identifiers for refinement.

All refinement structures MUST be expressible using:

Skeletons
Representations
Fields
IndexDepth
Fragments

No additional constructs are permitted.

18.10 Derivability from Core Principles

All refinement structures MUST be derivable from the following core principles:

Skeletons define topological entities.
Representations define relations between Skeletons.
Fields attach data to Skeleton index spaces.
Fragments partition field index spaces.
IndexDepth expresses nested topological structure.
No semantics are encoded in names.
All semantics arise from structure.

These principles are sufficient to express:

hierarchical point clouds,
hierarchical meshes,
AMR refinement,
multi‑resolution grids,
and mixed‑dimensional refinement structures.

No additional rules are required. These principles are sufficient for an AI or human reader to derive refinement structures without explicit examples.

18.11 IndexDepth Design Guidelines for Scalable Topological Modeling

This section provides structural guidelines for choosing IndexDepth values in a way that ensures long‑term stability, extensibility, and composability of Skeletons. These guidelines do not introduce new semantics; they follow directly from the core principles defined in Sections 1–17.

The purpose of these guidelines is to ensure that Skeletons remain structurally stable even when intermediate topological levels are added or removed, and that tools can reliably interpret Skeletons based solely on their dimensionality and index‑depth signatures.

These guidelines are consistent with the examples in Section 5.6.

18.11.1 Maximal‑Depth Principle

If a topological entity MAY participate in multiple nested levels of structure, its Skeleton SHOULD be assigned an IndexDepth equal to the maximum depth of nesting it MAY ever require, even if some intermediate levels are not currently present.

This ensures that:

the Skeleton’s index space remains stable over time,
intermediate Skeletons (e.g., Edges between Points and Lines) MAY be added or removed without restructuring,
higher‑order or refined variants of the entity can be introduced without altering existing Skeletons,
tools can reliably identify Skeletons by their (SkeletonDimensionality, IndexDepth) pair.

This principle follows from the definition of IndexDepth as the number of nested index spaces between a Skeleton and its vertices.

18.11.2 Optional Intermediate Skeletons

Intermediate topological Skeletons (e.g., Edges between Points and Lines) are optional.
Their presence or absence MUST not require restructuring of Skeletons at higher index depths.

A Skeleton defined with a maximal IndexDepth MAY coexist with or without intermediate Skeletons. If intermediate Skeletons are introduced later, they simply occupy the appropriate index‑depth level without affecting existing structures.

18.11.3 Stability of Index Spaces

Skeletons SHOULD be designed so that their index spaces do not change when:

intermediate Skeletons are added or removed,
refinement structures are introduced,
higher‑order elements are added,
additional Representations are defined.

Assigning a maximal IndexDepth ensures that the index space of a Skeleton remains stable and predictable.

18.11.4 Uniformity Across Datasets

For a given topological entity type (e.g., lines, surfaces, volumes), all datasets SHOULD use the same (SkeletonDimensionality, IndexDepth) pair, regardless of whether intermediate levels are present.

This uniformity enables:

consistent interpretation across datasets,
predictable traversal logic,
deterministic code generation,
compatibility with refinement structures.

18.11.6 Derivability From Core Principles

The Maximal‑Depth Principle is not an additional rule; it follows directly from:

Skeletons represent topological entities.
IndexDepth expresses nested topological structure.
Representations express relations between Skeletons.
Skeleton identity MUST be stable across Representations.
No semantics are encoded in names.
All semantics arise from structure.

These principles imply that Skeletons SHOULD be assigned index depths that remain valid under all future structural extensions.

End of Section 18.11

End of Section 18

19. Mathematical Domains Required for Fitting a Dataset into the F5 Layout

This section provides the mathematical context that underlies the F5 model. These concepts are not additional requirements of the format; they explain the structures defined in Sections 1–18.

The domains are listed in order of conceptual priority.

19.1 Differential Geometry

Differential geometry provides the foundational language for:

coordinate charts
tangent and cotangent spaces
vector and covector fields
tensor fields
coordinate transformations
geometric embeddings
metric‑dependent quantities
geometric invariants

In the F5 model:

coordinate representations correspond to charts on a manifold,
Positions fields provide embeddings of skeleton elements into a chart,
tensor‑valued fields MUST be stored in coordinate representations because they transform under chart changes,
named datatypes encode tensorial transformation rules.

Differential geometry is therefore essential for:

interpreting coordinate‑dependent fields,
understanding how fields transform between charts,
ensuring that geometric quantities are stored in the correct representation.

19.2 Topology

Topology provides the structural foundation for:

vertices, edges, faces, and cells
nested index spaces (IndexDepth)
connectivity relations
refinement relations
cell complexes
adjacency and incidence
partial coverings and fragment partitions

In the F5 model:

Skeletons represent topological entities,
IndexDepth expresses nested topological structure,
relative representations express incidence relations between skeletons,
refinement structures are topological relations between skeletons at different levels.

Topology determines:

the structure of Skeletons,
the meaning of Representations,
the interpretation of Fields as functions on index spaces.

19.3 Fiber‑Bundle Theory

Fiber‑bundle theory provides the unifying mathematical framework for the F5 model.

A fiber bundle consists of:

a base space (the index space of a Skeleton),
a fiber (the datatype of a Field),
a projection (the attachment of field values to indices),
and optional connections (e.g., chart transformations).

In the F5 model:

every Field is a section of a fiber bundle,
the fiber is defined by the Field’s named datatype,
the base is the Skeleton’s index space,
coordinate representations correspond to local trivializations of the bundle,
chart transformations correspond to transition functions.

This perspective explains:

why Fields attach to Skeletons,
why tensor fields MUST live in coordinate representations,
why named datatypes MUST encode transformation rules,
why geometry and topology are strictly separated.

Fiber‑bundle theory is the conceptual backbone of the entire F5 design.

19.4 Geometric Algebra and Tensor Algebra

Geometric algebra (or classical tensor algebra) provides the mathematical language for typing fields.

Required concepts include:

tangent vectors
cotangent vectors
multivectors
differential forms
metric‑dependent and metric‑independent quantities
transformation rules under chart changes
basis representations
tensor rank and variance

In the F5 model:

named datatypes encode the algebraic type of a field,
coordinate representations provide the basis in which components are stored,
chart transformations define how components transform.

This domain is essential for:

distinguishing vectors from covectors,
distinguishing tensors of different rank,
ensuring correct transformation behavior,
interpreting physical quantities correctly.

19.5 Geometry

Geometry provides:

embeddings into ℝⁿ
metric interpretation
geometric queries
geometric ordering
spatial reasoning

In the F5 model:

geometry is introduced only through coordinate representations,
Positions fields define embeddings,
geometry determines ordering and spatial queries,
storage layout has no geometric meaning.

19.6 Index‑Space Theory

Index‑space theory covers:

discrete index sets
nested index spaces (IndexDepth)
mappings between index spaces
partial coverage
fragment offsets
procedural fields

This domain is essential for:

interpreting Skeletons,
understanding relative representations,
handling fragmented fields,
interpreting refinement structures.

19.7 Set‑Theoretic Semantics

The F5 model is fundamentally set‑theoretic:

Skeletons define sets,
Representations define functions between sets,
Fields define functions from index sets to values,
Fragments define partitions of sets.

This domain is required for:

understanding identity,
interpreting refinement mappings,
handling partial fields,
merging datasets.

19.8 Identity Theory

Identity in F5 is defined by HDF5 object identity, not content.

This domain covers:

object identity vs equality
symbolic links
identity propagation
time‑dependence semantics

Identity theory is essential for:

time‑dependent fields,
fragment‑level updates,
merging across files.

19.9 Optional Domains

Depending on the dataset, additional mathematical domains MAY be relevant:

measure theory (densities, integrals)
graph theory (connectivity queries)
algebraic topology (homology, cohomology)
numerical analysis (interpolation, discretization)

These domains are not required by the F5 model but MAY be used by applications.

19.10 Derivability

All mathematical structures required to fit a dataset into the F5 layout are derivable from:

differential geometry
topology
fiber‑bundle theory
geometric/tensor algebra
index‑space theory
identity theory

No additional mathematical assumptions are required.