0. Introduction and Scope
This document defines the canonical F5 layout model for reading, validating, traversing, and interpreting F5 HDF5 files. It is the authoritative specification and supersedes all heuristics, naming conventions, or legacy assumptions. All semantics are explicit and identity‑based.
The F5 model is:
- deterministic
- backend‑agnostic
- semantically explicit
- time‑aware
- fragment‑aware
- chart‑aware
- index‑space‑aware
- geometry‑driven rather than storage‑driven
This specification is intended for:
- code generators
- validators
- readers and writers
- tooling authors
- researchers
- anyone implementing F5 semantics
The canonical hierarchy is:
Timeslice
→ Grid
→ Skeleton
→ Representation
→ Field
→ Fragment datasets
Each level has strict semantics and cannot be reordered or extended with additional hierarchy layers.
1. Canonical Hierarchy
The F5 hierarchy is structurally fixed:
Timeslice
└── Grid
└── Skeleton
└── Representation
└── Field
└── Fragment datasets
Each level has a specific role:
- Timeslice: temporal grouping, identified by a scalar Time attribute
- Grid: collection of skeletons representing one physical domain
- Skeleton: topological structure (vertices, edges, faces, cells, etc.)
- Representation: coordinate or relative representation of a skeleton
- Field: data defined over the skeleton’s index space
- Fragment: contiguous or structured subset of a field
Additional hierarchy levels MUST NOT be inserted.
Group names do not define semantics unless explicitly stated.
2. Timeslices
2.1 Definition
A root‑level group is a timeslice if and only if it has a scalar attribute named:
Time
The attribute MUST be convertible to a double‑precision
real.
Group names are irrelevant.
2.2 Acceptable Time values
The Time attribute MAY be stored as:
- integer
- floating‑point
- text that parses unambiguously to a floating‑point value
It MUST be scalar.
Non‑scalar values (arrays, compound types) are malformed.
2.3 Behavior on malformed or missing Time
- Missing Time → group is not a timeslice
- Presence of a Time attribute makes the group a candidate timeslice; candidate timeslices are validated per 12.1.1.
- File processing MUST not abort solely due to malformed timeslice attributes, see 12.1.2 for propagation rules.
2.4 Ordering
Valid timeslices are ordered by ascending numeric Time.
Tie‑breaking (if needed): lexicographic group name.
2.5 Future generalization
The model MAY later support multidimensional parameter
slices.
This is not implemented and MUST not be assumed.
3. Timeslice Merging
If multiple root groups convert to the same numeric Time, they represent the same physical timeslice and MUST be merged.
3.1 Merge semantics
Identification
Groups with identical converted Time values form a merge set.
Logical identity
The merged timeslice uses:
- the numeric Time value
- a canonical name = lexicographically smallest group name in the merge set
Children union
Child groups are merged recursively:
- unique names → included directly
- duplicate names → recursively merged
Attribute resolution
If an attribute appears in multiple merged groups:
- if semantically equal → keep
- if different → warning; choose value from lexicographically smallest group
Dataset merging
If datasets share the same name:
- if dataspaces or semantic types differ → fatal error
- otherwise → treat as the same logical dataset
Diagnostics
Record:
- merged group names
- attribute conflicts
- fatal inconsistencies
3.2 Ordering after merge
Merged logical timeslices are ordered by ascending numeric Time.
4. Grids
A Grid groups skeletons that belong to the same physical domain within a timeslice.
4.1 Identification
A Grid MAY define the attribute:
F5::GridID
If present, this is the canonical identifier for the
grid.
If absent, the grid’s group name is used as a fallback
identifier, and a warning is recommended.
Grids with the same identifier across timeslices represent the same physical grid evolving in time.
4.2 Structure
A Grid MUST contain at least one Skeleton subgroup.
A Grid MAY also contain a Charts subgroup defining local
charts.
4.3 Merging grids across files
When loading multiple files:
- Grids with the same identifier at the same timeslice MUST be
merged.
- Child skeletons, representations, fields, and datasets are
merged by name.
- Identical dataset paths refer to the same logical dataset;
if the datasets differ in dataspace or type, this is a fatal
error.
- Fragment datasets without offsets are concatenated in file
load order.
- Overlapping fragments produce a warning; last file wins.
4.4 Deterministic traversal
Traversal order of grids is:
- by F5::GridID if present
- otherwise lexicographic by group name
This ordering is for traversal only; it has no semantic meaning.
5. Skeletons
A Skeleton describes a topological structure such as vertices, edges, faces, or cells.
5.1 Required attributes
A Skeleton MUST define:
F5::SkeletonDimensionality (int) mandatory F5::rank (int) recommended IndexDepth (int) mandatory
Missing mandatory attributes are a fatal error. Name‑based inference is permitted only when importing non‑F5 or legacy formats that rely on conventional naming (e.g., “Points”, “Vertices”, “Triangles”). This mechanism MUST never be applied to native F5 files.
Writers of F5 files MUST not rely on naming conventions to convey semantics. Although names carry no semantics in the F5 model, writers of native F5 files are strongly encouraged to choose human‑interpretable names for Skeletons and Representations. This improves readability and usability without introducing semantic meaning.
If F5::rank is absent, readers MUST infer rank from F5::SkeletonDimensionality. Missing rank is a warning, not an error.
5.2 Optional attributes
FiberLib::FragmentLayout (int vector)
FiberLib::NumberOfFragments (int)
Refinement (int vector)
Refinement (int vector) describes per‑dimension refinement factors. Its length MUST equal F5::SkeletonDimensionality. Each element specifies the refinement factor along that dimension. The Skeleton’s refinement level is defined as max(Refinement). If the Refinement attribute is absent, the Skeleton’s refinement level is 0.
5.3 Semantics
- IndexDepth: number of indexing traversals required to reach
vertices.
- SkeletonDimensionality: geometric or embedding
dimension.
- rank: a structural descriptor of the Skeleton’s dimensionality and refinement behavior. It is redundant with SkeletonDimensionality but useful for consistency checking and for readers ingesting non‑HDF5 sources. rank is an advisory attribute for writers and readers; it carries no semantic meaning for interpretation. Readers MAY use rank for traversal or presentation, but MUST not rely on it for semantic interpretation.
5.4 Structural rules
- Skeletons are not nested.
- A Grid MAY contain multiple Skeletons.
- Missing optional attributes produce warnings, not
errors.
- Skeletons define the index space for all fields under their representations.
5.5 Deterministic ordering
Skeleton ordering has no semantic meaning and is not required for any F5 operation. Readers MAY traverse Skeletons in any order. For reproducibility, readers that choose a deterministic traversal SHOULD use the following keys in order: IndexDepth, SkeletonDimensionality, Refinement, lexicographic group name.
5.6 Examples
Vertices: IndexDepth = 0
Edges: IndexDepth = 1, SkeletonDimensionality = 1
Faces: IndexDepth = 1, SkeletonDimensionality = 2
Cells: IndexDepth = 1, SkeletonDimensionality = 3
Higher nesting: IndexDepth = 2 or more
6. Representations and Charts
A Representation is a subgroup of a Skeleton that defines how the skeleton’s elements are interpreted geometrically or relationally.
A subgroup of a Skeleton is a Representation if its name matches either:
- a Chart name (coordinate representation), or
- another Skeleton name (relative representation).
If the relative representation relates to a Skeleton into another Grid or Timeslice, then rules apply according to 6.5, i.e. its group name is overridden by the F5::Reference attribute.
If a subgroup name matches both a Chart name and a Skeleton name, the Chart match takes precedence and the subgroup is interpreted as a coordinate representation. A warning SHOULD be issued.
Naming a Chart identically to a Skeleton (or vice versa) is a modeling error and MUST be prevented by writers. If such a conflict appears in an HDF5 file, the file is malformed. Readers MUST treat the subgroup as a coordinate representation (Chart precedence) and issue a warning.
Representations MAY reference Skeletons that appear later in the file or in other merged files. Resolution of relative representations occurs after all Skeletons in the Grid have been loaded. If a referenced Skeleton cannot be resolved at that stage, this is a fatal error.
Subgroups of a Skeleton that match neither a Chart name nor a Skeleton name are ignored with a warning, unless they contain an F5::Reference attribute (see above), in which case they are treated as explicit references.
HDF5 is a random-access hierarchical store. Readers MUST not assume any ordering of groups or datasets. Skeletons MUST be resolved independently of traversal order, see 6.6.
Recovery mode is a non-normative, reader-controlled error-tolerance mode in which readers MAY attempt best-effort recovery from modeling errors (for example, chart/skeleton name conflicts). Recovery mode is not part of the F5 normative model; readers MUST document when they operate in recovery mode and record diagnostics.
6.1 Coordinate Representations
A coordinate representation maps skeleton elements to coordinates in a Chart.
6.1.1 Charts
Charts live under:
/Charts/
Charts define:
- coordinate systems
- named datatypes
- precision variants
- optional default datatype
6.1.2 Local Charts
Each Grid MAY define local charts under:
Local charts MUST contain a GlobalChart attribute pointing to the corresponding chart under /Charts/.
6.1.3 Transformations
Transformations between charts are represented as
subgroups.
Both directions MUST be present for a complete transformation
pair.
6.2 Relative Representations
A relative representation maps skeleton elements to indices of another skeleton.
Example:
Faces/Points
Meaning: faces are defined by indices into points.
Relative representations define connectivity but do not define geometry.
6.3 Units and axes
Units and axis order come from named datatypes, not from
representations.
Representations do not define units.
6.4 Default Chart (clarified)
If a Grid does not define any Chart:
- A default “StandardCartesianChart3D” chart MUST be
assumed.
- This behavior is mandatory for backward compatibility.
- A warning is recommended but not required.
- Explicit charts are strongly preferred.
6.5 Cross-Grid / Cross-Timeslice Skeleton references
F5::Reference (optional attribute on a Representation group)
- an HDF5 object reference or path that points to the target
Skeleton group (which MAY be in another Grid or Timeslice). When
present, the Representation is a relative representation to the
referenced Skeleton regardless of the Representation’s name.
Cross-timeslice or cross-grid references MUST use F5::Reference.
When F5::Reference is present, name-matching is not used for
Representation type determination. A writer SHOULD use a name
that hints at the destination Skeleton, e.g. “Points_T=0” ->
for reference to /T=0/
6.5.1 Representation References
F5::Reference either an HDF5 object reference or a string attribute. A string attribute containing a full path is recommended because it allows easier name resolution for readers. An HDF5 object reference (H5Rcreate_object(), H5R_OBJECT) MAY be used to better utilize the underlying HDF5 capabilities, but requires readers to identify Skeletons via a lookup table from HDF5 IDs rather than HDF5 path. A path in the string attribute MUST always be absolute. HDF5 does not support the notion of a parent group such as “..” like in a filesystem, thus relative links cannot be modeled here (Skeleton groups are always parental to Representation groups).
6.6 Skeleton discovery and resolution strategy
6.6.1 Local discovery requirement
Readers MUST collect and index all Skeletons that are children of the same Grid before resolving Representations that reference Skeletons within that Grid.
6.6.2 Cross-Grid and cross-Timeslice references
Representations MAY reference Skeletons in other Grids or Timeslices via explicit references (see F5::Reference). When a Representation references a Skeleton outside its local Grid or Timeslice, readers MUST resolve that reference before treating the Representation as valid.
6.6.3 On-demand global discovery
Readers MAY discover Skeletons across Grids and Timeslices lazily (on demand) rather than scanning the entire file(s) up front. On-demand discovery is the recommended default for large datasets because it avoids unnecessary I/O and memory use.
6.6.4 Optional eager discovery mode
Readers MAY offer an optional eager (pre-scan) mode that enumerates all Skeletons across all Grids and Timeslices before resolving Representations. This mode is permitted but not required; it is intended for tools that prioritize global analysis or human-readable diagnostics over minimal I/O.
6.6.5 Hybrid strategy
Readers MAY implement a hybrid strategy: perform a light metadata pass to collect skeleton identifiers and cheap attributes, then perform full discovery only for Skeletons that are actually referenced or requested by the application.
6.6.6 Implementation guidance
6.6.6.1 Dependency graph
Readers that resolve cross-references SHOULD build a dependency graph of Skeletons and Representations. Use the graph to: - determine resolution order, - detect missing targets, and - identify cycles in derivation or reference chains.
6.6.6.2 Cycle detection
Readers MUST detect cycles in cross-Skeleton or cross-Representation references. On cycle detection, readers MUST abort the resolution chain for the cycle, mark the involved entities as unresolved/partial, and emit a diagnostic describing the cycle.
6.6.6.3 Caching and memory management
Readers that perform on-demand discovery SHOULD cache resolved Skeleton metadata and index structures to avoid repeated I/O. Caches SHOULD be bounded and evictable to limit memory usage for very large files.
6.6.6.4 Diagnostics and fallbacks
If a referenced Skeleton cannot be resolved (missing, invalid, or in an inaccessible file), readers MUST: - treat the referencing Representation as invalid or partial according to the fatal-error propagation rules in 12.1.2, and - emit a clear diagnostic indicating the unresolved reference and its origin. Readers MAY provide configurable fallback behavior (for example, best-effort local derivation) but such fallbacks are implementation choices and MUST be documented by the reader.
6.6.6.5 Non-Normative Performance considerations
- Prefer on-demand discovery for large datasets and interactive use.
- Use eager discovery for batch analysis or when the application explicitly requests a global view.
- When resolving references across files, prefer metadata-only operations first (existence checks, attributes) before loading large datasets.
6.6.6.6 Normative summary
Readers MUST collect local Grid Skeletons before resolving local Representations. Readers MUST resolve explicit cross-Grid or cross-Timeslice references before accepting a Representation as valid. Readers MAY perform global Skeleton discovery eagerly, but SHOULD default to on-demand discovery for efficiency. Readers MUST detect and handle cycles, cache metadata prudently, and emit diagnostics for unresolved references.
7. Fields
A Field is a child of a Representation and attaches data to
the skeleton’s index space.
Fields describe what is stored, not how it is
geometrically interpreted — geometry comes from the Positions
field.
7.1 Identification
Fields are semantically identified by their datatype. If multiple fields in the same Representation share the same datatype, their field names MAY be used to disambiguate. However, when the datatype alone is insufficient or unavailable to determine algebraic or geometric interpretation, TypeInfo is required (see 8.5).
Field names have no semantic meaning within F5 (Positions is the sole exception, as F5 assigns it normative structural meaning). They exist only to distinguish multiple fields of identical datatype. F5 is agnostic to any additional semantics an application may attach to field names; applications MAY use field names to encode application-specific meaning such as ‘Velocity’, ‘Temperature’, or ‘Color’, and F5 tools MUST neither require nor reject such names.
7.1.1 Positions
Positions is the only field with special semantics.
In coordinate representations, Positions MUST use the coordinate datatype defined by the Chart and encode geometric coordinates.
In relative representations, Positions MUST be an integer array whose elements index into the target Skeleton’s index space. The arity of each entry (e.g., 2 for edges, 3 for triangles) is determined by the topological structure of the target Skeleton.
The two uses of Positions are distinguished solely by the type of Representation (coordinate vs. relative).
7.1.2 Requirement rule for omitted Positions and Fields
A Representation MUST contain exactly one Positions field unless the Positions values are intentionally omitted. Omission of Positions is permitted and is not by itself a fatal error.
7.1.2.1 Reader obligations
If a Representation contains a Positions field, readers use it as the authoritative geometric embedding.
If Positions is omitted, readers MUST treat the Representation as potentially partial and MUST not assume geometry is available.
Readers MAY attempt to derive omitted Positions using any heuristics or methods they implement. Derivation is an implementation choice. Readers that cannot or will not derive Positions MUST treat the Representation as partial and MUST emit a diagnostic indicating the missing geometry.
Readers that derive Positions MAY do so at any granularity including per-Representation, per-fragment, or per-element. Readers that compute and cache derived values SHOULD record provenance (human-readable string attribute describing the performed data derivation action) and a timestamp (7.1.2.6) on the cached values.
7.1.2.2 Cycle detection and safety
Readers that attempt derivation MUST detect and break cycles in derivation dependencies. If a derivation chain forms a cycle, the reader MUST stop derivation, treat the involved Representations as partial, and emit a diagnostic describing the cycle.
7.1.2.3 Preference guidance for readers
When multiple derivation paths are available, readers SHOULD prefer methods that maximize numerical stability, then minimize computational cost. This preference is guidance only and not normative.
7.1.2.4 Interoperability principle
Interoperability is achieved by conservative defaults: if Positions are absent and no reader-supported derivation exists, treat the Representation as partial rather than attempting speculative, cross-file operations.
7.1.2.5 Non-normative guidance
Examples of reader strategies include chart transformations, indirection mappings, refinement inheritance, temporal interpolation, and procedural generation. Implementations will vary by application and numerical requirements.
Caching: readers that compute derived values are encouraged to cache them locally and record a timestamp (7.1.2.6) to aid downstream tools.
Diagnostics: readers SHOULD provide clear diagnostics indicating whether a Representation is usable for a requested operation and why (missing Positions, unsupported derivation method, cycle detected).
7.1.2.6 Timestamps
Use the HDF5 timestamp property on groups and datasets (e.g. H5O_info2_t via H5Oget_info3() and H5O_INFO_TIME ). When supporting another file format, use attributes or another file-format specific storage method.
7.2 Field contents
A Field MAY contain:
- a single dataset (contiguous field)
- multiple datasets (fragmented field)
- subgroups representing separated compound components
- attributes defining procedural fields
Fields MAY mix these forms as long as semantics remain consistent.
7.3 Field–Skeleton relationship
A Field’s index space MUST match the Skeleton’s index
space.
This is enforced through:
- fragment offsets
- fragment sizes
- procedural definitions
- or implicit coverage rules
A Field MAY be partial (see Section 9).
7.4 Time‑dependence
A Field MAY be:
- time‑independent (same dataset identity across
timeslices)
- time‑dependent (new dataset identity at a timeslice)
- partially time‑dependent (some fragments change, others remain linked)
Symbolic links MUST be used to express time‑independence.
8. Field Types
F5 supports several field types, each with distinct semantics.
8.1 Contiguous Field
A contiguous field consists of a single dataset containing all values.
The dataset’s size defines the field’s size.
8.2 Fragmented Contiguous Field
A fragmented field consists of multiple datasets, each representing a fragment.
Each fragment MAY define:
- offset (index‑space offset)
- Range
- Fiber::NumericalShift
- CellSize
Fragment names have no semantic meaning.
Fragment placement is determined solely by:
- the fragment’s offset attribute (if present), and
- the coordinates in the Positions field.
Traversal order of fragments has no semantic effect.
8.3 Separated Compound Field
A compound datatype MAY be stored as separate datasets:
x
y
z
Each component MAY be:
- contiguous
- fragmented
- partially fragmented
All components MUST share consistent fragment attributes.
8.4 Procedural Fields
Procedural fields define values algorithmically.
UniformSampling
Attributes:
- base
- offset
Value at index i:
base + offset * i
DirectProduct
Defined by N one‑dimensional arrays.
The field value is the Cartesian product of
these arrays.
This is used to construct rectilinear grids (non‑uniform regular grids).
FragmentedUniformSampling
UniformSampling applied per fragment, with fragment‑level attributes.
8.5 TypeInfo
TypeInfo is required when the datatype alone is insufficient or unavailable to determine the field’s algebraic or geometric interpretation; this includes cases where multiple fields share the same low-level datatype and names alone are insufficient for unambiguous interpretation.
Examples include: – multiple fields sharing the same datatype, – tensor fields whose rank or variance cannot be inferred from the datatype, – fields participating in chart transformations, – compound datatypes lacking explicit component semantics - field stored as group rather than a dataset, e.g. separated compound layout
It is recommended for clarity.
A Field MAY exist without any datasets or fragments only if TypeInfo is present. In this case, TypeInfo defines the field’s datatype and algebraic meaning. Such a field is considered empty but well-typed, and fragments MAY be added later. A field with no datasets and no TypeInfo is malformed.
9. Fragment Semantics
Fragments are the fundamental unit of partial storage.
9.1 Fragment identity
Fragment identity is determined by dataset object identity, not by:
- name
- content
- order
- path
Two fragments are identical only if they reference the same HDF5 dataset object.
9.2 Fragment ordering
Fragment names are irrelevant.
Traversal order has no semantic meaning.
Readers MUST treat fragments as random‑access containers.
Any geometric or data‑processing algorithm MUST derive ordering from coordinates, not from fragment order.
9.3 Fragment attributes
Fragment attributes define placement and interpretation.
If multiple fields define fragment attributes:
- attributes are merged
- inconsistent attributes produce a warning
- file load order determines precedence
- inconsistent attributes across files are an error condition
9.4 Partial fields
A field is implicitly partial if some regions of the index space are not covered by any fragment.
Querying a location with no fragment coverage yields a default value:
- zero, or
- the HDF5 fill value (if defined)
No explicit “partial” marker is required.
10. Time‑Dependence and Identity Rules
Time‑dependence in F5 is defined strictly in terms of
dataset object identity, not content
equality.
This ensures deterministic interpretation across files, writers,
and timeslices.
10.1 First occurrence
The first occurrence of a field is the earliest timeslice containing a real dataset object for that field.
Symbolic links do not count as first occurrences; they refer back to an earlier dataset.
10.2 Time‑independent fields
A field is time‑independent over an interval if all timeslices reference the same dataset object identity.
This MUST be expressed using symbolic links.
Writers MUST use symbolic links to express identity. Copying a dataset destroys identity information and MUST not be used to express time‑independence. If a writer copies a dataset intentionally, the reader will treat it as a distinct dataset with distinct semantics.
10.3 Time‑dependent fields
A field is time‑dependent at timeslice T if it contains a new dataset object at that timeslice.
Time‑dependence MAY be:
- full (entire field changes)
- partial (only some fragments change)
10.4 Fragment‑level time‑dependence
Fragments MAY be time‑dependent independently of each other.
Examples:
- Fragment 0 is linked across timeslices →
time‑independent
- Fragment 1 is replaced at timeslice T → time‑dependent
This allows efficient incremental updates.
10.5 Forbidden duplication
Duplicating a dataset instead of linking is forbidden because:
- it destroys identity information
- it breaks time‑independence detection
- it introduces ambiguity
Readers MUST treat every dataset object as semantically distinct unless it is literally the same HDF5 object (via link). Readers cannot detect whether two datasets were copied intentionally or accidentally. Identity is determined solely by HDF5 object identity.
Writers MUST use symbolic links to express identity.
11. Skeleton Index‑Space Rules
Skeletons define the index space for all fields under their
representations.
Skeletons themselves do not have an intrinsic size; their size
is derived from fields.
11.1 Field size
A field’s size is determined by its internal structure:
- contiguous field → dataset size
- fragmented field → sum of fragment sizes
- procedural field → implied size from attributes
These sizes contribute to the skeleton’s index‑space coverage.
11.2 Skeleton size
Skeleton size is defined as the union of index coverage across all fields attached to the Skeleton. Coverage MAY come from: – contiguous datasets, – fragmented datasets, – procedural definitions, – or any combination thereof.
11.2.1 Rules
An unfragmented field defines the full Skeleton index space. If multiple unfragmented fields exist, they MUST all have identical size; otherwise the Skeleton is invalid.
Fragmented fields MAY cover any subset of this index space.
If no unfragmented field exists, the Skeleton index space is defined as the union of coverage across all fragmented or procedural fields.
If an unfragmented field exists on a Skeleton, all fragmented fields on that Skeleton MUST cover only indices within the unfragmented field’s index space. Any fragment that defines coverage outside the unfragmented field’s index space is a fatal error.
Empty fields contribute no coverage.
This rule ensures:
- consistent index space
- compatibility across fields
- predictable behavior for partial fields
11.3 Consistency requirements
All fields attached to a skeleton MUST:
- share the same index space
- provide values for all indices unless partial
- use consistent fragment attributes
- use consistent fragment offsets
- use consistent procedural definitions
If a field is partial, missing regions MUST return default values (see Section 9).
Partiality is a structural property: a field is partial if its fragments or datasets do not cover the full Skeleton index space. The F5 model does not distinguish intentional from accidental partiality; readers MUST treat all partial fields uniformly. Writers are responsible for ensuring that partial fields are semantically correct.
11.4 Geometry determines ordering
Ordering of elements in the index space is determined by:
- coordinates in the Positions field
- fragment offsets
- procedural definitions
Fragment names and storage order have no effect on index‑space ordering.
12. Validation Rules
Validation ensures that F5 files are semantically consistent and safe to interpret.
12.1 Fatal errors
The following conditions are fatal: - missing required skeleton attributes - incompatible datasets during merge - conflicting dataspaces - inconsistent fragment attributes across files
12.1.1 Timeslice validation
A group that contains a Time attribute is a candidate timeslice. A candidate timeslice becomes a valid timeslice only after its Time attribute is successfully parsed as a scalar real.
Candidate timeslice with a non-scalar or non-convertible Time attribute -> fatal error for that candidate timeslice (see 12.1.2 for propagation).
A group with no Time attribute -> not a timeslice; ignored without error.
12.1.2 Fatal error propagation
Fatal errors are local to the entity in which they occur, but propagate along all structural and referential dependencies. A fatal error invalidates the affected entity and all entities that depend on it, regardless of where they appear in the hierarchy or across timeslices.
The fatal-error propagation rule applies to references resolved via F5::Reference. If a referenced Skeleton in another Timeslice or Grid is invalid, any Representation that references it via F5::Reference is invalid.
Examples: - Invalid Skeleton → all Representations, Fields, and Fragments referencing that Skeleton are invalid, even across Grids or Timeslices - Invalid Representation → all Fields and Fragments under that Representation are invalid - Invalid Grid → only that Grid within its Timeslice is invalid, and all references to its Skeletons from other Grids or Timeslices are invalid - Invalid Timeslice → only that Timeslice is invalid, and all references to its Grids or Skeletons from other Timeslices are invalid
12.1.3 Continuation rule
Readers MUST continue processing all unaffected entities. A fatal error MUST not abort processing of the entire file unless the root structure itself is malformed.
12.2 Warnings
Warnings SHOULD be issued for:
- missing optional attributes
- fallback to group names for grid identification
- ignored malformed timeslices
- default chart assumption
- fragment attribute conflicts resolved by file load order
Warnings do not abort processing.
12.3 Diagnostics
Readers SHOULD record:
- merge operations
- attribute conflicts
- overlapping fragments
- time‑dependence intervals
- fallback behaviors
- missing or partial fields
Diagnostics are not part of the file format but are essential for tooling.
13. Deterministic Traversal Rules
Traversal order is defined for reproducibility but has no semantic meaning. Geometry and coordinates determine semantics, not traversal order.
Traversal order:
- Timeslices by numeric Time
- Grids by F5::GridID or group name
- Skeletons by IndexDepth, SkeletonDimensionality, Refinement, then name (recommended for reproducible traversal only; traversal order has no semantic meaning).
- Representations by name
- Fields by datatype, name
- Fragments by arbitrary order (names irrelevant)
13.1 Fragment traversal clarification
Because fragment names are irrelevant and geometry determines ordering:
- traversal order of fragments MUST not affect results
- algorithms MUST treat fragments as random‑access
containers
- coordinate‑based queries MUST be used for geometric operations
This is a core F5 principle:
geometry determines meaning; storage layout does
not.
14. Reader Expectations
Readers (and tools implementing this specification) MUST adhere to the following behavioral requirements.
14.1 Timeslice handling
Readers MUST:
- identify candidate timeslices by the presence of a Time attribute and validate them using 12.1.1.
- merge timeslices with identical numeric Time values
- order timeslices by numeric Time
14.2 Grid handling
Readers MUST:
- identify grids by F5::GridID when present
- fall back to group name when absent
- merge grids across files deterministically
- treat grid ordering as traversal‑only, not semantic
14.3 Skeleton handling
Readers MUST:
- validate required skeleton attributes
- treat missing optional attributes as warnings
- derive skeleton size from fields
- enforce consistent index‑space semantics
14.4 Representation handling
Readers MUST:
- identify coordinate representations via chart names
- identify relative representations via skeleton names
- resolve local charts via GlobalChart attributes
- assume default chart if none is defined (StandardCartesianChart3D)
- treat chart transformations as optional but recommended
14.5 Field handling
Readers MUST:
- identify fields by datatype
- support contiguous, fragmented, separated compound, and procedural fields
- treat Representations with absent Positions as partial per 7.1.2.
- treat fragment names as irrelevant
- treat fragments as random‑access containers
- use coordinates to determine geometric ordering
- support partial fields and default values
14.6 Time‑dependence handling
Readers MUST:
- detect time‑independence via symbolic links
- treat dataset duplication as semantic change
- support fragment‑level time‑dependence
- track identity across timeslices
14.7 Fragment handling
Readers MUST:
- ignore fragment names
- use fragment offsets and coordinates for placement
- merge fragment attributes using file load order
- warn on inconsistent attributes
- treat overlapping fragments as warnings
14.8 Geometry‑driven semantics
Readers MUST:
- derive ordering from coordinates, not storage order
- treat geometry as the authoritative source of meaning
- ensure that algorithms produce identical results regardless of fragment order
This is a core F5 principle.
15. Summary of the Canonical F5 Model
This section summarizes the entire specification in a concise, normative list.
15.1 Structural model
- Timeslices identified by scalar Time
- Timeslices with same numeric Time merged
- Grids identified by F5::GridID or group name
- Skeletons define topology and index space
- Representations define coordinate or relative meaning
- Fields attach data to skeleton index space
- Fragments store partial data
15.2 Geometry‑driven semantics
- Geometry determines ordering
- Fragment names are irrelevant
- Storage layout has no semantic meaning
- Readers MUST use coordinates for all geometric operations
15.3 Field types
- contiguous
- fragmented
- separated compound
- procedural (UniformSampling, DirectProduct, FragmentedUniformSampling)
15.4 Fragment semantics
- identity is dataset identity
- ordering is irrelevant
- offsets and coordinates determine placement
- partial fields allowed
- default values for uncovered regions
15.5 Time‑dependence
- symbolic links express time‑independence
- dataset duplication expresses change
- fragment‑level time‑dependence supported
15.6 Validation
- required attributes enforced
- optional attributes warn
- Fragment attribute conflicts: within a single file → warning across multiple files → fatal error
- deterministic merging required
15.7 Charts
- charts define coordinate systems
- local charts reference global charts
- default “StandardCartesianChart3D” chart assumed if none defined
16. Integrated Clarifications (OQ‑1 to OQ‑5)
All open questions have been resolved and integrated into the
specification.
For completeness, they are restated here.
OQ‑1: Fragment names and ordering
- Fragment names have no semantic meaning.
- Ordering of fragments is irrelevant.
- Placement is determined solely by fragment offsets and
coordinates.
- Geometry determines ordering, not storage layout.
- Deterministic traversal is unnecessary for semantics.
OQ‑2: Partial fields
- Partiality is implicit from missing fragments.
- Querying uncovered regions yields default values (zero or
HDF5 fill value).
- No explicit partial marker is required.
OQ‑3: DirectProduct semantics
- DirectProduct uses the Cartesian product of component
arrays.
- Used to construct rectilinear (non‑uniform regular) grids.
OQ‑4: Fragment attribute merging
- File load order determines precedence.
- Inconsistent fragment attributes produce warnings.
- Inconsistent attributes across files are an error.
OQ‑5: Default chart
- If no chart is defined, assume a default
“StandardCartesianChart3D” chart.
- This is mandatory for backward compatibility.
- A warning is recommended but not required.
- Explicit charts are strongly preferred.
17. Conceptual Note: “How is it?” vs. “What is it?”
The F5 model does not classify datasets by predefined geometric or topological types. Instead, it describes how a dataset is structured, not what it is.
Applications MUST not ask:
“Is this a triangular surface?”
but instead:
“Does this dataset have the properties of a triangular surface?”
This distinction is essential:
A triangular surface is also a point cloud. It MAY be part of a refinement hierarchy. It MAY coexist with line or cell structures on the same vertices. It MAY represent only a subset of a larger domain.
F5 avoids implicit assumptions and predefined categories. The model exposes structure; applications infer meaning from that structure.
This design philosophy is central to the expressive power of F5.
18. Structural Principles for Hierarchical Refinement
This appendix describes the structural principles that allow hierarchical refinement to be expressed within the F5 model. It introduces no new semantics beyond those already defined in Sections 1–17. Instead, it clarifies how the existing concepts—Skeletons, Representations, Fields, IndexDepth, and Fragments—combine to express refinement structures in a fully general and topology‑preserving manner.
The purpose of this appendix is to provide a design guide. It does not prescribe any specific refinement scheme, naming convention, or data layout. All refinement structures MUST be derivable from the core F5 rules.
18.1 Refinement as a Topological Relation
Refinement is a relation between topological entities.
In F5, all topological entities are represented by
Skeletons, and all relations between Skeletons
are represented by Representations.
Therefore:
- A refinement hierarchy MUST be expressed as a sequence of Skeletons.
- Refinement relations MUST be expressed as Representations between these Skeletons.
- No additional hierarchy levels or special metadata are required.
This follows directly from the canonical hierarchy:
Timeslice → Grid → Skeleton → Representation → Field → Fragment datasets
18.2 Fragments as Topological Entities
Fragments are subsets of a field’s index space.
Topologically, a fragment corresponds to a set of cells, and a
cell corresponds to a set of vertices. Fragments are not
topological entities themselves; they are represented as
topological entities only when modeled via Skeletons with
IndexDepth ≥ 2.
Thus, refinement tiles MAY be represented by Skeletons whose IndexDepth reflects their topological structure:
- IndexDepth = 0 → vertices (points)
- IndexDepth = 1 → cells (sets of vertices)
- IndexDepth = 2 → sets of cells (fragments)
- IndexDepth = 3 → sets of fragments, and so on
This allows refinement tiles to be treated as first‑class topological entities.
18.3 Refinement Levels as Skeletons
Each refinement level is represented by a distinct
Skeleton.
These Skeletons:
- share the same embedding dimension,
- differ in their index spaces,
- MAY differ in IndexDepth depending on the refinement
scheme,
- MAY define optional Refinement attributes to indicate level.
The F5 model does not constrain the number of refinement levels or their structure.
18.4 Refinement Relations as Relative Representations
A refinement relation between two Skeletons is expressed as a relative representation.
A subgroup of a Skeleton is a refinement Representation if its name matches another Skeleton name within the same Grid, or via the 6.5 cross-grid or cross-timeslice references.
Thus, refinement relations are expressed structurally as:
Skeleton_L / Skeleton_{L+1}
This representation has the same index space as
Skeleton_L.
Its Fields describe how each element of Skeleton_L
relates to elements of Skeleton_{L+1}.
No new keywords or attributes are required.
A physically important example is AMR (Adaptive Mesh Refinement) governed by the Courant-Friedrichs-Lewy (CFL) condition. The CFL condition constrains the time step at each refinement level to be proportional to the cell size at that level; finer levels therefore advance at smaller time steps than coarser ones. As a result, the coarse-level refinement hierarchy Representation is partially time-dependent: it references a fine-level Skeleton that exists at a different (earlier) timeslice than the coarse level’s current timeslice. Such inter-level refinement relations MUST use F5::Reference (see 6.5) pointing to the fine-level Skeleton at the appropriate timeslice. The refinement Representation itself thus changes only as often as the coarse level advances, not at every fine sub-step - a natural expression of partial time-dependence within the F5 model.
18.5 Fields on Refinement Representations
Every Representation MAY contain Fields.
Fields on a refinement representation:
- have the same index space as the parent Skeleton,
- MAY be contiguous, fragmented, or procedural,
- MAY contain identifiers, indices, or other values referencing the child Skeleton.
The F5 model does not prescribe the datatype or structure of
these fields.
Their semantics follow from the Representation they belong
to.
18.6 Fragment‑Level Refinement
When refinement is defined at the fragment (tile) level:
- fragments are represented as elements of a Skeleton with
appropriate IndexDepth,
- refinement relations are expressed as relative
representations between fragment Skeletons,
- Fields on these representations encode the refinement mapping.
Fragment names have no semantic meaning.
Fragment identity is determined by dataset identity, as defined
in Section 9.
18.7 Refinement Across Different Topological Dimensions
Refinement MAY occur:
- between fragment Skeletons,
- between fragment Skeletons and point Skeletons,
- between fragment Skeletons and higher‑dimensional Skeletons
(e.g., triangles, tetrahedra),
- or between Skeletons of different IndexDepth.
The F5 model imposes no restrictions on the dimensionality of Skeletons participating in refinement relations.
All such relations MUST be expressed structurally through relative representations.
18.8 Optional Coordinate Embeddings
Skeletons participating in refinement hierarchies MAY optionally define coordinate representations. Coordinate representations MUST refer to a chart; relative representations MUST refer to another Skeleton.
These coordinate representations:
use the same chart mechanism as all other coordinate representations,
MAY define Positions fields giving geometric embeddings of refinement elements.
Positions fields MAY be omitted when their information is derivable from other Representations, indirection mappings, refinement relationships, or chart transformations, consistent with the general Positions rule in Section 7.1.
The F5 model does not require explicit coordinate embeddings for refinement structures.
18.9 No Special Keywords or Reserved Names
The F5 model does not introduce any special keywords, attribute names, or reserved identifiers for refinement.
All refinement structures MUST be expressible using:
- Skeletons
- Representations
- Fields
- IndexDepth
- Fragments
No additional constructs are permitted.
18.10 Derivability from Core Principles
All refinement structures MUST be derivable from the following core principles:
- Skeletons define topological
entities.
- Representations define relations between
Skeletons.
- Fields attach data to Skeleton index
spaces.
- Fragments partition field index
spaces.
- IndexDepth expresses nested topological
structure.
- No semantics are encoded in names.
- All semantics arise from structure.
These principles are sufficient to express:
- hierarchical point clouds,
- hierarchical meshes,
- AMR refinement,
- multi‑resolution grids,
- and mixed‑dimensional refinement structures.
No additional rules are required. These principles are sufficient for an AI or human reader to derive refinement structures without explicit examples.
18.11 IndexDepth Design Guidelines for Scalable Topological Modeling
This section provides structural guidelines for choosing
IndexDepth values in a way that ensures long‑term
stability, extensibility, and composability of Skeletons. These
guidelines do not introduce new semantics; they follow directly
from the core principles defined in Sections 1–17.
The purpose of these guidelines is to ensure that Skeletons remain structurally stable even when intermediate topological levels are added or removed, and that tools can reliably interpret Skeletons based solely on their dimensionality and index‑depth signatures.
These guidelines are consistent with the examples in Section 5.6.
18.11.1 Maximal‑Depth Principle
If a topological entity MAY participate in multiple nested
levels of structure, its Skeleton SHOULD be assigned an
IndexDepth equal to the maximum
depth of nesting it MAY ever require, even if some
intermediate levels are not currently present.
This ensures that:
- the Skeleton’s index space remains stable over time,
- intermediate Skeletons (e.g., Edges between Points and Lines) MAY be added or removed without restructuring,
- higher‑order or refined variants of the entity can be introduced without altering existing Skeletons,
- tools can reliably identify Skeletons by their
(SkeletonDimensionality, IndexDepth)pair.
This principle follows from the definition of
IndexDepth as the number of nested index spaces
between a Skeleton and its vertices.
18.11.2 Optional Intermediate Skeletons
Intermediate topological Skeletons (e.g., Edges between
Points and Lines) are optional.
Their presence or absence MUST not require restructuring of
Skeletons at higher index depths.
A Skeleton defined with a maximal IndexDepth MAY
coexist with or without intermediate Skeletons. If intermediate
Skeletons are introduced later, they simply occupy the
appropriate index‑depth level without affecting existing
structures.
18.11.3 Stability of Index Spaces
Skeletons SHOULD be designed so that their index spaces do not change when:
- intermediate Skeletons are added or removed,
- refinement structures are introduced,
- higher‑order elements are added,
- additional Representations are defined.
Assigning a maximal IndexDepth ensures that the
index space of a Skeleton remains stable and predictable.
18.11.4 Uniformity Across Datasets
For a given topological entity type (e.g., lines, surfaces,
volumes), all datasets SHOULD use the same
(SkeletonDimensionality, IndexDepth) pair,
regardless of whether intermediate levels are present.
This uniformity enables:
- consistent interpretation across datasets,
- predictable traversal logic,
- deterministic code generation,
- compatibility with refinement structures.
18.11.5 Compatibility With Refinement Structures
Refinement structures (Section 18.1–18.10) rely on stable
index‑depth assignments.
Skeletons representing refined entities MUST be able to
participate in relative representations without requiring
redefinition of their index spaces.
Using maximal index depth ensures that refinement relations can be expressed structurally, even when intermediate levels are absent.
18.11.6 Derivability From Core Principles
The Maximal‑Depth Principle is not an additional rule; it follows directly from:
- Skeletons represent topological entities.
- IndexDepth expresses nested topological structure.
- Representations express relations between Skeletons.
- Skeleton identity MUST be stable across
Representations.
- No semantics are encoded in names.
- All semantics arise from structure.
These principles imply that Skeletons SHOULD be assigned index depths that remain valid under all future structural extensions.
End of Section 18.11
End of Section 18
19. Mathematical Domains Required for Fitting a Dataset into the F5 Layout
This section provides the mathematical context that underlies the F5 model. These concepts are not additional requirements of the format; they explain the structures defined in Sections 1–18.
The domains are listed in order of conceptual priority.
19.1 Differential Geometry
Differential geometry provides the foundational language for:
- coordinate charts
- tangent and cotangent spaces
- vector and covector fields
- tensor fields
- coordinate transformations
- geometric embeddings
- metric‑dependent quantities
- geometric invariants
In the F5 model:
- coordinate representations correspond to
charts on a manifold,
- Positions fields provide embeddings of
skeleton elements into a chart,
- tensor‑valued fields MUST be stored in
coordinate representations because they transform under chart
changes,
- named datatypes encode tensorial transformation rules.
Differential geometry is therefore essential for:
- interpreting coordinate‑dependent fields,
- understanding how fields transform between charts,
- ensuring that geometric quantities are stored in the correct representation.
19.2 Topology
Topology provides the structural foundation for:
- vertices, edges, faces, and cells
- nested index spaces (IndexDepth)
- connectivity relations
- refinement relations
- cell complexes
- adjacency and incidence
- partial coverings and fragment partitions
In the F5 model:
- Skeletons represent topological
entities,
- IndexDepth expresses nested topological
structure,
- relative representations express incidence
relations between skeletons,
- refinement structures are topological relations between skeletons at different levels.
Topology determines:
- the structure of Skeletons,
- the meaning of Representations,
- the interpretation of Fields as functions on index spaces.
19.3 Fiber‑Bundle Theory
Fiber‑bundle theory provides the unifying mathematical framework for the F5 model.
A fiber bundle consists of:
- a base space (the index space of a
Skeleton),
- a fiber (the datatype of a Field),
- a projection (the attachment of field
values to indices),
- and optional connections (e.g., chart transformations).
In the F5 model:
- every Field is a section of a fiber
bundle,
- the fiber is defined by the Field’s named
datatype,
- the base is the Skeleton’s index
space,
- coordinate representations correspond to
local trivializations of the bundle,
- chart transformations correspond to transition functions.
This perspective explains:
- why Fields attach to Skeletons,
- why tensor fields MUST live in coordinate
representations,
- why named datatypes MUST encode transformation rules,
- why geometry and topology are strictly separated.
Fiber‑bundle theory is the conceptual backbone of the entire F5 design.
19.4 Geometric Algebra and Tensor Algebra
Geometric algebra (or classical tensor algebra) provides the mathematical language for typing fields.
Required concepts include:
- tangent vectors
- cotangent vectors
- multivectors
- differential forms
- metric‑dependent and metric‑independent quantities
- transformation rules under chart changes
- basis representations
- tensor rank and variance
In the F5 model:
- named datatypes encode the algebraic type
of a field,
- coordinate representations provide the
basis in which components are stored,
- chart transformations define how components transform.
This domain is essential for:
- distinguishing vectors from covectors,
- distinguishing tensors of different rank,
- ensuring correct transformation behavior,
- interpreting physical quantities correctly.
19.5 Geometry
Geometry provides:
- embeddings into ℝⁿ
- metric interpretation
- geometric queries
- geometric ordering
- spatial reasoning
In the F5 model:
- geometry is introduced only through coordinate
representations,
- Positions fields define embeddings,
- geometry determines ordering and spatial queries,
- storage layout has no geometric meaning.
19.6 Index‑Space Theory
Index‑space theory covers:
- discrete index sets
- nested index spaces (IndexDepth)
- mappings between index spaces
- partial coverage
- fragment offsets
- procedural fields
This domain is essential for:
- interpreting Skeletons,
- understanding relative representations,
- handling fragmented fields,
- interpreting refinement structures.
19.7 Set‑Theoretic Semantics
The F5 model is fundamentally set‑theoretic:
- Skeletons define sets,
- Representations define functions between sets,
- Fields define functions from index sets to values,
- Fragments define partitions of sets.
This domain is required for:
- understanding identity,
- interpreting refinement mappings,
- handling partial fields,
- merging datasets.
19.8 Identity Theory
Identity in F5 is defined by HDF5 object identity, not content.
This domain covers:
- object identity vs equality
- symbolic links
- identity propagation
- time‑dependence semantics
Identity theory is essential for:
- time‑dependent fields,
- fragment‑level updates,
- merging across files.
19.9 Optional Domains
Depending on the dataset, additional mathematical domains MAY be relevant:
- measure theory (densities, integrals)
- graph theory (connectivity queries)
- algebraic topology (homology, cohomology)
- numerical analysis (interpolation, discretization)
These domains are not required by the F5 model but MAY be used by applications.
19.10 Derivability
All mathematical structures required to fit a dataset into the F5 layout are derivable from:
- differential geometry
- topology
- fiber‑bundle theory
- geometric/tensor algebra
- index‑space theory
- identity theory
No additional mathematical assumptions are required.