hoops_ai.storage.datasetstorage.schema_builder

class hoops_ai.storage.datasetstorage.schema_builder.SchemaBuilder(domain='generic_data', version='1.0', description=None)

Bases: object

Standard API builder for creating data storage schemas.

This class provides a clear, explicit interface for building schemas using a traditional object-oriented approach without method chaining.

Example

>>> builder = SchemaBuilder(domain='CAD_analysis', version='1.0')
>>> # Create groups
>>> faces_group = builder.create_group('faces', 'face', 'Face geometric data')
>>> edges_group = builder.create_group('edges', 'edge', 'Edge geometric data')
>>> # Create arrays in groups
>>> faces_group.create_array('face_areas', ['face'], 'float32', 'Surface area of each face')
>>> faces_group.create_array('face_normals', ['face', 'coordinate'], 'float32', 'Normal vectors')
>>> edges_group.create_array('edge_lengths', ['edge'], 'float32', 'Length of each edge')
>>> # Build the final schema
>>> schema = builder.build()

Classes

Parameters:
  • domain (str)

  • version (str)

  • description (str | None)

build()

Build the final schema dictionary.

Returns:

Complete schema dictionary

Return type:

Dict[str, Any]

Raises:

ValueError – If schema validation fails

copy()

Create a copy of this schema builder.

Returns:

A copy of this builder

Return type:

SchemaBuilder

create_group(name, primary_dimension, description=None, special_processing=None)

Create a new group in the schema.

Parameters:
  • name (str) – Name of the group

  • primary_dimension (str) – Primary dimension for this group

  • description (str | None) – Optional description of the group

  • special_processing (str | None) – Optional special processing type

Returns:

The created group object

Return type:

Group

Example

faces_group = builder.create_group(‘faces’, ‘face’, ‘Face-related data’)

define_categorical_metadata(name, dtype='str', description=None, values=None, labels=None, required=False, **validation_rules)

Define categorical metadata that goes to .attribset files.

Parameters:
  • name (str) – Name of the metadata field

  • dtype (str) – Data type (str, int32, int64, float32, float64, bool)

  • description (str | None) – Optional description of the metadata

  • values (List[Any] | None) – List of allowed values for this category

  • labels (List[str] | None) – List of human-readable labels corresponding to values

  • required (bool) – Whether this metadata is required

  • **validation_rules – Additional validation rules

Return type:

None

Example

builder.define_categorical_metadata(‘machining_category’, ‘int32’,

‘Machining complexity’, values=[1,2,3,4,5], labels=[‘Simple’, ‘Easy’, ‘Medium’, ‘Hard’, ‘Complex’])

define_file_metadata(name, dtype='str', description=None, required=False, **validation_rules)

Define file-level metadata that goes to .infoset files.

Parameters:
  • name (str) – Name of the metadata field

  • dtype (str) – Data type (str, int32, int64, float32, float64, bool)

  • description (str | None) – Optional description of the metadata

  • required (bool) – Whether this metadata is required

  • **validation_rules – Additional validation rules

Return type:

None

Example

builder.define_file_metadata(‘size_cadfile’, ‘int64’, ‘File size in bytes’) builder.define_file_metadata(‘processing_time’, ‘float32’, ‘Processing time in seconds’)

extend_template(template_name)

Extend the current schema with data from a template.

Parameters:

template_name (str) – Name of the template to extend with

Returns:

Self for continued configuration

Return type:

SchemaBuilder

classmethod from_dict(schema_dict)

Create a schema builder from a dictionary.

Parameters:

schema_dict (Dict[str, Any]) – Dictionary containing schema definition

Returns:

New builder with loaded schema

Return type:

SchemaBuilder

from_template(template_name)

Initialize the schema from a predefined template.

Parameters:

template_name (str) – Name of the template to use

Returns:

Self for continued configuration

Return type:

SchemaBuilder

get_group(name)

Get an existing group by name.

Parameters:

name (str) – Name of the group to retrieve

Returns:

The group object or None if not found

Return type:

Group

get_metadata(key)

Get metadata value by key.

Parameters:

key (str)

Return type:

Any

get_metadata_routing(metadata_name)

Get the routing destination for a metadata field using patterns and explicit definitions.

Parameters:

metadata_name (str) – Name of the metadata field

Returns:

“file_level” or “categorical”, or None if not found

Return type:

str

list_categorical_metadata_fields()

Get list of defined categorical metadata fields.

Return type:

List[str]

list_file_metadata_fields()

Get list of defined file-level metadata fields.

Return type:

List[str]

list_groups()

Get a list of all group names in the schema.

Return type:

List[str]

list_metadata_fields()

Get all defined metadata fields grouped by destination.

Returns:

Dict containing ‘file_level’ and ‘categorical’ lists

Return type:

Dict[str, List[str]]

classmethod load_from_file(file_path)

Load a schema from a JSON file.

Parameters:

file_path (str | Path) – Path to the schema file

Returns:

Loaded schema builder

Return type:

SchemaBuilder

remove_group(name)

Remove a group from the schema.

Parameters:

name (str) – Name of the group to remove

Returns:

True if group was removed, False if not found

Return type:

bool

route_metadata_field(field_name, value)

Route a metadata field to its destination using schema rules.

Parameters:
  • field_name (str) – Name of the metadata field

  • value (Any) – Value of the metadata field

Returns:

“file_level” or “categorical”

Return type:

str

save_to_file(file_path)

Save the schema to a JSON file.

Parameters:

file_path (str | Path) – Path where to save the schema file

Return type:

None

set_metadata(key, value)

Set metadata for the schema.

Parameters:
  • key (str) – Metadata key

  • value (Any) – Metadata value

Return type:

None

set_metadata_routing_rules(file_level_patterns=None, categorical_patterns=None, default_numeric='file_level', default_categorical='categorical', default_string='categorical')

Set routing rules for metadata fields with pattern-based and default routing.

Parameters:
  • file_level_patterns (List[str] | None) – Patterns for fields that should go to file-level (.infoset)

  • categorical_patterns (List[str] | None) – Patterns for fields that should go to categorical (.attribset)

  • default_numeric (str) – Where to route numeric metadata by default

  • default_categorical (str) – Where to route categorical metadata by default

  • default_string (str) – Where to route string metadata by default

Return type:

None

Pattern examples:

file_level_patterns=[‘description’, ‘flow_name’, ‘stream ‘, ‘Item’] categorical_patterns=[‘category’, ‘type’, ‘material_’, ‘complexity’]

to_json(indent=2)

Convert the schema to JSON string.

Parameters:

indent (int) – JSON indentation level

Returns:

JSON representation of the schema

Return type:

str

validate()

Validate the current schema configuration.

Returns:

List of validation errors (empty if valid)

Return type:

List[str]

validate_metadata_field(field_name, value)

Validate a metadata field value against its defined type.

Parameters:
  • field_name (str) – Name of the metadata field

  • value (Any) – Value to validate

Returns:

True if valid, False otherwise

Return type:

bool

validate_metadata_schema()

Validate the metadata schema configuration.

Returns:

List of validation errors (empty if valid)

Return type:

List[str]