hoops_ai.storage
Quick Overview
Modules
Classes
LabelStorage(path_for_storing[, ...])Class for encoding and decoding labels.
MetricStorage(store)Abstract class defining the interface for storing machine learning metrics based on their type of data and visualization.
CADFileRetriever(storage_provider[, ...])
LocalStorageProvider(directory_path)Functions
convert_storage(source_handler, dest_handler)Generic converter that works with ANY DataStorage implementation.
Data Storage Module
The Storage module provides persistent storage solutions for CAD data, ML models, and analysis results. It offers a unified interface for various storage backends, optimized for the unique requirements of CAD data processing and machine learning workflows.
This module handles the efficient storage and retrieval of large-scale CAD datasets, encoded geometric data, trained ML models, and experimental results. It provides both high-performance options for production use and convenient formats for development and prototyping.
For storage architecture details and usage patterns, see the Data Storage Programming Guide.
- class hoops_ai.storage.CADFileRetriever(storage_provider, formats=None, filter_pattern=None, use_regex=False)
Bases:
object- Parameters:
- class hoops_ai.storage.DataStorage
Bases:
ABC- abstract close()
Handles any necessary cleanup or resource deallocation.
- Return type:
None
- abstract get_file_path(data_key)
Retrieves the file path for a given data key.
- get_group_for_array(array_name)
Determines which group an array belongs to based on the schema.
- abstract get_keys()
Retrieves a list of all keys in the storage. :returns: A list of all keys in the storage. :rtype: list
- Return type:
- get_schema()
Retrieves the schema definition for this storage instance.
- Returns:
The schema definition, or empty dict if no schema is set
- Return type:
- abstract load_data(data_key)
Loads data associated with a specific key.
- Parameters:
data_key (str) – The key of the data to load.
- Returns:
The loaded data.
- Return type:
Any
- abstract load_metadata(key)
Loads metadata associated with a specific key.
- Parameters:
key (str) – The metadata key.
- Returns:
The loaded metadata value.
- Return type:
Any
- abstract save_data(data_key, data)
Saves data associated with a specific key.
- Parameters:
data_key (str) – The key under which to store the data.
data (Any) – The data to store.
- Return type:
None
- abstract save_metadata(key, value)
Saves metadata as a key-value pair into the metadata JSON file. If the file doesn’t exist, it will be created.
- Parameters:
key (str) – The metadata key.
value (Any) – The metadata value (bool, int, float, string, list, or array).
- Return type:
None
- class hoops_ai.storage.LabelStorage(path_for_storing, total_faces=0, total_edges=0)
Bases:
objectClass for encoding and decoding labels.
- EDGE_CADENTITY = 'edge_labels'
- EDGE_ENTITY = 'GRAPH_EDGE'
- GRAPH_CADENTITY = 'file_label'
- GRAPH_ENTITY = 'GRAPH_ENTITY'
- NODE_CADENTITY = 'face_labels'
- NODE_ENTITY = 'GRAPH_NODE'
- load_graph_edges_labels(mlTask)
Loads the label codes and descriptions for each CAD edge.
- load_graph_label(mlTask)
Loads the label code and description for the entire graph.
- load_graph_nodes_labels(mlTask)
Loads the label codes and descriptions for each CAD face.
- load_sparse_graph_edge_label(mlTask)
Loads sparse edge label data.
- save_graph_edge_label(mlTask, edgeLabels, edgeLabelDescriptions)
Saves the label codes and descriptions for each CAD edge.
- save_graph_label(mlTask, graphLabel, graphLabelDescription)
Saves the label code and description for the entire graph.
- save_graph_node_label(mlTask, faceLabels, faceLabelDescriptions)
Saves the label codes and descriptions for each CAD face.
- save_sparse_graph_edge_label(mlTask, edgeIndices, edgeLabels, defaultLabel=0, edgeLabelDescriptions=None)
Saves sparse edge label codes while assigning a default label to all other edges.
- save_sparse_graph_node_label(mlTask, faceIndices, faceLabels, defaultLabel=0, faceLabelDescriptions=None)
Saves sparse face label codes and descriptions while assigning a default label to all other faces.
- class hoops_ai.storage.LocalStorageProvider(directory_path)
Bases:
StorageProvider- Parameters:
directory_path (str)
- class hoops_ai.storage.MemoryStorage
Bases:
DataStorage- close()
Clears the stored data and metadata from memory.
- Return type:
None
- compress_store()
This method is a placeholder, as in-memory storage does not require compression.
- Returns:
Always returns 0 as no compression is performed.
- Return type:
- create_store_in_group(store_path='')
Creates a new MemoryStorage instance and adds it to the store group.
- Parameters:
store_path (str)
- Return type:
- format()
Returns the format of this storage.
- Returns:
A string specifying that this is in-memory storage.
- Return type:
- get_file_path(data_key)
Since this is an in-memory storage, file paths do not exist. For compatibility with code that expects a directory path for the root (“”), we return a temporary directory path. For specific keys, we return a descriptive message.
- get_group_for_array(array_name)
Determines which group an array belongs to based on the schema.
- get_keys()
Retrieves a list of all stored data keys.
- Returns:
A list of keys in the storage.
- Return type:
- get_schema()
Retrieves the schema definition for this storage instance.
- Returns:
The schema definition, or empty dict if no schema is set
- Return type:
- get_store_group()
- Return type:
StoreGroup
- load_data(data_key)
Loads data from memory by key.
- load_metadata(key)
Loads metadata by key from memory storage. Supports nested keys using ‘/’ as a separator.
- save_data(data_key, data)
Stores the data in memory and tracks file size.
- Parameters:
data_key (str) – The key under which to store the data.
data (Any) – The data to store.
- Return type:
None
- save_metadata(key, value)
Stores metadata as a key-value pair in memory. Supports nested keys using ‘/’ as a separator.
- Parameters:
key (str) – The metadata key, which can be a nested key using ‘/’ as a separator.
value (Any) – The metadata value.
- Return type:
None
- class hoops_ai.storage.MetricStorage(store)
Bases:
objectAbstract class defining the interface for storing machine learning metrics based on their type of data and visualization.
- Parameters:
store (DataStorage)
- get_storage()
Returns the storage handler for this metric storage.
- Returns:
The storage handler instance.
- list_data_ids(name)
Returns all data_ids pushed under ‘name’ by examining the Zarr store keys.
- pull_category_metric(name, epoch)
Pulls category-based metric data for a specific epoch from storage.
- Parameters:
- Returns:
A tuple containing a list of categories and the corresponding metric values.
- Raises:
ValueError – If the specified epoch is not found.
- Return type:
- pull_data(name, data_id)
Loads prediction data from storage.
- pull_matrix_metric(name, epoch)
Pulls a matrix-based metric for a specific epoch from storage.
- Parameters:
- Returns:
A 2D NumPy array representing the matrix metric.
- Raises:
ValueError – If the specified epoch is not found.
- Return type:
- pull_trend_metric(name)
Pulls trend metric data from storage.
- push_category_metric(name, epoch, categories, values)
Pushes category-based metrics incrementally in memory before writing them to storage.
Reason: Compares performance across different categories (e.g., classes, features) for each epoch.
Metrics Included:
Per-Class Accuracy (one bar per class)
Per-Class IoU (one bar per class)
Feature Importance (e.g., in random forests, SHAP values)
- push_matrix_metric(name, epoch, matrix)
Pushes matrix-based metrics incrementally in memory before writing them to storage.
Reason: Stores structured relationships between multiple variables in matrix form, per epoch.
Metrics Included:
Confusion Matrix (classification tasks)
Correlation Matrix (e.g., feature correlations)
- push_predictions(name, data_id, result)
Saves prediction data to storage.
- push_trend_metric(name, epoch, value)
Pushes trend metrics incrementally in memory before writing to a file.
Reason: Tracks values over time (epochs) to analyze learning progress.
Metrics Included:
Loss (training, validation, test)
Accuracy over epochs
Precision/Recall over epochs
F1-score over epochs
IoU (mean IoU, per-class IoU over time)
RMSE/MSE/MAE for regression tasks over epochs
Learning Rate schedules
- hoops_ai.storage.convert_storage(source_handler, dest_handler, verbose=False)
Generic converter that works with ANY DataStorage implementation.
This universal converter supports all storage types including: - OptStorage / ZarrStorage - Compressed binary format - JsonStorageHandler - Human-readable JSON format - MemoryStorage - In-memory storage (no files) - Custom implementations - Any class inheriting from DataStorage
The converter uses only the base DataStorage interface methods: - get_keys() or load_metadata() to discover data keys - load_data() to read data arrays - save_data() to write data arrays - save_metadata() to copy metadata
This makes it completely storage-agnostic and extensible.
Common Use Cases: - OptStorage → JSON: Decompress for inspection/debugging - JSON → OptStorage: Compress for performance/storage - MemoryStorage → OptStorage: Persist in-memory data to disk - OptStorage → MemoryStorage: Load encoded data for ML training
- Parameters:
source_handler (DataStorage) – Source storage to read from (any type).
dest_handler (DataStorage) – Destination storage to write to (any type).
verbose (bool) – Print progress messages. Default False.
- Returns:
None
- Raises:
RuntimeError – If no data keys found or conversion fails.
- Return type:
None
Examples
>>> # OptStorage → JSON (decompression) >>> opt = OptStorage("flows/data_mining/abc123") >>> json_storage = JsonStorageHandler("flows/json/abc123") >>> convert_storage(opt, json_storage)
>>> # MemoryStorage → OptStorage (persist to disk) >>> mem = MemoryStorage() >>> mem.save_data("faces/positions", np.array([[0,0,0], [1,1,1]])) >>> opt = OptStorage("output/saved_data") >>> convert_storage(mem, opt)
>>> # JSON → MemoryStorage (load for processing) >>> json_storage = JsonStorageHandler("data/abc123") >>> mem = MemoryStorage() >>> convert_storage(json_storage, mem)