hoops_ai.dataset
Quick Overview
Classes
DatasetExplorer([flow_output_file, ...])Provides methods to explore queries/filters on a merged Zarr dataset and retrieve metadata from a Parquet file.
DatasetLoader([merged_store_path, ...])A framework-agnostic dataset class that:
CADDataset(parent_dataset, indices)A framework-agnostic dataset object that contains training, validation or testing data.
TorchCADDataset(*args, **kwargs)PyTorch Dataset wrapper for CADDataset
GraphDataset(*args, **kwargs)PyTorch Dataset class that uses a DatasetLoader object to dynamically create DGL graphs instead of loading pre-computed graphs from disk.
Dataset Management Module
The Dataset module provides comprehensive tools for managing large-scale CAD datasets efficiently. It handles the complexities of working with collections of CAD models, from exploration and loading to preprocessing and ML integration.
This module is designed to work seamlessly with various dataset formats and provides efficient data loading patterns for machine learning workflows. It supports both traditional ML frameworks and specialized graph-based approaches for CAD data analysis.
For comprehensive examples and dataset management patterns, see the Dataset Management Programming Guide.