###################### Machine Learning Model ###################### .. warning:: The ML architecture described in this section is currently **EXPERIMENTAL** and may change in future releases. ************** Introduction ************** This section of the programming guide covers the concepts and best practices for training and deploying machine learning models using HOOPS AI. It includes data mining and exploration concepts, guides to train models using either pre-built architectures or custom machine learning frameworks, as well as model inference and evaluation techniques. HOOPS AI provides a complete end-to-end pipeline for CAD-based machine learning, from dataset exploration through model training to deployment. The system is designed around a unified architecture that ensures encoding consistency between training and inference while maintaining flexibility for custom implementations. Problem Statement ================= **The Challenge:** CAD → ML workflows require identical encoding logic in: 1. **Training:** Batch processing thousands of CAD files 2. **Inference:** Real-time processing of new CAD files **The Solution:** HOOPS AI encapsulates all encoding logic in reusable model interfaces: .. code-block:: python # Training: Used by FlowTrainer flow_model.encode_cad_data(cad_file, cad_loader, storage) # Inference: Used by FlowInference (same method!) flow_model.encode_cad_data(cad_file, cad_loader, storage) This architecture guarantees that the exact same encoding pipeline processes data during both training and deployment, eliminating the common bug where training and inference handle data differently. ********************** Architecture Overview ********************** The HOOPS AI ML system provides a layered architecture supporting the complete ML lifecycle: .. code-block:: text FlowModel (Abstract Interface) ├── GraphClassification (Graph-level classifier) │ ├── Whole-part classification │ └── Example: "bolt", "bearing", "bracket" └── GraphNodeClassification (Node-level classifier) ├── Per-face classification └── Example: "hole", "pocket", "slot" Both consumed by: ├── FlowTrainer (batch training with dataset splitting) └── FlowInference (single-file deployment) Dataset tools: ├── DatasetExplorer (query and analyze merged datasets) └── DatasetLoader (prepare data for ML training) Dataset Exploration and Mining =============================== :doc:`explore-dataset` - Comprehensive guide to dataset analysis and querying. **What you'll learn:** - Understanding merged dataset structure (``.dataset``, ``.infoset``, ``.attribset`` files) - Using DatasetExplorer for discovery and filtering - Querying data at different granularities (arrays, groups, files) - Creating distributions and analyzing statistics - Using DatasetLoader for ML preparation - Performing stratified train/validation/test splits - Creating PyTorch DataLoaders for training **When to use this guide:** - You have Flow-generated datasets and want to explore them - You need to filter CAD files by geometric/topological criteria - You want to understand dataset distributions before training - You're preparing data for custom ML workflows Part Classification with Graph Learning ======================================== :doc:`part-class` - Graph-level classification for whole-part categorization. **What you'll learn:** - Understanding the GraphClassification model architecture - Initializing models with custom class counts - CAD encoding strategy (face discretization, edge features, graph structure) - Integration with Flow tasks using ``@flowtask`` decorators - Complete FABWAVE dataset example (45 part classes) - Training workflows with FlowTrainer - Inference and deployment with FlowInference **When to use this guide:** - You need to classify entire CAD parts into categories - You want to recognize part types or shape families - You're building design retrieval or recommendation systems - You need manufacturing process selection based on part geometry CAD Feature Recognition Model ============================== :doc:`feature-rec` - Node-level classification for per-face feature recognition. **What you'll learn:** - Understanding the GraphNodeClassification model architecture - Initializing models with Transformer hyperparameters - Rich encoding strategy (extended graph features, angle/distance histograms) - Integration with Flow tasks using ``@flowtask`` decorators - Complete CADSynth-AAG dataset example (162k models) - Handling node-level labels and per-face predictions - Advanced hyperparameter tuning for production models **When to use this guide:** - You need to identify machining features on individual faces - You're performing face semantic segmentation - You're building manufacturing process planning tools - You need detailed geometric feature analysis Develop Your Own ML Model ========================== :doc:`train` - Custom model development using the FlowModel interface. **What you'll learn:** - Understanding the FlowModel abstract interface - Implementing custom encoding strategies - Defining model architectures with PyTorch Lightning - Integration with FlowTrainer for batch training - Integration with FlowInference for deployment - Ensuring encoding consistency between training and inference - Best practices for custom model development **When to use this guide:** - Pre-built models don't match your specific use case - You need custom feature extraction or graph structures - You want to integrate third-party ML architectures - You're researching novel CAD ML approaches *********** Quick Start *********** New to HOOPS AI ML? =================== Start with :doc:`explore-dataset` to understand how to query and analyze CAD datasets. This foundation is essential before training any models. Ready to Implement? =================== **Part classification:** → :doc:`part-class` - Learn GraphClassification for whole-part categorization **Face segmentation:** → :doc:`feature-rec` - Learn GraphNodeClassification for per-face feature recognition **Custom models:** → :doc:`train` - Learn the FlowModel interface for custom architectures Need Flow Integration Examples? ================================ See complete integration examples in: - :doc:`part-class` - FABWAVE dataset with 45 part classes - :doc:`feature-rec` - CADSynth-AAG dataset with 162k models ********************** Additional Resources ********************** **Flow Framework** See :doc:`/programming_guide/data-flow-management` for the complete Flow system guide covering task orchestration, parallel execution, and dataset merging. **CAD Data Encoding** See :doc:`/programming_guide/cad-data-encoding` for detailed information about feature extraction from CAD files. **Visualization** See :doc:`/programming_guide/cad-data-visualization` for tools to visualize predictions on 3D geometry. .. toctree:: :titlesonly: :hidden: explore-dataset part-class feature-rec train