Machine Learning Model
Warning
The ML architecture described in this section is currently EXPERIMENTAL and may change in future releases.
Introduction
This section of the programming guide covers the concepts and best practices for training and deploying machine learning models using HOOPS AI. It includes data mining and exploration concepts, guides to train models using either pre-built architectures or custom machine learning frameworks, as well as model inference and evaluation techniques.
HOOPS AI provides a complete end-to-end pipeline for CAD-based machine learning, from dataset exploration through model training to deployment. The system is designed around a unified architecture that ensures encoding consistency between training and inference while maintaining flexibility for custom implementations.
Problem Statement
The Challenge: CAD → ML workflows require identical encoding logic in:
Training: Batch processing thousands of CAD files
Inference: Real-time processing of new CAD files
The Solution: HOOPS AI encapsulates all encoding logic in reusable model interfaces:
# Training: Used by FlowTrainer
flow_model.encode_cad_data(cad_file, cad_loader, storage)
# Inference: Used by FlowInference (same method!)
flow_model.encode_cad_data(cad_file, cad_loader, storage)
This architecture guarantees that the exact same encoding pipeline processes data during both training and deployment, eliminating the common bug where training and inference handle data differently.
Architecture Overview
The HOOPS AI ML system provides a layered architecture supporting the complete ML lifecycle:
FlowModel (Abstract Interface)
├── GraphClassification (Graph-level classifier)
│ ├── Whole-part classification
│ └── Example: "bolt", "bearing", "bracket"
└── GraphNodeClassification (Node-level classifier)
├── Per-face classification
└── Example: "hole", "pocket", "slot"
Both consumed by:
├── FlowTrainer (batch training with dataset splitting)
└── FlowInference (single-file deployment)
Dataset tools:
├── DatasetExplorer (query and analyze merged datasets)
└── DatasetLoader (prepare data for ML training)
Dataset Exploration and Mining
Dataset Exploration and Mining - Comprehensive guide to dataset analysis and querying.
What you’ll learn:
Understanding merged dataset structure (
.dataset,.infoset,.attribsetfiles)Using DatasetExplorer for discovery and filtering
Querying data at different granularities (arrays, groups, files)
Creating distributions and analyzing statistics
Using DatasetLoader for ML preparation
Performing stratified train/validation/test splits
Creating PyTorch DataLoaders for training
When to use this guide:
You have Flow-generated datasets and want to explore them
You need to filter CAD files by geometric/topological criteria
You want to understand dataset distributions before training
You’re preparing data for custom ML workflows
Part Classification with Graph Learning
Parts Classification Model - Graph-level classification for whole-part categorization.
What you’ll learn:
Understanding the GraphClassification model architecture
Initializing models with custom class counts
CAD encoding strategy (face discretization, edge features, graph structure)
Integration with Flow tasks using
@flowtaskdecoratorsComplete FABWAVE dataset example (45 part classes)
Training workflows with FlowTrainer
Inference and deployment with FlowInference
When to use this guide:
You need to classify entire CAD parts into categories
You want to recognize part types or shape families
You’re building design retrieval or recommendation systems
You need manufacturing process selection based on part geometry
CAD Feature Recognition Model
CAD Feature Recognition Model - Node-level classification for per-face feature recognition.
What you’ll learn:
Understanding the GraphNodeClassification model architecture
Initializing models with Transformer hyperparameters
Rich encoding strategy (extended graph features, angle/distance histograms)
Integration with Flow tasks using
@flowtaskdecoratorsComplete CADSynth-AAG dataset example (162k models)
Handling node-level labels and per-face predictions
Advanced hyperparameter tuning for production models
When to use this guide:
You need to identify machining features on individual faces
You’re performing face semantic segmentation
You’re building manufacturing process planning tools
You need detailed geometric feature analysis
Develop Your Own ML Model
Develop Your own ML Model - Custom model development using the FlowModel interface.
What you’ll learn:
Understanding the FlowModel abstract interface
Implementing custom encoding strategies
Defining model architectures with PyTorch Lightning
Integration with FlowTrainer for batch training
Integration with FlowInference for deployment
Ensuring encoding consistency between training and inference
Best practices for custom model development
When to use this guide:
Pre-built models don’t match your specific use case
You need custom feature extraction or graph structures
You want to integrate third-party ML architectures
You’re researching novel CAD ML approaches
Quick Start
New to HOOPS AI ML?
Start with Dataset Exploration and Mining to understand how to query and analyze CAD datasets. This foundation is essential before training any models.
Ready to Implement?
- Part classification:
→ Parts Classification Model - Learn GraphClassification for whole-part categorization
- Face segmentation:
→ CAD Feature Recognition Model - Learn GraphNodeClassification for per-face feature recognition
- Custom models:
→ Develop Your own ML Model - Learn the FlowModel interface for custom architectures
Need Flow Integration Examples?
See complete integration examples in:
Parts Classification Model - FABWAVE dataset with 45 part classes
CAD Feature Recognition Model - CADSynth-AAG dataset with 162k models
Additional Resources
- Flow Framework
See Data Flow Management for the complete Flow system guide covering task orchestration, parallel execution, and dataset merging.
- CAD Data Encoding
See CAD Data Encoding for detailed information about feature extraction from CAD files.
- Visualization
See Data Visualization Experience for tools to visualize predictions on 3D geometry.