CAD Feature Recognition Model
Introduction
CAD feature recognition is a fundamental task in manufacturing and design automation, enabling automatic identification of machining features on individual faces of mechanical components. This capability supports various downstream applications including manufacturing process planning, design rule checking, feature-based similarity search, and automated design analysis.
HOOPS AI provides the GraphNodeClassification model, a node-level (per-face) classifier specifically designed for feature recognition tasks. This model operates directly on Boundary Representation (B-rep) data from 3D CAD models, leveraging both geometric features and topological relationships to produce accurate face-level classifications.
Use Cases
The feature recognition capability addresses several practical scenarios:
- Machining Feature Recognition
Automatically identify machining features such as holes, pockets, slots, and chamfers on individual faces. This supports automated manufacturing process planning and CNC programming.
- Face Semantic Segmentation
Classify each face based on its functional or manufacturing characteristics, enabling automated design analysis and quality checking.
- Manufacturing Process Planning
Determine appropriate machining operations for each feature based on face geometry and topology, supporting automated process selection.
- Design Rule Checking
Validate that design features conform to manufacturing constraints and design standards.
- Feature-Based Similarity Search
Enable intelligent search and retrieval based on specific feature types and configurations.
Model Overview
The GraphNodeClassification model implements a node-level classification architecture that processes CAD models through the following pipeline:
Graph Representation: Convert B-rep models into graph representations where nodes are faces and edges are adjacency relationships
Rich Feature Encoding: Extract geometric and topological features including face discretization, neighbor relationships, angle histograms, and distance histograms
Feature Learning: Apply Transformer-based encoders and Graph Attention Networks to capture both local and global context
Per-Face Classification: Produce a classification label for each individual face in the CAD model
This approach captures both local geometric details and global topological context, providing robust feature recognition even for complex mechanical parts.
Note
Attribution Notice
This implementation is based on a third-party architecture. For complete attribution and citation information, see Acknowledgments.
Model Architecture
Overview
The GraphNodeClassification model takes B-rep (Boundary Representation) models from CAD files and converts them into graph-based representations suitable for machine learning. This conversion enables per-face classification using a combined Transformer and Graph Neural Network (GNN) approach.
Graph Representation
The model uses a graph-based representation of CAD data:
Nodes: Represent the individual faces of the CAD model
Edges: Capture adjacency relationships between faces where they share common edges
Node Features: Encode rich geometric and topological information for each face
Edge Features: Represent curve geometry and relationships between connected faces
Neural Network Components
The model employs several neural network components:
Transformer-based Encoder: Implements a multi-layer attention mechanism for feature processing
Graph Attention Networks: Aggregate and propagate information from neighboring nodes
MLP Classifier: Produces per-node classification predictions for each face
Key Features
The architecture includes several important capabilities:
Local Geometric Encoding: Captures surface shape through face-level discretization samples
Global Topological Encoding: Leverages the graph structure to capture part-level contextual information
Transfer Learning: Supports two-step training approach from synthetic to real CAD models
Attention Mechanisms: Enables the model to focus on relevant geometric relationships
┌─────────────────────────────────────────────────────────────────┐
│ CAD B-rep Model │
│ (Faces + Adjacency Edges) │
└────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Graph Representation │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Nodes: Individual faces │ │
│ │ Edges: Face adjacency (shared edges) │ │
│ └──────────────────────────────────────────────────────────┘ │
└───────────────────────────────┬─────────────────────────────────┘
│
┌────────────┴────────────┐
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ Node Features (Rich) │ │ Edge Features │
│ │ │ │
│ • Face discretization │ │ • U-grids (10, 6) │
│ (pointsamples, 7) │ │ • Edge attributes │
│ • Face attributes │ │ • Path information │
│ • Neighbor count │ │ │
│ • Angle histograms │ │ │
│ • Distance histograms │ │ │
└─────────────┬─────────────┘ └─────────────┬─────────────┘
│ │
└──────────────┬──────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ GraphNodeClassification │
└─────────────────────────────────────────────────────────────────┘
Output
The model produces a classification label for each face in the CAD model.
Technology Details
This implementation is based on a state-of-the-art architecture for machining feature recognition in B-rep models. The architecture employs:
Rich geometric and topological feature encoding
Transformer-based attention mechanisms for capturing long-range dependencies
Graph neural networks for local neighborhood aggregation
Domain adaptation capabilities for transferring from synthetic to real data
Model Initialization
Before using the GraphNodeClassification model, you must initialize it with configuration parameters.
Basic Usage
The simplest initialization requires only the number of feature classes:
from hoops_ai.ml.EXPERIMENTAL import GraphNodeClassification
# Create model with default parameters
flow_model = GraphNodeClassification(
num_classes=25,
result_dir="./results"
)
This creates a GraphNodeClassification instance configured for a 25-class feature recognition task, with results saved to the ./results directory.
Parameters
The GraphNodeClassification initialization accepts several parameter categories:
Model Architecture Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
25 |
Number of feature classes (e.g., hole, pocket, slot) |
|
int |
8 |
Number of Transformer encoder layers |
|
int |
256 |
Node embedding dimension |
|
int |
512 |
Transformer model dimension |
|
int |
32 |
Number of attention heads |
|
float |
0.3 |
Classifier dropout rate |
|
float |
0.3 |
Attention mechanism dropout |
|
float |
0.3 |
Activation layer dropout |
Training Hyperparameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
0.002 |
Initial learning rate |
|
Tuple[float, float] |
(0.99, 0.999) |
AdamW optimizer betas |
|
float |
0.5 |
LR reduction factor |
|
int |
5 |
Patience for LR scheduler |
|
float |
1e-4 |
Threshold for scheduler |
|
float |
1e-6 |
Minimum learning rate |
|
int |
2 |
Cooldown period after LR reduction |
|
int |
5000 |
Warmup steps for learning rate |
Logging and Output
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘training_errors.log’ |
Error logging file path |
|
str |
None |
Results output directory |
|
bool |
False |
Generate visualization cache |
Advanced Configuration
For production workflows, provide explicit configuration:
flow_model = GraphNodeClassification(
# Architecture
num_classes=25,
n_layers_encode=8,
dim_node=256,
d_model=512,
n_heads=32,
dropout=0.3,
# Training
learning_rate=0.002,
# Logging
result_dir="./experiments/feature_recognition",
log_file="training_errors.log",
generate_stream_cache_for_visu=True # Enable for debugging
)
CAD Encoding Strategy
The GraphNodeClassification model uses a richer encoding strategy compared to GraphClassification. While GraphClassification uses standard face discretization and edge features, GraphNodeClassification adds extended topological features including neighbor count, angle histograms between adjacent faces, and distance histograms to neighbor centroids. These additional features provide richer contextual information for per-face classification tasks.
Encoding Pipeline
The encode_cad_data() method follows this pipeline:
def encode_cad_data(self, cad_file: str, cad_loader: CADLoader, storage: DataStorage):
# 1. Configure CAD loading
general_options = cad_loader.get_general_options()
general_options["read_feature"] = True
general_options["read_solid"] = True
# 2. Load model
model = cad_loader.create_from_file(cad_file)
# 3. Configure BREP with UV computation
hoopstools = HOOPSTools()
brep_options = hoopstools.brep_options()
brep_options["force_compute_uv"] = True
brep_options["force_compute_3d"] = True
hoopstools.adapt_brep(model, brep_options)
# 4. Encode features (RICHER than GraphClassification)
brep_encoder = BrepEncoder(model.get_brep(body_index=0), storage)
# Standard features
brep_encoder.push_face_adjacency_graph()
brep_encoder.push_face_attributes()
brep_encoder.push_face_discretization(pointsamples=25)
brep_encoder.push_edge_attributes()
brep_encoder.push_curvegrid(10)
brep_encoder.push_face_pair_edges_path(16)
# Extended features (specific to GraphNodeClassification)
brep_encoder.push_extended_adjacency()
brep_encoder.push_face_neighbors_count()
brep_encoder.push_average_face_pair_angle_histograms(5, 64)
brep_encoder.push_average_face_pair_distance_histograms(5, 64)
Feature Specifications
The encoding process extracts different types of features from the CAD model’s B-rep structure. These features are organized into three categories: node features (attached to each face), edge features (attached to connections between faces), and the overall graph structure that represents how faces connect to each other.
Node Features (Per Face):
Face discretization:
(pointsamples, 7)- Surface geometry sampled pointsFace attributes: Surface type, area, loop count
Neighbor count: Number of adjacent faces
Angle histograms: Distribution of dihedral angles with neighbors
Distance histograms: Distribution of distances to neighbor centroids
Edge Features (Per Face-Face Connection):
U-grids:
(10, 6)- Shared edge curve geometryEdge attributes: Curve type, length, dihedral angle
Path information: Shortest path metrics
Graph Structure:
Nodes: All faces in the CAD model
Edges: Face adjacency (two faces share an edge)
Extended Adjacency: Multi-hop neighbor relationships
Mathematical Representation
For each face \(f_i\) with neighbors \(\mathcal{N}(f_i)\):
Node Embedding:
where:
\(\mathbf{S}_i\) is the face discretization sample points
\(\mathbf{a}_i\) are face attributes (type, area, etc.)
\(\theta_{ij}\) is the dihedral angle between faces \(i\) and \(j\)
\(d_{ij}\) is the distance between face centroids
Transformer-based Message Passing:
Per-Face Classification:
Integration with Flow Tasks
Overview
Like GraphClassification, the GraphNodeClassification model integrates with HOOPS AI’s Flow framework. However, there are key differences due to node-level labels.
Pattern: Wrapping FlowModel Methods
The key insight is to instantiate the FlowModel once at the module level, then call its methods inside decorated Flow tasks:
from hoops_ai.flowmanager import flowtask
# 1. Create FlowModel instance
flow_model = GraphNodeClassification(num_classes=25, result_dir="./results")
# 2. Wrap encode_cad_data() in a Flow task
@flowtask.transform(
name="advanced_cad_encoder",
inputs=["cad_file", "cad_loader", "storage"],
outputs=["face_count", "edge_count"]
)
def my_encoder(cad_file: str, cad_loader, storage):
# Call the FlowModel's encoding method
face_count, edge_count = flow_model.encode_cad_data(cad_file, cad_loader, storage)
# Optional: Add custom label processing
# ... your label code here ...
# Optional: Convert to graph
flow_model.convert_encoded_data_to_graph(storage, graph_handler, filename)
return face_count, edge_count
Key Difference: Node-Level Labels
Unlike graph classification (one label per file), node classification requires one label per face:
# Graph classification (GraphClassification)
storage.save_data("file_label", np.array([3])) # Single label
# Node classification (GraphNodeClassification)
storage.save_data("face_labels", np.array([0, 1, 1, 2, 0, ...])) # Label per face
Complete Example: CADSynth-AAG Dataset
This example demonstrates processing the CADSynth-AAG segmentation dataset (162k models) using GraphNodeClassification integrated with HOOPS AI Flows.
Complete Implementation
"""
CADSynth-AAG Segmentation Preprocessing using GraphNodeClassification FlowModel
"""
import pathlib
from typing import Tuple, List
# Flow framework imports
from hoops_ai.flowmanager import flowtask
import hoops_ai
from hoops_ai.cadaccess import HOOPSLoader, CADLoader
from hoops_ai.storage import (
DataStorage,
MLStorage,
CADFileRetriever,
LocalStorageProvider,
DGLGraphStoreHandler
)
from hoops_ai.dataset import DatasetExplorer
# FlowModel import
from hoops_ai.ml.EXPERIMENTAL import GraphNodeClassification
from hoops_ai.storage.label_storage import LabelStorage
# ==================================================================================
# CONFIGURATION
# ==================================================================================
flows_inputdir = r"C:\Temp\Cadsynth_aag\step"
flows_outputdir = str(pathlib.Path(flows_inputdir))
datasources_dir = str(flows_inputdir)
# ==================================================================================
# FLOWMODEL INSTANTIATION
# ==================================================================================
flowName = "cadsynth_aag_162k_flowtask"
flow_model = GraphNodeClassification(
num_classes=25, # Will be updated when labels are available
result_dir=str(pathlib.Path(flows_inputdir).joinpath("flows").joinpath(flowName))
)
# ==================================================================================
# FLOW TASK DEFINITIONS
# ==================================================================================
@flowtask.extract(
name="gather_cad_files_to_be_treated",
inputs=["cad_datasources"],
outputs=["cad_dataset"]
)
def my_demo_gatherer(source: str) -> List[str]:
"""
Gather all CAD files from the dataset directory.
"""
cad_formats = [".stp", ".step"]
local_provider = LocalStorageProvider(directory_path=source)
retriever = CADFileRetriever(
storage_provider=local_provider,
formats=cad_formats
)
return retriever.get_file_list()
@flowtask.transform(
name="advanced_cad_encoder",
inputs=["cad_file", "cad_loader", "storage"],
outputs=["face_count", "edge_count"]
)
def my_demo_encoder(cad_file: str, cad_loader: HOOPSLoader, storage: DataStorage) -> Tuple[int, int]:
"""
Encode CAD data using GraphNodeClassification FlowModel.
"""
# ===== CALL FLOWMODEL METHOD =====
face_count, edge_count = flow_model.encode_cad_data(cad_file, cad_loader, storage)
# =================================
# Convert encoded data to DGL graph file
location = pathlib.Path(storage.get_file_path("."))
dgl_output_path = pathlib.Path(location.parent.parent / "dgl" / f"{location.stem}.ml")
dgl_output_path.parent.mkdir(parents=True, exist_ok=True)
# ===== CALL FLOWMODEL METHOD =====
flow_model.convert_encoded_data_to_graph(storage, DGLGraphStoreHandler(), str(dgl_output_path))
# =================================
return face_count, edge_count
# ==================================================================================
# FLOW ORCHESTRATION
# ==================================================================================
def main():
"""
Execute the CADSynth-AAG preprocessing flow.
"""
cad_flow = hoops_ai.create_flow(
name=flowName,
tasks=[
my_demo_gatherer,
my_demo_encoder
],
max_workers=40,
flows_outputdir=str(flows_outputdir),
ml_task="Feature Recognition with GraphNodeClassification",
)
output, dict_data, flow_file = cad_flow.process(
inputs={'cad_datasources': [datasources_dir]}
)
print(output.summary())
explorer = DatasetExplorer(flow_output_file=str(flow_file))
explorer.print_table_of_contents()
if __name__ == "__main__":
main()
Key Integration Points
FlowModel Instantiation
Create the GraphNodeClassification model instance once at the module level to ensure consistency across all Flow tasks:
# Instantiate ONCE at module level
flow_model = GraphNodeClassification(num_classes=25, result_dir="./results")
Encoding Task Wrapper
Wrap the FlowModel’s encode_cad_data() method inside a Flow task to enable batch processing while maintaining access to custom logic:
@flowtask.transform(...)
def my_demo_encoder(cad_file, cad_loader, storage):
# Call FlowModel method directly
face_count, edge_count = flow_model.encode_cad_data(cad_file, cad_loader, storage)
# Add custom logic (labels, schema, etc.)
# ...
# Call another FlowModel method
flow_model.convert_encoded_data_to_graph(storage, graph_handler, filename)
return face_count, edge_count
Rich Feature Encoding
This demonstrates the extended features that GraphNodeClassification encodes beyond the standard GraphClassification approach:
# Inside flow_model.encode_cad_data():
brep_encoder.push_face_adjacency_graph() # Standard graph structure
brep_encoder.push_face_discretization(pointsamples=25) # Standard face sampling
brep_encoder.push_extended_adjacency() # Extended topological features
brep_encoder.push_face_neighbors_count() # Neighbor counting
brep_encoder.push_average_face_pair_angle_histograms(5, 64) # Angle relationships
brep_encoder.push_average_face_pair_distance_histograms(5, 64) # Distance relationships
Output Structure
The Flow generates a structured output directory containing encoded CAD data, graph files ready for machine learning, and execution metadata:
Cadsynth_aag/
└── flows/
└── cadsynth_aag_162k_flowtask/
├── encoded/ # Individual .data files
├── dgl/ # DGL graph files for ML training
└── flow_output.json # Flow execution summary
Training Workflow
Step 1: Preprocess Dataset
Use the Flow example above to create ML-ready graph files.
Step 2: Prepare Face-Level Labels
Important: Node classification requires labels for each face in each model.
# Example: Load face labels from annotation file
import json
def load_face_labels(cad_file, annotation_file):
"""
Load per-face labels from annotation.
Returns array of shape (num_faces,) with class labels.
"""
with open(annotation_file) as f:
annotations = json.load(f)
# Extract face labels for this CAD file
face_labels = annotations[cad_file]["face_labels"]
return np.array(face_labels)
Step 3: Load Dataset
from hoops_ai.dataset import DatasetLoader
# Load preprocessed graphs
# Note: For node classification, labels are stored IN the graph files
dataset_loader = DatasetLoader(
graph_files=["./flows/cadsynth_aag_162k_flowtask/dgl/*.ml"]
# No separate label files - labels are per-face in graph
)
# Split into train/val/test
dataset_loader.split_data(train_ratio=0.7, val_ratio=0.15, test_ratio=0.15)
Step 4: Create Trainer
from hoops_ai.ml import FlowTrainer
trainer = FlowTrainer(
flowmodel=flow_model, # Same instance used in Flow
datasetLoader=dataset_loader,
batch_size=32,
num_workers=4,
experiment_name="machining_feature_recognition",
accelerator='gpu',
devices=1,
max_epochs=100,
result_dir="./experiments"
)
Step 5: Train Model
# Train and get best checkpoint
best_checkpoint = trainer.train()
print(f"Training complete! Best model: {best_checkpoint}")
# Evaluate on test set
trainer.test(trained_model_path=best_checkpoint)
# Access metrics
metrics = trainer.metrics_storage()
train_loss = metrics.get("train_loss")
val_accuracy = metrics.get("val_node_accuracy") # Note: node-level accuracy
Step 6: Monitor Training
# Launch TensorBoard
tensorboard --logdir=./experiments/ml_output/machining_feature_recognition/
Metrics to Monitor:
train_loss: Per-face cross-entropy lossval_node_accuracy: Percentage of correctly classified facesval_per_class_accuracy: Accuracy for each feature class
Inference Workflow
Single File Inference
from hoops_ai.ml import FlowInference
from hoops_ai.cadaccess import HOOPSLoader
# Setup
cad_loader = HOOPSLoader()
inference = FlowInference(
cad_loader=cad_loader,
flowmodel=flow_model, # Same instance used in training
log_file='inference_errors.log'
)
# Load trained model
inference.load_from_checkpoint("./experiments/best.ckpt")
# Predict on new CAD file
batch = inference.preprocess("new_part.step")
predictions = inference.predict_and_postprocess(batch)
# Interpret results (PER-FACE predictions)
face_predictions = predictions['node_predictions'] # [N_faces]
face_confidences = predictions['node_probabilities'] # [N_faces, N_classes]
num_faces = predictions['num_faces']
print(f"Model has {num_faces} faces")
for i in range(num_faces):
pred_class = face_predictions[i]
confidence = face_confidences[i][pred_class]
print(f"Face {i}: Class {pred_class} (confidence: {confidence:.2%})")
Visualizing Face Predictions
# Map predictions to CAD model for visualization
feature_names = {
0: "Base",
1: "Hole",
2: "Pocket",
3: "Slot",
# ... etc
}
# Create color-coded face map
face_colors = []
for pred in face_predictions:
feature = feature_names[pred]
face_colors.append(get_color_for_feature(feature))
# Export for visualization
# (Use HOOPS Communicator or other CAD viewer)
Hyperparameter Tuning
Architecture Hyperparameters
# Baseline (default)
flow_model = GraphNodeClassification(
num_classes=25,
n_layers_encode=8,
dim_node=256,
d_model=512,
n_heads=32,
)
# Larger model for complex features
flow_model = GraphNodeClassification(
num_classes=25,
n_layers_encode=12, # More layers
dim_node=512, # Larger embedding
d_model=1024, # Larger transformer
n_heads=16, # More attention heads
)
Training Hyperparameters
trainer = FlowTrainer(
flowmodel=flow_model,
datasetLoader=dataset_loader,
# Batch size
batch_size=64, # Try: 16, 32, 64, 128
# Epochs
max_epochs=200, # Try: 50, 100, 200
# Device
accelerator='gpu',
devices=1,
)
Face Discretization Resolution
# In your encoding task:
def my_encoder(cad_file, cad_loader, storage):
face_count, edge_count = flow_model.encode_cad_data(cad_file, cad_loader, storage)
# Override default encoding with custom resolution
brep_encoder = BrepEncoder(model.get_brep(), storage)
brep_encoder.push_face_discretization(pointsamples=50) # Higher resolution (default: 25)
brep_encoder.push_curvegrid(20) # Higher resolution (default: 10)
Troubleshooting
Issue: Shape Mismatch During Training
Symptom:
RuntimeError: expected shape [B, 25, 7], got [B, 40, 7]
Cause:
Inconsistent face discretization resolution between files.
Solution:
# Ensure all files use same number of sample points brep_encoder.push_face_discretization(pointsamples=25) # Always use 25 points
Issue: Missing Face Labels
Symptom:
KeyError: 'face_labels'
Cause:
Node classification requires per-face labels, but they are not present in the graph files.
Solution:
# Ensure face labels are saved during encoding storage.save_data("face_labels", face_label_array) # Array of shape (num_faces,)
Issue: Low Accuracy
Possible Causes:
Insufficient training data: Collect more annotated samples
Class imbalance: Use weighted loss or data augmentation
Poor feature encoding: Verify extended features are being extracted
Hyperparameters: Try different model sizes or learning rates
Solutions:
# Check feature encoding explorer = DatasetExplorer(flow_output_file="...") # Verify extended features are present # Use class weights during training (requires model modification)
Conclusion
GraphNodeClassification provides a production-ready implementation of a node-level classifier for CAD feature recognition. By following the FlowModel interface, it seamlessly integrates with HOOPS AI’s Flow framework for batch preprocessing and supports both training and inference workflows with guaranteed encoding consistency.
Key Takeaways:
Instantiate GraphNodeClassification once at module level
Wrap its methods in
@flowtaskdecorated tasksUse the same instance for training (FlowTrainer) and inference (FlowInference)
Ensure per-face labels are provided for node classification
Leverage rich feature encoding for better performance
Attribution: This implementation is based on a third-party architecture. When publishing research using this model, please refer to Acknowledgments for proper citation.