hoops_ai.ml.EXPERIMENTAL

Quick Overview

Classes

GraphClassification([num_classes, ...])

GraphClassification is a user-friendly wrapper around the UVNet model.

GraphNodeClassification([num_classes, ...])

GraphNodeClassification is a user-friendly wrapper around the BrepSeg model.

CustomFlowModel([encode_cad_data_fn, ...])

Child class with default 'dummy' implementations of the abstract methods.

FlowTrainer([flowmodel, datasetLoader, ...])

FlowInference(cad_loader, flowmodel[, log_file])

EmbeddingFlowModel([result_dir, log_file, ...])

FlowModel adapter so FlowTrainer can train embedding contrastive model.

class hoops_ai.ml.EXPERIMENTAL.EmbeddingFlowModel(result_dir=None, log_file='Embedding_model_training_errors.log', generate_stream_cache_for_visu=False, face_weight_mode='power', face_alpha=1.3, face_tau=2.0, face_mix_uniform=0.05, face_eps=1e-06, edge_weight_mode='power', edge_alpha=1.3, edge_tau=2.0, edge_mix_uniform=0.05, edge_eps=1e-06, use_face_type_onehot=True, face_type_num_classes=16, face_type_max=15, uv_channels=7, lr=0.0003, weight_decay=0.0, emb_dim=1024, proj_dim=512, surf_in_ch=23, curve_in_ch=6, proj_hidden=1024, use_bn=True, temp_init=0.05, temp_min=0.01, temp_max=0.2, surf_c=(32, 64, 128), curve_c=(32, 64, 128), conv_kernel=3, conv_pad=1, aug_noise_std=0.005, aug_scale_min=0.85, aug_scale_max=1.2, p_hflip=0.5, p_vflip=0.5, p_rot90=0.5, p_cutout=0.3, cutout_frac_min=0.08, cutout_frac_max=0.2, channel_jitter_std=0.005, curve_noise_std_xyz=0.005, curve_noise_std_other=0.003, curve_p_reverse=0.5, curve_p_dropout=0.04, curve_augment_smooth=False, node_weight_drop_prob=0.5, edge_weight_drop_prob=0.5, weight_beta_a=10.0, weight_beta_b=10.0, loss_type='hard', hard_pos_weight=2.0, hard_neg_weight=3.0, sc_weight=0.0, proj=False, load_checkpoint_using_nn_module=False)

Bases: FlowModel

FlowModel adapter so FlowTrainer can train embedding contrastive model.

Parameters:
static build_body_filepath(filename, body_index, default_extension='.ml')

Build a filename with a body index suffix (e.g. _0, _1).

The suffix _{body_index} is always appended to the stem of the filename. If filename has a file extension it is treated as a file path and the suffix is inserted before the extension. Otherwise the default_extension is appended after the suffix.

Parameters:
  • filename (str) – Original file path or hash id.

  • body_index (int) – Zero-based body index to append.

  • default_extension (str) – Extension to use when the filename has no extension. Defaults to ".ml".

Returns:

The filename with the body index suffix applied.

Return type:

str

collate_function(batch)

Return a collated batch for this model.

Return type:

Any

convert_encoded_data_to_graph(storage, graph_handler, filename)

Converts encoded data from storage into a graph representation, which serves as input for the ML model.

Parameters:
Return type:

Dict[str, Any]

encode_cad_data(cad_file, cad_loader, storage)

Opens the CAD file and encodes its data into a format suitable for machine learning. Stores the encoded data using the provided storage handler.

Parameters:
Return type:

Tuple[int, int]

encode_label_data(label_storage, storage)

Uses the LabelStorage object to retrieve the labeling information for a given input Stores the label data for the specific machine learning Task

return the str key when the label data is found in the storage object and the size of the label data

Parameters:
Return type:

Tuple[str, int]

load_model_input_from_files(graph_file, data_id, label_file=None)

Loads the graph created in method convert_encoded_data_to_graph from a file. the return of this method is exactly the input as expected by the machine learning model the label_file is optional. If not given, the method should return a valid object.

This method will be called multiple times by the DatasetLoader. This method will be called with label_file == None for doing inference

Parameters:
  • graph_file (str)

  • data_id (int)

  • label_file (str)

Return type:

Any

make_weights(a, mode='power', alpha=0.5, tau=2.0, mix_uniform=0.0, eps=1e-06)
metrics()

Publish/push the ml metrics after training the model

model_name()

Provides the name of the model.

Return type:

str

predict_and_postprocess(batch)

Post-processes and formats the raw model output into a structured prediction.

Return type:

Any

retrieve_model(check_point=None)

Retrieves the PyTorch Lightning model used in this flow.

Parameters:

check_point (str)

Return type:

pytorch_lightning.LightningModule

class hoops_ai.ml.EXPERIMENTAL.GraphClassification(num_classes=10, result_dir=None, log_file='flow_model_graph_training_errors.log', generate_stream_cache_for_visu=False, use_gnn_surface_encoder=True, layer_type='gat', num_heads=4, dropout=0.1, graph_emb_dim=128)

Bases: FlowModel

GraphClassification is a user-friendly wrapper around the UVNet model. It provides default hyperparameters and an interface for users to interact without directly accessing the UVnet class.

Parameters:
  • num_classes (int, optional) – Number of classes for classification. Default: 10

  • log_file (str, optional) – Path to the log file. Default: ‘training_errors.log’

  • layer_type (str, optional) – Type of graph layer - “conv” (default), “gat” (attention), or “transformer”

  • graph_emb_dim (int, optional) – Dimension of graph-level embeddings. Default: 128

  • result_dir (str)

  • generate_stream_cache_for_visu (bool)

  • use_gnn_surface_encoder (bool)

  • num_heads (int)

  • dropout (float)

static build_body_filepath(filename, body_index, default_extension='.ml')

Build a filename with a body index suffix (e.g. _0, _1).

The suffix _{body_index} is always appended to the stem of the filename. If filename has a file extension it is treated as a file path and the suffix is inserted before the extension. Otherwise the default_extension is appended after the suffix.

Parameters:
  • filename (str) – Original file path or hash id.

  • body_index (int) – Zero-based body index to append.

  • default_extension (str) – Extension to use when the filename has no extension. Defaults to ".ml".

Returns:

The filename with the body index suffix applied.

Return type:

str

collate_function(batch)

Return a collate function for this model.

Return type:

Any

convert_encoded_data_to_graph(storage, graph_handler, filename)

Converts encoded data from storage into a graph representation, which serves as input for the ML model.

Parameters:
Return type:

Dict[str, Any]

encode_cad_data(cad_file, cad_loader, storage)

Opens the CAD file and encodes its data into a format suitable for machine learning. Stores the encoded data using the provided storage handler.

Parameters:
Return type:

Tuple[int, int]

encode_label_data(label_storage, storage)

Uses the LabelStorage object to retrieve the labeling information for a given input Stores the label data for the specific machine learning Task

return the str key when the label data is found in the storage object and the size of the label data

Parameters:
Return type:

Tuple[str, int]

extract_embeddings(batch)

Extract graph-level embeddings (before classification layer).

This method enables the model to return part embeddings instead of class predictions. Set the model to embedding mode, run inference, then restore to prediction mode.

Parameters:

batch – Input batch containing graph data

Returns:

Graph embeddings with shape (batch_size, graph_emb_dim)

where graph_emb_dim is set in the constructor (default: 128)

Return type:

numpy.ndarray

Example

>>> flow_model = GraphClassification(num_classes=10, graph_emb_dim=256)
>>> embeddings = flow_model.extract_embeddings(batch)
>>> print(embeddings.shape)  # (batch_size, 256)
load_model_input_from_files(graph_file, data_id, label_file=None)

Loads a single graph from a file to be used as input for the machine learning model.

Parameters:
  • graph_file (str)

  • data_id (int)

  • label_file (str)

Return type:

Any

metrics()

Publish/push the ml metrics after traiing the model

Return type:

MetricStorage

model_name()

Returns the model name with surface encoder and layer type information.

Return type:

str

predict_and_postprocess(batch)

Post-processes and formats the raw model output into a structured prediction. Returns a numpy array with top 3 predictions and their probability percentages as integers.

Returns:

Array with shape (batch_size, 2, 3), where: - First dimension: batch items - Second dimension [0]: class indices (int) - Second dimension [1]: probability percentages (int)

Return type:

numpy.ndarray

retrieve_model(check_point=None)

Retrieves the PyTorch Lightning model used in this flow.

Parameters:

check_point (str)

Return type:

pytorch_lightning.LightningModule

class hoops_ai.ml.EXPERIMENTAL.GraphNodeClassification(num_classes=25, n_layers_encode=8, dim_node=256, d_model=512, n_heads=32, dropout=0.3, attention_dropout=0.3, act_dropout=0.3, learning_rate=0.002, optimizer_betas=(0.99, 0.999), scheduler_factor=0.5, scheduler_patience=5, scheduler_threshold=0.0001, scheduler_min_lr=1e-06, scheduler_cooldown=2, max_warmup_steps=5000, log_file='training_errors.log', result_dir=None, generate_stream_cache_for_visu=False, **kwargs)

Bases: FlowModel

GraphNodeClassification is a user-friendly wrapper around the BrepSeg model. It provides default hyperparameters and an interface for users to interact without directly accessing the BrepSeg class.

Parameters:
  • num_classes (int, optional) – Number of classes for classification. Default: 10

  • n_layers_encode (int, optional) – Number of encoding layers in the BrepEncoder. Default: 6

  • dim_node (int, optional) – Dimension of node embeddings. Default: 256

  • d_model (int, optional) – Dimension of the model in BrepEncoder. Default: 512

  • n_heads (int, optional) – Number of attention heads in BrepEncoder. Default: 8

  • dropout (float, optional) – Dropout rate for the classifier. Default: 0.3

  • attention_dropout (float, optional) – Dropout rate for attention layers. Default: 0.1

  • act_dropout (float, optional) – Dropout rate for activation layers. Default: 0.1

  • learning_rate (float, optional) – Initial learning rate for the optimizer. Default: 0.002

  • optimizer_betas (Tuple[float, float], optional) – Betas for the AdamW optimizer. Default: (0.99, 0.999)

  • scheduler_factor (float, optional) – Factor for ReduceLROnPlateau scheduler. Default: 0.5

  • scheduler_patience (int, optional) – Patience for ReduceLROnPlateau scheduler. Default: 5

  • scheduler_threshold (float, optional) – Threshold for ReduceLROnPlateau scheduler. Default: 1e-4

  • scheduler_min_lr (float, optional) – Minimum learning rate for scheduler. Default: 1e-6

  • scheduler_cooldown (int, optional) – Cooldown period for scheduler. Default: 2

  • max_warmup_steps (int, optional) – Number of warmup steps for learning rate scaling. Default: 5000

  • log_file (str, optional) – Path to the log file. Default: ‘training_errors.log’

  • **kwargs (Any) – Additional keyword arguments for BrepSeg.

  • result_dir (str)

  • generate_stream_cache_for_visu (bool)

  • **kwargs

static build_body_filepath(filename, body_index, default_extension='.ml')

Build a filename with a body index suffix (e.g. _0, _1).

The suffix _{body_index} is always appended to the stem of the filename. If filename has a file extension it is treated as a file path and the suffix is inserted before the extension. Otherwise the default_extension is appended after the suffix.

Parameters:
  • filename (str) – Original file path or hash id.

  • body_index (int) – Zero-based body index to append.

  • default_extension (str) – Extension to use when the filename has no extension. Defaults to ".ml".

Returns:

The filename with the body index suffix applied.

Return type:

str

collate_function(batch)

Return a collate function for this model.

Return type:

Any

convert_encoded_data_to_graph(storage, graph_handler, filename)

Converts encoded data from storage into a graph representation, which serves as input for the ML model.

Parameters:
Return type:

Dict[str, Any]

encode_cad_data(cad_file, cad_loader, storage)

Opens the CAD file and encodes its data into a format suitable for machine learning. Stores the encoded data using the provided storage handler.

Parameters:
Return type:

Tuple[int, int]

encode_label_data(label_storage, storage)

Uses the LabelStorage object to retrieve the labeling information for a given input Stores the label data for the specific machine learning Task

return the str key when the label data is found in the storage object and the size of the label data

Parameters:
Return type:

Tuple[str, int]

load_model_input_from_files(graph_file, data_id, label_file=None)

Loads a single graph from a file to be used as input for the machine learning model.

Parameters:
  • graph_file (str)

  • data_id (int)

  • label_file (str)

Return type:

Any

metrics()

Publish/push the ml metrics after training the model

Return type:

MetricStorage

model_name()
Return type:

str

predict_and_postprocess(batch)

Post-processes and formats the raw model output into a structured prediction. Returns the predictions and their associated probabilities.

Return type:

Any

retrieve_model(check_point=None)

Retrieves the PyTorch Lightning model used in this flow.

Parameters:

check_point (str)

Return type:

pytorch_lightning.LightningModule