hoops_ai.ml.embeddings

Quick Overview

Classes

Embedding(values, model, dim[, metadata])

Single embedding result.

EmbeddingBatch(values, model, dim[, ids, ...])

Batch embedding result for multiple inputs.

VectorRecord(id, embedding[, metadata])

Record containing a vector embedding with metadata for storage.

VectorHit(id, score[, metadata])

Search result containing a matched vector from similarity search.

FaissVectorStore(dim)

FAISS (in-memory) storage and retrieval of high-dimensionalvector store implementation.

HOOPSEmbeddings([cad_loader, model, device])

HOOPS AI shape embeddings using graph neural networks on B-Rep data.

Embeddings Module

This module provides embeddings functionality for CAD data analysis, including: - Shape embeddings from CAD geometry - Text embeddings for semantic search - Vector storage and similarity search

Key Classes:
  • HOOPSEmbeddings: Generate embeddings from CAD files using graph neural networks

  • ShapeEmbeddingsModel: Abstract base for shape embedding models

  • TextEmbeddingsModel: Abstract base for text embedding models

  • VectorStore: Abstract base for vector storage backends

  • FaissVectorStore: FAISS-based vector storage implementation

  • Embedding: Single embedding result data class

  • EmbeddingBatch: Batch embedding result data class

  • VectorRecord: Vector record for storage

  • VectorHit: Search result from similarity search

class hoops_ai.ml.embeddings.Embedding(values, model, dim, metadata=None)

Bases: object

Single embedding result.

Parameters:
  • values (numpy.ndarray) – Embedding vector of shape (dim,), dtype float32

  • model (str) – Model identifier (e.g., ‘hf:all-MiniLM-L6-v2’, ‘hoops:shape-v1’)

  • dim (int) – Dimensionality of the embedding vector

  • metadata (Dict[str, Any]) – Optional diagnostics (tokens, timings, processing info, etc.)

dim: int
metadata: Dict[str, Any] = None
model: str
values: numpy.ndarray
class hoops_ai.ml.embeddings.EmbeddingBatch(values, model, dim, ids=None, metadata=None)

Bases: object

Batch embedding result for multiple inputs.

Parameters:
  • values (numpy.ndarray) – Embedding matrix of shape (n, dim), dtype float32

  • model (str) – Model identifier (e.g., ‘hf:all-MiniLM-L6-v2’, ‘hoops:shape-v1’)

  • dim (int) – Dimensionality of each embedding vector

  • ids (List[str] | None) – Optional identifiers for each embedding in the batch

  • metadata (Dict[str, Any]) – Optional batch-level diagnostics

dim: int
classmethod from_arrays(embeddings, model='unknown', ids=None, metadata=None)

Create EmbeddingBatch from xarray or numpy arrays.

Parameters:
  • embeddings (xarray.DataArray | numpy.ndarray) – Embedding matrix (xr.DataArray or np.ndarray) with shape (n, dim)

  • model (str) – Model identifier string

  • ids (xarray.DataArray | numpy.ndarray | List[str] | None) – Optional part IDs (xr.DataArray, np.ndarray, or List[str])

  • metadata (Dict[str, Any] | None) – Optional batch-level metadata

Returns:

EmbeddingBatch instance

Raises:
  • TypeError – If arrays are not of supported types

  • ValueError – If embedding array is not 2D

Return type:

EmbeddingBatch

ids: List[str] | None = None
metadata: Dict[str, Any] = None
model: str
values: numpy.ndarray
class hoops_ai.ml.embeddings.VectorHit(id, score, metadata=None)

Bases: object

Search result containing a matched vector from similarity search.

Parameters:
  • id (str) – Unique identifier of the matched vector

  • score (float) – Similarity score (higher is better, typically cosine similarity or L2 distance)

  • metadata (Dict[str, Any]) – Optional metadata dictionary from the stored record

id: str
metadata: Dict[str, Any] = None
score: float
class hoops_ai.ml.embeddings.VectorRecord(id, embedding, metadata=None)

Bases: object

Record containing a vector embedding with metadata for storage.

Wraps an Embedding with an identifier for retrieval and domain-specific metadata.

Parameters:
  • id (str) – Unique identifier (e.g., HOOPS canonical id, document id)

  • embedding (Embedding) – The embedding result from a model

  • metadata (Dict[str, Any]) – Optional domain metadata (e.g., part name, properties, tags)

embedding: Embedding
id: str
metadata: Dict[str, Any] = None