============================== Data Visualization Experience ============================== .. sidebar:: Quick Navigation - :ref:`viz-overview` - :ref:`viz-datasetviewer` - :ref:`viz-cadviewer` - :ref:`viz-utils` - :ref:`viz-examples` - :ref:`viz-best-practices` - :ref:`viz-troubleshooting` .. _viz-overview: Overview ======== After extracting features from CAD files and training models, you need to see what you’re working with. Which parts have complex geometry? Are your model’s predictions reasonable? How do different classes look visually? The Insights module answers these questions by bringing your CAD data to life. The :mod:`hoops_ai.insights` module provides visualization tools that work **throughout your entire workflow**. Whether you're exploring raw CAD files, inspecting encoded datasets, validating filtered results, or analyzing model predictions, these tools let you see what's happening at every step. **Visualize at any stage:** - **During data exploration**: Browse CAD files before encoding to understand your dataset - **After encoding**: Inspect generated features and verify data quality - **While filtering**: Visualize query results to confirm you've selected the right subset - **During training**: Check training/validation splits to ensure balanced distributions - **After inference**: Color-code predictions on 3D geometry to validate model behavior The Insights module transforms your CAD analysis from abstract data pipelines into visual, interactive workflows. Think of Insights as your visual dashboard for CAD analysis. Instead of staring at numbers in spreadsheets, you can: - Display grids of CAD previews to quickly scan through datasets - Open interactive 3D models directly in Jupyter notebooks - Visualize which parts match specific criteria (e.g., "show me all gears") - Color-code model predictions on the actual 3D geometry The module centers around three core tools: **DatasetViewer** helps you explore datasets in bulk. After filtering with :class:`DatasetExplorer` (e.g., "find all parts with >50 faces labeled 'bracket'"), DatasetViewer shows you the results as image grids or interactive 3D viewers. You see what your queries return, not just file IDs. **CADViewer** focuses on individual models. Load a single CAD file and interact with it in 3D - rotate, zoom, inspect. This is perfect for debugging: "Why did my model misclassify this part? Let me look at its geometry." **ColorPalette utilities** solve a common visualization problem: how to assign meaningful colors to predictions. If your model classifies 26 part types, ColorPalette generates 26 distinct colors automatically. If you want to group predictions (e.g., "color all fasteners blue, all housings red"), it handles that too. Together, these tools transform abstract data pipelines into visual workflows you can understand at a glance. .. _viz-datasetviewer: DatasetViewer ============= Purpose ------- :class:`DatasetViewer` is your window into dataset exploration. After running queries with :class:`DatasetExplorer` (e.g., "find all parts labeled as 'gear' with more than 30 faces"), you get back a list of file IDs. But IDs don't tell you much - you need to **see** the parts. That's where DatasetViewer comes in. It takes those query results and shows them to you as: - **Image grids**: Quick visual scans of 10, 25, or 100 parts at once - **Interactive 3D viewers**: Rotate and inspect specific models inline in Jupyter - **Filtered comparisons**: "Show me the 10 most complex gears vs. the 10 simplest" :class:`DatasetViewer` enables visualization of multiple CAD files from dataset queries. It's designed to work seamlessly with :class:`DatasetExplorer`, allowing you to: - Filter files based on criteria (labels, complexity, features) - Visualize the filtered results as image grids or interactive 3D views - Compare multiple parts side-by-side Creating a DatasetViewer ------------------------ The easiest way is to create it directly from an existing :class:`DatasetExplorer`: .. code-block:: python from hoops_ai.dataset import DatasetExplorer from hoops_ai.insights import DatasetViewer # Initialize explorer explorer = DatasetExplorer(flow_output_file="fabwave.flow") # Create viewer from explorer (automatically extracts visualization paths) viewer = DatasetViewer.from_explorer(explorer) # Check availability viewer.print_statistics() The :meth:`from_explorer()` method read your ``.flow`` file and extracted three critical pieces of information: 1. **PNG preview paths**: Pre-rendered images of each CAD file (for grid displays) 2. **3D cache paths**: Optimized 3D geometry files (for interactive viewing) 3. **File IDs and names**: Mapping between your data and the visualization assets When you call :meth:`print_statistics()`, you see coverage metrics: .. code-block:: text ================================================== Dataset Visualization Statistics ================================================== Total files: 234 Files with PNG preview: 234 (100.0%) Files with 3D cache: 234 (100.0%) Overall coverage: 100.0% ================================================== This tells you that all 234 files in your dataset have both PNG previews and 3D cache files ready. If coverage is less than 100%, some visualizations might fail for files without cached assets. You can also create a viewer manually if you already have the paths: .. code-block:: python from hoops_ai.insights import DatasetViewer # Extract data from explorer manually cache_df = explorer.get_stream_cache_paths() file_ids = cache_df['id'].astype(int).tolist() png_paths = cache_df['stream_cache_png'].tolist() scs_paths = cache_df['stream_cache_3d'].tolist() file_names = cache_df['name'].tolist() # Create viewer with explicit data viewer = DatasetViewer(file_ids, png_paths, scs_paths, file_names) This is useful when you want to visualize a subset of files or when working with custom data sources. Visualizing Image Grids ----------------------- Once you have a viewer, the most common task is displaying query results as image grids. Say you filtered your dataset and got back a list of file IDs - now you want to see what those parts actually look like. Start by getting the list of available files: .. code-block:: python # Get all file IDs all_ids = viewer.get_available_file_ids() # Display first 25 files as image grid fig = viewer.show_preview_as_image(all_ids, k=25) This creates a 5Γ—5 grid showing PNG previews of 25 CAD files. The ``k`` parameter limits how many files to show (useful when you have thousands of results). The layout is automatic - DatasetViewer calculates the optimal grid dimensions based on how many files you're displaying. Want more control over the layout? Customize the grid: .. code-block:: python # Custom 4-column grid with file names fig = viewer.show_preview_as_image( all_ids, k=20, grid_cols=4, title="Dataset Preview", label_format='name', show_labels=True ) Here's what each parameter does: - ``file_ids``: Which files to show (from your query results) - ``k``: Maximum number of files (if you have 1000 results but only want to see 20) - ``grid_cols``: Force a specific number of columns (otherwise auto-calculated) - ``figsize``: Figure size in inches as ``(width, height)`` (otherwise auto-calculated) - ``show_labels``: Whether to overlay file information on each image - ``label_format``: Show file ``'id'``, ``'name'``, or ``'both'`` - ``title``: Overall title for the entire grid - ``missing_color``: RGB color tuple for files without PNG previews (default: gray) - ``save_path``: Automatically save the figure to a file path The label format controls what text appears on each image. You have three options: .. code-block:: python # Show only file IDs viewer.show_preview_as_image(file_ids, label_format='id') # Labels: "ID: 42", "ID: 87", ... # Show only file names viewer.show_preview_as_image(file_ids, label_format='name') # Labels: "bracket.step", "housing.step", ... # Show both ID and name viewer.show_preview_as_image(file_ids, label_format='both') # Labels: "ID:42\nbracket.step" The returned ``fig`` is a Matplotlib figure object, so you can save it or customize it further: .. code-block:: python # Create and save grid fig = viewer.show_preview_as_image( file_ids, k=100, save_path='results/dataset_preview.png' ) Interactive 3D Viewing ---------------------- Image grids are great for quick overviews, but sometimes you need to inspect geometry in detail. That's where 3D viewing comes in. The :meth:`show_preview_as_3d()` method creates interactive 3D viewers inline in your Jupyter notebook: .. code-block:: python # Open 3 interactive 3D viewers (inline in notebook) viewers_3d = viewer.show_preview_as_3d(file_ids, k=3) # Each viewer is a CADViewer instance print(f"Created {len(viewers_3d)} 3D viewers") This displays 3 separate 3D viewers in your notebook, one for each file. You can rotate, zoom, and pan each model independently. The ``k`` parameter works the same way as in image grids - it limits how many viewers to create (you don't want 100 3D viewers clogging your notebook). Want bigger viewers or different layouts? Customize the display: .. code-block:: python # Larger inline viewers viewers_3d = viewer.show_preview_as_3d( file_ids, k=3, width=600, height=500 ) # Sidecar layout (opens in side panel) [AVAILABLE IN FUTURE RELEASES] viewers_3d = viewer.show_preview_as_3d( file_ids, k=5, display_mode='sidecar' ) Here's what you can control: - ``file_ids``: Which files to show in 3D - ``k``: Maximum number of 3D viewers (default: 5 - be conservative, each viewer uses resources) - ``display_mode``: ``'inline'`` (in notebook), ``'sidecar'`` (side panel), or ``'none'`` (headless) - ``layout``: ``'sequential'`` (one per cell) or ``'grid'`` (arranged in grid) - ``host``: Server host address (default: ``'127.0.0.1'``) - ``start_port``: Starting port for web servers (default: 8000, increments for each viewer) - ``silent``: Suppress server output logs (default: ``True``) - ``width``: Viewer width in pixels (default: 400) - ``height``: Viewer height in pixels (default: 400) The method returns a list of :class:`CADViewer` instances (one per file). You can interact with these viewers programmatically: .. code-block:: python # Get selected faces from user interaction selected_faces = viewers_3d[0].get_selected_faces() print(f"Selected faces: {selected_faces}") # Color selected faces red viewers_3d[0].set_face_color(selected_faces, [255, 0, 0]) # Clear all face colors viewers_3d[0].clear_face_colors() # Clean up when done for v in viewers_3d: v.terminate() When you're done viewing, call :meth:`terminate()` on each viewer to shut down its web server and free up ports. **Side-By-Side Comparison (AVAILABLE IN FUTURE RELEASES)** You'll be able to load two CAD files and view them in a split-screen layout, making it easy to spot differences and similarities. .. code-block:: python # Compare two specific files viewer_a, viewer_b = viewer.create_comparison_view( file_id_a=42, file_id_b=87, display_mode='sidecar' ) # Highlight same features in both viewer_a.set_face_color([1, 2, 3], [255, 0, 0]) viewer_b.set_face_color([1, 2, 3], [255, 0, 0]) Working with DatasetExplorer ----------------------------- The real power of DatasetViewer comes from combining it with :class:`DatasetExplorer` filtering capabilities. You filter your dataset based on criteria (labels, complexity, geometry), then immediately visualize the results. Here's the typical workflow: **Filter -> Visualize** .. code-block:: python from hoops_ai.dataset import DatasetExplorer from hoops_ai.insights import DatasetViewer # Step 1: Initialize explorer and viewer explorer = DatasetExplorer(flow_output_file="fabwave.flow") viewer = DatasetViewer.from_explorer(explorer) # Step 2: Define filter condition high_complexity = lambda ds: ds['num_nodes'] > 30 # Step 3: Get file IDs matching condition complex_file_ids = explorer.get_file_list( group="graph", where=high_complexity ) print(f"Found {len(complex_file_ids)} complex files") # Step 4: Visualize filtered results fig = viewer.show_preview_as_image( complex_file_ids, title="High Complexity Parts", grid_cols=5 ) Let's break this down. First, you create both an explorer and a viewer from the same ``.flow`` file. The explorer handles querying (find files matching criteria), while the viewer handles visualization (show me what they look like). Then you define a filter condition using a lambda function. Here, ``high_complexity`` says "keep only files where the number of graph nodes exceeds 30." The explorer applies this filter and returns matching file IDs. Finally, you pass those IDs to the viewer, which displays them as an image grid. The result: you immediately see what "high complexity" parts look like in your dataset. **Example: Filter by Label and Visualize** You can filter by any criterion available in your dataset. For example, filter by label: .. code-block:: python # Filter files with specific label pipe_fittings = lambda ds: ds['file_label'] == 15 pipe_file_ids = explorer.get_file_list( group="file", where=pipe_fittings ) print(f"Found {len(pipe_file_ids)} pipe fittings") # Show image grid viewer.show_preview_as_image( pipe_file_ids, k=16, title='Pipe Fittings (Label 15)', label_format='name' ) # Show 3D views of first 4 viewers_3d = viewer.show_preview_as_3d(pipe_file_ids, k=4) Or combine multiple criteria: **Example: Multi-Criteria Filtering** .. code-block:: python # Complex query: high face count AND specific label def complex_brackets(ds): return (ds['num_nodes'] > 25) & (ds['file_label'] == 3) bracket_ids = explorer.get_file_list(group="graph", where=complex_brackets) # Visualize results viewer.show_preview_as_image( bracket_ids, k=20, title=f'Complex Brackets ({len(bracket_ids)} files)', grid_cols=4, save_path='results/complex_brackets.png' ) This filter-then-visualize pattern is powerful. You're not just looking at random parts, you're seeing exactly the subset you care about, filtered by any combination of attributes in your dataset. Helper Methods -------------- DatasetViewer provides several utility methods for inspecting your data: Get File Information ~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Get info for specific file file_info = viewer.get_file_info(42) print(f"Name: {file_info['name']}") print(f"PNG available: {file_info['png_path'] is not None}") print(f"3D available: {file_info['stream_cache_path'] is not None}") Get Available Files ~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Get all file IDs with visualization data available_ids = viewer.get_available_file_ids() print(f"Total files with visualization: {len(available_ids)}") Statistics ~~~~~~~~~~ .. code-block:: python # Get detailed statistics stats = viewer.get_statistics() print(f"Total files: {stats['total_files']}") print(f"PNG coverage: {stats['png_percentage']:.1f}%") print(f"3D coverage: {stats['3d_percentage']:.1f}%") # Pretty print viewer.print_statistics() The :meth:`get_file_info()` method returns a dictionary with file name and paths to visualization assets (PNG and 3D cache). Use it to check if a specific file has visualization data before trying to display it. The :meth:`get_available_file_ids()` method gives you all file IDs that have at least some visualization data. This is useful for iterating over visualizable files. The :meth:`get_statistics()` method returns coverage metrics (how many files have PNGs, how many have 3D caches). The :meth:`print_statistics()` method shows the same info in a nicely formatted table. .. _viz-cadviewer: CADViewer ========= Purpose ------- While DatasetViewer handles bulk visualization, :class:`CADViewer` focuses on individual models. It provides interactive 3D viewing with face coloring, selection, and manipulation. This is perfect for: - Detailed inspection of single CAD models (rotate, zoom, inspect geometry) - Visualizing ML predictions on 3D geometry (color each face by its predicted class) - Interactive feature highlighting (select faces, highlight regions) - Educational demonstrations (show specific geometric features) The key difference: CADViewer gives you programmatic control. You can color specific faces, retrieve user selections, and update the view dynamically based on predictions or analysis results. Loading a CAD File ------------------ The simplest way to view a CAD file is to create a viewer, load a file, and display it: .. code-block:: python from hoops_ai.insights import CADViewer # Create viewer (auto-finds free port) viewer = CADViewer() # Load CAD file viewer.load_cad_file("bracket.step") # Display in notebook viewer.show() The viewer starts a local web server (automatically finding an available port between 8000-8099), loads your CAD file, and embeds an interactive 3D viewer in your Jupyter notebook. You can now rotate, zoom, and pan the model. Quick View (One-Liner) ~~~~~~~~~~~~~~~~~~~~~~ For even simpler one-off viewing, use the :func:`quick_view()` convenience function: .. code-block:: python from hoops_ai.insights import quick_view # Load and display in one call viewer = quick_view("bracket.step") This creates the viewer and loads the file in a single line. The returned ``viewer`` object is still fully functional, you can color faces, get selections, etc. Display Modes ------------- CADViewer supports three display modes depending on your workflow: .. code-block:: python # Inline display (embedded in notebook) viewer = CADViewer(display_mode='inline') viewer.load_cad_file("bracket.step") viewer.show(width=600, height=500) # Sidecar display (side panel) viewer = CADViewer(display_mode='sidecar') viewer.load_cad_file("bracket.step") viewer.show() # No display (server only) viewer = CADViewer(display_mode='none') viewer.load_cad_file("bracket.step") print(f"Viewer URL: {viewer.get_viewer_url()}") - **Inline mode** (default) embeds the 3D viewer directly in your notebook output. This is great for documentation and sharing notebooks. - **Sidecar mode** opens the viewer in JupyterLab's side panel, giving you a split-screen view. You can write code on one side while seeing the model on the other. This is ideal for interactive development. (Note: In classic Jupyter, this falls back to inline mode.) - **None mode** runs the server without displaying anything. You get the viewer URL and can open it in a separate browser tab. This is useful for debugging or when you want manual control over display. Managing Ports -------------- By default, CADViewer automatically finds an available port: .. code-block:: python # Auto-find free port (default, recommended) viewer = CADViewer() # Finds port 8000-8099 The viewer scans ports 8000 through 8099 and picks the first available one. This prevents conflicts when running multiple viewers. If you need a specific port (e.g., for firewall rules), you can specify it: .. code-block:: python # Use specific port (strict mode) viewer = CADViewer(port=9000) # Must be available or fails In strict mode, if port 9000 is already in use, the viewer will fail rather than picking a different port. This ensures you always know which port is being used. Automatic Cleanup ----------------- Viewers create web servers that need to be shut down when you're done. The easiest way is using a context manager: .. code-block:: python # Automatic resource cleanup with CADViewer() as viewer: viewer.load_cad_file("model.step") viewer.show() # ... interact with viewer ... # Automatically terminates on exit When the ``with`` block exits, the viewer's server is automatically terminated and the port is freed. This is the recommended approach for scripts and notebooks to avoid port leaks. Alternatively, you can manually terminate: .. code-block:: python # Terminate viewer and release resources viewer.terminate() # Check if still active print(f"Active: {viewer.is_active}") # False Coloring and Selecting Faces ----------------------------- The most powerful feature of CADViewer is the ability to color individual faces. This is essential for visualizing ML predictions, highlighting features, or showing analysis results. Start by interacting with the viewer to select faces: .. code-block:: python # User: Click faces in 3D viewer (Ctrl+Click for multiple) # Get selected face indices selected = viewer.get_selected_faces() print(f"Selected {len(selected)} faces: {selected}") # Color selected faces viewer.set_face_color(selected, [255, 0, 0]) # Red viewer.set_face_color(selected, [0, 255, 0]) # Green viewer.set_face_color(selected, [0, 0, 255]) # Blue # Default highlight color viewer.set_face_color(selected) # Light blue The workflow: you click faces in the 3D viewer (hold Ctrl/Cmd to select multiple), then call :meth:`get_selected_faces()` to retrieve their indices. With those indices, you can color them using :meth:`set_face_color()`. Colors are specified as RGB lists: ``[red, green, blue]`` where each value is 0-255. If you don't specify a color, it defaults to light blue for highlighting. You can also color faces directly by index without user interaction: .. code-block:: python # Color faces by index hole_faces = [1, 2, 5, 7] viewer.set_face_color(hole_faces, [255, 100, 0]) This is useful when you have predictions from a model or results from an analysis, you already know which faces to color. To reset and remove all coloring: .. code-block:: python # Remove all face coloring viewer.clear_face_colors() This returns all faces to their default appearance. Coloring Groups with Visual Feedback ------------------------------------- When you have multiple groups of faces to color (like different feature types from a classifier), you can color them all at once with visual feedback: .. code-block:: python # Define feature groups groups = [ ([1, 2, 6], (255, 0, 0), 'through hole'), ([3, 4], (0, 0, 255), 'blind hole'), ([8, 9, 10, 11], (0, 255, 0), 'pocket') ] # Color with progress feedback viewer.color_faces_by_groups( groups, delay=0.5, # Delay between groups (seconds) verbose=True # Show colored terminal output ) Each group is a tuple of ``(face_indices, color, description)``. The :meth:`color_faces_by_groups()` method colors each group sequentially, with an optional delay between them so you can watch the colors appear. When ``verbose=True``, you get colored terminal output: .. code-block:: text πŸŸ₯ through hole (3 faces) 🟦 blind hole (2 faces) 🟩 pocket (4 faces) The delay is useful for presentations or debugging, you can watch each feature type get colored and verify the predictions make sense. For production use, set ``delay=0`` to color everything instantly. Loading Different File Formats ------------------------------- CAD Files (Auto-Convert to SCS) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Automatically converts STEP/IGES to SCS format viewer.load_cad_file("model.step", auto_convert=True) viewer.load_cad_file("model.iges", auto_convert=True) SCS Files (Direct Loading) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ When you call :meth:`load_cad_file()` with ``auto_convert=True``, HOOPS Exchange converts your STEP or IGES file to SCS format (HOOPS' optimized streaming format) before loading. This conversion happens once and is cached. If you already have an SCS file, load it directly for faster performance: .. code-block:: python # Load pre-converted SCS file directly (faster) viewer.load_scs_file("model.scs") Background Options ~~~~~~~~~~~~~~~~~~ SCS files load almost instantly because they're already in the viewer's native format. If you're loading the same model repeatedly, convert it once and reuse the SCS file. You can also control the background color: .. code-block:: python # White background (default, good for presentations) viewer.load_cad_file("model.step", white_background=True) # Black background (optional) viewer.load_cad_file("model.step", white_background=False) White backgrounds work better for presentations and documentation (the default). Black backgrounds can reduce eye strain during long analysis sessions. Checking Viewer Status ----------------------- You can query the viewer's current state at any time: .. code-block:: python status = viewer.get_status() print(f"Active: {status['active']}") print(f"Model loaded: {status['model_loaded']}") print(f"Viewer URL: {status['viewer_url']}") print(f"Port: {status['port']}") The status dictionary tells you whether the viewer's server is running (``active``), whether a model is currently loaded (``model_loaded``), the URL to access the viewer (``viewer_url``), and which port it's using (``port``). This is useful for debugging connection issues or verifying the viewer is ready before proceeding with operations. Validating Colors ------------------ Before passing colors to the viewer, you can validate they're properly formatted RGB tuples: .. code-block:: python from hoops_ai.insights import CADViewer # Check if color is valid RGB CADViewer.validate_color([255, 0, 0]) # True CADViewer.validate_color([255, 0, 256]) # False (out of range) CADViewer.validate_color([255, 0]) # False (wrong length) Valid RGB colors are lists or tuples of exactly 3 integers, each in the range 0-255. The :meth:`validate_color()` method returns ``True`` if valid, ``False`` otherwise. This helps catch errors before they cause viewer issues. Quick View Function -------------------- Convenience function for one-line visualization: .. code-block:: python from hoops_ai.insights import quick_view # Basic usage (auto-finds port) viewer = quick_view("model.step") # Inline with custom size viewer = quick_view("model.step", display_mode='inline') # Sidecar display viewer = quick_view("model.step", display_mode='sidecar') # Specific port (strict mode) viewer = quick_view("model.step", port=9000) .. _viz-utils: Visualization Utils =================== ColorPalette: Managing Label Colors ------------------------------------ When you're visualizing classification results, you need consistent colors for each class. :class:`ColorPalette` solves this by managing the mapping between label IDs and their colors. Create a palette from your label definitions: .. code-block:: python from hoops_ai.insights.utils import ColorPalette # Define label descriptions labels = { 0: "background", 1: "through hole", 2: "blind hole", 3: "pocket", 4: "slot" } # Create palette with automatic colors palette = ColorPalette.from_labels( labels, cmap_name='hsv', # Matplotlib colormap reserved_colors={ 0: (200, 200, 200), # Gray for background 1: (255, 0, 0) # Red for through holes } ) Access Colors and Descriptions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The :meth:`from_labels()` method creates a palette from your label dictionary. You provide a ``color_map`` for the classes you care about, and any remaining classes get automatically assigned distinct colors from a built-in palette. In this example, you explicitly set colors for background (gray), through holes (red), blind holes (blue), and pockets (green). The "slot" class (label 4) gets an auto-generated color since you didn't specify one. Once you have a palette, use it to look up colors and descriptions: .. code-block:: python # Get color for label color = palette.get_color(1) # (255, 0, 0) # Get description desc = palette.get_description(1) # "through hole" # Or use alias label = palette.get_label(1) # "through hole" # Get all mappings all_colors = palette.get_all_colors() # {0: (200, 200, 200), 1: (255, 0, 0), ...} all_descs = palette.get_all_descriptions() # {0: "background", 1: "through hole", ...} Palette Operations ~~~~~~~~~~~~~~~~~~ The palette acts like a bidirectional lookup: given a label ID, you get its color and description. The :meth:`get_all_colors()` and :meth:`get_all_descriptions()` methods return dictionaries with all mappings. You can also iterate over the palette: .. code-block:: python # Check membership 1 in palette # True # Get size len(palette) # 5 # Iterate for label_id in palette: color = palette.get_color(label_id) desc = palette.get_description(label_id) print(f"Label {label_id}: {desc} = {color}") # Iterate with items for label_id, (color, desc) in palette.items(): print(f"{label_id}: {desc} -> {color}") This makes it easy to generate legends, reports, or validation summaries showing all your label-color associations. Grouping Predictions for Visualization --------------------------------------- After running a classifier on CAD faces, you get an array of predictions (one label per face). To visualize these predictions, you need to group faces by their predicted label and assign colors. The :func:`group_predictions_by_label()` function does exactly this: .. code-block:: python from hoops_ai.insights.utils import group_predictions_by_label import numpy as np # Predictions array (one per face) predictions = np.array([0, 1, 1, 2, 0, 2, 1, 3, 3]) # Face 0: background, Face 1-2,6: hole, Face 3,5: blind hole, etc. # Group by label with colors groups = group_predictions_by_label( predictions, palette, exclude_labels={0} # Skip background ) # Result format: [(face_indices, color, description), ...] # [ # ([1, 2, 6], (255, 0, 0), 'through hole'), # ([3, 5], (0, 0, 255), 'blind hole'), # ([7, 8], (0, 255, 0), 'pocket') # ] # Use directly with CADViewer viewer.color_faces_by_groups(groups, verbose=True) the function takes your prediction array and groups faces that have the same predicted label. For each group, it looks up the color and description from the palette. The result is a list of tuples ready to pass to :meth:`color_faces_by_groups()`. The ``exclude_labels`` parameter lets you skip certain labels (like background) that you don't want to color. In this example, faces predicted as "background" (label 0) are excluded, so only features are colored. This is the standard workflow for visualizing ML predictions: 1. Run inference to get predictions (array of label IDs) 2. Create a ColorPalette with your label descriptions and colors 3. Group predictions using ``group_predictions_by_label()`` 4. Color the CAD model using ``viewer.color_faces_by_groups()`` .. _viz-examples: Complete Workflow Examples =========================== Example 1: Dataset Exploration ------------------------------- This example shows the complete workflow for exploring a dataset: load it, check coverage, visualize random samples, filter by complexity, and view results in both 2D grids and 3D viewers. .. code-block:: python from hoops_ai.dataset import DatasetExplorer from hoops_ai.insights import DatasetViewer # Initialize explorer = DatasetExplorer(flow_output_file="fabwave.flow") viewer = DatasetViewer.from_explorer(explorer) # Print statistics viewer.print_statistics() # Get all files all_ids = viewer.get_available_file_ids() # Visualize random sample import random sample_ids = random.sample(all_ids, 25) viewer.show_preview_as_image(sample_ids, title='Random Sample') # Filter by complexity complex_parts = lambda ds: ds['num_nodes'] > 40 complex_ids = explorer.get_file_list(group="graph", where=complex_parts) # Visualize complex parts viewer.show_preview_as_image( complex_ids, k=16, title=f'High Complexity Parts ({len(complex_ids)} files)', save_path='results/complex_parts.png' ) # Interactive 3D view of first 3 viewers_3d = viewer.show_preview_as_3d(complex_ids, k=3, width=500, height=400) # Cleanup for v in viewers_3d: v.terminate() explorer.close() The workflow: First, print statistics to verify visualization coverage (are PNGs and 3D files available?). Then grab a random sample to get a feel for the dataset. Next, apply a complexity filter (more than 40 graph nodes) and visualize those results both as an image grid (saved to disk) and as interactive 3D viewers (for detailed inspection). Finally, clean up resources. Example 2: Label-Based Filtering --------------------------------- This example demonstrates filtering by specific labels and creating high-resolution visualizations: .. code-block:: python from hoops_ai.dataset import DatasetExplorer from hoops_ai.insights import DatasetViewer # Setup explorer = DatasetExplorer(flow_output_file="fabwave.flow") viewer = DatasetViewer.from_explorer(explorer) # Get label descriptions label_df = explorer.get_descriptions("file_label") print(label_df) # Filter by specific label pipe_fittings = lambda ds: ds['file_label'] == 15 pipe_ids = explorer.get_file_list(group="file", where=pipe_fittings) print(f"\nFound {len(pipe_ids)} pipe fittings") # Create visualization fig = viewer.show_preview_as_image( pipe_ids, k=25, grid_cols=5, title='Pipe Fittings (Label 15)', label_format='name', figsize=(15, 8) ) # Save high-resolution version fig.savefig('results/pipe_fittings_overview.png', dpi=300, bbox_inches='tight') # Cleanup explorer.close() The workflow: First, examine available labels using ``get_descriptions()`` to understand what label 15 represents. Filter for that specific label. Create a detailed visualization with both IDs and names shown (``label_format='both'``), arranged in 6 columns. Save the result at high resolution (300 DPI) for presentation or publication. Example 3: ML Prediction Visualization --------------------------------------- This example shows the complete pipeline from model predictions to colored 3D visualization: .. code-block:: python from hoops_ai.insights import CADViewer from hoops_ai.insights.utils import ColorPalette, group_predictions_by_label import numpy as np # Load model predictions (example) predictions = np.load('predictions.npy') # Shape: (n_faces,) # Define label palette labels = { 0: "no feature", 17: "through hole", 18: "blind hole", 23: "pocket", 24: "slot" } palette = ColorPalette.from_labels( labels, cmap_name='Set3', reserved_colors={ 0: (220, 220, 220), # Light gray for no feature 17: (255, 0, 0), # Red for through holes 18: (255, 165, 0) # Orange for blind holes } ) # Group predictions by label groups = group_predictions_by_label( predictions, palette, exclude_labels={0} ) # Visualize on 3D model viewer = CADViewer() viewer.load_cad_file("test_part.step") viewer.show(display_mode='sidecar') # Color faces by prediction viewer.color_faces_by_groups(groups, delay=0.3, verbose=True) # Get statistics print("\nPrediction Distribution:") for indices, color, desc in groups: print(f" {desc}: {len(indices)} faces") # Cleanup viewer.terminate() The workflow: Load predictions from your trained model (a NumPy array with one label per face). Create a ColorPalette mapping each label ID to a color and description. Use ``group_predictions_by_label()`` to organize faces by their predicted label, excluding background (label 0). Load the CAD model in a viewer, then color it using the grouped predictions. The ``verbose=True`` option shows colored terminal output as each feature type is colored. Finally, print statistics showing how many faces were predicted for each class. Example 4: Side-by-Side Comparison --------------------------------------- .. code-block:: python from hoops_ai.insights import DatasetViewer # Setup viewer viewer = DatasetViewer.from_explorer(explorer) # Compare original vs optimized design viewer_original, viewer_optimized = viewer.create_comparison_view( file_id_a=100, # Original design file_id_b=150, # Optimized design display_mode='sidecar' ) # Highlight same features in both critical_faces = [5, 7, 12, 18] viewer_original.set_face_color(critical_faces, [255, 0, 0]) viewer_optimized.set_face_color(critical_faces, [255, 0, 0]) # User can interact with both viewers simultaneously # Compare geometry, analyze changes, etc. # Cleanup viewer_original.terminate() viewer_optimized.terminate() Example 5: Batch Processing with Visualization ----------------------------------------------- .. code-block:: python from hoops_ai.dataset import DatasetExplorer from hoops_ai.insights import DatasetViewer import matplotlib.pyplot as plt # Initialize explorer = DatasetExplorer(flow_output_file="dataset.flow") viewer = DatasetViewer.from_explorer(explorer) # Get distribution of face counts dist = explorer.create_distribution( key="num_nodes", bins=10, group="graph" ) # Visualize distribution bin_centers = 0.5 * (dist['bin_edges'][1:] + dist['bin_edges'][:-1]) plt.figure(figsize=(10, 5)) plt.bar(bin_centers, dist['hist'], width=(dist['bin_edges'][1] - dist['bin_edges'][0])) plt.xlabel('Number of Faces') plt.ylabel('Count') plt.title('Face Count Distribution') plt.savefig('results/face_count_distribution.png', dpi=300) plt.show() # Visualize samples from each bin for i, bin_files in enumerate(dist['file_id_codes_in_bins']): if len(bin_files) > 0: # Take up to 9 samples from this bin sample_ids = bin_files[:9] # Create visualization fig = viewer.show_preview_as_image( sample_ids, k=9, grid_cols=3, title=f'Bin {i+1}: {int(dist["bin_edges"][i])}-{int(dist["bin_edges"][i+1])} faces', figsize=(9, 9) ) # Save fig.savefig(f'results/bin_{i+1}_samples.png', dpi=150) plt.close(fig) print("Batch visualization complete!") explorer.close() .. _viz-best-practices: Best Practices =================== Performance Tips ---------------- When working with large datasets or many CAD files, performance becomes critical. Here's how to keep your visualization workflows fast and responsive. **Limit 3D Viewers**: Opening many 3D viewers consumes significant system resources, each viewer runs a separate server process and maintains an active web session. Keep the number reasonable: .. code-block:: python # Good: Limit to 3-5 viewers viewers = viewer.show_preview_as_3d(file_ids, k=3) # Avoid: Too many simultaneous 3D viewers viewers = viewer.show_preview_as_3d(file_ids, k=50) # May crash! Why this matters: Each CADViewer instance spawns a hoops-viewer server process. With 50 viewers, you'd have 50 server processes competing for CPU and memory. Your system will slow to a crawl or run out of resources. Stick to 3-5 simultaneous viewers for responsive performance. **Use Image Grids for Overview**: When you need to see many parts at once, image grids are vastly more efficient than 3D viewers: .. code-block:: python # Efficiently preview 100 files viewer.show_preview_as_image(file_ids, k=100) Image grids load pre-rendered PNGs, which are lightweight and fast. You can display 100+ parts in a single grid without the overhead of running server processes. Use this for quick overviews, then open 3D viewers for the specific parts you want to inspect in detail. **Clean Up Resources**: Always terminate 3D viewers when done. Each viewer holds system resources (ports, memory, CPU) until explicitly closed: .. code-block:: python # Manual cleanup for v in viewers_3d: v.terminate() # Or use context manager with CADViewer() as viewer: # ... use viewer ... pass # Auto-cleanup The context manager (``with`` statement) is safer because it guarantees cleanup even if your code raises an exception. Manual cleanup with ``terminate()`` works but requires discipline, it's easy to forget, especially during interactive experimentation in Jupyter notebooks. **Filter Before Visualizing**: Reduce your data before creating visualizations. Don't visualize files that don't have the required resources: .. code-block:: python # Filter first filtered_ids = viewer.filter_by_availability( all_ids, require_png=True ) # Then visualize viewer.show_preview_as_image(filtered_ids, k=25) This prevents errors from missing files and avoids wasting time trying to display files that don't exist. The ``filter_by_availability()`` method checks which files actually have PNG/SCS files available, so you only work with valid data. Color Scheme Guidelines ------------------------ Choosing the right colors for your visualizations isn't just aesthetic, it affects how quickly you can interpret results and spot patterns in your data. **Use Reserved Colors for Important Labels**: Certain labels deserve specific colors for consistency and clarity: .. code-block:: python palette = ColorPalette.from_labels( labels, reserved_colors={ 0: (200, 200, 200), # Gray for background/no-label 1: (255, 0, 0) # Red for critical features } ) The reserved colors ensure that label 0 (typically background or "no feature") always appears gray, and label 1 (perhaps a critical feature type) always appears red. This consistency helps you quickly scan visualizations, you immediately know "gray = background, red = important thing to check." **Choose Appropriate Colormaps**: Different colormaps work better for different types of data: .. code-block:: python # Discrete labels (feature types): Use distinct colors palette = ColorPalette.from_labels(labels, colormap='tab20') # 20 distinct colors # or 'Set3', 'Paired' for smaller label sets # Sequential data (continuous values): Use gradients palette = ColorPalette.from_labels(labels, colormap='viridis') # Dark to bright # or 'plasma', 'cividis' for perceptually uniform gradients # Diverging data (values around a center): Use opposing colors palette = ColorPalette.from_labels(labels, colormap='RdBu') # Red-white-blue # or 'coolwarm' for warm-to-cool transition # Many labels (10+): Use full spectrum palette = ColorPalette.from_labels(labels, colormap='hsv') # Full hue range # Warning: Colors may be hard to distinguish with many labels The colormap choice affects how easily you can distinguish different labels. 'tab20' and 'Paired' give maximally distinct colors, making it easy to tell labels apart. 'viridis' and 'plasma' show smooth progressions (useful for representing continuous values discretized into bins). 'RdBu' and 'coolwarm' emphasize differences from a midpoint (useful for showing deviations). 'hsv' covers the full color spectrum but can be confusing with many labels. - **Discrete labels**: 'tab20', 'Set3', 'Paired' - **Sequential data**: 'viridis', 'plasma', 'cividis' - **Diverging data**: 'RdBu', 'coolwarm' - **Many labels**: 'hsv' (but can be hard to distinguish) **Exclude Background from Visualization**: Don't waste colors on background faces, they clutter the visualization and make actual features harder to see: .. code-block:: python groups = group_predictions_by_label( predictions, palette, exclude_labels={0} # Don't color background ) By excluding label 0 (background), those faces stay their default color (typically light gray from the base geometry). This keeps the focus on actual features. You immediately see which faces the model classified as features, while background faces fade into... well, the background. Integration Patterns ------------------------ The real power of the Insights module comes from combining its components in systematic workflows. Here are the three most common patterns you'll use. Pattern 1: Explore β†’ Filter β†’ Visualize ---------------------------------------- This is the fundamental pattern for dataset exploration. You start by understanding what's in your dataset, filter down to interesting subsets, then visualize those subsets: .. code-block:: python # 1. Explore dataset explorer = DatasetExplorer(flow_output_file="dataset.flow") explorer.print_table_of_contents() # 2. Filter files interesting_files = explorer.get_file_list( group="graph", where=lambda ds: ds['num_nodes'] > 30 ) # 3. Visualize viewer = DatasetViewer.from_explorer(explorer) viewer.show_preview_as_image(interesting_files, k=25) The workflow: First, call ``print_table_of_contents()`` to see what data is available in your dataset, what groups exist, what attributes you can filter on, and how many files you have. This gives you the lay of the land. Next, define a filter condition (here: parts with more than 30 graph nodes, indicating higher geometric complexity). The ``get_file_list()`` method applies this filter and returns matching file IDs. Finally, create a viewer from the same explorer (this automatically extracts visualization paths) and display the filtered results as an image grid. You've gone from "I have 10,000 parts" to "here are the 150 complex ones, and this is what they look like." Pattern 2: Analyze β†’ Sample β†’ Inspect -------------------------------------- When you want to understand distributions and examine representative samples from different ranges: .. code-block:: python # 1. Analyze distribution dist = explorer.create_distribution(key="num_nodes", bins=5) # 2. Sample from specific bin high_complexity_bin = dist['file_id_codes_in_bins'][-1] # Last bin sample = high_complexity_bin[:10] # 3. Inspect in 3D viewers = viewer.show_preview_as_3d(sample, k=3) The workflow: Use ``create_distribution()`` to bin your data by some attribute (here: number of graph nodes). This tells you how your parts are distributed, do you have mostly simple parts with a few complex ones, or is it evenly distributed? The distribution returns bins with file IDs in each bin. You can sample from specific bins, for example, the last bin contains the most complex parts. Take a sample from that bin (here: 10 files). Then create 3D viewers for a few of those samples (here: 3 viewers) to visually inspect what "high complexity" actually looks like in your dataset. This helps you understand whether your filters are capturing what you want. Pattern 3: Predict β†’ Visualize β†’ Validate ------------------------------------------ This is the ML prediction workflow, run your model, visualize results on 3D geometry, and validate predictions visually: .. code-block:: python # 1. Run predictions (from ML model) predictions = model.predict(test_data) # 2. Group by prediction groups = group_predictions_by_label(predictions, palette) # 3. Visualize on geometry cad_viewer = CADViewer() cad_viewer.load_cad_file("test_part.step") cad_viewer.show() cad_viewer.color_faces_by_groups(groups) # 4. Validate visually and correct if needed The workflow: After training your model, run inference on test data to get predictions (one label per face). Create a ColorPalette mapping labels to colors, and use ``group_predictions_by_label()`` to organize faces by their predicted labels. Now comes validation: visually inspect the colored model. Do the predictions make sense? Are holes colored red as expected? If you see something wrong, you can click faces to get their predictions and understand what the model is doing. This visual feedback is crucial for debugging ML models, you immediately see when the model confuses similar features or misses edge cases. .. _viz-troubleshooting: Troubleshooting Common Issues ------------------------------ **Port already in use:** If you see errors about the port being unavailable, another process is using that port. On Windows PowerShell, find and kill the process: .. code-block:: python # Problem: Specified port is busy viewer = CADViewer(port=8000) # Error if port 8000 is busy # Solution 1: Use auto port selection (recommended) viewer = CADViewer() # Auto-finds free port # Solution 2: Find and kill process using port # Windows PowerShell: # netstat -ano | findstr :8000 # taskkill /F /PID **3D viewer not displaying:** If the viewer window doesn't appear or you see a blank iframe, check if ``hoops-viewer`` is available in your environment: .. code-block:: python # Check if hoops-viewer is installed from hoops_ai.insights.hoops_viewer_interface import is_viewer_available if not is_viewer_available(): print("Install hoops-viewer: pip install hoops-viewer") **Missing PNG or SCS files:** When using DatasetViewer with files from a dataset, you might encounter missing files if the dataset creation didn't generate all expected outputs. Filter your file list to only include files with available visualizations: .. code-block:: python # Check availability stats = viewer.get_statistics() print(f"PNG coverage: {stats['png_percentage']:.1f}%") print(f"3D coverage: {stats['3d_percentage']:.1f}%") # Filter to available files only available = viewer.filter_by_availability( all_ids, require_png=True, require_3d=True ) This prevents errors from trying to display non-existent files. If most of your files are missing, check your dataset creation flow, the PNG and SCS conversion tasks might have failed or been skipped. **Image grid not displaying in Jupyter:** .. code-block:: python # Ensure matplotlib backend is configured import matplotlib matplotlib.use('inline') # For Jupyter notebooks import matplotlib.pyplot as plt plt.ion() # Interactive mode # Then create visualization fig = viewer.show_preview_as_image(file_ids) plt.show() # Explicitly show if needed Summary ======== The **Insights** module provides a complete visualization solution for CAD datasets: **DatasetViewer** - βœ… Batch visualization of query results - βœ… Image grids for quick overview - βœ… Interactive 3D for detailed inspection - βœ… Seamless DatasetExplorer integration - βœ… Side-by-side comparison **CADViewer** - βœ… Interactive 3D viewing in notebooks - βœ… Face coloring and selection - βœ… ML prediction visualization - βœ… Multiple display modes - βœ… Automatic resource management **Visualization Utils** - βœ… ColorPalette for label-color management - βœ… Automatic color generation - βœ… Prediction grouping utilities - βœ… Matplotlib colormap integration **Typical Workflows** .. code-block:: text DatasetExplorer β†’ Filter Files β†’ DatasetViewer β†’ Image Grid / 3D Views ↓ CADViewer β†’ Face Coloring β†’ Visual Analysis The Insights module transforms data analysis into visual understanding, making it easy to explore large CAD datasets, validate ML predictions, and communicate findings effectively. Next Steps ========== You now understand how to visualize CAD data using the Insights module. Here's what to explore next: **Explore datasets** - The :doc:`/programming_guide/explore-dataset` guide shows you how to structure and query your CAD datasets. You'll learn about the ``.flow`` format, how DatasetExplorer works, and what filtering operations are available. This is essential for understanding what data you can visualize. **Build ML workflows** - Combine visualization with model training using the :doc:`/programming_guide/train` guide. You'll see how to train classifiers on CAD features, run inference to generate predictions, and use the visualization tools from this guide to display results on 3D geometry. **Interactive analysis** - The :doc:`/programming_guide/cad-data-encoding` guide explains how features are extracted from CAD files. Understanding the encoding process helps you interpret what your visualizations are showing and why certain faces get classified as specific feature types. **Hands-on examples** - Try the complete tutorials in :doc:`/tutorials/index` to see end-to-end workflows combining data loading, model training, and result visualization. .. seealso:: - :doc:`/programming_guide/datasets` - Dataset exploration and querying - :doc:`/programming_guide/train` - ML model training and inference - :doc:`/api_ref/hoops_ai.insights` - Complete Insights API reference - :doc:`/tutorials/index` - End-to-end visualization examples