3. HOOPS AI - Minimal ETL Demo

This notebook demonstrates the core features of the HOOPS AI data engineering workflows:

Key Components

  • Schema-Based Dataset Organization: Define structured data schemas for consistent data merging

  • Parallel Task Decorators: Simplify CAD processing with type-safe task definitions

  • Generic Flow Orchestration: Automatically handle task dependencies and data flow

  • Automatic Dataset Merging: Process multiple files into a unified dataset structure

  • Integrated Exploration Tools: Analyze and prepare data for ML workflows

Run the notebook within the hoops_ai_cpu environment outlined in Evaluate & Install.

The code and resources for this tutorial can be found in the HOOPS-AI-Tutorials Github repository.

Hint

Launch jupyter lab notebooks/3a_ETL_pipeline_using_flow.ipynb from the bundle root to experiment with the sample workflow.