hoops_ai.ml.context_layer.JsonContextProvider

class hoops_ai.ml.context_layer.JsonContextProvider(path, initial=None, *, create=True, store_path=None, id_filter=None, identifier_fields=('part_id',))

Bases: ContextProvider

JSON-backed ContextProvider for lightweight persistent metadata.

path may point to either:

  1. A store-shaped JSON file whose root is {part_id: payload}.

  2. A directory containing many single-part JSON files.

Directory mode is useful for source metadata corpora where each part has one JSON file. The source files are read, wrapped into a store-shaped mapping, and persisted to store_path. Source files are not modified.

Parameters:
  • path (str | Path)

  • initial (Mapping[str, dict] | None)

  • create (bool)

  • store_path (str | Path | None)

  • id_filter (Callable[[str], bool] | None)

  • identifier_fields (Sequence[str])

describe_for(part_ids)

Summarize key availability across part_ids.

Issues a single get_contexts() call and aggregates {observed, missing, coverage} per top-level key.

coverage = observed / len(part_ids). Ids that the store does not resolve count as missing for every key (consistent with get_contexts omitting unknown ids).

Partners with a cheaper schema endpoint can override this method and skip the get_contexts round-trip entirely.

Parameters:

part_ids (Sequence[str])

Return type:

dict[str, dict]

get_contexts(part_ids)

Return {part_id: attribute_dict} for ids that resolve.

Missing ids must be omitted from the result (not returned with None). Each value is a free-form dict whose top-level keys match the keys passed to ContextPredictor.infer(..., keys=[...]).

Implementations must treat this as the primary API and issue a single batched call against the backend — not a loop.

Parameters:

part_ids (Sequence[str])

Return type:

Mapping[str, dict]

list_numeric_keys()

Return the metadata keys that should be treated as numeric.

ContextPredictor consults this list to route un-mapped keys to its numerical rule. Keys not returned here are treated as categorical. The default returns () (everything is categorical) so existing partner subclasses keep working unchanged; subclasses with a real schema should override.

Per-key rules supplied to ContextPredictor always win over this list, so callers can still override on a per-call basis without changing the provider.

Return type:

Sequence[str]

set_contexts(updates)

Persist one update per part_id.

Merge semantics, durability, and authentication are the implementation’s choice. The predictor never calls this method.

Parameters:

updates (Mapping[str, dict])

Return type:

None