hoops_ai.ml.context_layer

Quick Overview

Classes

AggregationRule()

Base class for context-aggregation rules.

CategoricalRule([temperature, score_floor, ...])

Predict categorical context by exponential score-weighting with consensus guard.

ContextPrediction(value, confidence[, ...])

A single predicted context value with confidence.

ContextPredictor(context_provider[, ...])

Predicts engineering context (metadata) for a query part from its nearest neighbors.

ContextProvider()

InMemoryContextProvider([initial, numeric_keys])

NearestNeighborRule([threshold])

Use the top-similarity hit's value when shape similarity is high enough.

NumericWeightedRule([log_scale, ...])

Predict continuous numeric context (e.g. cost, weight, time) from neighbors.

RelevanceWeighter(factors)

Adjusts hit scores based on metadata agreement with a query context.

JsonContextProvider(path[, initial, create, ...])

JSON-backed ContextProvider for lightweight persistent metadata.

Context Layer Module

Predicts engineering context (metadata) for query parts based on similarity search results.

ContextPredictor consumes a list of VectorHit objects from CADSearch and predicts metadata values (e.g. material, category) by analyzing what metadata is present in the nearest neighbors and how strong each match is.

Usage:

from hoops_ai.ml.context_layer import (
    ContextPredictor, InMemoryContextProvider,
    CategoricalRule, NumericWeightedRule,
)

store = InMemoryContextProvider(
    {"part-a": {"Material": "Steel", "Cost": 12.5}},
    numeric_keys=["Cost"],
)

# Zero-config: defaults are CategoricalRule + NumericWeightedRule.
# Dispatch is automatic via store.list_numeric_keys().
predictor = ContextPredictor(context_provider=store)
predictions = predictor.infer(hits, keys=["Material", "Cost"])
# predictions["Material"] / predictions["Cost"] are ContextPrediction
# objects with .value/.confidence.

# Customise either default or override per-key:
predictor = ContextPredictor(
    context_provider=store,
    default_categorical_rule=CategoricalRule(min_evidence=2),
    default_numerical_rule=NumericWeightedRule(log_scale=True),
    per_key_rules={"Material": CategoricalRule(min_top_share=0.8)},
)
class hoops_ai.ml.context_layer.AggregationRule

Bases: ABC

Base class for context-aggregation rules.

A rule turns a list of (value, score) pairs harvested from a single context key into one ContextPrediction. Subclasses must implement predict(); predict_with_context() is optional and defaults to ignoring the extra arguments and delegating to predict.

Override predict_with_context() when the rule wants to consume the query’s own attributes (query_context) or per-hit metadata (hits). NumericWeightedRule does this to run an internal RelevanceWeighter and/or an MLP without forcing the predictor to know about either.

Override bind() to receive a one-shot reference to the ContextProvider at predictor construction. The default is a no-op so most rules can ignore it.

abstract predict(values, scores, key)

Predict a context value from neighbor evidence.

Parameters:
  • values (list[Any]) – The context values collected from hits for this key, ordered by relevance (best match first).

  • scores (list[float]) – Similarity scores corresponding 1-to-1 with values (higher = more similar).

  • key (str) – The context key being inferred.

Returns:

ContextPrediction or None if insufficient evidence.

Return type:

ContextPrediction | None

predict_with_context(values, scores, key, query_context=None, *, hits=None)

Predict with optional query context and per-hit metadata.

Default implementation ignores query_context and hits and delegates to predict(). Override in subclasses that benefit from one or both.

Parameters:
Return type:

ContextPrediction | None

class hoops_ai.ml.context_layer.CategoricalRule(temperature=10.0, score_floor=0.0, min_confidence=0.0, min_margin=0.1, top1_dominance_threshold=1.0, top1_dominance_weight=0.85, min_top1_margin_for_dominance=0.08, normalize=None)

Bases: AggregationRule

Predict categorical context by exponential score-weighting with consensus guard.

Combines three key ideas: 1. Exponential sharpening — top-scoring hits dominate the vote,

making it safe to use large candidate pools (top-50+) without noise dilution from low-scoring neighbors.

  1. Top-1 exact-match dominance — when the best hit is a near-duplicate of the query (score ≥ top1_dominance_threshold) and clearly beats the runner-up (margin ≥ min_top1_margin_for_dominance), the rule short-circuits the softmax and gives that hit a fixed weight (top1_dominance_weight), splitting the remainder across the tail. This handles the “the query is already in the index” / self-hit case deterministically instead of letting softmax smear the result.

  2. Margin-based abstention — refuses to predict when the winner’s lead over the runner-up is too slim, avoiding unreliable coin-flip predictions in production pipelines.

weight_i = exp(temperature × score_i) # softmax path weights = [w_top1, (1-w_top1) * softmax(rest)] # top-1 dominance path

A prediction is made only when: - The winner’s weighted share ≥ min_confidence - The gap between winner and runner-up ≥ min_margin

Special cases: - temperature=0 → uniform weighting (all hits equal) - min_margin=0 → always predicts (no consensus guard) - score_floor > 0 → discards distant hits before voting - top1_dominance_threshold >= 1.0 (default) → dominance disabled, pure softmax

Parameters:
  • temperature (float) – Controls exponential sharpness. Higher values focus more weight on top hits. Default 10.0.

  • score_floor (float) – Hits below this score are discarded entirely. Default 0.0.

  • min_confidence (float) – Minimum weighted vote share for a valid prediction. Predictions below this return None (abstain). Default 0.0.

  • min_margin (float) – Minimum confidence gap between winner and runner-up. If the race is tighter than this, returns None. Default 0.1.

  • top1_dominance_threshold (float) – Score at/above which the best hit is treated as a near-duplicate of the query and assigned top1_dominance_weight directly. Use 0.99 to enable. Default 1.0 (disabled).

  • top1_dominance_weight (float) – Weight assigned to the top-1 hit when dominance triggers. The remaining hits share 1 - this via softmax. Default 0.85.

  • min_top1_margin_for_dominance (float) – Minimum score gap between top-1 and top-2 for dominance to trigger. Prevents triggering when the second hit is also near-duplicate. Default 0.08.

  • normalize (Callable[[Any], Any] | Mapping[str, Callable[[Any], Any]] | None) – Optional callable mapping raw values to a canonical form for grouping (e.g. "steel 1018" and "S235JR""steel"). The rule groups by normalize(value) and reports the most frequent raw form per group as the prediction’s value. Default None (no normalization).

predict(values, scores, key)

Predict a context value from neighbor evidence.

Parameters:
  • values (list[Any]) – The context values collected from hits for this key, ordered by relevance (best match first).

  • scores (list[float]) – Similarity scores corresponding 1-to-1 with values (higher = more similar).

  • key (str) – The context key being inferred.

Returns:

ContextPrediction or None if insufficient evidence.

Return type:

ContextPrediction | None

predict_with_context(values, scores, key, query_context=None, *, hits=None)

Predict with optional query context and per-hit metadata.

Default implementation ignores query_context and hits and delegates to predict(). Override in subclasses that benefit from one or both.

Parameters:
Return type:

ContextPrediction | None

class hoops_ai.ml.context_layer.ContextPrediction(value, confidence, evidence_count=0, alternatives=<factory>, status=None, reasons=<factory>, coverage=None, evidence=None, injected_context=None)

Bases: object

A single predicted context value with confidence.

Parameters:
value

The inferred value for the context key.

Type:

Any

confidence

Probability/confidence in [0.0, 1.0].

Type:

float

evidence_count

Number of hits that contributed to this prediction.

Type:

int

alternatives

Runner-up predictions sorted by confidence descending.

Type:

list[dict[str, Any]]

status

Optional decision label — one of ready_to_propose, needs_review, or insufficient_evidence. None when the caller has not requested status evaluation.

Type:

str | None

reasons

Optional human-readable reasons explaining the status (empty when status is ready_to_propose or None).

Type:

list[str]

coverage

Optional coverage diagnostics about the hit pool that produced this prediction. Keys include observed_hits, total_hits, observed_score_weight, missing_score_weight. Successful predictions from ContextPredictor.infer include coverage; insufficient sentinels include coverage when status/evidence output was requested.

Type:

dict[str, Any] | None

evidence

Optional per-hit contribution records (rank, score, value, source). Populated only when return_evidence=True is passed to ContextPredictor.infer.

Type:

list[dict[str, Any]] | None

injected_context

Optional snapshot of the query_context the ContextPredictor actually fed to the rule for this key. Populated for numeric keys when ContextPredictor.infer forwards earlier ready_to_propose categorical predictions (or the caller’s query_context) so downstream rules can re-rank neighbors. None when no context was injected.

Type:

dict[str, Any] | None

alternatives: list[dict[str, Any]]
confidence: float
coverage: dict[str, Any] | None = None
evidence: list[dict[str, Any]] | None = None
evidence_count: int = 0
injected_context: dict[str, Any] | None = None
reasons: list[str]
status: str | None = None
value: Any
class hoops_ai.ml.context_layer.ContextPredictor(context_provider, default_categorical_rule=None, default_numerical_rule=None, per_key_rules=None)

Bases: object

Predicts engineering context (metadata) for a query part from its nearest neighbors.

Given a list of VectorHit objects (from CADSearch.search_by_shape) and a set of metadata keys to infer, this class fetches per-hit metadata from a ContextProvider, extracts evidence per key, and dispatches each key to an AggregationRule.

Dispatch precedence:

  1. per_key_rules[key] if present.

  2. default_numerical_rule if key in context_provider.list_numeric_keys().

  3. default_categorical_rule otherwise.

Defaults are constructed lazily: CategoricalRule() and NumericWeightedRule() if you do not pass them.

Parameters:

Example

from hoops_ai.ml.context_layer import (

ContextPredictor, InMemoryContextProvider, CategoricalRule, NumericWeightedRule,

)

store = InMemoryContextProvider(

{“part-a”: {“Material”: “Steel”, “Cost”: 12.5}}, numeric_keys=[“Cost”],

) predictor = ContextPredictor(context_provider=store) predictions = predictor.infer(hits, keys=[“Material”, “Cost”]) # “Material” uses CategoricalRule (default), “Cost” uses NumericWeightedRule (default).

property context_provider: ContextProvider
property default_categorical_rule: AggregationRule
property default_numerical_rule: AggregationRule
infer(hits, keys, query_context=None, *, return_evidence=False, status_policy=<object object>)

Infer context values from a list of search hits.

Metadata is fetched from the configured ContextProvider in a single batched call; VectorHit.metadata is not consulted.

Every key in keys is mapped to a ContextPrediction — the result is never None. When the underlying rule cannot produce a value (margin guard fired, no usable evidence, empty hit list) the returned ContextPrediction carries value=None and status=STATUS_INSUFFICIENT so callers can branch on a single, well-typed object.

status_policy controls how status is gated:

  • Not passedDEFAULT_STATUS_POLICY is applied. Lenient defaults give every non-abstaining prediction STATUS_READY; abstentions are STATUS_INSUFFICIENT.

  • Mapping → the caller’s policy is used (tighten any of min_observed_hits, min_observed_score_weight, min_top_share, min_margin).

  • Explicitly ``None`` → opt out of status evaluation entirely; predictions are still returned (never None) but their status is None and reasons is empty.

When keys mixes categorical and numeric entries, categoricals are inferred first and any prediction reaching STATUS_READY (ready_to_propose) is merged into the query_context passed to numeric rules. Numeric ContextPrediction objects carry the merged context on their injected_context field so callers can see what conditioned the numeric estimate. The output dict preserves the caller’s key order.

Parameters:
Return type:

dict[str, ContextPrediction]

property per_key_rules: Mapping[str, AggregationRule]
class hoops_ai.ml.context_layer.ContextProvider

Bases: ABC

describe_for(part_ids)

Summarize key availability across part_ids.

Issues a single get_contexts() call and aggregates {observed, missing, coverage} per top-level key.

coverage = observed / len(part_ids). Ids that the store does not resolve count as missing for every key (consistent with get_contexts omitting unknown ids).

Partners with a cheaper schema endpoint can override this method and skip the get_contexts round-trip entirely.

Parameters:

part_ids (Sequence[str])

Return type:

dict[str, dict]

abstract get_contexts(part_ids)

Return {part_id: attribute_dict} for ids that resolve.

Missing ids must be omitted from the result (not returned with None). Each value is a free-form dict whose top-level keys match the keys passed to ContextPredictor.infer(..., keys=[...]).

Implementations must treat this as the primary API and issue a single batched call against the backend — not a loop.

Parameters:

part_ids (Sequence[str])

Return type:

Mapping[str, dict]

list_numeric_keys()

Return the metadata keys that should be treated as numeric.

ContextPredictor consults this list to route un-mapped keys to its numerical rule. Keys not returned here are treated as categorical. The default returns () (everything is categorical) so existing partner subclasses keep working unchanged; subclasses with a real schema should override.

Per-key rules supplied to ContextPredictor always win over this list, so callers can still override on a per-call basis without changing the provider.

Return type:

Sequence[str]

abstract set_contexts(updates)

Persist one update per part_id.

Merge semantics, durability, and authentication are the implementation’s choice. The predictor never calls this method.

Parameters:

updates (Mapping[str, dict])

Return type:

None

class hoops_ai.ml.context_layer.InMemoryContextProvider(initial=None, *, numeric_keys=None)

Bases: ContextProvider

Parameters:
  • initial (Optional[Mapping[str, dict]])

  • numeric_keys (Optional[Sequence[str]])

describe_for(part_ids)

Summarize key availability across part_ids.

Issues a single get_contexts() call and aggregates {observed, missing, coverage} per top-level key.

coverage = observed / len(part_ids). Ids that the store does not resolve count as missing for every key (consistent with get_contexts omitting unknown ids).

Partners with a cheaper schema endpoint can override this method and skip the get_contexts round-trip entirely.

Parameters:

part_ids (Sequence[str])

Return type:

dict[str, dict]

get_contexts(part_ids)

Return {part_id: attribute_dict} for ids that resolve.

Missing ids must be omitted from the result (not returned with None). Each value is a free-form dict whose top-level keys match the keys passed to ContextPredictor.infer(..., keys=[...]).

Implementations must treat this as the primary API and issue a single batched call against the backend — not a loop.

Parameters:

part_ids (Sequence[str])

Return type:

Mapping[str, dict]

list_numeric_keys()

Return the metadata keys that should be treated as numeric.

ContextPredictor consults this list to route un-mapped keys to its numerical rule. Keys not returned here are treated as categorical. The default returns () (everything is categorical) so existing partner subclasses keep working unchanged; subclasses with a real schema should override.

Per-key rules supplied to ContextPredictor always win over this list, so callers can still override on a per-call basis without changing the provider.

Return type:

Sequence[str]

set_contexts(updates)

Persist one update per part_id.

Merge semantics, durability, and authentication are the implementation’s choice. The predictor never calls this method.

Parameters:

updates (Mapping[str, dict])

Return type:

None

class hoops_ai.ml.context_layer.JsonContextProvider(path, initial=None, *, create=True, store_path=None, id_filter=None, identifier_fields=('part_id',))

Bases: ContextProvider

JSON-backed ContextProvider for lightweight persistent metadata.

path may point to either:

  1. A store-shaped JSON file whose root is {part_id: payload}.

  2. A directory containing many single-part JSON files.

Directory mode is useful for source metadata corpora where each part has one JSON file. The source files are read, wrapped into a store-shaped mapping, and persisted to store_path. Source files are not modified.

Parameters:
  • path (str | Path)

  • initial (Mapping[str, dict] | None)

  • create (bool)

  • store_path (str | Path | None)

  • id_filter (Callable[[str], bool] | None)

  • identifier_fields (Sequence[str])

describe_for(part_ids)

Summarize key availability across part_ids.

Issues a single get_contexts() call and aggregates {observed, missing, coverage} per top-level key.

coverage = observed / len(part_ids). Ids that the store does not resolve count as missing for every key (consistent with get_contexts omitting unknown ids).

Partners with a cheaper schema endpoint can override this method and skip the get_contexts round-trip entirely.

Parameters:

part_ids (Sequence[str])

Return type:

dict[str, dict]

get_contexts(part_ids)

Return {part_id: attribute_dict} for ids that resolve.

Missing ids must be omitted from the result (not returned with None). Each value is a free-form dict whose top-level keys match the keys passed to ContextPredictor.infer(..., keys=[...]).

Implementations must treat this as the primary API and issue a single batched call against the backend — not a loop.

Parameters:

part_ids (Sequence[str])

Return type:

Mapping[str, dict]

list_numeric_keys()

Return the metadata keys that should be treated as numeric.

ContextPredictor consults this list to route un-mapped keys to its numerical rule. Keys not returned here are treated as categorical. The default returns () (everything is categorical) so existing partner subclasses keep working unchanged; subclasses with a real schema should override.

Per-key rules supplied to ContextPredictor always win over this list, so callers can still override on a per-call basis without changing the provider.

Return type:

Sequence[str]

set_contexts(updates)

Persist one update per part_id.

Merge semantics, durability, and authentication are the implementation’s choice. The predictor never calls this method.

Parameters:

updates (Mapping[str, dict])

Return type:

None

class hoops_ai.ml.context_layer.NearestNeighborRule(threshold=0.95)

Bases: AggregationRule

Use the top-similarity hit’s value when shape similarity is high enough.

Useful for keys whose value is essentially intrinsic to the geometry — number of internal features, hole count, surface area, bounding-box dimensions, … — and therefore better borrowed from the closest shape match than averaged out of a neighborhood. When the top-ranked hit’s score is at or above threshold, this rule returns that hit’s value with confidence equal to the top score. Otherwise it abstains so the predictor returns an STATUS_INSUFFICIENT sentinel.

Pairs naturally with ContextPredictor’s cross-key injection: install this rule for an intrinsic geometric key (InternalFeatures) via per_key_rules, and the resulting STATUS_READY prediction is forwarded into the query_context of every downstream numeric key (Cost, Weight, …) so their RelevanceWeighter can re-rank neighbors with a matching feature count.

Parameters:

threshold (float) – Minimum top-hit similarity required to trust the borrowed value, in [0.0, 1.0]. Default 0.95.

Example

from hoops_ai.ml.context_layer import (

ContextPredictor, NearestNeighborRule, NumericWeightedRule,

)

predictor = ContextPredictor(

provider, per_key_rules={

“InternalFeatures”: NearestNeighborRule(threshold=0.95),

},

)

predict(values, scores, key)

Predict a context value from neighbor evidence.

Parameters:
  • values (list[Any]) – The context values collected from hits for this key, ordered by relevance (best match first).

  • scores (list[float]) – Similarity scores corresponding 1-to-1 with values (higher = more similar).

  • key (str) – The context key being inferred.

Returns:

ContextPrediction or None if insufficient evidence.

Return type:

ContextPrediction | None

predict_with_context(values, scores, key, query_context=None, *, hits=None)

Predict with optional query context and per-hit metadata.

Default implementation ignores query_context and hits and delegates to predict(). Override in subclasses that benefit from one or both.

Parameters:
Return type:

ContextPrediction | None

property threshold: float
class hoops_ai.ml.context_layer.NumericWeightedRule(log_scale=True, min_evidence=3, interval_sigmas=1.0, relevance_weighter=None, auto_relevance_weight=True, score_temperature=None, nearest_neighbor_threshold=None)

Bases: AggregationRule

Predict continuous numeric context (e.g. cost, weight, time) from neighbors.

Three operating modes selected at construction:

  1. Plain weighted meanrelevance_weighter=None and auto_relevance_weight=False. Pure shape-weighted aggregation.

  2. Auto-fitted weighter (default) — the rule lazily fits a per-key RelevanceWeighter from the current call’s hit metadata.

  3. Explicit weighter — pass relevance_weighter=RelevanceWeighter(...) to skip auto-fit and use hand-tuned factors.

Confidence comes from the coefficient of variation (CV) of neighbor values: confidence = 1 / (1 + CV). Log-space is appropriate for multiplicative quantities (cost, weight, time); set log_scale=False for additive quantities.

Parameters:
  • log_scale (bool) – If True (default), operates in log-space. Set False for additive quantities (counts, dimensions).

  • min_evidence (int) – Minimum hits required before predicting. Default 3.

  • interval_sigmas (float) – Prediction-interval width in σ. Default 1.0 (≈68%).

  • relevance_weighter (RelevanceWeighter | None) – Explicit weighter; disables auto-fit for keys this rule sees.

  • auto_relevance_weight (bool) – When True (default) and relevance_weighter is None, the rule lazily fits a per-key weighter the first time predict_with_context() runs for that key, using the call’s hit metadata.

  • score_temperature (float | None) – When set, neighbour weights are computed by a softmax exp(T·s) / Σ exp(T·s) over the adjusted scores instead of the default linear s / Σ s. Sharper weighting: a hit that matches on shape and on every injected metadata key takes almost all of the mass, so the estimate converges to that neighbour’s value. Useful when the cost surface is highly non-linear and you want a near-duplicate, full-metadata-match neighbour to dominate. None (default) preserves the linear behaviour. Typical values: 4.0 (moderate sharpening) to 12.0 (near-argmax). Must be a positive finite number.

  • nearest_neighbor_threshold (float | None) – When set, the rule first checks whether any neighbour reaches a normalised global agreement score adjusted_i / max_possible_boost ≥ threshold. The numerator is the weighter-adjusted score (shape × per-key boosts); the denominator is what a perfect-match neighbour would score, so the ratio lives in [0, 1]. When the gate fires the rule short-circuits to that neighbour’s value with confidence equal to the normalised score — the same idea as NearestNeighborRule but ranked by a global signal that also reflects Material / Process / InternalFeatures agreement instead of raw shape similarity alone. When no neighbour clears the gate the rule falls through to the softmax / linear path as usual. None (default) disables the gate. Must be in (0, 1] when set. With no weighter and no query_context the normalisation degenerates to 1.0 and this is exactly NearestNeighborRule(threshold) on the raw shape score.

predict(values, scores, key)

Predict a context value from neighbor evidence.

Parameters:
  • values (list[Any]) – The context values collected from hits for this key, ordered by relevance (best match first).

  • scores (list[float]) – Similarity scores corresponding 1-to-1 with values (higher = more similar).

  • key (str) – The context key being inferred.

Returns:

ContextPrediction or None if insufficient evidence.

Return type:

ContextPrediction | None

predict_with_context(values, scores, key, query_context=None, *, hits=None)

Adjust scores via the (auto- or hand-fit) weighter, then aggregate.

Pipeline:

  1. Resolve the weighter for key: explicit relevance_weighter wins; otherwise lazy-fit per-key if auto_relevance_weight.

  2. Auto-infer a baseline query_context from the top-K hits’ metadata using the weighter’s known factor keys, then overlay any caller-supplied query_context on top (caller wins). This lets a partial injection (e.g. {Material, Process} forwarded from earlier categorical predictions) still benefit from neighbor-derived numeric factors like InternalFeatures.

  3. Multiply scores by the weighter’s per-hit boosts.

  4. Fall through to predict() on the adjusted scores.

Parameters:
Return type:

ContextPrediction | None

property relevance_weighter: RelevanceWeighter | None

The explicit weighter passed at construction (None if auto-fitting).

class hoops_ai.ml.context_layer.RelevanceWeighter(factors)

Bases: object

Adjusts hit scores based on metadata agreement with a query context.

When predicting a target key (e.g. “Cost”), other known attributes of the query (Material, Process, InternalFeatures) can inform which neighbors are most trustworthy. A neighbor that matches on Material AND Process should be weighted higher than one differing on both.

For categorical attributes: exact match → full boost, mismatch → no boost. For numeric attributes: proximity-based boost using a Gaussian kernel.

The adjustment is multiplicative:

adjusted_score = base_score × product(boost_per_attribute)

Parameters:

factors (dict[str, dict[str, Any]]) –

Dict mapping metadata keys to their boost configuration. Each entry is a dict with:

”weight”: float — How much this attribute matters (0.0 to 1.0).

1.0 = full match doubles the score; 0.5 = half effect.

”type”: “categorical” | “numeric” — How to compute similarity. “scale”: float (numeric only) — Gaussian kernel bandwidth.

Example

weighter = RelevanceWeighter(factors={

“Material”: {“weight”: 1.0, “type”: “categorical”}, “Process”: {“weight”: 0.8, “type”: “categorical”}, “InternalFeatures”: {“weight”: 0.5, “type”: “numeric”, “scale”: 5.0},

})

adjust_scores(hits, base_scores, target_key, query_context)

Compute adjusted scores based on metadata agreement.

Parameters:
Return type:

list[float]

property factors: dict[str, dict[str, Any]]

The learned or manually-specified factor configuration.

classmethod fit(records, target_key, feature_keys=None, min_samples_per_group=2)

Learn importance factors from data.

Parameters:
  • records (list[dict[str, Any]]) – List of metadata dicts from indexed parts.

  • target_key (str) – The key to predict (e.g. “Cost”). Must be numeric.

  • feature_keys (list[str] | None) – Keys to evaluate as predictors. If None, uses all top-level record keys excluding target_key.

  • min_samples_per_group (int) – Minimum samples per categorical group.

Returns:

A configured RelevanceWeighter with learned factors.

Return type:

RelevanceWeighter

max_boost_for(target_key, query_context)

Return the multiplicative boost a perfect-match neighbour would receive.

For every factor key that is both known to the weighter and present in query_context (except target_key itself), a perfect match contributes (1 + weight). The product gives the ceiling against which an actual adjust_scores result can be normalised into [0, 1] — useful for nearest-neighbour-style gates that need a scale-invariant agreement signal.

Returns 1.0 when no relevant factor is present (the adjusted score then equals the raw shape score and no normalisation is needed).

Parameters:
Return type:

float