Skip to content

Inference batcher#

Inference batchers can be accessed from the Predictor metadata objects.

predictor.inference_batcher

InferenceBatcher #

Configuration of an inference batcher for a predictor.

PARAMETER DESCRIPTION
enabled

Whether the inference batcher is enabled or not. The default value is false.

TYPE: bool | None DEFAULT: None

max_batch_size

Maximum requests batch size.

TYPE: int | None DEFAULT: None

max_latency

Maximum latency for request batching.

TYPE: int | None DEFAULT: None

timeout

Maximum waiting time for request batching.

TYPE: int | None DEFAULT: None

RETURNS DESCRIPTION

InferenceLogger. Configuration of an inference logger.

enabled property writable #

enabled

Whether the inference batcher is enabled or not.

max_batch_size property writable #

max_batch_size

Maximum requests batch size.

max_latency property writable #

max_latency

Maximum latency.

timeout property writable #

timeout

Maximum timeout.

describe #

describe()

Print a JSON description of the inference batcher.