Inference batcher#

Inference batchers can be accessed from the Predictor metadata objects.

predictor.inference_batcher

InferenceBatcher #

Configuration of an inference batcher for a predictor.

PARAMETER	DESCRIPTION
`enabled`	Whether the inference batcher is enabled or not. The default value is `false`. TYPE: `bool \| None` DEFAULT: `None`
`max_batch_size`	Maximum requests batch size. TYPE: `int \| None` DEFAULT: `None`
`max_latency`	Maximum latency for request batching. TYPE: `int \| None` DEFAULT: `None`
`timeout`	Maximum waiting time for request batching. TYPE: `int \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
	`InferenceLogger`. Configuration of an inference logger.

enabled

Whether the inference batcher is enabled or not.

max_batch_size

Maximum requests batch size.

max_latency

Maximum latency.

timeout

Maximum timeout.

describe()

Print a JSON description of the inference batcher.