Deployment#

You can obtain a ModelServing using Project.get_model_serving. Then you can create a deployment using ModelServing.create_deployment, and retrieve existing deployments using ModelServing.get_deployment, ModelServing.get_deployment_by_id, and ModelServing.get_deployments.

You can also create a deployment by deploying a model via Model.deploy or from a predictor via Predictor.deploy.

Deployment #

NOT_FOUND_ERROR_CODE `class-attribute` `instance-attribute` #

NOT_FOUND_ERROR_CODE = 240000

Metadata object representing a deployment in Model Serving.

api_protocol `property` `writable` #

api_protocol

API protocol enabled in the deployment (e.g., HTTP or GRPC).

artifact_files_path `property` #

artifact_files_path

Path of the artifact files deployed by the predictor.

artifact_path `property` #

artifact_path

Path of the model artifact deployed by the predictor.

Deprecated

Artifact versions are deprecated in favor of deployment versions.

artifact_version `property` `writable` #

artifact_version

Artifact version deployed by the predictor.

Deprecated

Artifact versions are deprecated in favor of deployment versions.

config_file `property` `writable` #

config_file

Model server configuration file passed to the model deployment.

It can be accessed via CONFIG_FILE_PATH environment variable from a predictor or transformer script. For LLM deployments without a predictor script, this file is used to configure the vLLM engine.

created_at `property` #

created_at

Created at date of the predictor.

creator `property` #

creator

Creator of the predictor.

description `property` `writable` #

description

Description of the deployment.

environment `property` `writable` #

environment

Name of inference environment.

has_model `property` #

has_model

Whether the deployment has a model associated.

id `property` #

id

Id of the deployment.

inference_batcher `property` `writable` #

inference_batcher

Configuration of the inference batcher attached to this predictor.

inference_logger `property` `writable` #

inference_logger

Configuration of the inference logger attached to this predictor.

model_name `property` `writable` #

model_name

Name of the model deployed by the predictor.

model_path `property` `writable` #

model_path

Model path deployed by the predictor.

model_registry_id `property` `writable` #

model_registry_id

Model Registry Id of the deployment.

model_server `property` `writable` #

model_server

Model server ran by the predictor.

model_version `property` `writable` #

model_version

Model version deployed by the predictor.

name `property` `writable` #

name

Name of the deployment.

predictor `property` `writable` #

predictor

Predictor used in the deployment.

project_name `property` `writable` #

project_name

Name of the project the deployment belongs to.

project_namespace `property` `writable` #

project_namespace

Name of inference environment.

requested_instances `property` #

requested_instances

Total number of requested instances in the deployment.

resources `property` `writable` #

resources

Resource configuration for the predictor.

script_file `property` `writable` #

script_file

Script file used by the predictor.

serving_tool `property` `writable` #

serving_tool

Serving tool used to run the model server.

transformer `property` `writable` #

transformer

Transformer configured in the predictor.

version `property` #

version

Version of the deployment.

delete #

delete(force=False)

Delete the deployment.

PARAMETER	DESCRIPTION
`force`	Force the deletion of the deployment. If the deployment is running, it will be stopped and deleted automatically. DEFAULT: `False`

Warning

A call to this method does not ask for a second confirmation.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

describe #

describe()

Print a JSON description of the deployment.

download_artifact_files #

download_artifact_files(local_path=None)

Download the artifact files served by the deployment.

PARAMETER	DESCRIPTION
`local_path`	path where to download the artifact files in the local filesystem DEFAULT: `None`

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

get_logs #

get_logs(component='predictor', tail=10)

Prints the deployment logs of the predictor or transformer.

PARAMETER	DESCRIPTION
`component`	Deployment component to get the logs from (e.g., predictor or transformer) DEFAULT: `'predictor'`
`tail`	Number of most recent lines to retrieve from the logs. DEFAULT: `10`

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

get_model #

get_model()

Retrieve the metadata object for the model being used by this deployment.

get_state #

get_state() -> PredictorState

Get the current state of the deployment.

RETURNS	DESCRIPTION
`PredictorState`	`PredictorState`. The state of the deployment.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

get_url #

get_url()

Get url to the deployment in Hopsworks.

is_created #

is_created() -> bool

Check whether the deployment is created.

RETURNS	DESCRIPTION
`bool`	`bool`. Whether the deployment is created or not.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

is_running #

is_running(or_idle=True, or_updating=True) -> bool

Check whether the deployment is ready to handle inference requests.

PARAMETER	DESCRIPTION
`or_idle`	Whether the idle state is considered as running (default is True) DEFAULT: `True`
`or_updating`	Whether the updating state is considered as running (default is True) DEFAULT: `True`

RETURNS	DESCRIPTION
`bool`	`bool`. Whether the deployment is ready or not.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

is_stopped #

is_stopped(or_created=True) -> bool

Check whether the deployment is stopped.

PARAMETER	DESCRIPTION
`or_created`	Whether the creating and created state is considered as stopped (default is True) DEFAULT: `True`

RETURNS	DESCRIPTION
`bool`	`bool`. Whether the deployment is stopped or not.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

predict #

predict(
    data: dict | InferInput = None,
    inputs: list | dict = None,
)

Send inference requests to the deployment.

One of data or inputs parameters must be set. If both are set, inputs will be ignored.

Example

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

# retrieve deployment by name
my_deployment = ms.get_deployment("my_deployment")

# (optional) retrieve model input example
my_model = project.get_model_registry()                                .get_model(my_deployment.model_name, my_deployment.model_version)

# make predictions using model inputs (single or batch)
predictions = my_deployment.predict(inputs=my_model.input_example)

# or using more sophisticated inference request payloads
data = { "instances": [ my_model.input_example ], "key2": "value2" }
predictions = my_deployment.predict(data)

PARAMETER	DESCRIPTION
`data`	Payload dictionary for the inference request including the model input(s) TYPE: `dict \| InferInput` DEFAULT: `None`
`inputs`	Model inputs used in the inference requests TYPE: `list \| dict` DEFAULT: `None`

RETURNS	DESCRIPTION
	`dict`. Inference response.

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

save #

save(await_update: int | None = 600)

Persist this deployment including the predictor and metadata to Model Serving.

PARAMETER	DESCRIPTION
`await_update`	If the deployment is running, awaiting time (seconds) for the running instances to be updated. If the running instances are not updated within this timespan, the call to this method returns while the update in the background. TYPE: `int \| None` DEFAULT: `600`

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

start #

start(await_running: int | None = 600)

Start the deployment.

PARAMETER	DESCRIPTION
`await_running`	Awaiting time (seconds) for the deployment to start. If the deployment has not started within this timespan, the call to this method returns while it deploys in the background. TYPE: `int \| None` DEFAULT: `600`

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

stop #

stop(await_stopped: int | None = 600)

Stop the deployment.

PARAMETER	DESCRIPTION
`await_stopped`	Awaiting time (seconds) for the deployment to stop. If the deployment has not stopped within this timespan, the call to this method returns while it stopping in the background. TYPE: `int \| None` DEFAULT: `600`

RAISES	DESCRIPTION
`hopsworks.client.exceptions.RestAPIError`	In case the backend encounters an issue

Deployment#

Deployment #

NOT_FOUND_ERROR_CODE class-attribute instance-attribute #

api_protocol property writable #

artifact_files_path property #

artifact_path property #

artifact_version property writable #

config_file property writable #

created_at property #

creator property #

description property writable #

environment property writable #

has_model property #

id property #

inference_batcher property writable #

inference_logger property writable #

model_name property writable #

model_path property writable #

model_registry_id property writable #

model_server property writable #

model_version property writable #

name property writable #

predictor property writable #

project_name property writable #

project_namespace property writable #

requested_instances property #

resources property writable #

script_file property writable #

serving_tool property writable #

transformer property writable #

version property #