Perform a server-side streaming online prediction request for Vertex LLM streaming.
Arguments
| Parameters | |
|---|---|
| endpoint | 
 Required. The name of the Endpoint requested to serve the prediction. Format:  | 
| region | 
 Required. Region of the HTTP endpoint. For example, if region is set to  | 
| body | 
 Required. | 
Raised exceptions
| Exceptions | |
|---|---|
| ConnectionError | In case of a network problem (such as DNS failure or refused connection). | 
| HttpError | If the response status is >= 400 (excluding 429 and 503). | 
| TimeoutError | If a long-running operation takes longer to finish than the specified timeout limit. | 
| TypeError | If an operation or function receives an argument of the wrong type. | 
| ValueError | If an operation or function receives an argument of the right type but an inappropriate value. For example, a negative timeout. | 
Response
If successful, the response contains an instance of GoogleCloudAiplatformV1StreamingPredictResponse.
Subworkflow snippet
Some fields might be optional or required. To identify required fields, refer to the API documentation.
YAML
- serverStreamingPredict: call: googleapis.aiplatform.v1.projects.locations.endpoints.serverStreamingPredict args: endpoint: ... region: ... body: inputs: ... parameters: boolVal: ... bytesVal: ... doubleVal: ... dtype: ... floatVal: ... int64Val: ... intVal: ... listVal: ... shape: ... stringVal: ... structVal: ... tensorVal: ... uint64Val: ... uintVal: ... result: serverStreamingPredictResult
JSON
[ { "serverStreamingPredict": { "call": "googleapis.aiplatform.v1.projects.locations.endpoints.serverStreamingPredict", "args": { "endpoint": "...", "region": "...", "body": { "inputs": "...", "parameters": { "boolVal": "...", "bytesVal": "...", "doubleVal": "...", "dtype": "...", "floatVal": "...", "int64Val": "...", "intVal": "...", "listVal": "...", "shape": "...", "stringVal": "...", "structVal": "...", "tensorVal": "...", "uint64Val": "...", "uintVal": "..." } } }, "result": "serverStreamingPredictResult" } } ]