The ML.DISTANCE function
This document describes the ML.DISTANCE scalar function, which lets you
compute the distance between two vectors.
Syntax
ML.DISTANCE(vector1, vector2 [, type])
Arguments
ML.DISTANCE has the following arguments:
- vector1: an- ARRAYvalue that represents the first vector, in one of the following forms:- ARRAY<Numerical type>
- ARRAY<STRUCT<STRING, Numerical type>>
- ARRAY<STRUCT<INT64, Numerical type>>
 - where - Numerical typeis- BIGNUMERIC,- FLOAT64,- INT64or- NUMERIC. For example- ARRAY<STRUCT<INT64, BIGNUMERIC>>.- When a vector is expressed as - ARRAY<Numerical type>, each element of the array denotes one dimension of the vector. An example of a four-dimensional vector is- [0.0, 1.0, 1.0, 0.0].- When a vector is expressed as - ARRAY<STRUCT<STRING, Numerical type>>or- ARRAY<STRUCT<INT64, Numerical type>>, each- STRUCTarray item denotes one dimension of the vector. An example of a three-dimensional vector is- [("a", 0.0), ("b", 1.0), ("c", 1.0)].- The initial - INT64or- STRINGvalue in the- STRUCTis used as an identifier to match the- STRUCTvalues in- vector2. The ordering of data in the array doesn't matter; the values are matched by the identifier rather than by their position in the array. If either vector has any- STRUCTvalues with duplicate identifiers, running this function returns an error.
- vector2: an- ARRAYvalue that represents the second vector.- vector2must have the same type as- vector1.- For example, if - vector1is an- ARRAY<STRUCT<STRING, FLOAT64>>column with three elements, like- [("a", 0.0), ("b", 1.0), ("c", 1.0)], then- vector2must also be an- ARRAY<STRUCT<STRING, FLOAT64>>column.- When - vector1and- vector2are- ARRAY<Numerical type>columns, they must have the same array length.
- type: a- STRINGvalue that specifies the type of distance to calculate. Valid values are- EUCLIDEAN,- MANHATTAN, and- COSINE. If this argument isn't specified, the default value is- EUCLIDEAN.
Output
ML.DISTANCE returns a FLOAT64 value that represents the distance between
the vectors. Returns NULL if either vector1 or vector2 is NULL.
Example
Get the Euclidean distance for two tensors of ARRAY<FLOAT64> values:
- Create the table - t1:- CREATE TABLE mydataset.t1 ( v1 ARRAY<FLOAT64>, v2 ARRAY<FLOAT64> ) 
- Populate - t1:- INSERT mydataset.t1 (v1,v2) VALUES ([4.1,0.5,1.0], [3.0,0.0,2.5]) 
- Calculate the Euclidean norm for - v1and- v2:- SELECT v1, v2, ML.DISTANCE(v1, v2, 'EUCLIDEAN') AS output FROM mydataset.t1 - This query produces the following output: - +---------------+---------------+-------------------+ | v1 | v2 | output | +---------------+---------------+-------------------| | [4.1,0.5,1.0] | [3.0,0.0,2.5] | 1.926136028425822 | +------------+------------------+-------------------+
What's next
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.