Class TextTransformation (1.158.0)

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.

Methods

TextTransformation

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.

TextTransformation

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-06-17 UTC.