Class LogisticRegression (2.29.0)

LogisticRegression(
    *,
    optimize_strategy: typing.Literal[
        "auto_strategy", "batch_gradient_descent"
    ] = "auto_strategy",
    fit_intercept: bool = True,
    l1_reg: typing.Optional[float] = None,
    l2_reg: float = 0.0,
    max_iterations: int = 20,
    warm_start: bool = False,
    learning_rate: typing.Optional[float] = None,
    learning_rate_strategy: typing.Literal["line_search", "constant"] = "line_search",
    tol: float = 0.01,
    ls_init_learning_rate: typing.Optional[float] = None,
    calculate_p_values: bool = False,
    enable_global_explain: bool = False,
    class_weight: typing.Optional[
        typing.Union[typing.Literal["balanced"], typing.Dict[str, float]]
    ] = None
)

Logistic Regression (aka logit, MaxEnt) classifier.

from bigframes.ml.linear_model import LogisticRegression import bigframes.pandas as bpd X = bpd.DataFrame({ "feature0": [20, 21, 19, 18], "feature1": [0, 1, 1, 0], "feature2": [0.2, 0.3, 0.4, 0.5]}) y = bpd.DataFrame({"outcome": [0, 0, 1, 1]})

Create the LogisticRegression

model = LogisticRegression() model.fit(X, y) LogisticRegression() model.predict(X) # doctest:+SKIP predicted_outcome predicted_outcome_probs feature0 feature1 feature2 0 0 [{'label': 1, 'prob': 3.1895929877221615e-07} ... 20 0 0.2 1 0 [{'label': 1, 'prob': 5.662891265051953e-06} ... 21 1 0.3 2 1 [{'label': 1, 'prob': 0.9999917826885262} {'l... 19 1 0.4 3 1 [{'label': 1, 'prob': 0.9999999993659574} {'l... 18 0 0.5 4 rows × 5 columns

[4 rows x 5 columns in total]

Score the model

score = model.score(X, y) score # doctest:+SKIP precision recall accuracy f1_score log_loss roc_auc 0 1.0 1.0 1.0 1.0 0.000004 1.0 1 rows × 6 columns

[1 rows x 6 columns in total]

Methods

__repr__

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    X_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
    y_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
) -> bigframes.ml.base._T

Fit the model according to the given training data.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

Series or DataFrame of shape (n_samples, n_features). Training vector, where n_samples is the number of samples and n_features is the number of features.

y bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

DataFrame of shape (n_samples,). Target vector relative to X.

X_eval bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

Series or DataFrame of shape (n_samples, n_features). Evaluation vector, where n_samples is the number of samples and n_features is the number of features.

y_eval bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

DataFrame of shape (n_samples,). Target vector relative to X_eval.

Returns
Type Description
LogisticRegression Fitted estimator.

get_params

get_params(deep: bool = True) -> typing.Dict[str, typing.Any]

Get parameters for this estimator.

Parameter
Name Description
deep bool, default True

Default True. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type Description
Dictionary A dictionary of parameter names mapped to their values.

predict

predict(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFrame

Predict class labels for samples in X.

Returns
Type Description
bigframes.dataframe.DataFrame DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.

predict_explain

predict_explain(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    *,
    top_k_features: int = 5
) -> bigframes.dataframe.DataFrame

Explain predictions for a logistic regression model.

Returns
Type Description
bigframes.pandas.DataFrame The predicted DataFrames with explanation columns.

register

register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._T

Register the model to Vertex AI.

After register, go to the Google Cloud console (https://console.cloud.google.com/vertex-ai/models) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.

Parameter
Name Description
vertex_ai_model_id Optional[str], default None

Optional string id as model id in Vertex. If not set, will default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation.

score

score(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFrame

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy, which is a harsh metric since you require that each label set be correctly predicted for each sample.

Returns
Type Description
bigframes.dataframe.DataFrame A DataFrame of the evaluation result.

to_gbq

to_gbq(
    model_name: str, replace: bool = False
) -> bigframes.ml.linear_model.LogisticRegression

Save the model to BigQuery.

Returns
Type Description
LogisticRegression Saved model.