- 2.29.0 (latest)
- 2.28.0
- 2.27.0
- 2.26.0
- 2.25.0
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Linear models. This module is styled after scikit-learn's linear_model module: https://scikit-learn.org/stable/modules/linear_model.html.
Classes
LinearRegression
LinearRegression(
*,
optimize_strategy: typing.Literal[
"auto_strategy", "batch_gradient_descent", "normal_equation"
] = "auto_strategy",
fit_intercept: bool = True,
l1_reg: typing.Optional[float] = None,
l2_reg: float = 0.0,
max_iterations: int = 20,
warm_start: bool = False,
learning_rate: typing.Optional[float] = None,
learning_rate_strategy: typing.Literal["line_search", "constant"] = "line_search",
tol: float = 0.01,
ls_init_learning_rate: typing.Optional[float] = None,
calculate_p_values: bool = False,
enable_global_explain: bool = False
)Ordinary least squares Linear Regression.
LinearRegression fits a linear model with coefficients w = (w1, ..., wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.
Examples:
>>> from bigframes.ml.linear_model import LinearRegression
>>> import bigframes.pandas as bpd
>>> X = bpd.DataFrame({ "feature0": [20, 21, 19, 18], "feature1": [0, 1, 1, 0], "feature2": [0.2, 0.3, 0.4, 0.5]})
>>> y = bpd.DataFrame({"outcome": [0, 0, 1, 1]})
>>> # Create the linear model
>>> model = LinearRegression()
>>> model.fit(X, y)
LinearRegression()
>>> # Score the model
>>> score = model.score(X, y)
>>> print(score) # doctest:+SKIP
mean_absolute_error mean_squared_error mean_squared_log_error 0 0.022812 0.000602 0.00035
median_absolute_error r2_score explained_variance
0 0.015077 0.997591 0.997591
LogisticRegression
LogisticRegression(
*,
optimize_strategy: typing.Literal[
"auto_strategy", "batch_gradient_descent"
] = "auto_strategy",
fit_intercept: bool = True,
l1_reg: typing.Optional[float] = None,
l2_reg: float = 0.0,
max_iterations: int = 20,
warm_start: bool = False,
learning_rate: typing.Optional[float] = None,
learning_rate_strategy: typing.Literal["line_search", "constant"] = "line_search",
tol: float = 0.01,
ls_init_learning_rate: typing.Optional[float] = None,
calculate_p_values: bool = False,
enable_global_explain: bool = False,
class_weight: typing.Optional[
typing.Union[typing.Literal["balanced"], typing.Dict[str, float]]
] = None
)Logistic Regression (aka logit, MaxEnt) classifier.
from bigframes.ml.linear_model import LogisticRegression import bigframes.pandas as bpd X = bpd.DataFrame({ "feature0": [20, 21, 19, 18], "feature1": [0, 1, 1, 0], "feature2": [0.2, 0.3, 0.4, 0.5]}) y = bpd.DataFrame({"outcome": [0, 0, 1, 1]})
Create the LogisticRegression
model = LogisticRegression() model.fit(X, y) LogisticRegression() model.predict(X) # doctest:+SKIP predicted_outcome predicted_outcome_probs feature0 feature1 feature2 0 0 [{'label': 1, 'prob': 3.1895929877221615e-07} ... 20 0 0.2 1 0 [{'label': 1, 'prob': 5.662891265051953e-06} ... 21 1 0.3 2 1 [{'label': 1, 'prob': 0.9999917826885262} {'l... 19 1 0.4 3 1 [{'label': 1, 'prob': 0.9999999993659574} {'l... 18 0 0.5 4 rows × 5 columns
[4 rows x 5 columns in total]
Score the model
score = model.score(X, y) score # doctest:+SKIP precision recall accuracy f1_score log_loss roc_auc 0 1.0 1.0 1.0 1.0 0.000004 1.0 1 rows × 6 columns
[1 rows x 6 columns in total]