Spark runtime version 2.2 components
Notes:
1. The 2.2 runtime uses the UTF-8 default character encoding.
Spark runtime 2.2 libraries
Learning libraries, such TensorFlow, PyTorch, and XGBoost, and offer a ready-to-use environment for machine learning and data science applications.
The following sections list the library versions that are available in
Serverless for Apache Spark runtime version 2.2.
GPU-specific libraries
For Serverless for Apache Spark batch workloads that use GPU VMs, the following NVIDIA driver and libraries are available in the Serverless for Apache Spark container. You can use them to accomplish the following tasks:
- Accelerate Spark batch workloads with the NVIDIA Spark Rapids library
- Train machine learning workloads
- Run distributed batch inference using Spark
| Package Name | Version | 
|---|---|
| Spark Rapids | 24.04.0 | 
| NVIDIA Driver | 550.127.05 | 
| CUDA | 12.6 | 
| cublas | 12.8.4 | 
| cusolver | 11.7.3 | 
| cupti | 12.8.90 | 
| cusparse | 12.5.8 | 
| cuDNN | 9.2 | 
| NCCL | 2.22 | 
XGBoost libraries
The following Maven package versions
are available in Serverless for Apache Spark runtime version 2.2 to use 
XGBoost with Spark in Java or Scala.
| Group ID | Package Name | Version | 
|---|---|---|
| ml.dmlc | xgboost4j-gpu_2.13 | 2.1.1 | 
| ml.dmlc | xgboost4j-spark-gpu_2.13 | 2.1.1 | 
Python libraries
The following Python library versions are included in
Serverless for Apache Spark runtime version 2.2.
| Package Name | Version | 
|---|---|
| accelerate | 0.33 | 
| bigframes | 1.7 | 
| cookiecutter | 2.6 | 
| cython | 3.0 | 
| dask | 2024.5 | 
| deepspeed | 0.14 | 
| delta-spark | 3.2 | 
| evaluate | 0.4 | 
| fastavro | 1.9 | 
| fastparquet | 2024.2 | 
| gcsfs | 2024.5 | 
| git | 2.45 | 
| google-auth-oauthlib | 1.2 | 
| google-cloud-aiplatform | 1.60 | 
| google-cloud-bigquery | 3.23 | 
| google-cloud-bigquery-storage | 2.25 | 
| google-cloud-bigtable | 2.23 | 
| google-cloud-container | 2.45 | 
| google-cloud-datacatalog | 3.19 | 
| google-cloud-dataproc | 5.9 | 
| google-cloud-datastore | 2.19 | 
| google-cloud-dlp | 3.22 | 
| google-cloud-language | 2.13 | 
| google-cloud-logging | 3.10 | 
| google-cloud-monitoring | 2.21 | 
| google-cloud-pubsub | 2.21 | 
| google-cloud-redis | 2.15 | 
| google-cloud-secret-manager | 2.20 | 
| google-cloud-spanner | 3.46 | 
| google-cloud-speech | 2.26 | 
| google-cloud-storage | 2.16 | 
| google-cloud-texttospeech | 2.16 | 
| google-cloud-translate | 3.15 | 
| google-cloud-vision | 3.7 | 
| httplib2 | 0.22 | 
| huggingface_hub | 0.27 | 
| ipyparallel | 8.8 | 
| ipython-sql | 0.3 | 
| ipywidgets | 8.1 | 
| jupyter_http_over_ws | 0.0 | 
| jupyterlab | 4.1 | 
| jupyterlab-git | 0.50 | 
| keyrings.google-artifactregistry-auth | 1.1 | 
| langchain | 0.2 | 
| lightgbm | 4.5 | 
| markdown | 3.6 | 
| matplotlib | 3.8 | 
| nbclassic | 1.0 | 
| nbconvert | 7.16 | 
| nbdime | 4.0 | 
| nltk | 3.8 | 
| nodejs | 20.12 | 
| numba | 0.59 | 
| numpy | 1.26 | 
| oauth2client | 4.1 | 
| onnx | 1.16 | 
| openblas | 0.3 | 
| opencv | 4.9 | 
| orc | 2.0 | 
| pandas | 2.2 | 
| papermill | 2.6 | 
| pyarrow | 15.0 | 
| pydot | 2.0 | 
| pyhive | 0.7 | 
| pymongo | 4.7 | 
| pynvml | 11.5 | 
| pytables | 3.9 | 
| pytorch-cpu | 2.3 | 
| regex | 2024.5 | 
| requests | 2.31 | 
| rtree | 1.2 | 
| scikit-image | 0.22 | 
| scikit-learn | 1.5 | 
| scipy | 1.11 | 
| seaborn | 0.12 | 
| sentence-transformers | 3.0 | 
| shap | 0.45 | 
| spark-tensorflow-distributor | 1.0 | 
| sparksql-magic | 0.0.3 | 
| sqlalchemy | 2.0 | 
| sympy | 1.12 | 
| tokenizers | 0.19 | 
| torcheval | 0.0.7 | 
| torchvision | 0.18 | 
| toree | 0.5 | 
| tornado | 6.4 | 
| transformers | 4.43 | 
| uritemplate | 4.1 | 
| virtualenv | 20.26 | 
| wordcloud | 1.9 | 
| xgboost | 2.0 | 
| ydata-profiling | 4.8 | 
R libraries
The following R library versions are included in
Serverless for Apache Spark runtime version 2.2.
| Package Name | Version | 
|---|---|
| askpass | 1.2 | 
| assertthat | 0.2 | 
| backports | 1.5 | 
| bit | 4.0 | 
| bit64 | 4.0 | 
| blob | 1.2 | 
| boot | 1.3_30 | 
| brew | 1.0_10 | 
| broom | 1.0 | 
| callr | 3.7 | 
| caret | 6.0_94 | 
| cellranger | 1.1 | 
| chron | 2.3_61 | 
| class | 7.3_22 | 
| cli | 3.6 | 
| clipr | 0.8 | 
| cluster | 2.1 | 
| codetools | 0.2_20 | 
| colorspace | 2.1_0 | 
| commonmark | 1.9 | 
| cpp11 | 0.4 | 
| crayon | 1.5 | 
| curl | 5.1 | 
| data.table | 1.15 | 
| dbi | 1.2 | 
| dbplyr | 2.5 | 
| desc | 1.4 | 
| devtools | 2.4 | 
| digest | 0.6 | 
| dplyr | 1.1 | 
| ellipsis | 0.3 | 
| evaluate | 0.23 | 
| fansi | 1.0 | 
| fastmap | 1.2 | 
| forcats | 1.0 | 
| foreach | 1.5 | 
| foreign | 0.8_86 | 
| fs | 1.6 | 
| future | 1.33 | 
| generics | 0.1 | 
| ggplot2 | 3.5 | 
| gh | 1.4 | 
| glmnet | 4.1_8 | 
| globals | 0.16 | 
| glue | 1.7 | 
| gower | 1.0 | 
| gtable | 0.3 | 
| haven | 2.5 | 
| highr | 0.10 | 
| hms | 1.1 | 
| htmltools | 0.5.8 | 
| htmlwidgets | 1.6 | 
| httpuv | 1.6 | 
| httr | 1.4 | 
| hwriter | 1.3.2 | 
| ini | 0.3 | 
| ipred | 0.9_14 | 
| isoband | 0.2 | 
| iterators | 1.0 | 
| jsonlite | 1.8 | 
| kernsmooth | 2.23_24 | 
| knitr | 1.46 | 
| labeling | 0.4 | 
| later | 1.3 | 
| lattice | 0.22_6 | 
| lava | 1.7 | 
| lifecycle | 1.0 | 
| listenv | 0.9 | 
| lubridate | 1.9 | 
| magrittr | 2.0 | 
| markdown | 1.12 | 
| mass | 7.3_60 | 
| matrix | 1.6_5 | 
| memoise | 2.0 | 
| mgcv | 1.9_1 | 
| mime | 0.12 | 
| modelmetrics | 1.2.2 | 
| modelr | 0.1 | 
| munsell | 0.5 | 
| nlme | 3.1_164 | 
| nnet | 7.3_19 | 
| numderiv | 2016.8_1 | 
| openssl | 2.2 | 
| pillar | 1.9 | 
| pkgbuild | 1.4 | 
| pkgconfig | 2.0 | 
| pkgload | 1.3 | 
| plogr | 0.2 | 
| plyr | 1.8 | 
| praise | 1.0 | 
| prettyunits | 1.2 | 
| processx | 3.8 | 
| prodlim | 2023.08 | 
| progress | 1.2 | 
| promises | 1.3 | 
| proto | 1.0 | 
| ps | 1.7 | 
| purrr | 1.0 | 
| r6 | 2.5 | 
| randomforest | 4.7_1 | 
| rappdirs | 0.3 | 
| rcmdcheck | 1.4 | 
| rcolorbrewer | 1.1_3 | 
| rcpp | 1.0 | 
| rcurl | 1.98_1 | 
| readr | 2.1 | 
| readxl | 1.4 | 
| recipes | 1.0 | 
| rematch | 2.0 | 
| remotes | 2.5 | 
| reprex | 2.1 | 
| reshape2 | 1.4 | 
| rlang | 1.1 | 
| rmarkdown | 2.27 | 
| rodbc | 1.3_23 | 
| roxygen2 | 7.3 | 
| rpart | 4.1 | 
| rprojroot | 2.0 | 
| rserve | 1.8_7 | 
| rsqlite | 2.3 | 
| rstudioapi | 0.16 | 
| rvest | 1.0 | 
| scales | 1.3 | 
| selectr | 0.4_2 | 
| sessioninfo | 1.2 | 
| shape | 1.4.6 | 
| shiny | 1.8.1 | 
| sourcetools | 0.1 | 
| spatial | 7.3_17 | 
| squarem | 2021.1 | 
| stringi | 1.8 | 
| stringr | 1.5 | 
| survival | 3.6_4 | 
| sys | 3.4 | 
| teachingdemos | 2.12 | 
| testthat | 3.2.1 | 
| tibble | 3.2 | 
| tidyr | 1.3 | 
| tidyselect | 1.2 | 
| tidyverse | 2.0 | 
| timedate | 4032.109 | 
| tinytex | 0.51 | 
| usethis | 2.2 | 
| utf8 | 1.2 | 
| uuid | 1.2_0 | 
| vctrs | 0.6 | 
| whisker | 0.4 | 
| withr | 3.0 | 
| xfun | 0.44 | 
| xml2 | 1.3 | 
| xopen | 1.0 | 
| xtable | 1.8_4 | 
| yaml | 2.3 | 
| zip | 2.3 |