Note: We are merging Analytics Zoo into BigDL 2.0, and our future development will move to the BigDL project.
Analytics Zoo DocumentationΒΆ
Analytics Zoo is an open source Big Data AI platform, and includes the following features for scaling end-to-end AI to distributed Big Data:
Orca: seamlessly scale out TensorFlow and PyTorch for Big Data (using Spark & Ray)
RayOnSpark: run Ray programs directly on Big Data clusters
BigDL Extensions: high-level Spark ML pipeline and Keras-like APIs for BigDL
Chronos: scalable time series analysis using AutoML
PPML: privacy preserving big data analysis and machine learning (experimental)
Quick Start
User Guide
Common Use Case
- Use
torch.distributed
in Orca - Use Spark Dataframe for Deep Learning
- Use Distributed Pandas for Deep Learning
- Use AutoTSEstimator for Time-Series Forecasting
- Use TSDataset and Forecaster for Time-Series Forecasting
- Use Anomaly Detector for Unsupervised Anomaly Detection
- Use Keras-Like API for BigDL
- Use Spark ML Pipeline for BigDL
- Enable AutoML for PyTorch
- Use AutoXGBoost to auto-tune XGBoost parameters
Orca Overview
Python API
Real-World Application