Analytics Zoo DocumentationΒΆ
Analytics Zoo is an open source Big Data AI platform, and includes the following features for scaling end-to-end AI to distributed Big Data:
Orca: seamlessly scale out TensorFlow and PyTorch for Big Data (using Spark & Ray)
RayOnSpark: run Ray programs directly on Big Data clusters
BigDL Extensions: high-level Spark ML pipeline and Keras-like APIs for BigDL
Chronos: scalable time series analysis using AutoML
PPML: privacy preserving big data analysis and machine learning (experimental)
- Use
torch.distributed
in Orca - Use Spark Dataframe for Deep Learning
- Use Distributed Pandas for Deep Learning
- Use AutoML for Time-Series Forecasting
- Use TSDataset and Forecaster for Time-Series Forecasting
- Use Anomaly Detector for Unsupervised Anomaly Detection
- Use Keras-Like API for BigDL
- Use Spark ML Pipeline for BigDL
- Enable AutoML for PyTorch
- Use AutoXGBoost to auto-tune XGBoost parameters