Use Anomaly Detector for Unsupervised Anomaly Detection¶
Run in Google Colab
View source on GitHub
In this guide we will demonstrate how to use Chronos Anomaly Detector for time seires anomaly detection in 3 simple steps.
Step 0: Prepare Environment¶
We recommend using conda to prepare the environment. Please refer to the install guide for more details.
conda create -n zoo python=3.7 # "zoo" is conda environment name, you can use any name you like.
conda activate zoo
pip install analytics-zoo[automl] # install either version 0.10 or latest nightly build
Step 1: Prepare dataset¶
For demonstration, we use the publicly available real time traffic data from the Twin Cities Metro area in Minnesota, collected by the Minnesota Department of Transportation. The detailed information can be found here
Now we need to do data cleaning and preprocessing on the raw data. Note that this part could vary for different dataset.
For the machine_usage data, the pre-processing contains 2 parts:
Change the time interval from irregular to 5 minutes.
Check missing values and handle missing data.
from zoo.chronos.data import TSDataset
tsdata = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value")
df = tsdata.resample("5min")\
.impute(mode="linear")\
.to_pandas()
Step 2: Use Chronos Anomaly Detector¶
Chronos provides many anomaly detector for anomaly detection, here we use DBScan as an example. More anomaly detector can be found here.
from zoo.chronos.detector.anomaly import DBScanDetector
ad = DBScanDetector(eps=0.3, min_samples=6)
ad.fit(df['value'].to_numpy())
anomaly_indexes = ad.anomaly_indexes()