Simulator¶

DPGANSimulator¶

class zoo.chronos.simulator.doppelganger_simulator.DPGANSimulator(L_max, sample_len, feature_dim, num_real_attribute, discriminator_num_layers=5, discriminator_num_units=200, attr_discriminator_num_layers=5, attr_discriminator_num_units=200, attribute_num_units=100, attribute_num_layers=3, feature_num_units=100, feature_num_layers=1, attribute_input_noise_dim=5, addi_attribute_input_noise_dim=5, d_gp_coe=10, attr_d_gp_coe=10, g_attr_d_coe=1, d_lr=0.001, attr_d_lr=0.001, g_lr=0.001, g_rounds=1, d_rounds=1, seed=0, num_threads=None, ckpt_dir='.', checkpoint_every_n_epoch=0)[source]¶

Bases: object

Doppelganger Simulator for time series generation. The codes and algorithm are adapted from https://github.com/fjxmlzn/DoppelGANger.

Initialize a doppelganger simulator.

Parameters

L_max – the maximum length of your feature.
sample_len – the sample length to control LSTM length, should be a divider to L_max
feature_dim – dimention of the feature
num_real_attribute – the length of you attribute, which should be equal to the len(data_attribute).
discriminator_num_layers – MLP layer num for discriminator.
discriminator_num_units – MLP hidden unit for discriminator.
attr_discriminator_num_layers – MLP layer num for attr discriminator.
attr_discriminator_num_units – MLP hidden unit for attr discriminator.
attribute_num_units – MLP layer num for attr generator/addi attr generator.
attribute_num_layers – MLP hidden unit for attr generator/addi attr generator.
feature_num_units – LSTM hidden unit for feature generator.
feature_num_layers – LSTM layer num for feature generator.
attribute_input_noise_dim – noise data dim for attr generator.
addi_attribute_input_noise_dim – noise data dim for addi attr generator.
d_gp_coe – gradient penalty ratio for d loss.
attr_d_gp_coe – gradient penalty ratio for attr d loss.
g_attr_d_coe – ratio between feature loss and attr loss for g loss.
d_lr – learning rate for discriminator.
attr_d_lr – learning rate for attr discriminator.
g_lr – learning rate for genereators.
g_rounds – g rounds.
d_rounds – d rounds.
seed – random seed.
num_threads – num of threads to be used for training.
ckpt_dir – The checkpoint location, defaults to the working dir.
checkpoint_every_n_epoch – checkpoint every n epoch, defaults to 0 for no checkpoints.

fit(data_feature, data_attribute, data_gen_flag, feature_outputs, attribute_outputs, epoch=1, batch_size=32)[source]¶

Fit on the training data(typically the private data).

Parameters

data_feature – Training features, in numpy float32 array format. The size is [(number of training samples) x (maximum length) x (total dimension of features)]. Categorical features are stored by one-hot encoding; for example, if a categorical feature has 3 possibilities, then it can take values between [1., 0., 0.], [0., 1., 0.], and [0., 0., 1.]. Each continuous feature should be normalized to [0, 1] or [-1, 1]. The array is padded by zeros after the time series ends.
data_attribute – Training attributes, in numpy float32 array format. The size is [(number of training samples) x (total dimension of attributes)]. Categorical attributes are stored by one-hot encoding; for example, if a categorical attribute has 3 possibilities, then it can take values between [1., 0., 0.], [0., 1., 0.], and [0., 0., 1.]. Each continuous attribute should be normalized to [0, 1] or [-1, 1].
data_gen_flag – Flags indicating the activation of features, in numpy float32 array format. The size is [(number of training samples) x (maximum length)]. 1 means the time series is activated at this time step, 0 means the time series is inactivated at this timestep.
feature_outputs – A list of Output indicates the meta data of data_feature.
attribute_outputs – A list of Output indicates the meta data of data_attribute.
epoch – training epoch.
batch_size – training batchsize.

generate(sample_num=1, batch_size=32)[source]¶

Generate synthetic data with similar distribution as training data.

Parameters

sample_num – How many samples to be generated.
batch_size – batch size to generate.

save(path_dir)[source]¶

Save the simulator.

Parameters: path_dir – saving path

load(path_dir, model_version='doppelganger.ckpt')[source]¶

Load the simulator.

Parameters

path_dir – saving path
model_version – model version(filename) you would like to load.