Note
Click here to download the full example code
Getting started¶
Welcome to deepdow
! This tutorial is going to demonstrate all the essential features.
Before you continue, make sure to check out Basics to familiarize yourself with the core ideas
of deepdow
. This hands-on tutorial is divided into 4 sections
Dataset creation and loading
Network definition
Training
Evaluation and visualization of results
Preliminaries¶
Let us start with importing all important dependencies.
from deepdow.benchmarks import Benchmark, OneOverN, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader, prepare_standard_scaler, Scale
from deepdow.data.synthetic import sin_single
from deepdow.experiments import Run
from deepdow.layers import SoftmaxAllocator
from deepdow.losses import MeanReturns, SharpeRatio, MaximumDrawdown
from deepdow.visualize import generate_metrics_table, generate_weights_table, plot_metrics, plot_weight_heatmap
import matplotlib.pyplot as plt
import numpy as np
import torch
In order to be able to reproduce all results we set both the numpy
and torch
seed.
torch.manual_seed(4)
np.random.seed(5)
Dataset creation and loading¶
In this example, we are going to be using a synthetic dataset. Asset returns are going to be sine functions where the frequency and phase are randomly selected for each asset. First of all let us set all the parameters relevant to data creation.
n_timesteps, n_assets = 1000, 20
lookback, gap, horizon = 40, 2, 20
n_samples = n_timesteps - lookback - horizon - gap + 1
Additionally, we will use approximately 80% of the data for training and 20% for testing.
split_ix = int(n_samples * 0.8)
indices_train = list(range(split_ix))
indices_test = list(range(split_ix + lookback + horizon, n_samples))
print('Train range: {}:{}\nTest range: {}:{}'.format(indices_train[0], indices_train[-1],
indices_test[0], indices_test[-1]))
Out:
Train range: 0:750
Test range: 811:938
Now we can generate the synthetic asset returns of with shape (n_timesteps, n_assets)
.
returns = np.array([sin_single(n_timesteps,
freq=1 / np.random.randint(3, lookback),
amplitude=0.05,
phase=np.random.randint(0, lookback)
) for _ in range(n_assets)]).T
We also add some noise.
returns += np.random.normal(scale=0.02, size=returns.shape)
See below the first 100 timesteps of 2 assets.
plt.plot(returns[:100, [1, 2]])
Out:
[<matplotlib.lines.Line2D object at 0x7f621d2b2e10>, <matplotlib.lines.Line2D object at 0x7f621caf7dd0>]
To obtain the feature matrix X
and the target y
we apply the rolling window
strategy.
X_list, y_list = [], []
for i in range(lookback, n_timesteps - horizon - gap + 1):
X_list.append(returns[i - lookback: i, :])
y_list.append(returns[i + gap: i + gap + horizon, :])
X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]
print('X: {}, y: {}'.format(X.shape, y.shape))
Out:
X: (939, 1, 40, 20), y: (939, 1, 20, 20)
As commonly done in every deep learning application, we want to scale our input features to
be approximately centered around 0 and have a standard deviation of 1. In deepdow
we
can achieve this with the prepare_standard_scaler
function that computes the mean
and standard deviation of the input (for each channel). Additionally, we do not want to leak
any information from our test set and therefore we only compute these statistics over the
training set.
means, stds = prepare_standard_scaler(X, indices=indices_train)
print('mean: {}, std: {}'.format(means, stds))
Out:
mean: [-9.56904164e-07], std: [0.04066513]
We can now construct the InRAMDataset
. By providing the optional transform
we
make sure that when the samples are streamed they are always scaled based on our computed
(training) statistics. See InRAMDataset for more details.
dataset = InRAMDataset(X, y, transform=Scale(means, stds))
Using the dataset
we can now construct two dataloaders—one for training and the other one
for testing. For more details see Dataloaders.
dataloader_train = RigidDataLoader(dataset,
indices=indices_train,
batch_size=32)
dataloader_test = RigidDataLoader(dataset,
indices=indices_test,
batch_size=32)
Network definition¶
Let us now write a custom network. See Writing custom networks.
class GreatNet(torch.nn.Module, Benchmark):
def __init__(self, n_assets, lookback, p=0.5):
super().__init__()
n_features = n_assets * lookback
self.dropout_layer = torch.nn.Dropout(p=p)
self.dense_layer = torch.nn.Linear(n_features, n_assets, bias=True)
self.allocate_layer = SoftmaxAllocator(temperature=None)
self.temperature = torch.nn.Parameter(torch.ones(1), requires_grad=True)
def forward(self, x):
"""Perform forward pass.
Parameters
----------
x : torch.Tensor
Of shape (n_samples, 1, lookback, n_assets).
Returns
-------
weights : torch.Torch
Tensor of shape (n_samples, n_assets).
"""
n_samples, _, _, _ = x.shape
x = x.view(n_samples, -1) # flatten features
x = self.dropout_layer(x)
x = self.dense_layer(x)
temperatures = torch.ones(n_samples).to(device=x.device, dtype=x.dtype) * self.temperature
weights = self.allocate_layer(x, temperatures)
return weights
So what is this network doing? First of all, we make an assumption that assets and lookback will
never change (the same shape and order at train and at inference time). This assumption
is justified since we are using RigidDataLoader
.
We can learn n_assets
linear models that have n_assets * lookback
features. In
other words we have a dense layer that takes the flattened feature tensor x
and returns
a vector of length n_assets
. Since elements of this vector can range from \(-\infty\)
to \(\infty\) we turn it into an asset allocation via SoftmaxAllocator
.
Additionally, we learn the temperature
from the data. This will enable us to learn the
optimal trade-off between an equally weighted allocation (uniform distribution) and
single asset portfolios.
network = GreatNet(n_assets, lookback)
print(network)
Out:
GreatNet(
(dropout_layer): Dropout(p=0.5, inplace=False)
(dense_layer): Linear(in_features=800, out_features=20, bias=True)
(allocate_layer): SoftmaxAllocator(
(layer): Softmax(dim=1)
)
)
In torch
networks are either in the train or eval mode. Since we are using
dropout it is essential that we set the mode correctly based on what we are trying to do.
network = network.train() # it is the default, however, just to make the distinction clear
Training¶
It is now time to define our loss. Let’s say we want to achieve multiple objectives at the same
time. We want to minimize the drawdowns, maximize the mean returns and also maximize the Sharpe
ratio. All of these losses are implemented in deepdow.losses
. To avoid confusion, they
are always implemented in a way that the lower the value of the loss the better. To combine
multiple objectives we can simply sum all of the individual losses. Similarly, if we want to
assign more importance to one of them we can achieve this by multiplying by a constant. To learn
more see Losses.
loss = MaximumDrawdown() + 2 * MeanReturns() + SharpeRatio()
Note that by default all the losses assume that we input logarithmic returns
(input_type='log'
) and that they are in the 0th channel (returns_channel=0
).
We now have all the ingredients ready for training of the neural network. deepdow
implements
a simple wrapper Run
that implements the training loop and a minimal callback
framework. For further information see Experiments.
run = Run(network,
loss,
dataloader_train,
val_dataloaders={'test': dataloader_test},
optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
callbacks=[EarlyStoppingCallback(metric_name='loss',
dataloader_name='test',
patience=15)])
To run the training loop, we use the launch
where we specify the number of epochs.
history = run.launch(30)
Out:
Epoch 0: 0%| | 0/24 [00:00<?, ?it/s]/home/docs/checkouts/readthedocs.org/user_builds/deepdow/envs/v0.2.2/lib/python3.7/site-packages/torch/autograd/__init__.py:132: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
allow_unreachable=True) # allow_unreachable flag
Epoch 0: 4%|4 | 1/24 [00:00<00:00, 59.70it/s, loss=0.10490]
Epoch 0: 8%|8 | 2/24 [00:00<00:00, 78.34it/s, loss=0.04792]
Epoch 0: 12%|#2 | 3/24 [00:00<00:00, 87.48it/s, loss=0.03893]
Epoch 0: 17%|#6 | 4/24 [00:00<00:00, 93.23it/s, loss=0.01282]
Epoch 0: 21%|## | 5/24 [00:00<00:00, 97.11it/s, loss=-0.01442]
Epoch 0: 25%|##5 | 6/24 [00:00<00:00, 99.80it/s, loss=-0.02882]
Epoch 0: 29%|##9 | 7/24 [00:00<00:00, 101.87it/s, loss=-0.04859]
Epoch 0: 33%|###3 | 8/24 [00:00<00:00, 103.48it/s, loss=-0.06196]
Epoch 0: 38%|###7 | 9/24 [00:00<00:00, 104.76it/s, loss=-0.06503]
Epoch 0: 42%|####1 | 10/24 [00:00<00:00, 105.86it/s, loss=-0.07275]
Epoch 0: 46%|####5 | 11/24 [00:00<00:00, 106.69it/s, loss=-0.07275]
Epoch 0: 46%|####5 | 11/24 [00:00<00:00, 106.69it/s, loss=-0.09093]
Epoch 0: 50%|##### | 12/24 [00:00<00:00, 106.69it/s, loss=-0.09086]
Epoch 0: 54%|#####4 | 13/24 [00:00<00:00, 106.69it/s, loss=-0.09539]
Epoch 0: 58%|#####8 | 14/24 [00:00<00:00, 106.69it/s, loss=-0.10493]
Epoch 0: 62%|######2 | 15/24 [00:00<00:00, 106.69it/s, loss=-0.11856]
Epoch 0: 67%|######6 | 16/24 [00:00<00:00, 106.69it/s, loss=-0.12744]
Epoch 0: 71%|####### | 17/24 [00:00<00:00, 106.69it/s, loss=-0.13954]
Epoch 0: 75%|#######5 | 18/24 [00:00<00:00, 106.69it/s, loss=-0.14985]
Epoch 0: 79%|#######9 | 19/24 [00:00<00:00, 106.69it/s, loss=-0.16115]
Epoch 0: 83%|########3 | 20/24 [00:00<00:00, 106.69it/s, loss=-0.16692]
Epoch 0: 88%|########7 | 21/24 [00:00<00:00, 106.69it/s, loss=-0.17260]
Epoch 0: 92%|#########1| 22/24 [00:00<00:00, 106.69it/s, loss=-0.18099]
Epoch 0: 96%|#########5| 23/24 [00:00<00:00, 112.00it/s, loss=-0.18099]
Epoch 0: 96%|#########5| 23/24 [00:00<00:00, 112.00it/s, loss=-0.19158]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 112.00it/s, loss=-0.19150]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 112.00it/s, loss=-0.19150, test_loss=-0.32334]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 112.00it/s, loss=-0.19150, test_loss=-0.32334]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 101.11it/s, loss=-0.19150, test_loss=-0.32334]
Epoch 1: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 1: 4%|4 | 1/24 [00:00<00:00, 104.05it/s, loss=-0.44158]
Epoch 1: 8%|8 | 2/24 [00:00<00:00, 106.02it/s, loss=-0.52983]
Epoch 1: 12%|#2 | 3/24 [00:00<00:00, 107.82it/s, loss=-0.51111]
Epoch 1: 17%|#6 | 4/24 [00:00<00:00, 109.82it/s, loss=-0.49352]
Epoch 1: 21%|## | 5/24 [00:00<00:00, 111.06it/s, loss=-0.47975]
Epoch 1: 25%|##5 | 6/24 [00:00<00:00, 111.70it/s, loss=-0.46843]
Epoch 1: 29%|##9 | 7/24 [00:00<00:00, 112.30it/s, loss=-0.48003]
Epoch 1: 33%|###3 | 8/24 [00:00<00:00, 112.47it/s, loss=-0.46829]
Epoch 1: 38%|###7 | 9/24 [00:00<00:00, 112.89it/s, loss=-0.47158]
Epoch 1: 42%|####1 | 10/24 [00:00<00:00, 113.15it/s, loss=-0.47445]
Epoch 1: 46%|####5 | 11/24 [00:00<00:00, 113.37it/s, loss=-0.47606]
Epoch 1: 50%|##### | 12/24 [00:00<00:00, 113.44it/s, loss=-0.47606]
Epoch 1: 50%|##### | 12/24 [00:00<00:00, 113.44it/s, loss=-0.49378]
Epoch 1: 54%|#####4 | 13/24 [00:00<00:00, 113.44it/s, loss=-0.51729]
Epoch 1: 58%|#####8 | 14/24 [00:00<00:00, 113.44it/s, loss=-0.50429]
Epoch 1: 62%|######2 | 15/24 [00:00<00:00, 113.44it/s, loss=-0.50479]
Epoch 1: 67%|######6 | 16/24 [00:00<00:00, 113.44it/s, loss=-0.50125]
Epoch 1: 71%|####### | 17/24 [00:00<00:00, 113.44it/s, loss=-0.50523]
Epoch 1: 75%|#######5 | 18/24 [00:00<00:00, 113.44it/s, loss=-0.50999]
Epoch 1: 79%|#######9 | 19/24 [00:00<00:00, 113.44it/s, loss=-0.51247]
Epoch 1: 83%|########3 | 20/24 [00:00<00:00, 113.44it/s, loss=-0.52263]
Epoch 1: 88%|########7 | 21/24 [00:00<00:00, 113.44it/s, loss=-0.53069]
Epoch 1: 92%|#########1| 22/24 [00:00<00:00, 113.44it/s, loss=-0.53297]
Epoch 1: 96%|#########5| 23/24 [00:00<00:00, 113.44it/s, loss=-0.53296]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 112.62it/s, loss=-0.53296]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 112.62it/s, loss=-0.53542]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 112.62it/s, loss=-0.53542, test_loss=-0.48129]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 112.62it/s, loss=-0.53542, test_loss=-0.48129]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 101.81it/s, loss=-0.53542, test_loss=-0.48129]
Epoch 2: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 2: 4%|4 | 1/24 [00:00<00:00, 107.11it/s, loss=-0.63685]
Epoch 2: 8%|8 | 2/24 [00:00<00:00, 111.43it/s, loss=-0.66271]
Epoch 2: 12%|#2 | 3/24 [00:00<00:00, 112.83it/s, loss=-0.65602]
Epoch 2: 17%|#6 | 4/24 [00:00<00:00, 113.85it/s, loss=-0.67055]
Epoch 2: 21%|## | 5/24 [00:00<00:00, 113.73it/s, loss=-0.66056]
Epoch 2: 25%|##5 | 6/24 [00:00<00:00, 114.19it/s, loss=-0.64627]
Epoch 2: 29%|##9 | 7/24 [00:00<00:00, 114.53it/s, loss=-0.65641]
Epoch 2: 33%|###3 | 8/24 [00:00<00:00, 114.78it/s, loss=-0.64307]
Epoch 2: 38%|###7 | 9/24 [00:00<00:00, 114.90it/s, loss=-0.64494]
Epoch 2: 42%|####1 | 10/24 [00:00<00:00, 115.02it/s, loss=-0.64593]
Epoch 2: 46%|####5 | 11/24 [00:00<00:00, 115.11it/s, loss=-0.63736]
Epoch 2: 50%|##### | 12/24 [00:00<00:00, 115.25it/s, loss=-0.63736]
Epoch 2: 50%|##### | 12/24 [00:00<00:00, 115.25it/s, loss=-0.63647]
Epoch 2: 54%|#####4 | 13/24 [00:00<00:00, 115.25it/s, loss=-0.64013]
Epoch 2: 58%|#####8 | 14/24 [00:00<00:00, 115.25it/s, loss=-0.64837]
Epoch 2: 62%|######2 | 15/24 [00:00<00:00, 115.25it/s, loss=-0.65960]
Epoch 2: 67%|######6 | 16/24 [00:00<00:00, 115.25it/s, loss=-0.65934]
Epoch 2: 71%|####### | 17/24 [00:00<00:00, 115.25it/s, loss=-0.66752]
Epoch 2: 75%|#######5 | 18/24 [00:00<00:00, 115.25it/s, loss=-0.67467]
Epoch 2: 79%|#######9 | 19/24 [00:00<00:00, 115.25it/s, loss=-0.67196]
Epoch 2: 83%|########3 | 20/24 [00:00<00:00, 115.25it/s, loss=-0.67212]
Epoch 2: 88%|########7 | 21/24 [00:00<00:00, 115.25it/s, loss=-0.67794]
Epoch 2: 92%|#########1| 22/24 [00:00<00:00, 115.25it/s, loss=-0.67780]
Epoch 2: 96%|#########5| 23/24 [00:00<00:00, 115.25it/s, loss=-0.68390]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 117.21it/s, loss=-0.68390]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 117.21it/s, loss=-0.68696]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 117.21it/s, loss=-0.68696, test_loss=-0.58786]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 117.21it/s, loss=-0.68696, test_loss=-0.58786]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 105.15it/s, loss=-0.68696, test_loss=-0.58786]
Epoch 3: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 3: 4%|4 | 1/24 [00:00<00:00, 107.89it/s, loss=-0.80054]
Epoch 3: 8%|8 | 2/24 [00:00<00:00, 108.65it/s, loss=-0.74575]
Epoch 3: 12%|#2 | 3/24 [00:00<00:00, 111.32it/s, loss=-0.71361]
Epoch 3: 17%|#6 | 4/24 [00:00<00:00, 112.51it/s, loss=-0.74840]
Epoch 3: 21%|## | 5/24 [00:00<00:00, 113.29it/s, loss=-0.76307]
Epoch 3: 25%|##5 | 6/24 [00:00<00:00, 113.23it/s, loss=-0.75697]
Epoch 3: 29%|##9 | 7/24 [00:00<00:00, 113.64it/s, loss=-0.75020]
Epoch 3: 33%|###3 | 8/24 [00:00<00:00, 114.02it/s, loss=-0.76123]
Epoch 3: 38%|###7 | 9/24 [00:00<00:00, 114.30it/s, loss=-0.75829]
Epoch 3: 42%|####1 | 10/24 [00:00<00:00, 114.51it/s, loss=-0.75030]
Epoch 3: 46%|####5 | 11/24 [00:00<00:00, 114.67it/s, loss=-0.74389]
Epoch 3: 50%|##### | 12/24 [00:00<00:00, 114.52it/s, loss=-0.74389]
Epoch 3: 50%|##### | 12/24 [00:00<00:00, 114.52it/s, loss=-0.74895]
Epoch 3: 54%|#####4 | 13/24 [00:00<00:00, 114.52it/s, loss=-0.74962]
Epoch 3: 58%|#####8 | 14/24 [00:00<00:00, 114.52it/s, loss=-0.75594]
Epoch 3: 62%|######2 | 15/24 [00:00<00:00, 114.52it/s, loss=-0.74996]
Epoch 3: 67%|######6 | 16/24 [00:00<00:00, 114.52it/s, loss=-0.76037]
Epoch 3: 71%|####### | 17/24 [00:00<00:00, 114.52it/s, loss=-0.75747]
Epoch 3: 75%|#######5 | 18/24 [00:00<00:00, 114.52it/s, loss=-0.75558]
Epoch 3: 79%|#######9 | 19/24 [00:00<00:00, 114.52it/s, loss=-0.75438]
Epoch 3: 83%|########3 | 20/24 [00:00<00:00, 114.52it/s, loss=-0.75775]
Epoch 3: 88%|########7 | 21/24 [00:00<00:00, 114.52it/s, loss=-0.75353]
Epoch 3: 92%|#########1| 22/24 [00:00<00:00, 114.52it/s, loss=-0.75606]
Epoch 3: 96%|#########5| 23/24 [00:00<00:00, 114.52it/s, loss=-0.75838]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 116.50it/s, loss=-0.75838]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 116.50it/s, loss=-0.74774]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 116.50it/s, loss=-0.74774, test_loss=-0.63107]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 116.50it/s, loss=-0.74774, test_loss=-0.63107]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 104.31it/s, loss=-0.74774, test_loss=-0.63107]
Epoch 4: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 4: 4%|4 | 1/24 [00:00<00:00, 107.79it/s, loss=-0.86807]
Epoch 4: 8%|8 | 2/24 [00:00<00:00, 112.12it/s, loss=-0.83849]
Epoch 4: 12%|#2 | 3/24 [00:00<00:00, 113.07it/s, loss=-0.81478]
Epoch 4: 17%|#6 | 4/24 [00:00<00:00, 113.89it/s, loss=-0.80963]
Epoch 4: 21%|## | 5/24 [00:00<00:00, 114.55it/s, loss=-0.81812]
Epoch 4: 25%|##5 | 6/24 [00:00<00:00, 114.90it/s, loss=-0.80224]
Epoch 4: 29%|##9 | 7/24 [00:00<00:00, 115.15it/s, loss=-0.80415]
Epoch 4: 33%|###3 | 8/24 [00:00<00:00, 115.33it/s, loss=-0.79150]
Epoch 4: 38%|###7 | 9/24 [00:00<00:00, 115.45it/s, loss=-0.79227]
Epoch 4: 42%|####1 | 10/24 [00:00<00:00, 115.61it/s, loss=-0.78451]
Epoch 4: 46%|####5 | 11/24 [00:00<00:00, 115.67it/s, loss=-0.79479]
Epoch 4: 50%|##### | 12/24 [00:00<00:00, 115.75it/s, loss=-0.79479]
Epoch 4: 50%|##### | 12/24 [00:00<00:00, 115.75it/s, loss=-0.79715]
Epoch 4: 54%|#####4 | 13/24 [00:00<00:00, 115.75it/s, loss=-0.78662]
Epoch 4: 58%|#####8 | 14/24 [00:00<00:00, 115.75it/s, loss=-0.79701]
Epoch 4: 62%|######2 | 15/24 [00:00<00:00, 115.75it/s, loss=-0.79984]
Epoch 4: 67%|######6 | 16/24 [00:00<00:00, 115.75it/s, loss=-0.80330]
Epoch 4: 71%|####### | 17/24 [00:00<00:00, 115.75it/s, loss=-0.80924]
Epoch 4: 75%|#######5 | 18/24 [00:00<00:00, 115.75it/s, loss=-0.80052]
Epoch 4: 79%|#######9 | 19/24 [00:00<00:00, 115.75it/s, loss=-0.80117]
Epoch 4: 83%|########3 | 20/24 [00:00<00:00, 115.75it/s, loss=-0.79779]
Epoch 4: 88%|########7 | 21/24 [00:00<00:00, 115.75it/s, loss=-0.79405]
Epoch 4: 92%|#########1| 22/24 [00:00<00:00, 115.75it/s, loss=-0.79270]
Epoch 4: 96%|#########5| 23/24 [00:00<00:00, 115.75it/s, loss=-0.78483]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 116.80it/s, loss=-0.78483]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 116.80it/s, loss=-0.78377]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 116.80it/s, loss=-0.78377, test_loss=-0.64322]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 116.80it/s, loss=-0.78377, test_loss=-0.64322]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 104.57it/s, loss=-0.78377, test_loss=-0.64322]
Epoch 5: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 5: 4%|4 | 1/24 [00:00<00:00, 106.77it/s, loss=-0.88189]
Epoch 5: 8%|8 | 2/24 [00:00<00:00, 111.29it/s, loss=-0.87424]
Epoch 5: 12%|#2 | 3/24 [00:00<00:00, 112.79it/s, loss=-0.87271]
Epoch 5: 17%|#6 | 4/24 [00:00<00:00, 113.27it/s, loss=-0.84462]
Epoch 5: 21%|## | 5/24 [00:00<00:00, 113.88it/s, loss=-0.82483]
Epoch 5: 25%|##5 | 6/24 [00:00<00:00, 114.33it/s, loss=-0.81341]
Epoch 5: 29%|##9 | 7/24 [00:00<00:00, 114.56it/s, loss=-0.82725]
Epoch 5: 33%|###3 | 8/24 [00:00<00:00, 114.76it/s, loss=-0.79128]
Epoch 5: 38%|###7 | 9/24 [00:00<00:00, 114.99it/s, loss=-0.79484]
Epoch 5: 42%|####1 | 10/24 [00:00<00:00, 115.12it/s, loss=-0.78512]
Epoch 5: 46%|####5 | 11/24 [00:00<00:00, 115.29it/s, loss=-0.77860]
Epoch 5: 50%|##### | 12/24 [00:00<00:00, 115.44it/s, loss=-0.77860]
Epoch 5: 50%|##### | 12/24 [00:00<00:00, 115.44it/s, loss=-0.79522]
Epoch 5: 54%|#####4 | 13/24 [00:00<00:00, 115.44it/s, loss=-0.79623]
Epoch 5: 58%|#####8 | 14/24 [00:00<00:00, 115.44it/s, loss=-0.80099]
Epoch 5: 62%|######2 | 15/24 [00:00<00:00, 115.44it/s, loss=-0.80306]
Epoch 5: 67%|######6 | 16/24 [00:00<00:00, 115.44it/s, loss=-0.80648]
Epoch 5: 71%|####### | 17/24 [00:00<00:00, 115.44it/s, loss=-0.80719]
Epoch 5: 75%|#######5 | 18/24 [00:00<00:00, 115.44it/s, loss=-0.80971]
Epoch 5: 79%|#######9 | 19/24 [00:00<00:00, 115.44it/s, loss=-0.80875]
Epoch 5: 83%|########3 | 20/24 [00:00<00:00, 115.44it/s, loss=-0.80510]
Epoch 5: 88%|########7 | 21/24 [00:00<00:00, 115.44it/s, loss=-0.80474]
Epoch 5: 92%|#########1| 22/24 [00:00<00:00, 115.44it/s, loss=-0.80982]
Epoch 5: 96%|#########5| 23/24 [00:00<00:00, 115.44it/s, loss=-0.80574]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.80574]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.80707]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.80707, test_loss=-0.69094]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.80707, test_loss=-0.69094]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 104.91it/s, loss=-0.80707, test_loss=-0.69094]
Epoch 6: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 6: 4%|4 | 1/24 [00:00<00:00, 107.99it/s, loss=-0.83203]
Epoch 6: 8%|8 | 2/24 [00:00<00:00, 111.84it/s, loss=-0.81940]
Epoch 6: 12%|#2 | 3/24 [00:00<00:00, 112.23it/s, loss=-0.87350]
Epoch 6: 17%|#6 | 4/24 [00:00<00:00, 112.37it/s, loss=-0.85572]
Epoch 6: 21%|## | 5/24 [00:00<00:00, 112.96it/s, loss=-0.83401]
Epoch 6: 25%|##5 | 6/24 [00:00<00:00, 113.47it/s, loss=-0.85978]
Epoch 6: 29%|##9 | 7/24 [00:00<00:00, 113.92it/s, loss=-0.83418]
Epoch 6: 33%|###3 | 8/24 [00:00<00:00, 114.19it/s, loss=-0.85117]
Epoch 6: 38%|###7 | 9/24 [00:00<00:00, 114.49it/s, loss=-0.83326]
Epoch 6: 42%|####1 | 10/24 [00:00<00:00, 114.67it/s, loss=-0.82912]
Epoch 6: 46%|####5 | 11/24 [00:00<00:00, 114.82it/s, loss=-0.84131]
Epoch 6: 50%|##### | 12/24 [00:00<00:00, 114.95it/s, loss=-0.84131]
Epoch 6: 50%|##### | 12/24 [00:00<00:00, 114.95it/s, loss=-0.83978]
Epoch 6: 54%|#####4 | 13/24 [00:00<00:00, 114.95it/s, loss=-0.84942]
Epoch 6: 58%|#####8 | 14/24 [00:00<00:00, 114.95it/s, loss=-0.84104]
Epoch 6: 62%|######2 | 15/24 [00:00<00:00, 114.95it/s, loss=-0.83066]
Epoch 6: 67%|######6 | 16/24 [00:00<00:00, 114.95it/s, loss=-0.83331]
Epoch 6: 71%|####### | 17/24 [00:00<00:00, 114.95it/s, loss=-0.83270]
Epoch 6: 75%|#######5 | 18/24 [00:00<00:00, 114.95it/s, loss=-0.82180]
Epoch 6: 79%|#######9 | 19/24 [00:00<00:00, 114.95it/s, loss=-0.81733]
Epoch 6: 83%|########3 | 20/24 [00:00<00:00, 114.95it/s, loss=-0.81575]
Epoch 6: 88%|########7 | 21/24 [00:00<00:00, 114.95it/s, loss=-0.81119]
Epoch 6: 92%|#########1| 22/24 [00:00<00:00, 114.95it/s, loss=-0.81353]
Epoch 6: 96%|#########5| 23/24 [00:00<00:00, 114.95it/s, loss=-0.80488]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 116.66it/s, loss=-0.80488]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 116.66it/s, loss=-0.81209]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 116.66it/s, loss=-0.81209, test_loss=-0.66785]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 116.66it/s, loss=-0.81209, test_loss=-0.66785]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 104.76it/s, loss=-0.81209, test_loss=-0.66785]
Epoch 7: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 7: 4%|4 | 1/24 [00:00<00:00, 107.22it/s, loss=-0.85642]
Epoch 7: 8%|8 | 2/24 [00:00<00:00, 111.36it/s, loss=-0.84402]
Epoch 7: 12%|#2 | 3/24 [00:00<00:00, 113.10it/s, loss=-0.85857]
Epoch 7: 17%|#6 | 4/24 [00:00<00:00, 113.96it/s, loss=-0.86925]
Epoch 7: 21%|## | 5/24 [00:00<00:00, 114.34it/s, loss=-0.87888]
Epoch 7: 25%|##5 | 6/24 [00:00<00:00, 114.50it/s, loss=-0.86318]
Epoch 7: 29%|##9 | 7/24 [00:00<00:00, 114.74it/s, loss=-0.84015]
Epoch 7: 33%|###3 | 8/24 [00:00<00:00, 114.87it/s, loss=-0.83096]
Epoch 7: 38%|###7 | 9/24 [00:00<00:00, 114.99it/s, loss=-0.82342]
Epoch 7: 42%|####1 | 10/24 [00:00<00:00, 115.22it/s, loss=-0.80503]
Epoch 7: 46%|####5 | 11/24 [00:00<00:00, 115.38it/s, loss=-0.79766]
Epoch 7: 50%|##### | 12/24 [00:00<00:00, 115.50it/s, loss=-0.79766]
Epoch 7: 50%|##### | 12/24 [00:00<00:00, 115.50it/s, loss=-0.79035]
Epoch 7: 54%|#####4 | 13/24 [00:00<00:00, 115.50it/s, loss=-0.78803]
Epoch 7: 58%|#####8 | 14/24 [00:00<00:00, 115.50it/s, loss=-0.78710]
Epoch 7: 62%|######2 | 15/24 [00:00<00:00, 115.50it/s, loss=-0.78229]
Epoch 7: 67%|######6 | 16/24 [00:00<00:00, 115.50it/s, loss=-0.78392]
Epoch 7: 71%|####### | 17/24 [00:00<00:00, 115.50it/s, loss=-0.78518]
Epoch 7: 75%|#######5 | 18/24 [00:00<00:00, 115.50it/s, loss=-0.79814]
Epoch 7: 79%|#######9 | 19/24 [00:00<00:00, 115.50it/s, loss=-0.80648]
Epoch 7: 83%|########3 | 20/24 [00:00<00:00, 115.50it/s, loss=-0.80511]
Epoch 7: 88%|########7 | 21/24 [00:00<00:00, 115.50it/s, loss=-0.81377]
Epoch 7: 92%|#########1| 22/24 [00:00<00:00, 115.50it/s, loss=-0.81610]
Epoch 7: 96%|#########5| 23/24 [00:00<00:00, 115.50it/s, loss=-0.81771]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 116.94it/s, loss=-0.81771]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 116.94it/s, loss=-0.81451]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 116.94it/s, loss=-0.81451, test_loss=-0.68186]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 116.94it/s, loss=-0.81451, test_loss=-0.68186]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 104.93it/s, loss=-0.81451, test_loss=-0.68186]
Epoch 8: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 8: 4%|4 | 1/24 [00:00<00:00, 107.54it/s, loss=-0.86933]
Epoch 8: 8%|8 | 2/24 [00:00<00:00, 111.57it/s, loss=-0.85621]
Epoch 8: 12%|#2 | 3/24 [00:00<00:00, 112.99it/s, loss=-0.84640]
Epoch 8: 17%|#6 | 4/24 [00:00<00:00, 113.82it/s, loss=-0.83246]
Epoch 8: 21%|## | 5/24 [00:00<00:00, 114.39it/s, loss=-0.84346]
Epoch 8: 25%|##5 | 6/24 [00:00<00:00, 114.68it/s, loss=-0.83299]
Epoch 8: 29%|##9 | 7/24 [00:00<00:00, 114.79it/s, loss=-0.82435]
Epoch 8: 33%|###3 | 8/24 [00:00<00:00, 115.02it/s, loss=-0.80710]
Epoch 8: 38%|###7 | 9/24 [00:00<00:00, 115.19it/s, loss=-0.80352]
Epoch 8: 42%|####1 | 10/24 [00:00<00:00, 115.43it/s, loss=-0.81391]
Epoch 8: 46%|####5 | 11/24 [00:00<00:00, 115.54it/s, loss=-0.82201]
Epoch 8: 50%|##### | 12/24 [00:00<00:00, 115.64it/s, loss=-0.82201]
Epoch 8: 50%|##### | 12/24 [00:00<00:00, 115.64it/s, loss=-0.83140]
Epoch 8: 54%|#####4 | 13/24 [00:00<00:00, 115.64it/s, loss=-0.83273]
Epoch 8: 58%|#####8 | 14/24 [00:00<00:00, 115.64it/s, loss=-0.82876]
Epoch 8: 62%|######2 | 15/24 [00:00<00:00, 115.64it/s, loss=-0.83208]
Epoch 8: 67%|######6 | 16/24 [00:00<00:00, 115.64it/s, loss=-0.83699]
Epoch 8: 71%|####### | 17/24 [00:00<00:00, 115.64it/s, loss=-0.83785]
Epoch 8: 75%|#######5 | 18/24 [00:00<00:00, 115.64it/s, loss=-0.83556]
Epoch 8: 79%|#######9 | 19/24 [00:00<00:00, 115.64it/s, loss=-0.83209]
Epoch 8: 83%|########3 | 20/24 [00:00<00:00, 115.64it/s, loss=-0.83313]
Epoch 8: 88%|########7 | 21/24 [00:00<00:00, 115.64it/s, loss=-0.83768]
Epoch 8: 92%|#########1| 22/24 [00:00<00:00, 115.64it/s, loss=-0.83440]
Epoch 8: 96%|#########5| 23/24 [00:00<00:00, 115.64it/s, loss=-0.83870]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 117.29it/s, loss=-0.83870]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 117.29it/s, loss=-0.83431]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 117.29it/s, loss=-0.83431, test_loss=-0.67144]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 117.29it/s, loss=-0.83431, test_loss=-0.67144]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 105.26it/s, loss=-0.83431, test_loss=-0.67144]
Epoch 9: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 9: 4%|4 | 1/24 [00:00<00:00, 105.53it/s, loss=-0.79191]
Epoch 9: 8%|8 | 2/24 [00:00<00:00, 110.49it/s, loss=-0.77772]
Epoch 9: 12%|#2 | 3/24 [00:00<00:00, 111.20it/s, loss=-0.78070]
Epoch 9: 17%|#6 | 4/24 [00:00<00:00, 112.17it/s, loss=-0.76701]
Epoch 9: 21%|## | 5/24 [00:00<00:00, 112.93it/s, loss=-0.79712]
Epoch 9: 25%|##5 | 6/24 [00:00<00:00, 112.00it/s, loss=-0.80524]
Epoch 9: 29%|##9 | 7/24 [00:00<00:00, 109.55it/s, loss=-0.79695]
Epoch 9: 33%|###3 | 8/24 [00:00<00:00, 107.78it/s, loss=-0.79821]
Epoch 9: 38%|###7 | 9/24 [00:00<00:00, 106.30it/s, loss=-0.80101]
Epoch 9: 42%|####1 | 10/24 [00:00<00:00, 105.95it/s, loss=-0.80398]
Epoch 9: 46%|####5 | 11/24 [00:00<00:00, 106.10it/s, loss=-0.80398]
Epoch 9: 46%|####5 | 11/24 [00:00<00:00, 106.10it/s, loss=-0.82079]
Epoch 9: 50%|##### | 12/24 [00:00<00:00, 106.10it/s, loss=-0.81453]
Epoch 9: 54%|#####4 | 13/24 [00:00<00:00, 106.10it/s, loss=-0.81156]
Epoch 9: 58%|#####8 | 14/24 [00:00<00:00, 106.10it/s, loss=-0.81365]
Epoch 9: 62%|######2 | 15/24 [00:00<00:00, 106.10it/s, loss=-0.82175]
Epoch 9: 67%|######6 | 16/24 [00:00<00:00, 106.10it/s, loss=-0.82483]
Epoch 9: 71%|####### | 17/24 [00:00<00:00, 106.10it/s, loss=-0.82958]
Epoch 9: 75%|#######5 | 18/24 [00:00<00:00, 106.10it/s, loss=-0.83229]
Epoch 9: 79%|#######9 | 19/24 [00:00<00:00, 106.10it/s, loss=-0.84071]
Epoch 9: 83%|########3 | 20/24 [00:00<00:00, 106.10it/s, loss=-0.84318]
Epoch 9: 88%|########7 | 21/24 [00:00<00:00, 106.10it/s, loss=-0.83960]
Epoch 9: 92%|#########1| 22/24 [00:00<00:00, 106.10it/s, loss=-0.83631]
Epoch 9: 96%|#########5| 23/24 [00:00<00:00, 110.30it/s, loss=-0.83631]
Epoch 9: 96%|#########5| 23/24 [00:00<00:00, 110.30it/s, loss=-0.83299]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 110.30it/s, loss=-0.83320]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 110.30it/s, loss=-0.83320, test_loss=-0.72668]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 110.30it/s, loss=-0.83320, test_loss=-0.72668]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 100.19it/s, loss=-0.83320, test_loss=-0.72668]
Epoch 10: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 10: 4%|4 | 1/24 [00:00<00:00, 105.86it/s, loss=-0.82439]
Epoch 10: 8%|8 | 2/24 [00:00<00:00, 110.56it/s, loss=-0.77286]
Epoch 10: 12%|#2 | 3/24 [00:00<00:00, 111.40it/s, loss=-0.81231]
Epoch 10: 17%|#6 | 4/24 [00:00<00:00, 112.26it/s, loss=-0.83180]
Epoch 10: 21%|## | 5/24 [00:00<00:00, 113.02it/s, loss=-0.83756]
Epoch 10: 25%|##5 | 6/24 [00:00<00:00, 113.51it/s, loss=-0.83438]
Epoch 10: 29%|##9 | 7/24 [00:00<00:00, 113.92it/s, loss=-0.83463]
Epoch 10: 33%|###3 | 8/24 [00:00<00:00, 113.66it/s, loss=-0.84336]
Epoch 10: 38%|###7 | 9/24 [00:00<00:00, 113.30it/s, loss=-0.85225]
Epoch 10: 42%|####1 | 10/24 [00:00<00:00, 113.72it/s, loss=-0.85093]
Epoch 10: 46%|####5 | 11/24 [00:00<00:00, 113.88it/s, loss=-0.85595]
Epoch 10: 50%|##### | 12/24 [00:00<00:00, 114.10it/s, loss=-0.85595]
Epoch 10: 50%|##### | 12/24 [00:00<00:00, 114.10it/s, loss=-0.85448]
Epoch 10: 54%|#####4 | 13/24 [00:00<00:00, 114.10it/s, loss=-0.83489]
Epoch 10: 58%|#####8 | 14/24 [00:00<00:00, 114.10it/s, loss=-0.81962]
Epoch 10: 62%|######2 | 15/24 [00:00<00:00, 114.10it/s, loss=-0.82080]
Epoch 10: 67%|######6 | 16/24 [00:00<00:00, 114.10it/s, loss=-0.81928]
Epoch 10: 71%|####### | 17/24 [00:00<00:00, 114.10it/s, loss=-0.83036]
Epoch 10: 75%|#######5 | 18/24 [00:00<00:00, 114.10it/s, loss=-0.82398]
Epoch 10: 79%|#######9 | 19/24 [00:00<00:00, 114.10it/s, loss=-0.82362]
Epoch 10: 83%|########3 | 20/24 [00:00<00:00, 114.10it/s, loss=-0.82474]
Epoch 10: 88%|########7 | 21/24 [00:00<00:00, 114.10it/s, loss=-0.82715]
Epoch 10: 92%|#########1| 22/24 [00:00<00:00, 114.10it/s, loss=-0.83157]
Epoch 10: 96%|#########5| 23/24 [00:00<00:00, 114.10it/s, loss=-0.83384]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 116.52it/s, loss=-0.83384]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 116.52it/s, loss=-0.83957]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 116.52it/s, loss=-0.83957, test_loss=-0.72519]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 116.52it/s, loss=-0.83957, test_loss=-0.72519]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 104.57it/s, loss=-0.83957, test_loss=-0.72519]
Epoch 11: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 11: 4%|4 | 1/24 [00:00<00:00, 107.48it/s, loss=-0.70708]
Epoch 11: 8%|8 | 2/24 [00:00<00:00, 111.79it/s, loss=-0.75427]
Epoch 11: 12%|#2 | 3/24 [00:00<00:00, 113.22it/s, loss=-0.80446]
Epoch 11: 17%|#6 | 4/24 [00:00<00:00, 113.92it/s, loss=-0.84595]
Epoch 11: 21%|## | 5/24 [00:00<00:00, 114.40it/s, loss=-0.83538]
Epoch 11: 25%|##5 | 6/24 [00:00<00:00, 114.69it/s, loss=-0.82663]
Epoch 11: 29%|##9 | 7/24 [00:00<00:00, 114.94it/s, loss=-0.83059]
Epoch 11: 33%|###3 | 8/24 [00:00<00:00, 115.20it/s, loss=-0.83646]
Epoch 11: 38%|###7 | 9/24 [00:00<00:00, 115.19it/s, loss=-0.83271]
Epoch 11: 42%|####1 | 10/24 [00:00<00:00, 115.32it/s, loss=-0.83046]
Epoch 11: 46%|####5 | 11/24 [00:00<00:00, 115.42it/s, loss=-0.84216]
Epoch 11: 50%|##### | 12/24 [00:00<00:00, 115.48it/s, loss=-0.84216]
Epoch 11: 50%|##### | 12/24 [00:00<00:00, 115.48it/s, loss=-0.83878]
Epoch 11: 54%|#####4 | 13/24 [00:00<00:00, 115.48it/s, loss=-0.84773]
Epoch 11: 58%|#####8 | 14/24 [00:00<00:00, 115.48it/s, loss=-0.84903]
Epoch 11: 62%|######2 | 15/24 [00:00<00:00, 115.48it/s, loss=-0.85134]
Epoch 11: 67%|######6 | 16/24 [00:00<00:00, 115.48it/s, loss=-0.85660]
Epoch 11: 71%|####### | 17/24 [00:00<00:00, 115.48it/s, loss=-0.85720]
Epoch 11: 75%|#######5 | 18/24 [00:00<00:00, 115.48it/s, loss=-0.85968]
Epoch 11: 79%|#######9 | 19/24 [00:00<00:00, 115.48it/s, loss=-0.86147]
Epoch 11: 83%|########3 | 20/24 [00:00<00:00, 115.48it/s, loss=-0.85709]
Epoch 11: 88%|########7 | 21/24 [00:00<00:00, 115.48it/s, loss=-0.86530]
Epoch 11: 92%|#########1| 22/24 [00:00<00:00, 115.48it/s, loss=-0.85852]
Epoch 11: 96%|#########5| 23/24 [00:00<00:00, 115.48it/s, loss=-0.85948]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 116.79it/s, loss=-0.85948]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 116.79it/s, loss=-0.85705]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 116.79it/s, loss=-0.85705, test_loss=-0.74037]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 116.79it/s, loss=-0.85705, test_loss=-0.74037]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 104.82it/s, loss=-0.85705, test_loss=-0.74037]
Epoch 12: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 12: 4%|4 | 1/24 [00:00<00:00, 106.42it/s, loss=-0.85252]
Epoch 12: 8%|8 | 2/24 [00:00<00:00, 109.01it/s, loss=-0.85557]
Epoch 12: 12%|#2 | 3/24 [00:00<00:00, 111.20it/s, loss=-0.84299]
Epoch 12: 17%|#6 | 4/24 [00:00<00:00, 112.54it/s, loss=-0.81529]
Epoch 12: 21%|## | 5/24 [00:00<00:00, 113.24it/s, loss=-0.81362]
Epoch 12: 25%|##5 | 6/24 [00:00<00:00, 113.59it/s, loss=-0.80995]
Epoch 12: 29%|##9 | 7/24 [00:00<00:00, 113.68it/s, loss=-0.81586]
Epoch 12: 33%|###3 | 8/24 [00:00<00:00, 113.95it/s, loss=-0.85413]
Epoch 12: 38%|###7 | 9/24 [00:00<00:00, 114.27it/s, loss=-0.85824]
Epoch 12: 42%|####1 | 10/24 [00:00<00:00, 114.35it/s, loss=-0.85783]
Epoch 12: 46%|####5 | 11/24 [00:00<00:00, 114.52it/s, loss=-0.84243]
Epoch 12: 50%|##### | 12/24 [00:00<00:00, 114.72it/s, loss=-0.84243]
Epoch 12: 50%|##### | 12/24 [00:00<00:00, 114.72it/s, loss=-0.85012]
Epoch 12: 54%|#####4 | 13/24 [00:00<00:00, 114.72it/s, loss=-0.83988]
Epoch 12: 58%|#####8 | 14/24 [00:00<00:00, 114.72it/s, loss=-0.84557]
Epoch 12: 62%|######2 | 15/24 [00:00<00:00, 114.72it/s, loss=-0.84927]
Epoch 12: 67%|######6 | 16/24 [00:00<00:00, 114.72it/s, loss=-0.84871]
Epoch 12: 71%|####### | 17/24 [00:00<00:00, 114.72it/s, loss=-0.85519]
Epoch 12: 75%|#######5 | 18/24 [00:00<00:00, 114.72it/s, loss=-0.84970]
Epoch 12: 79%|#######9 | 19/24 [00:00<00:00, 114.72it/s, loss=-0.84981]
Epoch 12: 83%|########3 | 20/24 [00:00<00:00, 114.72it/s, loss=-0.85131]
Epoch 12: 88%|########7 | 21/24 [00:00<00:00, 114.72it/s, loss=-0.84654]
Epoch 12: 92%|#########1| 22/24 [00:00<00:00, 114.72it/s, loss=-0.84407]
Epoch 12: 96%|#########5| 23/24 [00:00<00:00, 114.72it/s, loss=-0.84200]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 116.38it/s, loss=-0.84200]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 116.38it/s, loss=-0.83476]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 116.38it/s, loss=-0.83476, test_loss=-0.70057]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 116.38it/s, loss=-0.83476, test_loss=-0.70057]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 104.59it/s, loss=-0.83476, test_loss=-0.70057]
Epoch 13: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 13: 4%|4 | 1/24 [00:00<00:00, 104.69it/s, loss=-0.90849]
Epoch 13: 8%|8 | 2/24 [00:00<00:00, 110.14it/s, loss=-0.90743]
Epoch 13: 12%|#2 | 3/24 [00:00<00:00, 112.10it/s, loss=-0.92555]
Epoch 13: 17%|#6 | 4/24 [00:00<00:00, 112.34it/s, loss=-0.89607]
Epoch 13: 21%|## | 5/24 [00:00<00:00, 113.14it/s, loss=-0.85481]
Epoch 13: 25%|##5 | 6/24 [00:00<00:00, 113.21it/s, loss=-0.85423]
Epoch 13: 29%|##9 | 7/24 [00:00<00:00, 113.53it/s, loss=-0.82620]
Epoch 13: 33%|###3 | 8/24 [00:00<00:00, 113.90it/s, loss=-0.82333]
Epoch 13: 38%|###7 | 9/24 [00:00<00:00, 114.17it/s, loss=-0.84134]
Epoch 13: 42%|####1 | 10/24 [00:00<00:00, 114.40it/s, loss=-0.84743]
Epoch 13: 46%|####5 | 11/24 [00:00<00:00, 114.50it/s, loss=-0.84151]
Epoch 13: 50%|##### | 12/24 [00:00<00:00, 114.68it/s, loss=-0.84151]
Epoch 13: 50%|##### | 12/24 [00:00<00:00, 114.68it/s, loss=-0.84378]
Epoch 13: 54%|#####4 | 13/24 [00:00<00:00, 114.68it/s, loss=-0.85223]
Epoch 13: 58%|#####8 | 14/24 [00:00<00:00, 114.68it/s, loss=-0.85140]
Epoch 13: 62%|######2 | 15/24 [00:00<00:00, 114.68it/s, loss=-0.84576]
Epoch 13: 67%|######6 | 16/24 [00:00<00:00, 114.68it/s, loss=-0.83861]
Epoch 13: 71%|####### | 17/24 [00:00<00:00, 114.68it/s, loss=-0.84076]
Epoch 13: 75%|#######5 | 18/24 [00:00<00:00, 114.68it/s, loss=-0.84018]
Epoch 13: 79%|#######9 | 19/24 [00:00<00:00, 114.68it/s, loss=-0.83614]
Epoch 13: 83%|########3 | 20/24 [00:00<00:00, 114.68it/s, loss=-0.84279]
Epoch 13: 88%|########7 | 21/24 [00:00<00:00, 114.68it/s, loss=-0.84325]
Epoch 13: 92%|#########1| 22/24 [00:00<00:00, 114.68it/s, loss=-0.85341]
Epoch 13: 96%|#########5| 23/24 [00:00<00:00, 114.68it/s, loss=-0.85289]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 116.42it/s, loss=-0.85289]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 116.42it/s, loss=-0.86633]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 116.42it/s, loss=-0.86633, test_loss=-0.73863]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 116.42it/s, loss=-0.86633, test_loss=-0.73863]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 104.57it/s, loss=-0.86633, test_loss=-0.73863]
Epoch 14: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 14: 4%|4 | 1/24 [00:00<00:00, 103.75it/s, loss=-0.90290]
Epoch 14: 8%|8 | 2/24 [00:00<00:00, 109.20it/s, loss=-0.94002]
Epoch 14: 12%|#2 | 3/24 [00:00<00:00, 110.93it/s, loss=-0.91966]
Epoch 14: 17%|#6 | 4/24 [00:00<00:00, 112.24it/s, loss=-0.88643]
Epoch 14: 21%|## | 5/24 [00:00<00:00, 112.85it/s, loss=-0.87052]
Epoch 14: 25%|##5 | 6/24 [00:00<00:00, 113.28it/s, loss=-0.87153]
Epoch 14: 29%|##9 | 7/24 [00:00<00:00, 113.46it/s, loss=-0.86466]
Epoch 14: 33%|###3 | 8/24 [00:00<00:00, 113.70it/s, loss=-0.86494]
Epoch 14: 38%|###7 | 9/24 [00:00<00:00, 113.88it/s, loss=-0.85018]
Epoch 14: 42%|####1 | 10/24 [00:00<00:00, 113.75it/s, loss=-0.86174]
Epoch 14: 46%|####5 | 11/24 [00:00<00:00, 113.92it/s, loss=-0.86503]
Epoch 14: 50%|##### | 12/24 [00:00<00:00, 113.79it/s, loss=-0.86503]
Epoch 14: 50%|##### | 12/24 [00:00<00:00, 113.79it/s, loss=-0.85373]
Epoch 14: 54%|#####4 | 13/24 [00:00<00:00, 113.79it/s, loss=-0.85142]
Epoch 14: 58%|#####8 | 14/24 [00:00<00:00, 113.79it/s, loss=-0.86194]
Epoch 14: 62%|######2 | 15/24 [00:00<00:00, 113.79it/s, loss=-0.86184]
Epoch 14: 67%|######6 | 16/24 [00:00<00:00, 113.79it/s, loss=-0.86513]
Epoch 14: 71%|####### | 17/24 [00:00<00:00, 113.79it/s, loss=-0.85983]
Epoch 14: 75%|#######5 | 18/24 [00:00<00:00, 113.79it/s, loss=-0.86203]
Epoch 14: 79%|#######9 | 19/24 [00:00<00:00, 113.79it/s, loss=-0.86959]
Epoch 14: 83%|########3 | 20/24 [00:00<00:00, 113.79it/s, loss=-0.86548]
Epoch 14: 88%|########7 | 21/24 [00:00<00:00, 113.79it/s, loss=-0.86574]
Epoch 14: 92%|#########1| 22/24 [00:00<00:00, 113.79it/s, loss=-0.86495]
Epoch 14: 96%|#########5| 23/24 [00:00<00:00, 113.79it/s, loss=-0.86406]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 116.13it/s, loss=-0.86406]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 116.13it/s, loss=-0.86312]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 116.13it/s, loss=-0.86312, test_loss=-0.72877]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 116.13it/s, loss=-0.86312, test_loss=-0.72877]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 104.26it/s, loss=-0.86312, test_loss=-0.72877]
Epoch 15: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 15: 4%|4 | 1/24 [00:00<00:00, 107.65it/s, loss=-0.82690]
Epoch 15: 8%|8 | 2/24 [00:00<00:00, 111.75it/s, loss=-0.76751]
Epoch 15: 12%|#2 | 3/24 [00:00<00:00, 113.39it/s, loss=-0.74638]
Epoch 15: 17%|#6 | 4/24 [00:00<00:00, 113.93it/s, loss=-0.79307]
Epoch 15: 21%|## | 5/24 [00:00<00:00, 114.40it/s, loss=-0.79000]
Epoch 15: 25%|##5 | 6/24 [00:00<00:00, 114.80it/s, loss=-0.82629]
Epoch 15: 29%|##9 | 7/24 [00:00<00:00, 115.13it/s, loss=-0.83839]
Epoch 15: 33%|###3 | 8/24 [00:00<00:00, 115.33it/s, loss=-0.86510]
Epoch 15: 38%|###7 | 9/24 [00:00<00:00, 115.13it/s, loss=-0.87192]
Epoch 15: 42%|####1 | 10/24 [00:00<00:00, 114.86it/s, loss=-0.87625]
Epoch 15: 46%|####5 | 11/24 [00:00<00:00, 114.97it/s, loss=-0.88455]
Epoch 15: 50%|##### | 12/24 [00:00<00:00, 115.07it/s, loss=-0.88455]
Epoch 15: 50%|##### | 12/24 [00:00<00:00, 115.07it/s, loss=-0.88194]
Epoch 15: 54%|#####4 | 13/24 [00:00<00:00, 115.07it/s, loss=-0.87699]
Epoch 15: 58%|#####8 | 14/24 [00:00<00:00, 115.07it/s, loss=-0.88739]
Epoch 15: 62%|######2 | 15/24 [00:00<00:00, 115.07it/s, loss=-0.88379]
Epoch 15: 67%|######6 | 16/24 [00:00<00:00, 115.07it/s, loss=-0.88900]
Epoch 15: 71%|####### | 17/24 [00:00<00:00, 115.07it/s, loss=-0.88181]
Epoch 15: 75%|#######5 | 18/24 [00:00<00:00, 115.07it/s, loss=-0.87548]
Epoch 15: 79%|#######9 | 19/24 [00:00<00:00, 115.07it/s, loss=-0.88060]
Epoch 15: 83%|########3 | 20/24 [00:00<00:00, 115.07it/s, loss=-0.87932]
Epoch 15: 88%|########7 | 21/24 [00:00<00:00, 115.07it/s, loss=-0.87973]
Epoch 15: 92%|#########1| 22/24 [00:00<00:00, 115.07it/s, loss=-0.87265]
Epoch 15: 96%|#########5| 23/24 [00:00<00:00, 115.07it/s, loss=-0.86976]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 116.91it/s, loss=-0.86976]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 116.91it/s, loss=-0.86605]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 116.91it/s, loss=-0.86605, test_loss=-0.72968]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 116.91it/s, loss=-0.86605, test_loss=-0.72968]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 105.00it/s, loss=-0.86605, test_loss=-0.72968]
Epoch 16: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 16: 4%|4 | 1/24 [00:00<00:00, 108.48it/s, loss=-0.89401]
Epoch 16: 8%|8 | 2/24 [00:00<00:00, 111.92it/s, loss=-0.97681]
Epoch 16: 12%|#2 | 3/24 [00:00<00:00, 113.35it/s, loss=-0.94231]
Epoch 16: 17%|#6 | 4/24 [00:00<00:00, 114.08it/s, loss=-0.91509]
Epoch 16: 21%|## | 5/24 [00:00<00:00, 113.43it/s, loss=-0.93803]
Epoch 16: 25%|##5 | 6/24 [00:00<00:00, 113.96it/s, loss=-0.93274]
Epoch 16: 29%|##9 | 7/24 [00:00<00:00, 114.25it/s, loss=-0.93720]
Epoch 16: 33%|###3 | 8/24 [00:00<00:00, 114.47it/s, loss=-0.91962]
Epoch 16: 38%|###7 | 9/24 [00:00<00:00, 114.60it/s, loss=-0.89937]
Epoch 16: 42%|####1 | 10/24 [00:00<00:00, 114.77it/s, loss=-0.88521]
Epoch 16: 46%|####5 | 11/24 [00:00<00:00, 114.95it/s, loss=-0.88829]
Epoch 16: 50%|##### | 12/24 [00:00<00:00, 115.12it/s, loss=-0.88829]
Epoch 16: 50%|##### | 12/24 [00:00<00:00, 115.12it/s, loss=-0.87774]
Epoch 16: 54%|#####4 | 13/24 [00:00<00:00, 115.12it/s, loss=-0.88529]
Epoch 16: 58%|#####8 | 14/24 [00:00<00:00, 115.12it/s, loss=-0.87841]
Epoch 16: 62%|######2 | 15/24 [00:00<00:00, 115.12it/s, loss=-0.88428]
Epoch 16: 67%|######6 | 16/24 [00:00<00:00, 115.12it/s, loss=-0.88782]
Epoch 16: 71%|####### | 17/24 [00:00<00:00, 115.12it/s, loss=-0.89131]
Epoch 16: 75%|#######5 | 18/24 [00:00<00:00, 115.12it/s, loss=-0.88254]
Epoch 16: 79%|#######9 | 19/24 [00:00<00:00, 115.12it/s, loss=-0.87064]
Epoch 16: 83%|########3 | 20/24 [00:00<00:00, 115.12it/s, loss=-0.86713]
Epoch 16: 88%|########7 | 21/24 [00:00<00:00, 115.12it/s, loss=-0.87443]
Epoch 16: 92%|#########1| 22/24 [00:00<00:00, 115.12it/s, loss=-0.87220]
Epoch 16: 96%|#########5| 23/24 [00:00<00:00, 115.12it/s, loss=-0.86957]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 116.81it/s, loss=-0.86957]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 116.81it/s, loss=-0.86488]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 116.81it/s, loss=-0.86488, test_loss=-0.72985]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 116.81it/s, loss=-0.86488, test_loss=-0.72985]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 104.87it/s, loss=-0.86488, test_loss=-0.72985]
Epoch 17: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 17: 4%|4 | 1/24 [00:00<00:00, 106.18it/s, loss=-0.96466]
Epoch 17: 8%|8 | 2/24 [00:00<00:00, 110.74it/s, loss=-0.96219]
Epoch 17: 12%|#2 | 3/24 [00:00<00:00, 112.70it/s, loss=-0.92291]
Epoch 17: 17%|#6 | 4/24 [00:00<00:00, 113.42it/s, loss=-0.96262]
Epoch 17: 21%|## | 5/24 [00:00<00:00, 114.06it/s, loss=-0.96111]
Epoch 17: 25%|##5 | 6/24 [00:00<00:00, 114.32it/s, loss=-0.95432]
Epoch 17: 29%|##9 | 7/24 [00:00<00:00, 114.58it/s, loss=-0.94199]
Epoch 17: 33%|###3 | 8/24 [00:00<00:00, 114.86it/s, loss=-0.94179]
Epoch 17: 38%|###7 | 9/24 [00:00<00:00, 115.00it/s, loss=-0.92814]
Epoch 17: 42%|####1 | 10/24 [00:00<00:00, 115.17it/s, loss=-0.91410]
Epoch 17: 46%|####5 | 11/24 [00:00<00:00, 115.29it/s, loss=-0.92322]
Epoch 17: 50%|##### | 12/24 [00:00<00:00, 115.47it/s, loss=-0.92322]
Epoch 17: 50%|##### | 12/24 [00:00<00:00, 115.47it/s, loss=-0.92472]
Epoch 17: 54%|#####4 | 13/24 [00:00<00:00, 115.47it/s, loss=-0.91115]
Epoch 17: 58%|#####8 | 14/24 [00:00<00:00, 115.47it/s, loss=-0.89721]
Epoch 17: 62%|######2 | 15/24 [00:00<00:00, 115.47it/s, loss=-0.88236]
Epoch 17: 67%|######6 | 16/24 [00:00<00:00, 115.47it/s, loss=-0.87434]
Epoch 17: 71%|####### | 17/24 [00:00<00:00, 115.47it/s, loss=-0.87444]
Epoch 17: 75%|#######5 | 18/24 [00:00<00:00, 115.47it/s, loss=-0.86857]
Epoch 17: 79%|#######9 | 19/24 [00:00<00:00, 115.47it/s, loss=-0.87508]
Epoch 17: 83%|########3 | 20/24 [00:00<00:00, 115.47it/s, loss=-0.87364]
Epoch 17: 88%|########7 | 21/24 [00:00<00:00, 115.47it/s, loss=-0.86915]
Epoch 17: 92%|#########1| 22/24 [00:00<00:00, 115.47it/s, loss=-0.86905]
Epoch 17: 96%|#########5| 23/24 [00:00<00:00, 115.47it/s, loss=-0.87525]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 117.17it/s, loss=-0.87525]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 117.17it/s, loss=-0.87660]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 117.17it/s, loss=-0.87660, test_loss=-0.73495]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 117.17it/s, loss=-0.87660, test_loss=-0.73495]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 105.12it/s, loss=-0.87660, test_loss=-0.73495]
Epoch 18: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 18: 4%|4 | 1/24 [00:00<00:00, 104.74it/s, loss=-0.91076]
Epoch 18: 8%|8 | 2/24 [00:00<00:00, 109.13it/s, loss=-0.84848]
Epoch 18: 12%|#2 | 3/24 [00:00<00:00, 109.46it/s, loss=-0.86412]
Epoch 18: 17%|#6 | 4/24 [00:00<00:00, 110.44it/s, loss=-0.86924]
Epoch 18: 21%|## | 5/24 [00:00<00:00, 109.67it/s, loss=-0.88250]
Epoch 18: 25%|##5 | 6/24 [00:00<00:00, 110.25it/s, loss=-0.87989]
Epoch 18: 29%|##9 | 7/24 [00:00<00:00, 110.89it/s, loss=-0.87203]
Epoch 18: 33%|###3 | 8/24 [00:00<00:00, 111.52it/s, loss=-0.85549]
Epoch 18: 38%|###7 | 9/24 [00:00<00:00, 112.08it/s, loss=-0.85321]
Epoch 18: 42%|####1 | 10/24 [00:00<00:00, 112.48it/s, loss=-0.85261]
Epoch 18: 46%|####5 | 11/24 [00:00<00:00, 112.50it/s, loss=-0.84777]
Epoch 18: 50%|##### | 12/24 [00:00<00:00, 112.87it/s, loss=-0.84777]
Epoch 18: 50%|##### | 12/24 [00:00<00:00, 112.87it/s, loss=-0.85391]
Epoch 18: 54%|#####4 | 13/24 [00:00<00:00, 112.87it/s, loss=-0.84845]
Epoch 18: 58%|#####8 | 14/24 [00:00<00:00, 112.87it/s, loss=-0.86445]
Epoch 18: 62%|######2 | 15/24 [00:00<00:00, 112.87it/s, loss=-0.87314]
Epoch 18: 67%|######6 | 16/24 [00:00<00:00, 112.87it/s, loss=-0.87522]
Epoch 18: 71%|####### | 17/24 [00:00<00:00, 112.87it/s, loss=-0.88256]
Epoch 18: 75%|#######5 | 18/24 [00:00<00:00, 112.87it/s, loss=-0.89555]
Epoch 18: 79%|#######9 | 19/24 [00:00<00:00, 112.87it/s, loss=-0.89656]
Epoch 18: 83%|########3 | 20/24 [00:00<00:00, 112.87it/s, loss=-0.88802]
Epoch 18: 88%|########7 | 21/24 [00:00<00:00, 112.87it/s, loss=-0.88349]
Epoch 18: 92%|#########1| 22/24 [00:00<00:00, 112.87it/s, loss=-0.88604]
Epoch 18: 96%|#########5| 23/24 [00:00<00:00, 112.87it/s, loss=-0.88522]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 115.51it/s, loss=-0.88522]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 115.51it/s, loss=-0.88587]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 115.51it/s, loss=-0.88587, test_loss=-0.71993]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 115.51it/s, loss=-0.88587, test_loss=-0.71993]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 103.51it/s, loss=-0.88587, test_loss=-0.71993]
Epoch 19: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 19: 4%|4 | 1/24 [00:00<00:00, 108.26it/s, loss=-0.92628]
Epoch 19: 8%|8 | 2/24 [00:00<00:00, 112.49it/s, loss=-0.97708]
Epoch 19: 12%|#2 | 3/24 [00:00<00:00, 113.43it/s, loss=-0.97101]
Epoch 19: 17%|#6 | 4/24 [00:00<00:00, 114.23it/s, loss=-0.95536]
Epoch 19: 21%|## | 5/24 [00:00<00:00, 114.59it/s, loss=-0.96292]
Epoch 19: 25%|##5 | 6/24 [00:00<00:00, 114.91it/s, loss=-0.94840]
Epoch 19: 29%|##9 | 7/24 [00:00<00:00, 115.17it/s, loss=-0.93743]
Epoch 19: 33%|###3 | 8/24 [00:00<00:00, 115.20it/s, loss=-0.89530]
Epoch 19: 38%|###7 | 9/24 [00:00<00:00, 114.96it/s, loss=-0.90027]
Epoch 19: 42%|####1 | 10/24 [00:00<00:00, 115.12it/s, loss=-0.90792]
Epoch 19: 46%|####5 | 11/24 [00:00<00:00, 115.27it/s, loss=-0.92063]
Epoch 19: 50%|##### | 12/24 [00:00<00:00, 115.38it/s, loss=-0.92063]
Epoch 19: 50%|##### | 12/24 [00:00<00:00, 115.38it/s, loss=-0.91569]
Epoch 19: 54%|#####4 | 13/24 [00:00<00:00, 115.38it/s, loss=-0.91266]
Epoch 19: 58%|#####8 | 14/24 [00:00<00:00, 115.38it/s, loss=-0.90933]
Epoch 19: 62%|######2 | 15/24 [00:00<00:00, 115.38it/s, loss=-0.90440]
Epoch 19: 67%|######6 | 16/24 [00:00<00:00, 115.38it/s, loss=-0.90890]
Epoch 19: 71%|####### | 17/24 [00:00<00:00, 115.38it/s, loss=-0.91400]
Epoch 19: 75%|#######5 | 18/24 [00:00<00:00, 115.38it/s, loss=-0.90474]
Epoch 19: 79%|#######9 | 19/24 [00:00<00:00, 115.38it/s, loss=-0.90498]
Epoch 19: 83%|########3 | 20/24 [00:00<00:00, 115.38it/s, loss=-0.90497]
Epoch 19: 88%|########7 | 21/24 [00:00<00:00, 115.38it/s, loss=-0.90345]
Epoch 19: 92%|#########1| 22/24 [00:00<00:00, 115.38it/s, loss=-0.89237]
Epoch 19: 96%|#########5| 23/24 [00:00<00:00, 115.38it/s, loss=-0.88814]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 117.14it/s, loss=-0.88814]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 117.14it/s, loss=-0.88894]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 117.14it/s, loss=-0.88894, test_loss=-0.72353]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 117.14it/s, loss=-0.88894, test_loss=-0.72353]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 105.11it/s, loss=-0.88894, test_loss=-0.72353]
Epoch 20: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 20: 4%|4 | 1/24 [00:00<00:00, 107.44it/s, loss=-0.72410]
Epoch 20: 8%|8 | 2/24 [00:00<00:00, 111.58it/s, loss=-0.86575]
Epoch 20: 12%|#2 | 3/24 [00:00<00:00, 113.03it/s, loss=-0.88574]
Epoch 20: 17%|#6 | 4/24 [00:00<00:00, 113.53it/s, loss=-0.93535]
Epoch 20: 21%|## | 5/24 [00:00<00:00, 113.39it/s, loss=-0.89173]
Epoch 20: 25%|##5 | 6/24 [00:00<00:00, 113.83it/s, loss=-0.88800]
Epoch 20: 29%|##9 | 7/24 [00:00<00:00, 114.19it/s, loss=-0.87898]
Epoch 20: 33%|###3 | 8/24 [00:00<00:00, 114.43it/s, loss=-0.87751]
Epoch 20: 38%|###7 | 9/24 [00:00<00:00, 114.04it/s, loss=-0.87254]
Epoch 20: 42%|####1 | 10/24 [00:00<00:00, 114.22it/s, loss=-0.88783]
Epoch 20: 46%|####5 | 11/24 [00:00<00:00, 114.48it/s, loss=-0.89035]
Epoch 20: 50%|##### | 12/24 [00:00<00:00, 114.68it/s, loss=-0.89035]
Epoch 20: 50%|##### | 12/24 [00:00<00:00, 114.68it/s, loss=-0.89207]
Epoch 20: 54%|#####4 | 13/24 [00:00<00:00, 114.68it/s, loss=-0.88377]
Epoch 20: 58%|#####8 | 14/24 [00:00<00:00, 114.68it/s, loss=-0.89118]
Epoch 20: 62%|######2 | 15/24 [00:00<00:00, 114.68it/s, loss=-0.88608]
Epoch 20: 67%|######6 | 16/24 [00:00<00:00, 114.68it/s, loss=-0.88979]
Epoch 20: 71%|####### | 17/24 [00:00<00:00, 114.68it/s, loss=-0.88797]
Epoch 20: 75%|#######5 | 18/24 [00:00<00:00, 114.68it/s, loss=-0.88389]
Epoch 20: 79%|#######9 | 19/24 [00:00<00:00, 114.68it/s, loss=-0.88417]
Epoch 20: 83%|########3 | 20/24 [00:00<00:00, 114.68it/s, loss=-0.88551]
Epoch 20: 88%|########7 | 21/24 [00:00<00:00, 114.68it/s, loss=-0.88829]
Epoch 20: 92%|#########1| 22/24 [00:00<00:00, 114.68it/s, loss=-0.89052]
Epoch 20: 96%|#########5| 23/24 [00:00<00:00, 114.68it/s, loss=-0.89330]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 116.31it/s, loss=-0.89330]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 116.31it/s, loss=-0.88992]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 116.31it/s, loss=-0.88992, test_loss=-0.72201]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 116.31it/s, loss=-0.88992, test_loss=-0.72201]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 104.51it/s, loss=-0.88992, test_loss=-0.72201]
Epoch 21: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 21: 4%|4 | 1/24 [00:00<00:00, 98.45it/s, loss=-0.96516]
Epoch 21: 8%|8 | 2/24 [00:00<00:00, 104.63it/s, loss=-0.93560]
Epoch 21: 12%|#2 | 3/24 [00:00<00:00, 108.15it/s, loss=-0.90502]
Epoch 21: 17%|#6 | 4/24 [00:00<00:00, 110.18it/s, loss=-0.89059]
Epoch 21: 21%|## | 5/24 [00:00<00:00, 111.31it/s, loss=-0.86751]
Epoch 21: 25%|##5 | 6/24 [00:00<00:00, 112.10it/s, loss=-0.86352]
Epoch 21: 29%|##9 | 7/24 [00:00<00:00, 112.68it/s, loss=-0.87470]
Epoch 21: 33%|###3 | 8/24 [00:00<00:00, 113.21it/s, loss=-0.88267]
Epoch 21: 38%|###7 | 9/24 [00:00<00:00, 113.53it/s, loss=-0.88281]
Epoch 21: 42%|####1 | 10/24 [00:00<00:00, 113.72it/s, loss=-0.88165]
Epoch 21: 46%|####5 | 11/24 [00:00<00:00, 113.91it/s, loss=-0.88664]
Epoch 21: 50%|##### | 12/24 [00:00<00:00, 114.18it/s, loss=-0.88664]
Epoch 21: 50%|##### | 12/24 [00:00<00:00, 114.18it/s, loss=-0.89218]
Epoch 21: 54%|#####4 | 13/24 [00:00<00:00, 114.18it/s, loss=-0.89247]
Epoch 21: 58%|#####8 | 14/24 [00:00<00:00, 114.18it/s, loss=-0.88900]
Epoch 21: 62%|######2 | 15/24 [00:00<00:00, 114.18it/s, loss=-0.88224]
Epoch 21: 67%|######6 | 16/24 [00:00<00:00, 114.18it/s, loss=-0.87038]
Epoch 21: 71%|####### | 17/24 [00:00<00:00, 114.18it/s, loss=-0.87191]
Epoch 21: 75%|#######5 | 18/24 [00:00<00:00, 114.18it/s, loss=-0.86873]
Epoch 21: 79%|#######9 | 19/24 [00:00<00:00, 114.18it/s, loss=-0.87031]
Epoch 21: 83%|########3 | 20/24 [00:00<00:00, 114.18it/s, loss=-0.87777]
Epoch 21: 88%|########7 | 21/24 [00:00<00:00, 114.18it/s, loss=-0.88006]
Epoch 21: 92%|#########1| 22/24 [00:00<00:00, 114.18it/s, loss=-0.88463]
Epoch 21: 96%|#########5| 23/24 [00:00<00:00, 114.18it/s, loss=-0.89222]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 116.08it/s, loss=-0.89222]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 116.08it/s, loss=-0.88759]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 116.08it/s, loss=-0.88759, test_loss=-0.71689]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 116.08it/s, loss=-0.88759, test_loss=-0.71689]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 104.28it/s, loss=-0.88759, test_loss=-0.71689]
Epoch 22: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 22: 4%|4 | 1/24 [00:00<00:00, 100.61it/s, loss=-0.77183]
Epoch 22: 8%|8 | 2/24 [00:00<00:00, 106.22it/s, loss=-0.76269]
Epoch 22: 12%|#2 | 3/24 [00:00<00:00, 108.55it/s, loss=-0.81276]
Epoch 22: 17%|#6 | 4/24 [00:00<00:00, 110.34it/s, loss=-0.85360]
Epoch 22: 21%|## | 5/24 [00:00<00:00, 111.39it/s, loss=-0.83247]
Epoch 22: 25%|##5 | 6/24 [00:00<00:00, 112.24it/s, loss=-0.85758]
Epoch 22: 29%|##9 | 7/24 [00:00<00:00, 112.81it/s, loss=-0.83832]
Epoch 22: 33%|###3 | 8/24 [00:00<00:00, 113.25it/s, loss=-0.86089]
Epoch 22: 38%|###7 | 9/24 [00:00<00:00, 113.65it/s, loss=-0.87095]
Epoch 22: 42%|####1 | 10/24 [00:00<00:00, 113.87it/s, loss=-0.87492]
Epoch 22: 46%|####5 | 11/24 [00:00<00:00, 113.93it/s, loss=-0.87456]
Epoch 22: 50%|##### | 12/24 [00:00<00:00, 114.15it/s, loss=-0.87456]
Epoch 22: 50%|##### | 12/24 [00:00<00:00, 114.15it/s, loss=-0.88350]
Epoch 22: 54%|#####4 | 13/24 [00:00<00:00, 114.15it/s, loss=-0.88909]
Epoch 22: 58%|#####8 | 14/24 [00:00<00:00, 114.15it/s, loss=-0.88931]
Epoch 22: 62%|######2 | 15/24 [00:00<00:00, 114.15it/s, loss=-0.89174]
Epoch 22: 67%|######6 | 16/24 [00:00<00:00, 114.15it/s, loss=-0.89260]
Epoch 22: 71%|####### | 17/24 [00:00<00:00, 114.15it/s, loss=-0.90380]
Epoch 22: 75%|#######5 | 18/24 [00:00<00:00, 114.15it/s, loss=-0.89926]
Epoch 22: 79%|#######9 | 19/24 [00:00<00:00, 114.15it/s, loss=-0.89022]
Epoch 22: 83%|########3 | 20/24 [00:00<00:00, 114.15it/s, loss=-0.88863]
Epoch 22: 88%|########7 | 21/24 [00:00<00:00, 114.15it/s, loss=-0.88612]
Epoch 22: 92%|#########1| 22/24 [00:00<00:00, 114.15it/s, loss=-0.88702]
Epoch 22: 96%|#########5| 23/24 [00:00<00:00, 114.15it/s, loss=-0.87935]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 116.34it/s, loss=-0.87935]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 116.34it/s, loss=-0.87937]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 116.34it/s, loss=-0.87937, test_loss=-0.70188]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 116.34it/s, loss=-0.87937, test_loss=-0.70188]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 104.35it/s, loss=-0.87937, test_loss=-0.70188]
Epoch 23: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 23: 4%|4 | 1/24 [00:00<00:00, 103.53it/s, loss=-0.87101]
Epoch 23: 8%|8 | 2/24 [00:00<00:00, 109.11it/s, loss=-0.88615]
Epoch 23: 12%|#2 | 3/24 [00:00<00:00, 111.32it/s, loss=-0.89503]
Epoch 23: 17%|#6 | 4/24 [00:00<00:00, 112.66it/s, loss=-0.91860]
Epoch 23: 21%|## | 5/24 [00:00<00:00, 113.43it/s, loss=-0.91584]
Epoch 23: 25%|##5 | 6/24 [00:00<00:00, 113.39it/s, loss=-0.91223]
Epoch 23: 29%|##9 | 7/24 [00:00<00:00, 113.29it/s, loss=-0.91175]
Epoch 23: 33%|###3 | 8/24 [00:00<00:00, 113.64it/s, loss=-0.91216]
Epoch 23: 38%|###7 | 9/24 [00:00<00:00, 113.96it/s, loss=-0.92425]
Epoch 23: 42%|####1 | 10/24 [00:00<00:00, 114.18it/s, loss=-0.92013]
Epoch 23: 46%|####5 | 11/24 [00:00<00:00, 114.37it/s, loss=-0.93377]
Epoch 23: 50%|##### | 12/24 [00:00<00:00, 114.46it/s, loss=-0.93377]
Epoch 23: 50%|##### | 12/24 [00:00<00:00, 114.46it/s, loss=-0.93541]
Epoch 23: 54%|#####4 | 13/24 [00:00<00:00, 114.46it/s, loss=-0.94556]
Epoch 23: 58%|#####8 | 14/24 [00:00<00:00, 114.46it/s, loss=-0.93146]
Epoch 23: 62%|######2 | 15/24 [00:00<00:00, 114.46it/s, loss=-0.93146]
Epoch 23: 67%|######6 | 16/24 [00:00<00:00, 114.46it/s, loss=-0.91617]
Epoch 23: 71%|####### | 17/24 [00:00<00:00, 114.46it/s, loss=-0.90859]
Epoch 23: 75%|#######5 | 18/24 [00:00<00:00, 114.46it/s, loss=-0.90480]
Epoch 23: 79%|#######9 | 19/24 [00:00<00:00, 114.46it/s, loss=-0.89920]
Epoch 23: 83%|########3 | 20/24 [00:00<00:00, 114.46it/s, loss=-0.89742]
Epoch 23: 88%|########7 | 21/24 [00:00<00:00, 114.46it/s, loss=-0.90613]
Epoch 23: 92%|#########1| 22/24 [00:00<00:00, 114.46it/s, loss=-0.90265]
Epoch 23: 96%|#########5| 23/24 [00:00<00:00, 114.46it/s, loss=-0.90040]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 116.61it/s, loss=-0.90040]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 116.61it/s, loss=-0.89662]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 116.61it/s, loss=-0.89662, test_loss=-0.70119]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 116.61it/s, loss=-0.89662, test_loss=-0.70119]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 104.55it/s, loss=-0.89662, test_loss=-0.70119]
Epoch 24: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 24: 4%|4 | 1/24 [00:00<00:00, 106.96it/s, loss=-0.82150]
Epoch 24: 8%|8 | 2/24 [00:00<00:00, 111.20it/s, loss=-0.87175]
Epoch 24: 12%|#2 | 3/24 [00:00<00:00, 112.62it/s, loss=-0.86480]
Epoch 24: 17%|#6 | 4/24 [00:00<00:00, 113.15it/s, loss=-0.88550]
Epoch 24: 21%|## | 5/24 [00:00<00:00, 111.66it/s, loss=-0.87581]
Epoch 24: 25%|##5 | 6/24 [00:00<00:00, 112.31it/s, loss=-0.89015]
Epoch 24: 29%|##9 | 7/24 [00:00<00:00, 112.89it/s, loss=-0.89448]
Epoch 24: 33%|###3 | 8/24 [00:00<00:00, 113.20it/s, loss=-0.89962]
Epoch 24: 38%|###7 | 9/24 [00:00<00:00, 113.57it/s, loss=-0.89903]
Epoch 24: 42%|####1 | 10/24 [00:00<00:00, 113.47it/s, loss=-0.90315]
Epoch 24: 46%|####5 | 11/24 [00:00<00:00, 113.62it/s, loss=-0.90613]
Epoch 24: 50%|##### | 12/24 [00:00<00:00, 113.86it/s, loss=-0.90613]
Epoch 24: 50%|##### | 12/24 [00:00<00:00, 113.86it/s, loss=-0.89769]
Epoch 24: 54%|#####4 | 13/24 [00:00<00:00, 113.86it/s, loss=-0.90350]
Epoch 24: 58%|#####8 | 14/24 [00:00<00:00, 113.86it/s, loss=-0.90289]
Epoch 24: 62%|######2 | 15/24 [00:00<00:00, 113.86it/s, loss=-0.90944]
Epoch 24: 67%|######6 | 16/24 [00:00<00:00, 113.86it/s, loss=-0.90446]
Epoch 24: 71%|####### | 17/24 [00:00<00:00, 113.86it/s, loss=-0.90911]
Epoch 24: 75%|#######5 | 18/24 [00:00<00:00, 113.86it/s, loss=-0.90554]
Epoch 24: 79%|#######9 | 19/24 [00:00<00:00, 113.86it/s, loss=-0.90204]
Epoch 24: 83%|########3 | 20/24 [00:00<00:00, 113.86it/s, loss=-0.90312]
Epoch 24: 88%|########7 | 21/24 [00:00<00:00, 113.86it/s, loss=-0.90135]
Epoch 24: 92%|#########1| 22/24 [00:00<00:00, 113.86it/s, loss=-0.90496]
Epoch 24: 96%|#########5| 23/24 [00:00<00:00, 113.86it/s, loss=-0.90607]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 116.11it/s, loss=-0.90607]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 116.11it/s, loss=-0.89681]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 116.11it/s, loss=-0.89681, test_loss=-0.72673]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 116.11it/s, loss=-0.89681, test_loss=-0.72673]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 104.24it/s, loss=-0.89681, test_loss=-0.72673]
Epoch 25: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 25: 4%|4 | 1/24 [00:00<00:00, 106.77it/s, loss=-0.92309]
Epoch 25: 8%|8 | 2/24 [00:00<00:00, 109.25it/s, loss=-0.91210]
Epoch 25: 12%|#2 | 3/24 [00:00<00:00, 111.72it/s, loss=-0.91134]
Epoch 25: 17%|#6 | 4/24 [00:00<00:00, 112.39it/s, loss=-0.91075]
Epoch 25: 21%|## | 5/24 [00:00<00:00, 113.13it/s, loss=-0.89881]
Epoch 25: 25%|##5 | 6/24 [00:00<00:00, 113.60it/s, loss=-0.90480]
Epoch 25: 29%|##9 | 7/24 [00:00<00:00, 113.98it/s, loss=-0.92889]
Epoch 25: 33%|###3 | 8/24 [00:00<00:00, 114.23it/s, loss=-0.93403]
Epoch 25: 38%|###7 | 9/24 [00:00<00:00, 114.53it/s, loss=-0.92205]
Epoch 25: 42%|####1 | 10/24 [00:00<00:00, 114.70it/s, loss=-0.92798]
Epoch 25: 46%|####5 | 11/24 [00:00<00:00, 114.84it/s, loss=-0.92112]
Epoch 25: 50%|##### | 12/24 [00:00<00:00, 115.02it/s, loss=-0.92112]
Epoch 25: 50%|##### | 12/24 [00:00<00:00, 115.02it/s, loss=-0.92465]
Epoch 25: 54%|#####4 | 13/24 [00:00<00:00, 115.02it/s, loss=-0.92041]
Epoch 25: 58%|#####8 | 14/24 [00:00<00:00, 115.02it/s, loss=-0.91121]
Epoch 25: 62%|######2 | 15/24 [00:00<00:00, 115.02it/s, loss=-0.91616]
Epoch 25: 67%|######6 | 16/24 [00:00<00:00, 115.02it/s, loss=-0.91612]
Epoch 25: 71%|####### | 17/24 [00:00<00:00, 115.02it/s, loss=-0.91648]
Epoch 25: 75%|#######5 | 18/24 [00:00<00:00, 115.02it/s, loss=-0.91662]
Epoch 25: 79%|#######9 | 19/24 [00:00<00:00, 115.02it/s, loss=-0.91151]
Epoch 25: 83%|########3 | 20/24 [00:00<00:00, 115.02it/s, loss=-0.90764]
Epoch 25: 88%|########7 | 21/24 [00:00<00:00, 115.02it/s, loss=-0.90352]
Epoch 25: 92%|#########1| 22/24 [00:00<00:00, 115.02it/s, loss=-0.91885]
Epoch 25: 96%|#########5| 23/24 [00:00<00:00, 115.02it/s, loss=-0.91330]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.91330]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.90185]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.90185, test_loss=-0.71209]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 116.90it/s, loss=-0.90185, test_loss=-0.71209]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 104.94it/s, loss=-0.90185, test_loss=-0.71209]
Epoch 26: 0%| | 0/24 [00:00<?, ?it/s]
Epoch 26: 4%|4 | 1/24 [00:00<00:00, 107.47it/s, loss=-0.91975]
Epoch 26: 8%|8 | 2/24 [00:00<00:00, 111.08it/s, loss=-0.97278]
Epoch 26: 12%|#2 | 3/24 [00:00<00:00, 112.46it/s, loss=-0.92168]
Epoch 26: 17%|#6 | 4/24 [00:00<00:00, 113.40it/s, loss=-0.89869]
Epoch 26: 21%|## | 5/24 [00:00<00:00, 113.61it/s, loss=-0.90687]
Epoch 26: 25%|##5 | 6/24 [00:00<00:00, 113.52it/s, loss=-0.91959]
Epoch 26: 29%|##9 | 7/24 [00:00<00:00, 113.93it/s, loss=-0.92787]
Epoch 26: 33%|###3 | 8/24 [00:00<00:00, 114.28it/s, loss=-0.93910]
Epoch 26: 38%|###7 | 9/24 [00:00<00:00, 114.47it/s, loss=-0.90926]
Epoch 26: 42%|####1 | 10/24 [00:00<00:00, 114.62it/s, loss=-0.91268]
Epoch 26: 46%|####5 | 11/24 [00:00<00:00, 114.75it/s, loss=-0.90554]
Epoch 26: 50%|##### | 12/24 [00:00<00:00, 114.92it/s, loss=-0.90554]
Epoch 26: 50%|##### | 12/24 [00:00<00:00, 114.92it/s, loss=-0.90234]
Epoch 26: 54%|#####4 | 13/24 [00:00<00:00, 114.92it/s, loss=-0.90136]
Epoch 26: 58%|#####8 | 14/24 [00:00<00:00, 114.92it/s, loss=-0.89982]
Epoch 26: 62%|######2 | 15/24 [00:00<00:00, 114.92it/s, loss=-0.89377]
Epoch 26: 67%|######6 | 16/24 [00:00<00:00, 114.92it/s, loss=-0.88755]
Epoch 26: 71%|####### | 17/24 [00:00<00:00, 114.92it/s, loss=-0.89052]
Epoch 26: 75%|#######5 | 18/24 [00:00<00:00, 114.92it/s, loss=-0.89186]
Epoch 26: 79%|#######9 | 19/24 [00:00<00:00, 114.92it/s, loss=-0.89662]
Epoch 26: 83%|########3 | 20/24 [00:00<00:00, 114.92it/s, loss=-0.89773]
Epoch 26: 88%|########7 | 21/24 [00:00<00:00, 114.92it/s, loss=-0.90680]
Epoch 26: 92%|#########1| 22/24 [00:00<00:00, 114.92it/s, loss=-0.89747]
Epoch 26: 96%|#########5| 23/24 [00:00<00:00, 114.92it/s, loss=-0.89657]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 115.38it/s, loss=-0.89657]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 115.38it/s, loss=-0.89895]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 115.38it/s, loss=-0.89895, test_loss=-0.70702]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 115.38it/s, loss=-0.89895, test_loss=-0.70702]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 103.79it/s, loss=-0.89895, test_loss=-0.70702]
Training interrupted
Training stopped early because there was no improvement in test_loss for 15 epochs
Evaluation and visualization¶
The history
object returned by launch
contains a lot of useful information related
to training. Specifically, the property metrics
returns a comprehensive pd.DataFrame
.
To display the average test loss per each epoch we can run following.
per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value']
print(per_epoch_results.count()) # double check number of samples each epoch
print(per_epoch_results.mean()) # mean loss per epoch
Out:
dataloader metric model epoch
test loss network 0 128
1 128
2 128
3 128
4 128
5 128
6 128
7 128
8 128
9 128
10 128
11 128
12 128
13 128
14 128
15 128
16 128
17 128
18 128
19 128
20 128
21 128
22 128
23 128
24 128
25 128
26 128
Name: value, dtype: int64
dataloader metric model epoch
test loss network 0 -0.323336
1 -0.481293
2 -0.587856
3 -0.631074
4 -0.643219
5 -0.690939
6 -0.667855
7 -0.681860
8 -0.671437
9 -0.726678
10 -0.725190
11 -0.740370
12 -0.700567
13 -0.738635
14 -0.728765
15 -0.729683
16 -0.729853
17 -0.734949
18 -0.719934
19 -0.723532
20 -0.722008
21 -0.716887
22 -0.701883
23 -0.701192
24 -0.726732
25 -0.712086
26 -0.707021
Name: value, dtype: float64
per_epoch_results.mean()['test']['loss']['network'].plot()
Out:
<AxesSubplot:xlabel='epoch'>
To get more insight into what our network predicts we can use the deepdow.visualize
module.
Before we even start further evaluations, let us make sure the network is in eval model.
network = network.eval()
To put the performance of our network in context, we also utilize benchmarks. deepdow
offers multiple benchmarks already. Additionally, one can provide custom simple benchmarks or
some pre-trained networks.
benchmarks = {
'1overN': OneOverN(), # each asset has weight 1 / n_assets
'random': Random(), # random allocation that is however close 1OverN
'network': network
}
During training, the only mandatory metric/loss was the loss criterion that we tried to minimize. Naturally, one might be interested in many other metrics to evaluate the performance. See below an example.
metrics = {
'MaxDD': MaximumDrawdown(),
'Sharpe': SharpeRatio(),
'MeanReturn': MeanReturns()
}
Let us now use the above created objects. We first generate a table with all metrics over all
samples and for all benchmarks. This is done via generate_metrics_table
.
metrics_table = generate_metrics_table(benchmarks,
dataloader_test,
metrics)
And then we plot it with plot_metrics
.
plot_metrics(metrics_table)
Out:
array([<AxesSubplot:title={'center':'MaxDD'}, xlabel='timestamp'>,
<AxesSubplot:title={'center':'Sharpe'}, xlabel='timestamp'>,
<AxesSubplot:title={'center':'MeanReturn'}, xlabel='timestamp'>],
dtype=object)
Each plot represents a different metric. The x-axis represents the timestamps in our
test set. The different colors are capturing different models. How is the value of a metric
computed? We assume that the investor predicts the portfolio at time x and buys it. He then
holds it for horizon
timesteps. The actual metric is then computed over this time horizon.
Finally, we are also interested in how the allocation/prediction looks like at each time step.
We can use the generate_weights_table
function to create a pd.DataFrame
.
weight_table = generate_weights_table(network, dataloader_test)
We then call the plot_weight_heatmap
to see a heatmap of weights.
plot_weight_heatmap(weight_table,
add_sum_column=True,
time_format=None,
time_skips=25)
Out:
<AxesSubplot:>
The rows represent different timesteps in our test set. The columns are all the assets in our universe. The values represent the weight in the portfolio. Additionally, we add a sum column to show that we are really generating valid allocations.
Total running time of the script: ( 0 minutes 9.948 seconds)