Getting started

Welcome to deepdow! This tutorial is going to guide you through the basic (but essential) features. You will learn about the general pipeline through an end-to-end example. It consists of the following steps

  1. Dataset creation and loading

  2. Network definition

  3. Training

  4. Evaluation and visualization of results

Preliminaries

Let us start with importing all important dependencies.

from deepdow.benchmarks import Benchmark, OneOverN, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader, prepare_standard_scaler, Scale
from deepdow.data.synthetic import sin_single
from deepdow.experiments import Run
from deepdow.layers import SoftmaxAllocator
from deepdow.losses import MeanReturns, SharpeRatio, MaximumDrawdown
from deepdow.visualize import generate_metrics_table, generate_weights_table, plot_metrics, plot_weight_heatmap
import matplotlib.pyplot as plt
import numpy as np
import torch

In order to be able to reproduce all results we set both the numpy and torch seed.

torch.manual_seed(4)
np.random.seed(5)

Dataset creation and loading

In this example, we are going to be using a synthetic dataset. Asset returns are going to be some a sine function where the frequency and phase are randomly selected for each asset. First of all let us set all the parameters relevant to data creation.

n_timesteps, n_assets = 1000, 20
lookback, gap, horizon = 40, 2, 20
n_samples = n_timesteps - lookback - horizon - gap + 1

Additionally, we will use approximately 80% of the data for training and 20% for testing.

split_ix = int(n_samples * 0.8)
indices_train = list(range(split_ix))
indices_test = list(range(split_ix + lookback + horizon, n_samples))

print('Train range: {}:{}\nTest range: {}:{}'.format(indices_train[0], indices_train[-1],
                                                     indices_test[0], indices_test[-1]))

Out:

Train range: 0:750
Test range: 811:938

Now we can generate the synthetic asset returns of with shape (n_timesteps, n_assets).

returns = np.array([sin_single(n_timesteps,
                               freq=1 / np.random.randint(3, lookback),
                               amplitude=0.05,
                               phase=np.random.randint(0, lookback)
                               ) for _ in range(n_assets)]).T

We also add some noise.

returns += np.random.normal(scale=0.02, size=returns.shape)

See below the first 100 timesteps of 2 assets.

plt.plot(returns[:100, [1, 2]])
getting started

Out:

[<matplotlib.lines.Line2D object at 0x7f840ed67da0>, <matplotlib.lines.Line2D object at 0x7f840ed67eb8>]

To obtain the feature matrix X and the target y we apply the rolling window strategy.

X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(returns[i - lookback: i, :])
    y_list.append(returns[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

print('X: {}, y: {}'.format(X.shape, y.shape))

Out:

X: (939, 1, 40, 20), y: (939, 1, 20, 20)

As commonly done in every deep learning application, we want to scale our input features to be approximately centered around 0 and have a standard deviation of 1. In deepdow we can achieve this with the prepare_standard_scaler function that computes the mean and standard deviation of the input (for each channel). Additionally, we do not want to leak any information from our test set and therefore we only compute these statistics over the training set.

means, stds = prepare_standard_scaler(X, indices=indices_train)
print('mean: {}, std: {}'.format(means, stds))

Out:

mean: [-9.56904164e-07], std: [0.04066513]

We can now construct the InRAMDataset. By providing the optional transform we make sure that when the samples are streamed they are always scaled based on our computed (training) statistics. See InRAMDataset for more details.

dataset = InRAMDataset(X, y, transform=Scale(means, stds))

Using the dataset we can now construct two dataloaders—one for training and the other one for testing. For more details see Dataloaders.

dataloader_train = RigidDataLoader(dataset,
                                   indices=indices_train,
                                   batch_size=32)

dataloader_test = RigidDataLoader(dataset,
                                  indices=indices_test,
                                  batch_size=32)

Network definition

Let us now write a custom network. See Writing custom networks.

class GreatNet(torch.nn.Module, Benchmark):
    def __init__(self, n_assets, lookback, p=0.5):
        super().__init__()

        n_features = n_assets * lookback

        self.dropout_layer = torch.nn.Dropout(p=p)
        self.dense_layer = torch.nn.Linear(n_features, n_assets, bias=True)
        self.allocate_layer = SoftmaxAllocator(temperature=None)
        self.temperature = torch.nn.Parameter(torch.ones(1), requires_grad=True)

    def forward(self, x):
        """Perform forward pass.

        Parameters
        ----------
        x : torch.Tensor
            Of shape (n_samples, 1, lookback, n_assets).

        Returns
        -------
        weights : torch.Torch
            Tensor of shape (n_samples, n_assets).

        """
        n_samples, _, _, _ = x.shape
        x = x.view(n_samples, -1)  # flatten features
        x = self.dropout_layer(x)
        x = self.dense_layer(x)

        temperatures = torch.ones(n_samples).to(device=x.device, dtype=x.dtype) * self.temperature
        weights = self.allocate_layer(x, temperatures)

        return weights

So what is this network doing? First of all, we make an assumption that assets and lookback will never change (the same shape and order at train and at inference time). This assumption is justified since we are using RigidDataLoader. We can learn n_assets linear models that have n_assets * lookback features. In other words we have a dense layer that takes the flattened feature tensor x and returns a vector of length n_assets. Since elements of this vector can range from \(-\infty\) to \(\infty\) we turn it into an asset allocation via SoftmaxAllocator. Additionally, we learn the temperature from the data. This will enable us to learn the optimal trade-off between an equally weighted allocation (uniform distribution) and single asset portfolios.

network = GreatNet(n_assets, lookback)
print(network)

Out:

GreatNet(
  (dropout_layer): Dropout(p=0.5, inplace=False)
  (dense_layer): Linear(in_features=800, out_features=20, bias=True)
  (allocate_layer): SoftmaxAllocator(
    (layer): Softmax(dim=1)
  )
)

In torch networks are either in the train or eval mode. Since we are using dropout it is essential that we set the mode correctly based on what we are trying to do.

network = network.train()  # it is the default, however, just to make the distinction clear

Training

It is now time to define our loss. Let’s say we want to achieve multiple objectives at the same time. We want to minimize the drawdowns, maximize the mean returns and also maximize the Sharpe ratio. All of these losses are implemented in deepdow.losses. To avoid confusion, they are always implemented in a way that the lower the value of the loss the better. To combine multiple objectives we can simply sum all of the individual losses. Similarly, if we want to assign more importance to one of them we can achieve this by multiplying by a constant. To learn more see Losses.

loss = MaximumDrawdown() + 2 * MeanReturns() + SharpeRatio()

Note that by default all the losses assume that we input logarithmic returns (input_type='log') and that they are in the 0th channel (returns_channel=0).

We now have all the ingredients ready for training of the neural network. deepdow implements a simple wrapper Run that implements the training loop and a minimal callback framework. For further information see Experiments.

run = Run(network,
          loss,
          dataloader_train,
          val_dataloaders={'test': dataloader_test},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback(metric_name='loss',
                                           dataloader_name='test',
                                           patience=15)])

To run the training loop, we use the launch where we specify the number of epochs.

history = run.launch(30)

Out:

Epoch 0:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 0:   4%|4         | 1/24 [00:00<00:00, 51.70it/s, loss=0.10490]
Epoch 0:   8%|8         | 2/24 [00:00<00:00, 58.83it/s, loss=0.04792]
Epoch 0:  12%|#2        | 3/24 [00:00<00:00, 61.31it/s, loss=0.03893]
Epoch 0:  17%|#6        | 4/24 [00:00<00:00, 62.80it/s, loss=0.01282]
Epoch 0:  21%|##        | 5/24 [00:00<00:00, 63.63it/s, loss=-0.01442]
Epoch 0:  25%|##5       | 6/24 [00:00<00:00, 64.48it/s, loss=-0.02882]
Epoch 0:  29%|##9       | 7/24 [00:00<00:00, 64.35it/s, loss=-0.02882]
Epoch 0:  29%|##9       | 7/24 [00:00<00:00, 64.35it/s, loss=-0.04859]
Epoch 0:  33%|###3      | 8/24 [00:00<00:00, 64.35it/s, loss=-0.06196]
Epoch 0:  38%|###7      | 9/24 [00:00<00:00, 64.35it/s, loss=-0.06503]
Epoch 0:  42%|####1     | 10/24 [00:00<00:00, 64.35it/s, loss=-0.07275]
Epoch 0:  46%|####5     | 11/24 [00:00<00:00, 64.35it/s, loss=-0.09093]
Epoch 0:  50%|#####     | 12/24 [00:00<00:00, 64.35it/s, loss=-0.09086]
Epoch 0:  54%|#####4    | 13/24 [00:00<00:00, 64.35it/s, loss=-0.09539]
Epoch 0:  58%|#####8    | 14/24 [00:00<00:00, 65.47it/s, loss=-0.09539]
Epoch 0:  58%|#####8    | 14/24 [00:00<00:00, 65.47it/s, loss=-0.10493]
Epoch 0:  62%|######2   | 15/24 [00:00<00:00, 65.47it/s, loss=-0.11856]
Epoch 0:  67%|######6   | 16/24 [00:00<00:00, 65.47it/s, loss=-0.12744]
Epoch 0:  71%|#######   | 17/24 [00:00<00:00, 65.47it/s, loss=-0.13954]
Epoch 0:  75%|#######5  | 18/24 [00:00<00:00, 65.47it/s, loss=-0.14985]
Epoch 0:  79%|#######9  | 19/24 [00:00<00:00, 65.47it/s, loss=-0.16115]
Epoch 0:  83%|########3 | 20/24 [00:00<00:00, 65.47it/s, loss=-0.16692]
Epoch 0:  88%|########7 | 21/24 [00:00<00:00, 65.89it/s, loss=-0.16692]
Epoch 0:  88%|########7 | 21/24 [00:00<00:00, 65.89it/s, loss=-0.17260]
Epoch 0:  92%|#########1| 22/24 [00:00<00:00, 65.89it/s, loss=-0.18099]
Epoch 0:  96%|#########5| 23/24 [00:00<00:00, 65.89it/s, loss=-0.19158]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 65.89it/s, loss=-0.19150]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 65.89it/s, loss=-0.19150, test_loss=-0.32334]
Epoch 0: 100%|##########| 24/24 [00:00<00:00, 60.15it/s, loss=-0.19150, test_loss=-0.32334]

Epoch 1:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 1:   4%|4         | 1/24 [00:00<00:00, 68.09it/s, loss=-0.44158]
Epoch 1:   8%|8         | 2/24 [00:00<00:00, 68.18it/s, loss=-0.52983]
Epoch 1:  12%|#2        | 3/24 [00:00<00:00, 68.22it/s, loss=-0.51111]
Epoch 1:  17%|#6        | 4/24 [00:00<00:00, 67.46it/s, loss=-0.49352]
Epoch 1:  21%|##        | 5/24 [00:00<00:00, 67.30it/s, loss=-0.47975]
Epoch 1:  25%|##5       | 6/24 [00:00<00:00, 67.51it/s, loss=-0.46843]
Epoch 1:  29%|##9       | 7/24 [00:00<00:00, 66.80it/s, loss=-0.46843]
Epoch 1:  29%|##9       | 7/24 [00:00<00:00, 66.80it/s, loss=-0.48003]
Epoch 1:  33%|###3      | 8/24 [00:00<00:00, 66.80it/s, loss=-0.46829]
Epoch 1:  38%|###7      | 9/24 [00:00<00:00, 66.80it/s, loss=-0.47158]
Epoch 1:  42%|####1     | 10/24 [00:00<00:00, 66.80it/s, loss=-0.47445]
Epoch 1:  46%|####5     | 11/24 [00:00<00:00, 66.80it/s, loss=-0.47606]
Epoch 1:  50%|#####     | 12/24 [00:00<00:00, 66.80it/s, loss=-0.49378]
Epoch 1:  54%|#####4    | 13/24 [00:00<00:00, 66.80it/s, loss=-0.51729]
Epoch 1:  58%|#####8    | 14/24 [00:00<00:00, 67.44it/s, loss=-0.51729]
Epoch 1:  58%|#####8    | 14/24 [00:00<00:00, 67.44it/s, loss=-0.50429]
Epoch 1:  62%|######2   | 15/24 [00:00<00:00, 67.44it/s, loss=-0.50479]
Epoch 1:  67%|######6   | 16/24 [00:00<00:00, 67.44it/s, loss=-0.50125]
Epoch 1:  71%|#######   | 17/24 [00:00<00:00, 67.44it/s, loss=-0.50523]
Epoch 1:  75%|#######5  | 18/24 [00:00<00:00, 67.44it/s, loss=-0.50999]
Epoch 1:  79%|#######9  | 19/24 [00:00<00:00, 67.44it/s, loss=-0.51247]
Epoch 1:  83%|########3 | 20/24 [00:00<00:00, 67.44it/s, loss=-0.52263]
Epoch 1:  88%|########7 | 21/24 [00:00<00:00, 66.82it/s, loss=-0.52263]
Epoch 1:  88%|########7 | 21/24 [00:00<00:00, 66.82it/s, loss=-0.53069]
Epoch 1:  92%|#########1| 22/24 [00:00<00:00, 66.82it/s, loss=-0.53297]
Epoch 1:  96%|#########5| 23/24 [00:00<00:00, 66.82it/s, loss=-0.53296]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 66.82it/s, loss=-0.53542]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 66.82it/s, loss=-0.53542, test_loss=-0.48129]
Epoch 1: 100%|##########| 24/24 [00:00<00:00, 60.58it/s, loss=-0.53542, test_loss=-0.48129]

Epoch 2:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 2:   4%|4         | 1/24 [00:00<00:00, 66.28it/s, loss=-0.63685]
Epoch 2:   8%|8         | 2/24 [00:00<00:00, 55.80it/s, loss=-0.66271]
Epoch 2:  12%|#2        | 3/24 [00:00<00:00, 58.73it/s, loss=-0.65602]
Epoch 2:  17%|#6        | 4/24 [00:00<00:00, 59.84it/s, loss=-0.67055]
Epoch 2:  21%|##        | 5/24 [00:00<00:00, 61.11it/s, loss=-0.66056]
Epoch 2:  25%|##5       | 6/24 [00:00<00:00, 62.09it/s, loss=-0.64627]
Epoch 2:  29%|##9       | 7/24 [00:00<00:00, 62.22it/s, loss=-0.64627]
Epoch 2:  29%|##9       | 7/24 [00:00<00:00, 62.22it/s, loss=-0.65641]
Epoch 2:  33%|###3      | 8/24 [00:00<00:00, 62.22it/s, loss=-0.64307]
Epoch 2:  38%|###7      | 9/24 [00:00<00:00, 62.22it/s, loss=-0.64494]
Epoch 2:  42%|####1     | 10/24 [00:00<00:00, 62.22it/s, loss=-0.64593]
Epoch 2:  46%|####5     | 11/24 [00:00<00:00, 62.22it/s, loss=-0.63736]
Epoch 2:  50%|#####     | 12/24 [00:00<00:00, 62.22it/s, loss=-0.63647]
Epoch 2:  54%|#####4    | 13/24 [00:00<00:00, 62.22it/s, loss=-0.64013]
Epoch 2:  58%|#####8    | 14/24 [00:00<00:00, 62.84it/s, loss=-0.64013]
Epoch 2:  58%|#####8    | 14/24 [00:00<00:00, 62.84it/s, loss=-0.64837]
Epoch 2:  62%|######2   | 15/24 [00:00<00:00, 62.84it/s, loss=-0.65960]
Epoch 2:  67%|######6   | 16/24 [00:00<00:00, 62.84it/s, loss=-0.65934]
Epoch 2:  71%|#######   | 17/24 [00:00<00:00, 62.84it/s, loss=-0.66752]
Epoch 2:  75%|#######5  | 18/24 [00:00<00:00, 62.84it/s, loss=-0.67467]
Epoch 2:  79%|#######9  | 19/24 [00:00<00:00, 62.84it/s, loss=-0.67196]
Epoch 2:  83%|########3 | 20/24 [00:00<00:00, 62.84it/s, loss=-0.67212]
Epoch 2:  88%|########7 | 21/24 [00:00<00:00, 63.21it/s, loss=-0.67212]
Epoch 2:  88%|########7 | 21/24 [00:00<00:00, 63.21it/s, loss=-0.67794]
Epoch 2:  92%|#########1| 22/24 [00:00<00:00, 63.21it/s, loss=-0.67780]
Epoch 2:  96%|#########5| 23/24 [00:00<00:00, 63.21it/s, loss=-0.68390]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 63.21it/s, loss=-0.68696]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 63.21it/s, loss=-0.68696, test_loss=-0.58786]
Epoch 2: 100%|##########| 24/24 [00:00<00:00, 57.39it/s, loss=-0.68696, test_loss=-0.58786]

Epoch 3:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 3:   4%|4         | 1/24 [00:00<00:00, 69.30it/s, loss=-0.80054]
Epoch 3:   8%|8         | 2/24 [00:00<00:00, 70.25it/s, loss=-0.74575]
Epoch 3:  12%|#2        | 3/24 [00:00<00:00, 69.94it/s, loss=-0.71361]
Epoch 3:  17%|#6        | 4/24 [00:00<00:00, 69.18it/s, loss=-0.74840]
Epoch 3:  21%|##        | 5/24 [00:00<00:00, 68.15it/s, loss=-0.76307]
Epoch 3:  25%|##5       | 6/24 [00:00<00:00, 68.51it/s, loss=-0.75697]
Epoch 3:  29%|##9       | 7/24 [00:00<00:00, 68.93it/s, loss=-0.75697]
Epoch 3:  29%|##9       | 7/24 [00:00<00:00, 68.93it/s, loss=-0.75020]
Epoch 3:  33%|###3      | 8/24 [00:00<00:00, 68.93it/s, loss=-0.76123]
Epoch 3:  38%|###7      | 9/24 [00:00<00:00, 68.93it/s, loss=-0.75829]
Epoch 3:  42%|####1     | 10/24 [00:00<00:00, 68.93it/s, loss=-0.75030]
Epoch 3:  46%|####5     | 11/24 [00:00<00:00, 68.93it/s, loss=-0.74389]
Epoch 3:  50%|#####     | 12/24 [00:00<00:00, 68.93it/s, loss=-0.74895]
Epoch 3:  54%|#####4    | 13/24 [00:00<00:00, 68.93it/s, loss=-0.74962]
Epoch 3:  58%|#####8    | 14/24 [00:00<00:00, 68.78it/s, loss=-0.74962]
Epoch 3:  58%|#####8    | 14/24 [00:00<00:00, 68.78it/s, loss=-0.75594]
Epoch 3:  62%|######2   | 15/24 [00:00<00:00, 68.78it/s, loss=-0.74996]
Epoch 3:  67%|######6   | 16/24 [00:00<00:00, 68.78it/s, loss=-0.76037]
Epoch 3:  71%|#######   | 17/24 [00:00<00:00, 68.78it/s, loss=-0.75747]
Epoch 3:  75%|#######5  | 18/24 [00:00<00:00, 68.78it/s, loss=-0.75558]
Epoch 3:  79%|#######9  | 19/24 [00:00<00:00, 68.78it/s, loss=-0.75438]
Epoch 3:  83%|########3 | 20/24 [00:00<00:00, 68.78it/s, loss=-0.75775]
Epoch 3:  88%|########7 | 21/24 [00:00<00:00, 66.46it/s, loss=-0.75775]
Epoch 3:  88%|########7 | 21/24 [00:00<00:00, 66.46it/s, loss=-0.75353]
Epoch 3:  92%|#########1| 22/24 [00:00<00:00, 66.46it/s, loss=-0.75606]
Epoch 3:  96%|#########5| 23/24 [00:00<00:00, 66.46it/s, loss=-0.75838]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 66.46it/s, loss=-0.74774]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 66.46it/s, loss=-0.74774, test_loss=-0.63107]
Epoch 3: 100%|##########| 24/24 [00:00<00:00, 59.57it/s, loss=-0.74774, test_loss=-0.63107]

Epoch 4:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 4:   4%|4         | 1/24 [00:00<00:00, 68.34it/s, loss=-0.86807]
Epoch 4:   8%|8         | 2/24 [00:00<00:00, 69.20it/s, loss=-0.83849]
Epoch 4:  12%|#2        | 3/24 [00:00<00:00, 69.63it/s, loss=-0.81478]
Epoch 4:  17%|#6        | 4/24 [00:00<00:00, 69.61it/s, loss=-0.80963]
Epoch 4:  21%|##        | 5/24 [00:00<00:00, 69.77it/s, loss=-0.81812]
Epoch 4:  25%|##5       | 6/24 [00:00<00:00, 70.09it/s, loss=-0.80224]
Epoch 4:  29%|##9       | 7/24 [00:00<00:00, 70.37it/s, loss=-0.80415]
Epoch 4:  33%|###3      | 8/24 [00:00<00:00, 70.28it/s, loss=-0.80415]
Epoch 4:  33%|###3      | 8/24 [00:00<00:00, 70.28it/s, loss=-0.79150]
Epoch 4:  38%|###7      | 9/24 [00:00<00:00, 70.28it/s, loss=-0.79227]
Epoch 4:  42%|####1     | 10/24 [00:00<00:00, 70.28it/s, loss=-0.78451]
Epoch 4:  46%|####5     | 11/24 [00:00<00:00, 70.28it/s, loss=-0.79479]
Epoch 4:  50%|#####     | 12/24 [00:00<00:00, 70.28it/s, loss=-0.79715]
Epoch 4:  54%|#####4    | 13/24 [00:00<00:00, 70.28it/s, loss=-0.78662]
Epoch 4:  58%|#####8    | 14/24 [00:00<00:00, 70.28it/s, loss=-0.79701]
Epoch 4:  62%|######2   | 15/24 [00:00<00:00, 69.97it/s, loss=-0.79701]
Epoch 4:  62%|######2   | 15/24 [00:00<00:00, 69.97it/s, loss=-0.79984]
Epoch 4:  67%|######6   | 16/24 [00:00<00:00, 69.97it/s, loss=-0.80330]
Epoch 4:  71%|#######   | 17/24 [00:00<00:00, 69.97it/s, loss=-0.80924]
Epoch 4:  75%|#######5  | 18/24 [00:00<00:00, 69.97it/s, loss=-0.80052]
Epoch 4:  79%|#######9  | 19/24 [00:00<00:00, 69.97it/s, loss=-0.80117]
Epoch 4:  83%|########3 | 20/24 [00:00<00:00, 69.97it/s, loss=-0.79779]
Epoch 4:  88%|########7 | 21/24 [00:00<00:00, 69.97it/s, loss=-0.79405]
Epoch 4:  92%|#########1| 22/24 [00:00<00:00, 69.23it/s, loss=-0.79405]
Epoch 4:  92%|#########1| 22/24 [00:00<00:00, 69.23it/s, loss=-0.79270]
Epoch 4:  96%|#########5| 23/24 [00:00<00:00, 69.23it/s, loss=-0.78483]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 69.23it/s, loss=-0.78377]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 69.23it/s, loss=-0.78377, test_loss=-0.64322]
Epoch 4: 100%|##########| 24/24 [00:00<00:00, 61.37it/s, loss=-0.78377, test_loss=-0.64322]

Epoch 5:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 5:   4%|4         | 1/24 [00:00<00:00, 65.52it/s, loss=-0.88189]
Epoch 5:   8%|8         | 2/24 [00:00<00:00, 66.42it/s, loss=-0.87424]
Epoch 5:  12%|#2        | 3/24 [00:00<00:00, 65.69it/s, loss=-0.87271]
Epoch 5:  17%|#6        | 4/24 [00:00<00:00, 65.89it/s, loss=-0.84462]
Epoch 5:  21%|##        | 5/24 [00:00<00:00, 66.68it/s, loss=-0.82483]
Epoch 5:  25%|##5       | 6/24 [00:00<00:00, 67.12it/s, loss=-0.81341]
Epoch 5:  29%|##9       | 7/24 [00:00<00:00, 67.16it/s, loss=-0.81341]
Epoch 5:  29%|##9       | 7/24 [00:00<00:00, 67.16it/s, loss=-0.82725]
Epoch 5:  33%|###3      | 8/24 [00:00<00:00, 67.16it/s, loss=-0.79128]
Epoch 5:  38%|###7      | 9/24 [00:00<00:00, 67.16it/s, loss=-0.79484]
Epoch 5:  42%|####1     | 10/24 [00:00<00:00, 67.16it/s, loss=-0.78512]
Epoch 5:  46%|####5     | 11/24 [00:00<00:00, 67.16it/s, loss=-0.77860]
Epoch 5:  50%|#####     | 12/24 [00:00<00:00, 67.16it/s, loss=-0.79522]
Epoch 5:  54%|#####4    | 13/24 [00:00<00:00, 67.16it/s, loss=-0.79623]
Epoch 5:  58%|#####8    | 14/24 [00:00<00:00, 67.96it/s, loss=-0.79623]
Epoch 5:  58%|#####8    | 14/24 [00:00<00:00, 67.96it/s, loss=-0.80099]
Epoch 5:  62%|######2   | 15/24 [00:00<00:00, 67.96it/s, loss=-0.80306]
Epoch 5:  67%|######6   | 16/24 [00:00<00:00, 67.96it/s, loss=-0.80648]
Epoch 5:  71%|#######   | 17/24 [00:00<00:00, 67.96it/s, loss=-0.80719]
Epoch 5:  75%|#######5  | 18/24 [00:00<00:00, 67.96it/s, loss=-0.80971]
Epoch 5:  79%|#######9  | 19/24 [00:00<00:00, 67.96it/s, loss=-0.80875]
Epoch 5:  83%|########3 | 20/24 [00:00<00:00, 67.96it/s, loss=-0.80510]
Epoch 5:  88%|########7 | 21/24 [00:00<00:00, 68.47it/s, loss=-0.80510]
Epoch 5:  88%|########7 | 21/24 [00:00<00:00, 68.47it/s, loss=-0.80474]
Epoch 5:  92%|#########1| 22/24 [00:00<00:00, 68.47it/s, loss=-0.80982]
Epoch 5:  96%|#########5| 23/24 [00:00<00:00, 68.47it/s, loss=-0.80574]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 68.47it/s, loss=-0.80707]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 68.47it/s, loss=-0.80707, test_loss=-0.69094]
Epoch 5: 100%|##########| 24/24 [00:00<00:00, 62.28it/s, loss=-0.80707, test_loss=-0.69094]

Epoch 6:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 6:   4%|4         | 1/24 [00:00<00:00, 69.48it/s, loss=-0.83203]
Epoch 6:   8%|8         | 2/24 [00:00<00:00, 70.23it/s, loss=-0.81940]
Epoch 6:  12%|#2        | 3/24 [00:00<00:00, 70.56it/s, loss=-0.87350]
Epoch 6:  17%|#6        | 4/24 [00:00<00:00, 70.51it/s, loss=-0.85572]
Epoch 6:  21%|##        | 5/24 [00:00<00:00, 70.06it/s, loss=-0.83401]
Epoch 6:  25%|##5       | 6/24 [00:00<00:00, 69.45it/s, loss=-0.85978]
Epoch 6:  29%|##9       | 7/24 [00:00<00:00, 69.76it/s, loss=-0.85978]
Epoch 6:  29%|##9       | 7/24 [00:00<00:00, 69.76it/s, loss=-0.83418]
Epoch 6:  33%|###3      | 8/24 [00:00<00:00, 69.76it/s, loss=-0.85117]
Epoch 6:  38%|###7      | 9/24 [00:00<00:00, 69.76it/s, loss=-0.83326]
Epoch 6:  42%|####1     | 10/24 [00:00<00:00, 69.76it/s, loss=-0.82912]
Epoch 6:  46%|####5     | 11/24 [00:00<00:00, 69.76it/s, loss=-0.84131]
Epoch 6:  50%|#####     | 12/24 [00:00<00:00, 69.76it/s, loss=-0.83978]
Epoch 6:  54%|#####4    | 13/24 [00:00<00:00, 69.76it/s, loss=-0.84942]
Epoch 6:  58%|#####8    | 14/24 [00:00<00:00, 66.97it/s, loss=-0.84942]
Epoch 6:  58%|#####8    | 14/24 [00:00<00:00, 66.97it/s, loss=-0.84104]
Epoch 6:  62%|######2   | 15/24 [00:00<00:00, 66.97it/s, loss=-0.83066]
Epoch 6:  67%|######6   | 16/24 [00:00<00:00, 66.97it/s, loss=-0.83331]
Epoch 6:  71%|#######   | 17/24 [00:00<00:00, 66.97it/s, loss=-0.83270]
Epoch 6:  75%|#######5  | 18/24 [00:00<00:00, 66.97it/s, loss=-0.82180]
Epoch 6:  79%|#######9  | 19/24 [00:00<00:00, 66.97it/s, loss=-0.81733]
Epoch 6:  83%|########3 | 20/24 [00:00<00:00, 66.97it/s, loss=-0.81575]
Epoch 6:  88%|########7 | 21/24 [00:00<00:00, 66.97it/s, loss=-0.81119]
Epoch 6:  92%|#########1| 22/24 [00:00<00:00, 68.08it/s, loss=-0.81119]
Epoch 6:  92%|#########1| 22/24 [00:00<00:00, 68.08it/s, loss=-0.81353]
Epoch 6:  96%|#########5| 23/24 [00:00<00:00, 68.08it/s, loss=-0.80488]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 68.08it/s, loss=-0.81209]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 68.08it/s, loss=-0.81209, test_loss=-0.66786]
Epoch 6: 100%|##########| 24/24 [00:00<00:00, 61.07it/s, loss=-0.81209, test_loss=-0.66786]

Epoch 7:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 7:   4%|4         | 1/24 [00:00<00:00, 66.48it/s, loss=-0.85642]
Epoch 7:   8%|8         | 2/24 [00:00<00:00, 68.19it/s, loss=-0.84402]
Epoch 7:  12%|#2        | 3/24 [00:00<00:00, 68.39it/s, loss=-0.85857]
Epoch 7:  17%|#6        | 4/24 [00:00<00:00, 68.61it/s, loss=-0.86925]
Epoch 7:  21%|##        | 5/24 [00:00<00:00, 69.46it/s, loss=-0.87888]
Epoch 7:  25%|##5       | 6/24 [00:00<00:00, 69.68it/s, loss=-0.86318]
Epoch 7:  29%|##9       | 7/24 [00:00<00:00, 69.97it/s, loss=-0.86318]
Epoch 7:  29%|##9       | 7/24 [00:00<00:00, 69.97it/s, loss=-0.84015]
Epoch 7:  33%|###3      | 8/24 [00:00<00:00, 69.97it/s, loss=-0.83096]
Epoch 7:  38%|###7      | 9/24 [00:00<00:00, 69.97it/s, loss=-0.82342]
Epoch 7:  42%|####1     | 10/24 [00:00<00:00, 69.97it/s, loss=-0.80503]
Epoch 7:  46%|####5     | 11/24 [00:00<00:00, 69.97it/s, loss=-0.79766]
Epoch 7:  50%|#####     | 12/24 [00:00<00:00, 69.97it/s, loss=-0.79035]
Epoch 7:  54%|#####4    | 13/24 [00:00<00:00, 69.97it/s, loss=-0.78803]
Epoch 7:  58%|#####8    | 14/24 [00:00<00:00, 68.83it/s, loss=-0.78803]
Epoch 7:  58%|#####8    | 14/24 [00:00<00:00, 68.83it/s, loss=-0.78710]
Epoch 7:  62%|######2   | 15/24 [00:00<00:00, 68.83it/s, loss=-0.78229]
Epoch 7:  67%|######6   | 16/24 [00:00<00:00, 68.83it/s, loss=-0.78392]
Epoch 7:  71%|#######   | 17/24 [00:00<00:00, 68.83it/s, loss=-0.78518]
Epoch 7:  75%|#######5  | 18/24 [00:00<00:00, 68.83it/s, loss=-0.79814]
Epoch 7:  79%|#######9  | 19/24 [00:00<00:00, 68.83it/s, loss=-0.80647]
Epoch 7:  83%|########3 | 20/24 [00:00<00:00, 68.83it/s, loss=-0.80511]
Epoch 7:  88%|########7 | 21/24 [00:00<00:00, 68.60it/s, loss=-0.80511]
Epoch 7:  88%|########7 | 21/24 [00:00<00:00, 68.60it/s, loss=-0.81377]
Epoch 7:  92%|#########1| 22/24 [00:00<00:00, 68.60it/s, loss=-0.81610]
Epoch 7:  96%|#########5| 23/24 [00:00<00:00, 68.60it/s, loss=-0.81771]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 68.60it/s, loss=-0.81451]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 68.60it/s, loss=-0.81451, test_loss=-0.68186]
Epoch 7: 100%|##########| 24/24 [00:00<00:00, 61.82it/s, loss=-0.81451, test_loss=-0.68186]

Epoch 8:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 8:   4%|4         | 1/24 [00:00<00:00, 68.30it/s, loss=-0.86933]
Epoch 8:   8%|8         | 2/24 [00:00<00:00, 69.32it/s, loss=-0.85621]
Epoch 8:  12%|#2        | 3/24 [00:00<00:00, 70.03it/s, loss=-0.84640]
Epoch 8:  17%|#6        | 4/24 [00:00<00:00, 69.83it/s, loss=-0.83246]
Epoch 8:  21%|##        | 5/24 [00:00<00:00, 70.01it/s, loss=-0.84346]
Epoch 8:  25%|##5       | 6/24 [00:00<00:00, 70.17it/s, loss=-0.83299]
Epoch 8:  29%|##9       | 7/24 [00:00<00:00, 70.48it/s, loss=-0.82435]
Epoch 8:  33%|###3      | 8/24 [00:00<00:00, 70.60it/s, loss=-0.82435]
Epoch 8:  33%|###3      | 8/24 [00:00<00:00, 70.60it/s, loss=-0.80710]
Epoch 8:  38%|###7      | 9/24 [00:00<00:00, 70.60it/s, loss=-0.80352]
Epoch 8:  42%|####1     | 10/24 [00:00<00:00, 70.60it/s, loss=-0.81391]
Epoch 8:  46%|####5     | 11/24 [00:00<00:00, 70.60it/s, loss=-0.82201]
Epoch 8:  50%|#####     | 12/24 [00:00<00:00, 70.60it/s, loss=-0.83140]
Epoch 8:  54%|#####4    | 13/24 [00:00<00:00, 70.60it/s, loss=-0.83273]
Epoch 8:  58%|#####8    | 14/24 [00:00<00:00, 70.60it/s, loss=-0.82876]
Epoch 8:  62%|######2   | 15/24 [00:00<00:00, 70.60it/s, loss=-0.83208]
Epoch 8:  67%|######6   | 16/24 [00:00<00:00, 70.55it/s, loss=-0.83208]
Epoch 8:  67%|######6   | 16/24 [00:00<00:00, 70.55it/s, loss=-0.83699]
Epoch 8:  71%|#######   | 17/24 [00:00<00:00, 70.55it/s, loss=-0.83785]
Epoch 8:  75%|#######5  | 18/24 [00:00<00:00, 70.55it/s, loss=-0.83556]
Epoch 8:  79%|#######9  | 19/24 [00:00<00:00, 70.55it/s, loss=-0.83209]
Epoch 8:  83%|########3 | 20/24 [00:00<00:00, 70.55it/s, loss=-0.83313]
Epoch 8:  88%|########7 | 21/24 [00:00<00:00, 70.55it/s, loss=-0.83768]
Epoch 8:  92%|#########1| 22/24 [00:00<00:00, 70.55it/s, loss=-0.83440]
Epoch 8:  96%|#########5| 23/24 [00:00<00:00, 69.44it/s, loss=-0.83440]
Epoch 8:  96%|#########5| 23/24 [00:00<00:00, 69.44it/s, loss=-0.83870]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 69.44it/s, loss=-0.83431]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 69.44it/s, loss=-0.83431, test_loss=-0.67144]
Epoch 8: 100%|##########| 24/24 [00:00<00:00, 62.30it/s, loss=-0.83431, test_loss=-0.67144]

Epoch 9:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 9:   4%|4         | 1/24 [00:00<00:00, 69.46it/s, loss=-0.79191]
Epoch 9:   8%|8         | 2/24 [00:00<00:00, 68.93it/s, loss=-0.77772]
Epoch 9:  12%|#2        | 3/24 [00:00<00:00, 68.89it/s, loss=-0.78070]
Epoch 9:  17%|#6        | 4/24 [00:00<00:00, 65.59it/s, loss=-0.76701]
Epoch 9:  21%|##        | 5/24 [00:00<00:00, 65.42it/s, loss=-0.79712]
Epoch 9:  25%|##5       | 6/24 [00:00<00:00, 66.03it/s, loss=-0.80524]
Epoch 9:  29%|##9       | 7/24 [00:00<00:00, 66.42it/s, loss=-0.80524]
Epoch 9:  29%|##9       | 7/24 [00:00<00:00, 66.42it/s, loss=-0.79695]
Epoch 9:  33%|###3      | 8/24 [00:00<00:00, 66.42it/s, loss=-0.79821]
Epoch 9:  38%|###7      | 9/24 [00:00<00:00, 66.42it/s, loss=-0.80101]
Epoch 9:  42%|####1     | 10/24 [00:00<00:00, 66.42it/s, loss=-0.80398]
Epoch 9:  46%|####5     | 11/24 [00:00<00:00, 66.42it/s, loss=-0.82079]
Epoch 9:  50%|#####     | 12/24 [00:00<00:00, 66.42it/s, loss=-0.81453]
Epoch 9:  54%|#####4    | 13/24 [00:00<00:00, 66.42it/s, loss=-0.81156]
Epoch 9:  58%|#####8    | 14/24 [00:00<00:00, 67.34it/s, loss=-0.81156]
Epoch 9:  58%|#####8    | 14/24 [00:00<00:00, 67.34it/s, loss=-0.81365]
Epoch 9:  62%|######2   | 15/24 [00:00<00:00, 67.34it/s, loss=-0.82175]
Epoch 9:  67%|######6   | 16/24 [00:00<00:00, 67.34it/s, loss=-0.82483]
Epoch 9:  71%|#######   | 17/24 [00:00<00:00, 67.34it/s, loss=-0.82958]
Epoch 9:  75%|#######5  | 18/24 [00:00<00:00, 67.34it/s, loss=-0.83229]
Epoch 9:  79%|#######9  | 19/24 [00:00<00:00, 67.34it/s, loss=-0.84071]
Epoch 9:  83%|########3 | 20/24 [00:00<00:00, 67.34it/s, loss=-0.84318]
Epoch 9:  88%|########7 | 21/24 [00:00<00:00, 67.34it/s, loss=-0.83961]
Epoch 9:  92%|#########1| 22/24 [00:00<00:00, 68.22it/s, loss=-0.83961]
Epoch 9:  92%|#########1| 22/24 [00:00<00:00, 68.22it/s, loss=-0.83631]
Epoch 9:  96%|#########5| 23/24 [00:00<00:00, 68.22it/s, loss=-0.83299]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 68.22it/s, loss=-0.83320]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 68.22it/s, loss=-0.83320, test_loss=-0.72668]
Epoch 9: 100%|##########| 24/24 [00:00<00:00, 62.11it/s, loss=-0.83320, test_loss=-0.72668]

Epoch 10:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 10:   4%|4         | 1/24 [00:00<00:00, 65.37it/s, loss=-0.82439]
Epoch 10:   8%|8         | 2/24 [00:00<00:00, 65.51it/s, loss=-0.77286]
Epoch 10:  12%|#2        | 3/24 [00:00<00:00, 66.90it/s, loss=-0.81231]
Epoch 10:  17%|#6        | 4/24 [00:00<00:00, 62.74it/s, loss=-0.83180]
Epoch 10:  21%|##        | 5/24 [00:00<00:00, 63.68it/s, loss=-0.83756]
Epoch 10:  25%|##5       | 6/24 [00:00<00:00, 64.42it/s, loss=-0.83438]
Epoch 10:  29%|##9       | 7/24 [00:00<00:00, 65.04it/s, loss=-0.83438]
Epoch 10:  29%|##9       | 7/24 [00:00<00:00, 65.04it/s, loss=-0.83463]
Epoch 10:  33%|###3      | 8/24 [00:00<00:00, 65.04it/s, loss=-0.84336]
Epoch 10:  38%|###7      | 9/24 [00:00<00:00, 65.04it/s, loss=-0.85225]
Epoch 10:  42%|####1     | 10/24 [00:00<00:00, 65.04it/s, loss=-0.85093]
Epoch 10:  46%|####5     | 11/24 [00:00<00:00, 65.04it/s, loss=-0.85595]
Epoch 10:  50%|#####     | 12/24 [00:00<00:00, 65.04it/s, loss=-0.85448]
Epoch 10:  54%|#####4    | 13/24 [00:00<00:00, 62.87it/s, loss=-0.85448]
Epoch 10:  54%|#####4    | 13/24 [00:00<00:00, 62.87it/s, loss=-0.83489]
Epoch 10:  58%|#####8    | 14/24 [00:00<00:00, 62.87it/s, loss=-0.81962]
Epoch 10:  62%|######2   | 15/24 [00:00<00:00, 62.87it/s, loss=-0.82080]
Epoch 10:  67%|######6   | 16/24 [00:00<00:00, 62.87it/s, loss=-0.81928]
Epoch 10:  71%|#######   | 17/24 [00:00<00:00, 62.87it/s, loss=-0.83036]
Epoch 10:  75%|#######5  | 18/24 [00:00<00:00, 62.87it/s, loss=-0.82398]
Epoch 10:  79%|#######9  | 19/24 [00:00<00:00, 62.87it/s, loss=-0.82362]
Epoch 10:  83%|########3 | 20/24 [00:00<00:00, 64.74it/s, loss=-0.82362]
Epoch 10:  83%|########3 | 20/24 [00:00<00:00, 64.74it/s, loss=-0.82474]
Epoch 10:  88%|########7 | 21/24 [00:00<00:00, 64.74it/s, loss=-0.82715]
Epoch 10:  92%|#########1| 22/24 [00:00<00:00, 64.74it/s, loss=-0.83157]
Epoch 10:  96%|#########5| 23/24 [00:00<00:00, 64.74it/s, loss=-0.83384]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 64.74it/s, loss=-0.83957]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 64.74it/s, loss=-0.83957, test_loss=-0.72519]
Epoch 10: 100%|##########| 24/24 [00:00<00:00, 59.32it/s, loss=-0.83957, test_loss=-0.72519]

Epoch 11:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 11:   4%|4         | 1/24 [00:00<00:00, 69.49it/s, loss=-0.70708]
Epoch 11:   8%|8         | 2/24 [00:00<00:00, 67.29it/s, loss=-0.75427]
Epoch 11:  12%|#2        | 3/24 [00:00<00:00, 68.71it/s, loss=-0.80446]
Epoch 11:  17%|#6        | 4/24 [00:00<00:00, 69.30it/s, loss=-0.84595]
Epoch 11:  21%|##        | 5/24 [00:00<00:00, 68.86it/s, loss=-0.83538]
Epoch 11:  25%|##5       | 6/24 [00:00<00:00, 68.81it/s, loss=-0.82663]
Epoch 11:  29%|##9       | 7/24 [00:00<00:00, 68.60it/s, loss=-0.82663]
Epoch 11:  29%|##9       | 7/24 [00:00<00:00, 68.60it/s, loss=-0.83059]
Epoch 11:  33%|###3      | 8/24 [00:00<00:00, 68.60it/s, loss=-0.83646]
Epoch 11:  38%|###7      | 9/24 [00:00<00:00, 68.60it/s, loss=-0.83271]
Epoch 11:  42%|####1     | 10/24 [00:00<00:00, 68.60it/s, loss=-0.83045]
Epoch 11:  46%|####5     | 11/24 [00:00<00:00, 68.60it/s, loss=-0.84215]
Epoch 11:  50%|#####     | 12/24 [00:00<00:00, 68.60it/s, loss=-0.83878]
Epoch 11:  54%|#####4    | 13/24 [00:00<00:00, 68.60it/s, loss=-0.84773]
Epoch 11:  58%|#####8    | 14/24 [00:00<00:00, 68.75it/s, loss=-0.84773]
Epoch 11:  58%|#####8    | 14/24 [00:00<00:00, 68.75it/s, loss=-0.84903]
Epoch 11:  62%|######2   | 15/24 [00:00<00:00, 68.75it/s, loss=-0.85134]
Epoch 11:  67%|######6   | 16/24 [00:00<00:00, 68.75it/s, loss=-0.85660]
Epoch 11:  71%|#######   | 17/24 [00:00<00:00, 68.75it/s, loss=-0.85720]
Epoch 11:  75%|#######5  | 18/24 [00:00<00:00, 68.75it/s, loss=-0.85967]
Epoch 11:  79%|#######9  | 19/24 [00:00<00:00, 68.75it/s, loss=-0.86147]
Epoch 11:  83%|########3 | 20/24 [00:00<00:00, 68.75it/s, loss=-0.85709]
Epoch 11:  88%|########7 | 21/24 [00:00<00:00, 68.69it/s, loss=-0.85709]
Epoch 11:  88%|########7 | 21/24 [00:00<00:00, 68.69it/s, loss=-0.86530]
Epoch 11:  92%|#########1| 22/24 [00:00<00:00, 68.69it/s, loss=-0.85852]
Epoch 11:  96%|#########5| 23/24 [00:00<00:00, 68.69it/s, loss=-0.85948]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 68.69it/s, loss=-0.85705]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 68.69it/s, loss=-0.85705, test_loss=-0.74037]
Epoch 11: 100%|##########| 24/24 [00:00<00:00, 62.24it/s, loss=-0.85705, test_loss=-0.74037]

Epoch 12:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 12:   4%|4         | 1/24 [00:00<00:00, 67.37it/s, loss=-0.85252]
Epoch 12:   8%|8         | 2/24 [00:00<00:00, 69.03it/s, loss=-0.85557]
Epoch 12:  12%|#2        | 3/24 [00:00<00:00, 69.88it/s, loss=-0.84299]
Epoch 12:  17%|#6        | 4/24 [00:00<00:00, 70.28it/s, loss=-0.81529]
Epoch 12:  21%|##        | 5/24 [00:00<00:00, 70.42it/s, loss=-0.81362]
Epoch 12:  25%|##5       | 6/24 [00:00<00:00, 70.27it/s, loss=-0.80995]
Epoch 12:  29%|##9       | 7/24 [00:00<00:00, 70.37it/s, loss=-0.81586]
Epoch 12:  33%|###3      | 8/24 [00:00<00:00, 70.47it/s, loss=-0.81586]
Epoch 12:  33%|###3      | 8/24 [00:00<00:00, 70.47it/s, loss=-0.85413]
Epoch 12:  38%|###7      | 9/24 [00:00<00:00, 70.47it/s, loss=-0.85824]
Epoch 12:  42%|####1     | 10/24 [00:00<00:00, 70.47it/s, loss=-0.85783]
Epoch 12:  46%|####5     | 11/24 [00:00<00:00, 70.47it/s, loss=-0.84243]
Epoch 12:  50%|#####     | 12/24 [00:00<00:00, 70.47it/s, loss=-0.85012]
Epoch 12:  54%|#####4    | 13/24 [00:00<00:00, 70.47it/s, loss=-0.83987]
Epoch 12:  58%|#####8    | 14/24 [00:00<00:00, 70.47it/s, loss=-0.84557]
Epoch 12:  62%|######2   | 15/24 [00:00<00:00, 69.23it/s, loss=-0.84557]
Epoch 12:  62%|######2   | 15/24 [00:00<00:00, 69.23it/s, loss=-0.84927]
Epoch 12:  67%|######6   | 16/24 [00:00<00:00, 69.23it/s, loss=-0.84871]
Epoch 12:  71%|#######   | 17/24 [00:00<00:00, 69.23it/s, loss=-0.85519]
Epoch 12:  75%|#######5  | 18/24 [00:00<00:00, 69.23it/s, loss=-0.84970]
Epoch 12:  79%|#######9  | 19/24 [00:00<00:00, 69.23it/s, loss=-0.84981]
Epoch 12:  83%|########3 | 20/24 [00:00<00:00, 69.23it/s, loss=-0.85131]
Epoch 12:  88%|########7 | 21/24 [00:00<00:00, 69.23it/s, loss=-0.84654]
Epoch 12:  92%|#########1| 22/24 [00:00<00:00, 68.03it/s, loss=-0.84654]
Epoch 12:  92%|#########1| 22/24 [00:00<00:00, 68.03it/s, loss=-0.84407]
Epoch 12:  96%|#########5| 23/24 [00:00<00:00, 68.03it/s, loss=-0.84200]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 68.03it/s, loss=-0.83475]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 68.03it/s, loss=-0.83475, test_loss=-0.70057]
Epoch 12: 100%|##########| 24/24 [00:00<00:00, 60.63it/s, loss=-0.83475, test_loss=-0.70057]

Epoch 13:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 13:   4%|4         | 1/24 [00:00<00:00, 65.09it/s, loss=-0.90849]
Epoch 13:   8%|8         | 2/24 [00:00<00:00, 65.45it/s, loss=-0.90743]
Epoch 13:  12%|#2        | 3/24 [00:00<00:00, 65.52it/s, loss=-0.92555]
Epoch 13:  17%|#6        | 4/24 [00:00<00:00, 65.64it/s, loss=-0.89607]
Epoch 13:  21%|##        | 5/24 [00:00<00:00, 66.16it/s, loss=-0.85480]
Epoch 13:  25%|##5       | 6/24 [00:00<00:00, 66.41it/s, loss=-0.85423]
Epoch 13:  29%|##9       | 7/24 [00:00<00:00, 64.09it/s, loss=-0.85423]
Epoch 13:  29%|##9       | 7/24 [00:00<00:00, 64.09it/s, loss=-0.82619]
Epoch 13:  33%|###3      | 8/24 [00:00<00:00, 64.09it/s, loss=-0.82333]
Epoch 13:  38%|###7      | 9/24 [00:00<00:00, 64.09it/s, loss=-0.84134]
Epoch 13:  42%|####1     | 10/24 [00:00<00:00, 64.09it/s, loss=-0.84743]
Epoch 13:  46%|####5     | 11/24 [00:00<00:00, 64.09it/s, loss=-0.84151]
Epoch 13:  50%|#####     | 12/24 [00:00<00:00, 64.09it/s, loss=-0.84379]
Epoch 13:  54%|#####4    | 13/24 [00:00<00:00, 64.09it/s, loss=-0.85223]
Epoch 13:  58%|#####8    | 14/24 [00:00<00:00, 64.09it/s, loss=-0.85140]
Epoch 13:  62%|######2   | 15/24 [00:00<00:00, 65.89it/s, loss=-0.85140]
Epoch 13:  62%|######2   | 15/24 [00:00<00:00, 65.89it/s, loss=-0.84576]
Epoch 13:  67%|######6   | 16/24 [00:00<00:00, 65.89it/s, loss=-0.83861]
Epoch 13:  71%|#######   | 17/24 [00:00<00:00, 65.89it/s, loss=-0.84076]
Epoch 13:  75%|#######5  | 18/24 [00:00<00:00, 65.89it/s, loss=-0.84018]
Epoch 13:  79%|#######9  | 19/24 [00:00<00:00, 65.89it/s, loss=-0.83614]
Epoch 13:  83%|########3 | 20/24 [00:00<00:00, 65.89it/s, loss=-0.84279]
Epoch 13:  88%|########7 | 21/24 [00:00<00:00, 65.89it/s, loss=-0.84326]
Epoch 13:  92%|#########1| 22/24 [00:00<00:00, 66.73it/s, loss=-0.84326]
Epoch 13:  92%|#########1| 22/24 [00:00<00:00, 66.73it/s, loss=-0.85342]
Epoch 13:  96%|#########5| 23/24 [00:00<00:00, 66.73it/s, loss=-0.85289]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 66.73it/s, loss=-0.86633]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 66.73it/s, loss=-0.86633, test_loss=-0.73864]
Epoch 13: 100%|##########| 24/24 [00:00<00:00, 61.21it/s, loss=-0.86633, test_loss=-0.73864]

Epoch 14:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 14:   4%|4         | 1/24 [00:00<00:00, 65.38it/s, loss=-0.90291]
Epoch 14:   8%|8         | 2/24 [00:00<00:00, 67.85it/s, loss=-0.94003]
Epoch 14:  12%|#2        | 3/24 [00:00<00:00, 69.48it/s, loss=-0.91967]
Epoch 14:  17%|#6        | 4/24 [00:00<00:00, 69.65it/s, loss=-0.88644]
Epoch 14:  21%|##        | 5/24 [00:00<00:00, 69.60it/s, loss=-0.87052]
Epoch 14:  25%|##5       | 6/24 [00:00<00:00, 69.10it/s, loss=-0.87153]
Epoch 14:  29%|##9       | 7/24 [00:00<00:00, 69.31it/s, loss=-0.87153]
Epoch 14:  29%|##9       | 7/24 [00:00<00:00, 69.31it/s, loss=-0.86466]
Epoch 14:  33%|###3      | 8/24 [00:00<00:00, 69.31it/s, loss=-0.86494]
Epoch 14:  38%|###7      | 9/24 [00:00<00:00, 69.31it/s, loss=-0.85018]
Epoch 14:  42%|####1     | 10/24 [00:00<00:00, 69.31it/s, loss=-0.86174]
Epoch 14:  46%|####5     | 11/24 [00:00<00:00, 69.31it/s, loss=-0.86503]
Epoch 14:  50%|#####     | 12/24 [00:00<00:00, 69.31it/s, loss=-0.85372]
Epoch 14:  54%|#####4    | 13/24 [00:00<00:00, 69.31it/s, loss=-0.85142]
Epoch 14:  58%|#####8    | 14/24 [00:00<00:00, 69.39it/s, loss=-0.85142]
Epoch 14:  58%|#####8    | 14/24 [00:00<00:00, 69.39it/s, loss=-0.86194]
Epoch 14:  62%|######2   | 15/24 [00:00<00:00, 69.39it/s, loss=-0.86184]
Epoch 14:  67%|######6   | 16/24 [00:00<00:00, 69.39it/s, loss=-0.86513]
Epoch 14:  71%|#######   | 17/24 [00:00<00:00, 69.39it/s, loss=-0.85983]
Epoch 14:  75%|#######5  | 18/24 [00:00<00:00, 69.39it/s, loss=-0.86203]
Epoch 14:  79%|#######9  | 19/24 [00:00<00:00, 69.39it/s, loss=-0.86959]
Epoch 14:  83%|########3 | 20/24 [00:00<00:00, 69.39it/s, loss=-0.86548]
Epoch 14:  88%|########7 | 21/24 [00:00<00:00, 69.39it/s, loss=-0.86548]
Epoch 14:  88%|########7 | 21/24 [00:00<00:00, 69.39it/s, loss=-0.86574]
Epoch 14:  92%|#########1| 22/24 [00:00<00:00, 69.39it/s, loss=-0.86495]
Epoch 14:  96%|#########5| 23/24 [00:00<00:00, 69.39it/s, loss=-0.86406]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 69.39it/s, loss=-0.86312]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 69.39it/s, loss=-0.86312, test_loss=-0.72876]
Epoch 14: 100%|##########| 24/24 [00:00<00:00, 61.76it/s, loss=-0.86312, test_loss=-0.72876]

Epoch 15:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 15:   4%|4         | 1/24 [00:00<00:00, 67.48it/s, loss=-0.82690]
Epoch 15:   8%|8         | 2/24 [00:00<00:00, 67.77it/s, loss=-0.76751]
Epoch 15:  12%|#2        | 3/24 [00:00<00:00, 65.58it/s, loss=-0.74638]
Epoch 15:  17%|#6        | 4/24 [00:00<00:00, 64.97it/s, loss=-0.79307]
Epoch 15:  21%|##        | 5/24 [00:00<00:00, 65.41it/s, loss=-0.79000]
Epoch 15:  25%|##5       | 6/24 [00:00<00:00, 64.49it/s, loss=-0.82629]
Epoch 15:  29%|##9       | 7/24 [00:00<00:00, 65.21it/s, loss=-0.82629]
Epoch 15:  29%|##9       | 7/24 [00:00<00:00, 65.21it/s, loss=-0.83839]
Epoch 15:  33%|###3      | 8/24 [00:00<00:00, 65.21it/s, loss=-0.86510]
Epoch 15:  38%|###7      | 9/24 [00:00<00:00, 65.21it/s, loss=-0.87192]
Epoch 15:  42%|####1     | 10/24 [00:00<00:00, 65.21it/s, loss=-0.87625]
Epoch 15:  46%|####5     | 11/24 [00:00<00:00, 65.21it/s, loss=-0.88456]
Epoch 15:  50%|#####     | 12/24 [00:00<00:00, 65.21it/s, loss=-0.88194]
Epoch 15:  54%|#####4    | 13/24 [00:00<00:00, 65.21it/s, loss=-0.87700]
Epoch 15:  58%|#####8    | 14/24 [00:00<00:00, 65.21it/s, loss=-0.88739]
Epoch 15:  62%|######2   | 15/24 [00:00<00:00, 67.13it/s, loss=-0.88739]
Epoch 15:  62%|######2   | 15/24 [00:00<00:00, 67.13it/s, loss=-0.88380]
Epoch 15:  67%|######6   | 16/24 [00:00<00:00, 67.13it/s, loss=-0.88901]
Epoch 15:  71%|#######   | 17/24 [00:00<00:00, 67.13it/s, loss=-0.88182]
Epoch 15:  75%|#######5  | 18/24 [00:00<00:00, 67.13it/s, loss=-0.87549]
Epoch 15:  79%|#######9  | 19/24 [00:00<00:00, 67.13it/s, loss=-0.88060]
Epoch 15:  83%|########3 | 20/24 [00:00<00:00, 67.13it/s, loss=-0.87932]
Epoch 15:  88%|########7 | 21/24 [00:00<00:00, 67.13it/s, loss=-0.87973]
Epoch 15:  92%|#########1| 22/24 [00:00<00:00, 67.21it/s, loss=-0.87973]
Epoch 15:  92%|#########1| 22/24 [00:00<00:00, 67.21it/s, loss=-0.87266]
Epoch 15:  96%|#########5| 23/24 [00:00<00:00, 67.21it/s, loss=-0.86976]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 67.21it/s, loss=-0.86605]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 67.21it/s, loss=-0.86605, test_loss=-0.72969]
Epoch 15: 100%|##########| 24/24 [00:00<00:00, 60.27it/s, loss=-0.86605, test_loss=-0.72969]

Epoch 16:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 16:   4%|4         | 1/24 [00:00<00:00, 71.80it/s, loss=-0.89399]
Epoch 16:   8%|8         | 2/24 [00:00<00:00, 72.04it/s, loss=-0.97679]
Epoch 16:  12%|#2        | 3/24 [00:00<00:00, 72.40it/s, loss=-0.94230]
Epoch 16:  17%|#6        | 4/24 [00:00<00:00, 71.61it/s, loss=-0.91509]
Epoch 16:  21%|##        | 5/24 [00:00<00:00, 71.65it/s, loss=-0.93803]
Epoch 16:  25%|##5       | 6/24 [00:00<00:00, 72.03it/s, loss=-0.93274]
Epoch 16:  29%|##9       | 7/24 [00:00<00:00, 72.18it/s, loss=-0.93720]
Epoch 16:  33%|###3      | 8/24 [00:00<00:00, 72.28it/s, loss=-0.93720]
Epoch 16:  33%|###3      | 8/24 [00:00<00:00, 72.28it/s, loss=-0.91962]
Epoch 16:  38%|###7      | 9/24 [00:00<00:00, 72.28it/s, loss=-0.89937]
Epoch 16:  42%|####1     | 10/24 [00:00<00:00, 72.28it/s, loss=-0.88521]
Epoch 16:  46%|####5     | 11/24 [00:00<00:00, 72.28it/s, loss=-0.88829]
Epoch 16:  50%|#####     | 12/24 [00:00<00:00, 72.28it/s, loss=-0.87774]
Epoch 16:  54%|#####4    | 13/24 [00:00<00:00, 72.28it/s, loss=-0.88530]
Epoch 16:  58%|#####8    | 14/24 [00:00<00:00, 72.28it/s, loss=-0.87841]
Epoch 16:  62%|######2   | 15/24 [00:00<00:00, 71.41it/s, loss=-0.87841]
Epoch 16:  62%|######2   | 15/24 [00:00<00:00, 71.41it/s, loss=-0.88429]
Epoch 16:  67%|######6   | 16/24 [00:00<00:00, 71.41it/s, loss=-0.88782]
Epoch 16:  71%|#######   | 17/24 [00:00<00:00, 71.41it/s, loss=-0.89132]
Epoch 16:  75%|#######5  | 18/24 [00:00<00:00, 71.41it/s, loss=-0.88255]
Epoch 16:  79%|#######9  | 19/24 [00:00<00:00, 71.41it/s, loss=-0.87065]
Epoch 16:  83%|########3 | 20/24 [00:00<00:00, 71.41it/s, loss=-0.86714]
Epoch 16:  88%|########7 | 21/24 [00:00<00:00, 71.41it/s, loss=-0.87444]
Epoch 16:  92%|#########1| 22/24 [00:00<00:00, 70.80it/s, loss=-0.87444]
Epoch 16:  92%|#########1| 22/24 [00:00<00:00, 70.80it/s, loss=-0.87221]
Epoch 16:  96%|#########5| 23/24 [00:00<00:00, 70.80it/s, loss=-0.86958]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 70.80it/s, loss=-0.86489]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 70.80it/s, loss=-0.86489, test_loss=-0.72990]
Epoch 16: 100%|##########| 24/24 [00:00<00:00, 63.79it/s, loss=-0.86489, test_loss=-0.72990]

Epoch 17:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 17:   4%|4         | 1/24 [00:00<00:00, 48.35it/s, loss=-0.96463]
Epoch 17:   8%|8         | 2/24 [00:00<00:00, 56.69it/s, loss=-0.96218]
Epoch 17:  12%|#2        | 3/24 [00:00<00:00, 61.75it/s, loss=-0.92290]
Epoch 17:  17%|#6        | 4/24 [00:00<00:00, 63.96it/s, loss=-0.96264]
Epoch 17:  21%|##        | 5/24 [00:00<00:00, 65.00it/s, loss=-0.96112]
Epoch 17:  25%|##5       | 6/24 [00:00<00:00, 65.64it/s, loss=-0.95432]
Epoch 17:  29%|##9       | 7/24 [00:00<00:00, 66.53it/s, loss=-0.95432]
Epoch 17:  29%|##9       | 7/24 [00:00<00:00, 66.53it/s, loss=-0.94199]
Epoch 17:  33%|###3      | 8/24 [00:00<00:00, 66.53it/s, loss=-0.94179]
Epoch 17:  38%|###7      | 9/24 [00:00<00:00, 66.53it/s, loss=-0.92814]
Epoch 17:  42%|####1     | 10/24 [00:00<00:00, 66.53it/s, loss=-0.91409]
Epoch 17:  46%|####5     | 11/24 [00:00<00:00, 66.53it/s, loss=-0.92322]
Epoch 17:  50%|#####     | 12/24 [00:00<00:00, 66.53it/s, loss=-0.92470]
Epoch 17:  54%|#####4    | 13/24 [00:00<00:00, 66.53it/s, loss=-0.91114]
Epoch 17:  58%|#####8    | 14/24 [00:00<00:00, 67.30it/s, loss=-0.91114]
Epoch 17:  58%|#####8    | 14/24 [00:00<00:00, 67.30it/s, loss=-0.89720]
Epoch 17:  62%|######2   | 15/24 [00:00<00:00, 67.30it/s, loss=-0.88235]
Epoch 17:  67%|######6   | 16/24 [00:00<00:00, 67.30it/s, loss=-0.87433]
Epoch 17:  71%|#######   | 17/24 [00:00<00:00, 67.30it/s, loss=-0.87443]
Epoch 17:  75%|#######5  | 18/24 [00:00<00:00, 67.30it/s, loss=-0.86856]
Epoch 17:  79%|#######9  | 19/24 [00:00<00:00, 67.30it/s, loss=-0.87506]
Epoch 17:  83%|########3 | 20/24 [00:00<00:00, 67.30it/s, loss=-0.87362]
Epoch 17:  88%|########7 | 21/24 [00:00<00:00, 67.50it/s, loss=-0.87362]
Epoch 17:  88%|########7 | 21/24 [00:00<00:00, 67.50it/s, loss=-0.86914]
Epoch 17:  92%|#########1| 22/24 [00:00<00:00, 67.50it/s, loss=-0.86904]
Epoch 17:  96%|#########5| 23/24 [00:00<00:00, 67.50it/s, loss=-0.87524]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 67.50it/s, loss=-0.87660]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 67.50it/s, loss=-0.87660, test_loss=-0.73495]
Epoch 17: 100%|##########| 24/24 [00:00<00:00, 62.02it/s, loss=-0.87660, test_loss=-0.73495]

Epoch 18:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 18:   4%|4         | 1/24 [00:00<00:00, 70.63it/s, loss=-0.91078]
Epoch 18:   8%|8         | 2/24 [00:00<00:00, 70.28it/s, loss=-0.84848]
Epoch 18:  12%|#2        | 3/24 [00:00<00:00, 65.05it/s, loss=-0.86413]
Epoch 18:  17%|#6        | 4/24 [00:00<00:00, 66.47it/s, loss=-0.86924]
Epoch 18:  21%|##        | 5/24 [00:00<00:00, 65.73it/s, loss=-0.88250]
Epoch 18:  25%|##5       | 6/24 [00:00<00:00, 66.69it/s, loss=-0.87990]
Epoch 18:  29%|##9       | 7/24 [00:00<00:00, 67.16it/s, loss=-0.87990]
Epoch 18:  29%|##9       | 7/24 [00:00<00:00, 67.16it/s, loss=-0.87203]
Epoch 18:  33%|###3      | 8/24 [00:00<00:00, 67.16it/s, loss=-0.85550]
Epoch 18:  38%|###7      | 9/24 [00:00<00:00, 67.16it/s, loss=-0.85322]
Epoch 18:  42%|####1     | 10/24 [00:00<00:00, 67.16it/s, loss=-0.85262]
Epoch 18:  46%|####5     | 11/24 [00:00<00:00, 67.16it/s, loss=-0.84777]
Epoch 18:  50%|#####     | 12/24 [00:00<00:00, 67.16it/s, loss=-0.85391]
Epoch 18:  54%|#####4    | 13/24 [00:00<00:00, 67.16it/s, loss=-0.84845]
Epoch 18:  58%|#####8    | 14/24 [00:00<00:00, 67.16it/s, loss=-0.86447]
Epoch 18:  62%|######2   | 15/24 [00:00<00:00, 68.46it/s, loss=-0.86447]
Epoch 18:  62%|######2   | 15/24 [00:00<00:00, 68.46it/s, loss=-0.87316]
Epoch 18:  67%|######6   | 16/24 [00:00<00:00, 68.46it/s, loss=-0.87523]
Epoch 18:  71%|#######   | 17/24 [00:00<00:00, 68.46it/s, loss=-0.88258]
Epoch 18:  75%|#######5  | 18/24 [00:00<00:00, 68.46it/s, loss=-0.89556]
Epoch 18:  79%|#######9  | 19/24 [00:00<00:00, 68.46it/s, loss=-0.89658]
Epoch 18:  83%|########3 | 20/24 [00:00<00:00, 68.46it/s, loss=-0.88805]
Epoch 18:  88%|########7 | 21/24 [00:00<00:00, 68.46it/s, loss=-0.88352]
Epoch 18:  92%|#########1| 22/24 [00:00<00:00, 68.54it/s, loss=-0.88352]
Epoch 18:  92%|#########1| 22/24 [00:00<00:00, 68.54it/s, loss=-0.88606]
Epoch 18:  96%|#########5| 23/24 [00:00<00:00, 68.54it/s, loss=-0.88524]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 68.54it/s, loss=-0.88589]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 68.54it/s, loss=-0.88589, test_loss=-0.71993]
Epoch 18: 100%|##########| 24/24 [00:00<00:00, 62.77it/s, loss=-0.88589, test_loss=-0.71993]

Epoch 19:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 19:   4%|4         | 1/24 [00:00<00:00, 70.93it/s, loss=-0.92631]
Epoch 19:   8%|8         | 2/24 [00:00<00:00, 70.16it/s, loss=-0.97710]
Epoch 19:  12%|#2        | 3/24 [00:00<00:00, 66.34it/s, loss=-0.97085]
Epoch 19:  17%|#6        | 4/24 [00:00<00:00, 68.22it/s, loss=-0.95525]
Epoch 19:  21%|##        | 5/24 [00:00<00:00, 69.62it/s, loss=-0.96285]
Epoch 19:  25%|##5       | 6/24 [00:00<00:00, 69.96it/s, loss=-0.94833]
Epoch 19:  29%|##9       | 7/24 [00:00<00:00, 70.77it/s, loss=-0.93737]
Epoch 19:  33%|###3      | 8/24 [00:00<00:00, 71.13it/s, loss=-0.93737]
Epoch 19:  33%|###3      | 8/24 [00:00<00:00, 71.13it/s, loss=-0.89524]
Epoch 19:  38%|###7      | 9/24 [00:00<00:00, 71.13it/s, loss=-0.90022]
Epoch 19:  42%|####1     | 10/24 [00:00<00:00, 71.13it/s, loss=-0.90787]
Epoch 19:  46%|####5     | 11/24 [00:00<00:00, 71.13it/s, loss=-0.92060]
Epoch 19:  50%|#####     | 12/24 [00:00<00:00, 71.13it/s, loss=-0.91566]
Epoch 19:  54%|#####4    | 13/24 [00:00<00:00, 71.13it/s, loss=-0.91264]
Epoch 19:  58%|#####8    | 14/24 [00:00<00:00, 71.13it/s, loss=-0.90931]
Epoch 19:  62%|######2   | 15/24 [00:00<00:00, 71.13it/s, loss=-0.90437]
Epoch 19:  67%|######6   | 16/24 [00:00<00:00, 71.03it/s, loss=-0.90437]
Epoch 19:  67%|######6   | 16/24 [00:00<00:00, 71.03it/s, loss=-0.90886]
Epoch 19:  71%|#######   | 17/24 [00:00<00:00, 71.03it/s, loss=-0.91396]
Epoch 19:  75%|#######5  | 18/24 [00:00<00:00, 71.03it/s, loss=-0.90470]
Epoch 19:  79%|#######9  | 19/24 [00:00<00:00, 71.03it/s, loss=-0.90495]
Epoch 19:  83%|########3 | 20/24 [00:00<00:00, 71.03it/s, loss=-0.90494]
Epoch 19:  88%|########7 | 21/24 [00:00<00:00, 71.03it/s, loss=-0.90342]
Epoch 19:  92%|#########1| 22/24 [00:00<00:00, 71.03it/s, loss=-0.89234]
Epoch 19:  96%|#########5| 23/24 [00:00<00:00, 71.03it/s, loss=-0.88810]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 71.80it/s, loss=-0.88810]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 71.80it/s, loss=-0.88890]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 71.80it/s, loss=-0.88890, test_loss=-0.72349]
Epoch 19: 100%|##########| 24/24 [00:00<00:00, 63.77it/s, loss=-0.88890, test_loss=-0.72349]

Epoch 20:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 20:   4%|4         | 1/24 [00:00<00:00, 69.07it/s, loss=-0.72408]
Epoch 20:   8%|8         | 2/24 [00:00<00:00, 70.21it/s, loss=-0.86575]
Epoch 20:  12%|#2        | 3/24 [00:00<00:00, 69.43it/s, loss=-0.88577]
Epoch 20:  17%|#6        | 4/24 [00:00<00:00, 69.48it/s, loss=-0.93539]
Epoch 20:  21%|##        | 5/24 [00:00<00:00, 69.36it/s, loss=-0.89175]
Epoch 20:  25%|##5       | 6/24 [00:00<00:00, 68.90it/s, loss=-0.88801]
Epoch 20:  29%|##9       | 7/24 [00:00<00:00, 68.85it/s, loss=-0.88801]
Epoch 20:  29%|##9       | 7/24 [00:00<00:00, 68.85it/s, loss=-0.87898]
Epoch 20:  33%|###3      | 8/24 [00:00<00:00, 68.85it/s, loss=-0.87747]
Epoch 20:  38%|###7      | 9/24 [00:00<00:00, 68.85it/s, loss=-0.87249]
Epoch 20:  42%|####1     | 10/24 [00:00<00:00, 68.85it/s, loss=-0.88778]
Epoch 20:  46%|####5     | 11/24 [00:00<00:00, 68.85it/s, loss=-0.89031]
Epoch 20:  50%|#####     | 12/24 [00:00<00:00, 68.85it/s, loss=-0.89204]
Epoch 20:  54%|#####4    | 13/24 [00:00<00:00, 68.85it/s, loss=-0.88373]
Epoch 20:  58%|#####8    | 14/24 [00:00<00:00, 68.96it/s, loss=-0.88373]
Epoch 20:  58%|#####8    | 14/24 [00:00<00:00, 68.96it/s, loss=-0.89114]
Epoch 20:  62%|######2   | 15/24 [00:00<00:00, 68.96it/s, loss=-0.88604]
Epoch 20:  67%|######6   | 16/24 [00:00<00:00, 68.96it/s, loss=-0.88974]
Epoch 20:  71%|#######   | 17/24 [00:00<00:00, 68.96it/s, loss=-0.88791]
Epoch 20:  75%|#######5  | 18/24 [00:00<00:00, 68.96it/s, loss=-0.88384]
Epoch 20:  79%|#######9  | 19/24 [00:00<00:00, 68.96it/s, loss=-0.88412]
Epoch 20:  83%|########3 | 20/24 [00:00<00:00, 68.96it/s, loss=-0.88547]
Epoch 20:  88%|########7 | 21/24 [00:00<00:00, 68.96it/s, loss=-0.88825]
Epoch 20:  92%|#########1| 22/24 [00:00<00:00, 69.62it/s, loss=-0.88825]
Epoch 20:  92%|#########1| 22/24 [00:00<00:00, 69.62it/s, loss=-0.89049]
Epoch 20:  96%|#########5| 23/24 [00:00<00:00, 69.62it/s, loss=-0.89327]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 69.62it/s, loss=-0.88990]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 69.62it/s, loss=-0.88990, test_loss=-0.72196]
Epoch 20: 100%|##########| 24/24 [00:00<00:00, 63.24it/s, loss=-0.88990, test_loss=-0.72196]

Epoch 21:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 21:   4%|4         | 1/24 [00:00<00:00, 70.19it/s, loss=-0.96521]
Epoch 21:   8%|8         | 2/24 [00:00<00:00, 61.12it/s, loss=-0.93563]
Epoch 21:  12%|#2        | 3/24 [00:00<00:00, 65.58it/s, loss=-0.90503]
Epoch 21:  17%|#6        | 4/24 [00:00<00:00, 67.42it/s, loss=-0.89060]
Epoch 21:  21%|##        | 5/24 [00:00<00:00, 68.43it/s, loss=-0.86747]
Epoch 21:  25%|##5       | 6/24 [00:00<00:00, 69.17it/s, loss=-0.86345]
Epoch 21:  29%|##9       | 7/24 [00:00<00:00, 69.34it/s, loss=-0.86345]
Epoch 21:  29%|##9       | 7/24 [00:00<00:00, 69.34it/s, loss=-0.87461]
Epoch 21:  33%|###3      | 8/24 [00:00<00:00, 69.34it/s, loss=-0.88257]
Epoch 21:  38%|###7      | 9/24 [00:00<00:00, 69.34it/s, loss=-0.88271]
Epoch 21:  42%|####1     | 10/24 [00:00<00:00, 69.34it/s, loss=-0.88156]
Epoch 21:  46%|####5     | 11/24 [00:00<00:00, 69.34it/s, loss=-0.88655]
Epoch 21:  50%|#####     | 12/24 [00:00<00:00, 69.34it/s, loss=-0.89211]
Epoch 21:  54%|#####4    | 13/24 [00:00<00:00, 69.34it/s, loss=-0.89240]
Epoch 21:  58%|#####8    | 14/24 [00:00<00:00, 69.26it/s, loss=-0.89240]
Epoch 21:  58%|#####8    | 14/24 [00:00<00:00, 69.26it/s, loss=-0.88895]
Epoch 21:  62%|######2   | 15/24 [00:00<00:00, 69.26it/s, loss=-0.88219]
Epoch 21:  67%|######6   | 16/24 [00:00<00:00, 69.26it/s, loss=-0.87034]
Epoch 21:  71%|#######   | 17/24 [00:00<00:00, 69.26it/s, loss=-0.87187]
Epoch 21:  75%|#######5  | 18/24 [00:00<00:00, 69.26it/s, loss=-0.86870]
Epoch 21:  79%|#######9  | 19/24 [00:00<00:00, 69.26it/s, loss=-0.87029]
Epoch 21:  83%|########3 | 20/24 [00:00<00:00, 69.26it/s, loss=-0.87777]
Epoch 21:  88%|########7 | 21/24 [00:00<00:00, 68.76it/s, loss=-0.87777]
Epoch 21:  88%|########7 | 21/24 [00:00<00:00, 68.76it/s, loss=-0.88006]
Epoch 21:  92%|#########1| 22/24 [00:00<00:00, 68.76it/s, loss=-0.88464]
Epoch 21:  96%|#########5| 23/24 [00:00<00:00, 68.76it/s, loss=-0.89223]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 68.76it/s, loss=-0.88760]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 68.76it/s, loss=-0.88760, test_loss=-0.71699]
Epoch 21: 100%|##########| 24/24 [00:00<00:00, 62.23it/s, loss=-0.88760, test_loss=-0.71699]

Epoch 22:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 22:   4%|4         | 1/24 [00:00<00:00, 69.33it/s, loss=-0.77174]
Epoch 22:   8%|8         | 2/24 [00:00<00:00, 69.83it/s, loss=-0.76267]
Epoch 22:  12%|#2        | 3/24 [00:00<00:00, 69.26it/s, loss=-0.81260]
Epoch 22:  17%|#6        | 4/24 [00:00<00:00, 59.08it/s, loss=-0.85343]
Epoch 22:  21%|##        | 5/24 [00:00<00:00, 60.60it/s, loss=-0.83233]
Epoch 22:  25%|##5       | 6/24 [00:00<00:00, 61.84it/s, loss=-0.85748]
Epoch 22:  29%|##9       | 7/24 [00:00<00:00, 63.08it/s, loss=-0.85748]
Epoch 22:  29%|##9       | 7/24 [00:00<00:00, 63.08it/s, loss=-0.83824]
Epoch 22:  33%|###3      | 8/24 [00:00<00:00, 63.08it/s, loss=-0.86086]
Epoch 22:  38%|###7      | 9/24 [00:00<00:00, 63.08it/s, loss=-0.87092]
Epoch 22:  42%|####1     | 10/24 [00:00<00:00, 63.08it/s, loss=-0.87493]
Epoch 22:  46%|####5     | 11/24 [00:00<00:00, 63.08it/s, loss=-0.87456]
Epoch 22:  50%|#####     | 12/24 [00:00<00:00, 63.08it/s, loss=-0.88352]
Epoch 22:  54%|#####4    | 13/24 [00:00<00:00, 63.08it/s, loss=-0.88912]
Epoch 22:  58%|#####8    | 14/24 [00:00<00:00, 64.23it/s, loss=-0.88912]
Epoch 22:  58%|#####8    | 14/24 [00:00<00:00, 64.23it/s, loss=-0.88934]
Epoch 22:  62%|######2   | 15/24 [00:00<00:00, 64.23it/s, loss=-0.89177]
Epoch 22:  67%|######6   | 16/24 [00:00<00:00, 64.23it/s, loss=-0.89265]
Epoch 22:  71%|#######   | 17/24 [00:00<00:00, 64.23it/s, loss=-0.90385]
Epoch 22:  75%|#######5  | 18/24 [00:00<00:00, 64.23it/s, loss=-0.89932]
Epoch 22:  79%|#######9  | 19/24 [00:00<00:00, 64.23it/s, loss=-0.89027]
Epoch 22:  83%|########3 | 20/24 [00:00<00:00, 64.23it/s, loss=-0.88869]
Epoch 22:  88%|########7 | 21/24 [00:00<00:00, 65.49it/s, loss=-0.88869]
Epoch 22:  88%|########7 | 21/24 [00:00<00:00, 65.49it/s, loss=-0.88617]
Epoch 22:  92%|#########1| 22/24 [00:00<00:00, 65.49it/s, loss=-0.88707]
Epoch 22:  96%|#########5| 23/24 [00:00<00:00, 65.49it/s, loss=-0.87941]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 65.49it/s, loss=-0.87943]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 65.49it/s, loss=-0.87943, test_loss=-0.70203]
Epoch 22: 100%|##########| 24/24 [00:00<00:00, 58.71it/s, loss=-0.87943, test_loss=-0.70203]

Epoch 23:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 23:   4%|4         | 1/24 [00:00<00:00, 68.66it/s, loss=-0.87095]
Epoch 23:   8%|8         | 2/24 [00:00<00:00, 68.96it/s, loss=-0.88617]
Epoch 23:  12%|#2        | 3/24 [00:00<00:00, 68.48it/s, loss=-0.89517]
Epoch 23:  17%|#6        | 4/24 [00:00<00:00, 64.15it/s, loss=-0.91868]
Epoch 23:  21%|##        | 5/24 [00:00<00:00, 62.17it/s, loss=-0.91608]
Epoch 23:  25%|##5       | 6/24 [00:00<00:00, 63.08it/s, loss=-0.91244]
Epoch 23:  29%|##9       | 7/24 [00:00<00:00, 63.95it/s, loss=-0.91244]
Epoch 23:  29%|##9       | 7/24 [00:00<00:00, 63.95it/s, loss=-0.91192]
Epoch 23:  33%|###3      | 8/24 [00:00<00:00, 63.95it/s, loss=-0.91235]
Epoch 23:  38%|###7      | 9/24 [00:00<00:00, 63.95it/s, loss=-0.92439]
Epoch 23:  42%|####1     | 10/24 [00:00<00:00, 63.95it/s, loss=-0.92024]
Epoch 23:  46%|####5     | 11/24 [00:00<00:00, 63.95it/s, loss=-0.93388]
Epoch 23:  50%|#####     | 12/24 [00:00<00:00, 63.95it/s, loss=-0.93550]
Epoch 23:  54%|#####4    | 13/24 [00:00<00:00, 63.95it/s, loss=-0.94565]
Epoch 23:  58%|#####8    | 14/24 [00:00<00:00, 65.37it/s, loss=-0.94565]
Epoch 23:  58%|#####8    | 14/24 [00:00<00:00, 65.37it/s, loss=-0.93152]
Epoch 23:  62%|######2   | 15/24 [00:00<00:00, 65.37it/s, loss=-0.93152]
Epoch 23:  67%|######6   | 16/24 [00:00<00:00, 65.37it/s, loss=-0.91623]
Epoch 23:  71%|#######   | 17/24 [00:00<00:00, 65.37it/s, loss=-0.90866]
Epoch 23:  75%|#######5  | 18/24 [00:00<00:00, 65.37it/s, loss=-0.90486]
Epoch 23:  79%|#######9  | 19/24 [00:00<00:00, 65.37it/s, loss=-0.89927]
Epoch 23:  83%|########3 | 20/24 [00:00<00:00, 65.37it/s, loss=-0.89747]
Epoch 23:  88%|########7 | 21/24 [00:00<00:00, 66.34it/s, loss=-0.89747]
Epoch 23:  88%|########7 | 21/24 [00:00<00:00, 66.34it/s, loss=-0.90618]
Epoch 23:  92%|#########1| 22/24 [00:00<00:00, 66.34it/s, loss=-0.90269]
Epoch 23:  96%|#########5| 23/24 [00:00<00:00, 66.34it/s, loss=-0.90044]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 66.34it/s, loss=-0.89666]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 66.34it/s, loss=-0.89666, test_loss=-0.70125]
Epoch 23: 100%|##########| 24/24 [00:00<00:00, 61.07it/s, loss=-0.89666, test_loss=-0.70125]

Epoch 24:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 24:   4%|4         | 1/24 [00:00<00:00, 67.20it/s, loss=-0.82148]
Epoch 24:   8%|8         | 2/24 [00:00<00:00, 67.19it/s, loss=-0.87199]
Epoch 24:  12%|#2        | 3/24 [00:00<00:00, 66.41it/s, loss=-0.86505]
Epoch 24:  17%|#6        | 4/24 [00:00<00:00, 66.27it/s, loss=-0.88571]
Epoch 24:  21%|##        | 5/24 [00:00<00:00, 66.43it/s, loss=-0.87602]
Epoch 24:  25%|##5       | 6/24 [00:00<00:00, 67.12it/s, loss=-0.89032]
Epoch 24:  29%|##9       | 7/24 [00:00<00:00, 67.58it/s, loss=-0.89032]
Epoch 24:  29%|##9       | 7/24 [00:00<00:00, 67.58it/s, loss=-0.89463]
Epoch 24:  33%|###3      | 8/24 [00:00<00:00, 67.58it/s, loss=-0.89976]
Epoch 24:  38%|###7      | 9/24 [00:00<00:00, 67.58it/s, loss=-0.89917]
Epoch 24:  42%|####1     | 10/24 [00:00<00:00, 67.58it/s, loss=-0.90327]
Epoch 24:  46%|####5     | 11/24 [00:00<00:00, 67.58it/s, loss=-0.90624]
Epoch 24:  50%|#####     | 12/24 [00:00<00:00, 67.58it/s, loss=-0.89779]
Epoch 24:  54%|#####4    | 13/24 [00:00<00:00, 67.58it/s, loss=-0.90360]
Epoch 24:  58%|#####8    | 14/24 [00:00<00:00, 67.58it/s, loss=-0.90298]
Epoch 24:  62%|######2   | 15/24 [00:00<00:00, 68.62it/s, loss=-0.90298]
Epoch 24:  62%|######2   | 15/24 [00:00<00:00, 68.62it/s, loss=-0.90953]
Epoch 24:  67%|######6   | 16/24 [00:00<00:00, 68.62it/s, loss=-0.90454]
Epoch 24:  71%|#######   | 17/24 [00:00<00:00, 68.62it/s, loss=-0.90915]
Epoch 24:  75%|#######5  | 18/24 [00:00<00:00, 68.62it/s, loss=-0.90557]
Epoch 24:  79%|#######9  | 19/24 [00:00<00:00, 68.62it/s, loss=-0.90209]
Epoch 24:  83%|########3 | 20/24 [00:00<00:00, 68.62it/s, loss=-0.90317]
Epoch 24:  88%|########7 | 21/24 [00:00<00:00, 68.62it/s, loss=-0.90140]
Epoch 24:  92%|#########1| 22/24 [00:00<00:00, 68.62it/s, loss=-0.90502]
Epoch 24:  96%|#########5| 23/24 [00:00<00:00, 69.33it/s, loss=-0.90502]
Epoch 24:  96%|#########5| 23/24 [00:00<00:00, 69.33it/s, loss=-0.90612]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 69.33it/s, loss=-0.89684]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 69.33it/s, loss=-0.89684, test_loss=-0.72683]
Epoch 24: 100%|##########| 24/24 [00:00<00:00, 62.40it/s, loss=-0.89684, test_loss=-0.72683]

Epoch 25:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 25:   4%|4         | 1/24 [00:00<00:00, 67.24it/s, loss=-0.92317]
Epoch 25:   8%|8         | 2/24 [00:00<00:00, 65.99it/s, loss=-0.91205]
Epoch 25:  12%|#2        | 3/24 [00:00<00:00, 66.39it/s, loss=-0.91143]
Epoch 25:  17%|#6        | 4/24 [00:00<00:00, 66.49it/s, loss=-0.91082]
Epoch 25:  21%|##        | 5/24 [00:00<00:00, 66.75it/s, loss=-0.89882]
Epoch 25:  25%|##5       | 6/24 [00:00<00:00, 66.94it/s, loss=-0.90477]
Epoch 25:  29%|##9       | 7/24 [00:00<00:00, 66.97it/s, loss=-0.90477]
Epoch 25:  29%|##9       | 7/24 [00:00<00:00, 66.97it/s, loss=-0.92891]
Epoch 25:  33%|###3      | 8/24 [00:00<00:00, 66.97it/s, loss=-0.93404]
Epoch 25:  38%|###7      | 9/24 [00:00<00:00, 66.97it/s, loss=-0.92207]
Epoch 25:  42%|####1     | 10/24 [00:00<00:00, 66.97it/s, loss=-0.92800]
Epoch 25:  46%|####5     | 11/24 [00:00<00:00, 66.97it/s, loss=-0.92113]
Epoch 25:  50%|#####     | 12/24 [00:00<00:00, 66.97it/s, loss=-0.92466]
Epoch 25:  54%|#####4    | 13/24 [00:00<00:00, 66.97it/s, loss=-0.92044]
Epoch 25:  58%|#####8    | 14/24 [00:00<00:00, 66.59it/s, loss=-0.92044]
Epoch 25:  58%|#####8    | 14/24 [00:00<00:00, 66.59it/s, loss=-0.91123]
Epoch 25:  62%|######2   | 15/24 [00:00<00:00, 66.59it/s, loss=-0.91618]
Epoch 25:  67%|######6   | 16/24 [00:00<00:00, 66.59it/s, loss=-0.91612]
Epoch 25:  71%|#######   | 17/24 [00:00<00:00, 66.59it/s, loss=-0.91648]
Epoch 25:  75%|#######5  | 18/24 [00:00<00:00, 66.59it/s, loss=-0.91662]
Epoch 25:  79%|#######9  | 19/24 [00:00<00:00, 66.59it/s, loss=-0.91152]
Epoch 25:  83%|########3 | 20/24 [00:00<00:00, 66.59it/s, loss=-0.90764]
Epoch 25:  88%|########7 | 21/24 [00:00<00:00, 66.58it/s, loss=-0.90764]
Epoch 25:  88%|########7 | 21/24 [00:00<00:00, 66.58it/s, loss=-0.90353]
Epoch 25:  92%|#########1| 22/24 [00:00<00:00, 66.58it/s, loss=-0.91886]
Epoch 25:  96%|#########5| 23/24 [00:00<00:00, 66.58it/s, loss=-0.91331]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 66.58it/s, loss=-0.90186]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 66.58it/s, loss=-0.90186, test_loss=-0.71204]
Epoch 25: 100%|##########| 24/24 [00:00<00:00, 60.32it/s, loss=-0.90186, test_loss=-0.71204]

Epoch 26:   0%|          | 0/24 [00:00<?, ?it/s]
Epoch 26:   4%|4         | 1/24 [00:00<00:00, 66.96it/s, loss=-0.91968]
Epoch 26:   8%|8         | 2/24 [00:00<00:00, 68.37it/s, loss=-0.97279]
Epoch 26:  12%|#2        | 3/24 [00:00<00:00, 68.13it/s, loss=-0.92169]
Epoch 26:  17%|#6        | 4/24 [00:00<00:00, 68.91it/s, loss=-0.89866]
Epoch 26:  21%|##        | 5/24 [00:00<00:00, 69.33it/s, loss=-0.90680]
Epoch 26:  25%|##5       | 6/24 [00:00<00:00, 69.72it/s, loss=-0.91954]
Epoch 26:  29%|##9       | 7/24 [00:00<00:00, 68.70it/s, loss=-0.91954]
Epoch 26:  29%|##9       | 7/24 [00:00<00:00, 68.70it/s, loss=-0.92781]
Epoch 26:  33%|###3      | 8/24 [00:00<00:00, 68.70it/s, loss=-0.93902]
Epoch 26:  38%|###7      | 9/24 [00:00<00:00, 68.70it/s, loss=-0.90916]
Epoch 26:  42%|####1     | 10/24 [00:00<00:00, 68.70it/s, loss=-0.91263]
Epoch 26:  46%|####5     | 11/24 [00:00<00:00, 68.70it/s, loss=-0.90548]
Epoch 26:  50%|#####     | 12/24 [00:00<00:00, 68.70it/s, loss=-0.90226]
Epoch 26:  54%|#####4    | 13/24 [00:00<00:00, 68.70it/s, loss=-0.90128]
Epoch 26:  58%|#####8    | 14/24 [00:00<00:00, 68.70it/s, loss=-0.89972]
Epoch 26:  62%|######2   | 15/24 [00:00<00:00, 69.45it/s, loss=-0.89972]
Epoch 26:  62%|######2   | 15/24 [00:00<00:00, 69.45it/s, loss=-0.89367]
Epoch 26:  67%|######6   | 16/24 [00:00<00:00, 69.45it/s, loss=-0.88745]
Epoch 26:  71%|#######   | 17/24 [00:00<00:00, 69.45it/s, loss=-0.89042]
Epoch 26:  75%|#######5  | 18/24 [00:00<00:00, 69.45it/s, loss=-0.89174]
Epoch 26:  79%|#######9  | 19/24 [00:00<00:00, 69.45it/s, loss=-0.89651]
Epoch 26:  83%|########3 | 20/24 [00:00<00:00, 69.45it/s, loss=-0.89764]
Epoch 26:  88%|########7 | 21/24 [00:00<00:00, 69.45it/s, loss=-0.90672]
Epoch 26:  92%|#########1| 22/24 [00:00<00:00, 69.50it/s, loss=-0.90672]
Epoch 26:  92%|#########1| 22/24 [00:00<00:00, 69.50it/s, loss=-0.89739]
Epoch 26:  96%|#########5| 23/24 [00:00<00:00, 69.50it/s, loss=-0.89650]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 69.50it/s, loss=-0.89888]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 69.50it/s, loss=-0.89888, test_loss=-0.70681]
Epoch 26: 100%|##########| 24/24 [00:00<00:00, 62.91it/s, loss=-0.89888, test_loss=-0.70681]
Training interrupted
Training stopped early because there was no improvement in test_loss for 15 epochs

Evaluation and visualization

The history object returned by launch contains a lot of useful information related to training. Specifically, the property metrics returns a comprehensive pd.DataFrame. To display the average test loss per each epoch we can run following.

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value']

print(per_epoch_results.count())  # double check number of samples each epoch
print(per_epoch_results.mean())  # mean loss per epoch

Out:

dataloader  metric  model    epoch
test        loss    network  0        128
                             1        128
                             2        128
                             3        128
                             4        128
                             5        128
                             6        128
                             7        128
                             8        128
                             9        128
                             10       128
                             11       128
                             12       128
                             13       128
                             14       128
                             15       128
                             16       128
                             17       128
                             18       128
                             19       128
                             20       128
                             21       128
                             22       128
                             23       128
                             24       128
                             25       128
                             26       128
Name: value, dtype: int64
dataloader  metric  model    epoch
test        loss    network  0       -0.323336
                             1       -0.481293
                             2       -0.587856
                             3       -0.631074
                             4       -0.643218
                             5       -0.690939
                             6       -0.667855
                             7       -0.681859
                             8       -0.671437
                             9       -0.726678
                             10      -0.725190
                             11      -0.740369
                             12      -0.700569
                             13      -0.738637
                             14      -0.728764
                             15      -0.729694
                             16      -0.729903
                             17      -0.734953
                             18      -0.719929
                             19      -0.723492
                             20      -0.721961
                             21      -0.716985
                             22      -0.702031
                             23      -0.701247
                             24      -0.726833
                             25      -0.712036
                             26      -0.706810
Name: value, dtype: float64
per_epoch_results.mean()['test']['loss']['network'].plot()
getting started

Out:

<matplotlib.axes._subplots.AxesSubplot object at 0x7f840d5dee10>

To get more insight into what our network predicts we can use the deepdow.visualize module. Before we even start further evaluations, let us make sure the network is in eval model.

network = network.eval()

To put performance our our network in context, we also utilize multiple benchmarks. deepdow offers multiple benchmarks already. Additionally, one can provide custom simple benchmarks or some pre-trained networks.

benchmarks = {
    '1overN': OneOverN(),  # each asset has weight 1 / n_assets
    'random': Random(),  # random allocation that is however close 1OverN
    'network': network
}

During training, the only mandatory metric/loss was the loss criterion that we tried to minimize. Naturally, one might be interested in many other metrics to evaluate the performance. See below an example.

metrics = {
    'MaxDD': MaximumDrawdown(),
    'Sharpe': SharpeRatio(),
    'MeanReturn': MeanReturns()
}

Let us now use the above created objects. We first generate a table with all metrics over all samples and for all benchmarks. This is done via generate_metrics_table.

metrics_table = generate_metrics_table(benchmarks,
                                       dataloader_test,
                                       metrics)

And then we plot it with plot_metrics.

plot_metrics(metrics_table)
MaxDD, Sharpe, MeanReturn

Out:

array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f840d551e80>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f840d507630>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f840d4c9860>],
      dtype=object)

Each plot represents a different metric. The x-axis represents the timestamps in our test set. The different colors are capturing different models. How is the value of a metric computed? We assume that the investor predicts the portfolio at time x and buys it. He then holds it for horizon timesteps. The actual metric is then computed over this time horizon.

Finally, we are also interested in how the allocation/prediction looks like at each time step. We can use the generate_weights_table function to create a pd.DataFrame.

weight_table = generate_weights_table(network, dataloader_test)

We then call the plot_weight_heatmap to see a heatmap of weights.

plot_weight_heatmap(weight_table,
                    add_sum_column=True,
                    time_format=None,
                    time_skips=25)
getting started

Out:

<matplotlib.axes._subplots.AxesSubplot object at 0x7f840d4262b0>

The rows represent different timesteps in our test set. The columns are all the assets in our universe. The values represent the weight in the portfolio. Additionally, we add a sum column to show that we are really generating valid allocations.

Total running time of the script: ( 0 minutes 15.829 seconds)

Gallery generated by Sphinx-Gallery