Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

Lütkepohl2005

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss

Out:

/home/docs/checkouts/readthedocs.org/user_builds/deepdow/envs/v0.2.1/lib/python3.7/site-packages/patsy/constraint.py:13: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Mapping
model       metric     epoch  dataloader
1overN      loss       -1     train         0.002
                              val          -0.002
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train         0.003
                              val          -0.002
            sqweights  -1     train         0.144
                              val           0.144
Random      loss       -1     train         0.003
                              val          -0.001
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.163
                              val          -0.168
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 23.52it/s, loss=0.01140, sqweights=0.17000]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 26.63it/s, loss=0.00657, sqweights=0.16949]
Epoch 0:  15%|#5        | 3/20 [00:00<00:01, 14.31it/s, loss=0.00657, sqweights=0.16949]
Epoch 0:  15%|#5        | 3/20 [00:00<00:01, 14.31it/s, loss=0.00162, sqweights=0.16754]
Epoch 0:  20%|##        | 4/20 [00:00<00:01, 14.31it/s, loss=-0.00361, sqweights=0.16758]
Epoch 0:  25%|##5       | 5/20 [00:00<00:01, 14.31it/s, loss=-0.00084, sqweights=0.16735]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 14.31it/s, loss=0.00043, sqweights=0.16742]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 17.09it/s, loss=0.00043, sqweights=0.16742]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 17.09it/s, loss=0.00096, sqweights=0.16719]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 17.09it/s, loss=0.00129, sqweights=0.16732]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 17.09it/s, loss=0.00065, sqweights=0.16715]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 17.09it/s, loss=0.00161, sqweights=0.16725]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 19.87it/s, loss=0.00161, sqweights=0.16725]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 19.87it/s, loss=0.00294, sqweights=0.16733]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 19.87it/s, loss=0.00241, sqweights=0.16729]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 19.87it/s, loss=0.00235, sqweights=0.16763]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 19.87it/s, loss=0.00186, sqweights=0.16761]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 22.41it/s, loss=0.00186, sqweights=0.16761]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 22.41it/s, loss=0.00183, sqweights=0.16748]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 22.41it/s, loss=0.00208, sqweights=0.16758]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 22.41it/s, loss=0.00172, sqweights=0.16757]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 22.41it/s, loss=0.00160, sqweights=0.16757]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 24.62it/s, loss=0.00160, sqweights=0.16757]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 24.62it/s, loss=0.00120, sqweights=0.16761]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 24.62it/s, loss=0.00131, sqweights=0.16759]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 24.62it/s, loss=0.00131, sqweights=0.16759, train_loss=0.00220, train_sqweights=0.12549, val_loss=-0.00163, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 24.62it/s, loss=0.00131, sqweights=0.16759, train_loss=0.00220, train_sqweights=0.12549, val_loss=-0.00163, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 10.33it/s, loss=0.00131, sqweights=0.16759, train_loss=0.00220, train_sqweights=0.12549, val_loss=-0.00163, val_sqweights=0.12549]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 23.22it/s, loss=-0.00052, sqweights=0.16507]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 26.76it/s, loss=0.00042, sqweights=0.16577]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 28.23it/s, loss=0.00042, sqweights=0.16577]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 28.23it/s, loss=-0.00330, sqweights=0.16546]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 28.23it/s, loss=0.00060, sqweights=0.16653]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 28.23it/s, loss=-0.00125, sqweights=0.16680]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 28.23it/s, loss=-0.00261, sqweights=0.16695]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 29.16it/s, loss=-0.00261, sqweights=0.16695]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 29.16it/s, loss=-0.00369, sqweights=0.16760]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 29.16it/s, loss=-0.00235, sqweights=0.16776]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 29.16it/s, loss=-0.00224, sqweights=0.16795]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 29.16it/s, loss=-0.00072, sqweights=0.16823]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.00072, sqweights=0.16823]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.00036, sqweights=0.16823]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 29.80it/s, loss=-0.00032, sqweights=0.16846]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 29.80it/s, loss=-0.00029, sqweights=0.16869]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 29.80it/s, loss=-0.00160, sqweights=0.16859]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 30.43it/s, loss=-0.00160, sqweights=0.16859]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 30.43it/s, loss=-0.00260, sqweights=0.16862]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 30.43it/s, loss=-0.00248, sqweights=0.16857]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 30.43it/s, loss=-0.00339, sqweights=0.16855]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 30.43it/s, loss=-0.00357, sqweights=0.16860]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 30.89it/s, loss=-0.00357, sqweights=0.16860]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 30.89it/s, loss=-0.00415, sqweights=0.16862]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 30.89it/s, loss=-0.00494, sqweights=0.16868]
Epoch 1: 100%|##########| 20/20 [00:01<00:00, 30.89it/s, loss=-0.00494, sqweights=0.16868, train_loss=0.00170, train_sqweights=0.12583, val_loss=-0.00187, val_sqweights=0.12583]
Epoch 1: 100%|##########| 20/20 [00:01<00:00, 30.89it/s, loss=-0.00494, sqweights=0.16868, train_loss=0.00170, train_sqweights=0.12583, val_loss=-0.00187, val_sqweights=0.12583]
Epoch 1: 100%|##########| 20/20 [00:01<00:00, 10.92it/s, loss=-0.00494, sqweights=0.16868, train_loss=0.00170, train_sqweights=0.12583, val_loss=-0.00187, val_sqweights=0.12583]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 23.50it/s, loss=-0.00672, sqweights=0.16935]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 27.03it/s, loss=-0.00767, sqweights=0.17028]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 28.51it/s, loss=-0.00767, sqweights=0.17028]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 28.51it/s, loss=-0.00506, sqweights=0.17025]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 28.51it/s, loss=-0.00711, sqweights=0.17058]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 28.51it/s, loss=-0.00674, sqweights=0.17056]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 28.51it/s, loss=-0.00640, sqweights=0.17117]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 29.42it/s, loss=-0.00640, sqweights=0.17117]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 29.42it/s, loss=-0.00839, sqweights=0.17215]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 29.42it/s, loss=-0.00934, sqweights=0.17194]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 29.42it/s, loss=-0.00973, sqweights=0.17198]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 29.42it/s, loss=-0.00965, sqweights=0.17241]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 30.06it/s, loss=-0.00965, sqweights=0.17241]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 30.06it/s, loss=-0.01069, sqweights=0.17267]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 30.06it/s, loss=-0.01059, sqweights=0.17263]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 30.06it/s, loss=-0.01100, sqweights=0.17260]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 30.06it/s, loss=-0.01208, sqweights=0.17281]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 30.66it/s, loss=-0.01208, sqweights=0.17281]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 30.66it/s, loss=-0.01174, sqweights=0.17317]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 30.66it/s, loss=-0.01190, sqweights=0.17335]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 30.66it/s, loss=-0.01151, sqweights=0.17354]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 30.66it/s, loss=-0.01129, sqweights=0.17377]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 31.10it/s, loss=-0.01129, sqweights=0.17377]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 31.10it/s, loss=-0.01096, sqweights=0.17401]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 31.10it/s, loss=-0.01130, sqweights=0.17411]
Epoch 2: 100%|##########| 20/20 [00:01<00:00, 31.10it/s, loss=-0.01130, sqweights=0.17411, train_loss=0.00004, train_sqweights=0.12654, val_loss=-0.00323, val_sqweights=0.12652]
Epoch 2: 100%|##########| 20/20 [00:01<00:00, 31.10it/s, loss=-0.01130, sqweights=0.17411, train_loss=0.00004, train_sqweights=0.12654, val_loss=-0.00323, val_sqweights=0.12652]
Epoch 2: 100%|##########| 20/20 [00:01<00:00, 10.93it/s, loss=-0.01130, sqweights=0.17411, train_loss=0.00004, train_sqweights=0.12654, val_loss=-0.00323, val_sqweights=0.12652]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 23.06it/s, loss=-0.03224, sqweights=0.17828]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 26.76it/s, loss=-0.02423, sqweights=0.18175]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 28.29it/s, loss=-0.02423, sqweights=0.18175]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 28.29it/s, loss=-0.02155, sqweights=0.18141]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 28.29it/s, loss=-0.02182, sqweights=0.18132]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 28.29it/s, loss=-0.02099, sqweights=0.18063]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 28.29it/s, loss=-0.02084, sqweights=0.18059]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 29.30it/s, loss=-0.02084, sqweights=0.18059]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 29.30it/s, loss=-0.02091, sqweights=0.18043]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 29.30it/s, loss=-0.02028, sqweights=0.18055]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 29.30it/s, loss=-0.02062, sqweights=0.18076]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 29.30it/s, loss=-0.02233, sqweights=0.18126]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 30.07it/s, loss=-0.02233, sqweights=0.18126]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 30.07it/s, loss=-0.02260, sqweights=0.18099]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 30.07it/s, loss=-0.02178, sqweights=0.18168]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 30.07it/s, loss=-0.02265, sqweights=0.18190]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 23.10it/s, loss=-0.02265, sqweights=0.18190]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 23.10it/s, loss=-0.02141, sqweights=0.18207]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 23.10it/s, loss=-0.02037, sqweights=0.18238]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 23.10it/s, loss=-0.02016, sqweights=0.18301]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 23.10it/s, loss=-0.01938, sqweights=0.18308]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 25.20it/s, loss=-0.01938, sqweights=0.18308]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 25.20it/s, loss=-0.01908, sqweights=0.18358]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 25.20it/s, loss=-0.01904, sqweights=0.18372]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 25.20it/s, loss=-0.01851, sqweights=0.18417]
Epoch 3: 100%|##########| 20/20 [00:01<00:00, 25.20it/s, loss=-0.01851, sqweights=0.18417, train_loss=-0.00582, train_sqweights=0.12978, val_loss=-0.00840, val_sqweights=0.12974]
Epoch 3: 100%|##########| 20/20 [00:01<00:00, 25.20it/s, loss=-0.01851, sqweights=0.18417, train_loss=-0.00582, train_sqweights=0.12978, val_loss=-0.00840, val_sqweights=0.12974]
Epoch 3: 100%|##########| 20/20 [00:01<00:00, 10.31it/s, loss=-0.01851, sqweights=0.18417, train_loss=-0.00582, train_sqweights=0.12978, val_loss=-0.00840, val_sqweights=0.12974]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 23.32it/s, loss=-0.03719, sqweights=0.18397]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 26.90it/s, loss=-0.03021, sqweights=0.18808]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 28.41it/s, loss=-0.03021, sqweights=0.18808]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 28.41it/s, loss=-0.02556, sqweights=0.18995]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 28.41it/s, loss=-0.02385, sqweights=0.19071]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 28.41it/s, loss=-0.02704, sqweights=0.19098]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 28.41it/s, loss=-0.02620, sqweights=0.19117]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 29.40it/s, loss=-0.02620, sqweights=0.19117]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 29.40it/s, loss=-0.02454, sqweights=0.19139]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 29.40it/s, loss=-0.02543, sqweights=0.19226]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 29.40it/s, loss=-0.02582, sqweights=0.19261]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 29.40it/s, loss=-0.02615, sqweights=0.19296]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 30.07it/s, loss=-0.02615, sqweights=0.19296]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 30.07it/s, loss=-0.02578, sqweights=0.19340]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 30.07it/s, loss=-0.02680, sqweights=0.19369]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 30.07it/s, loss=-0.02711, sqweights=0.19406]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 30.07it/s, loss=-0.02628, sqweights=0.19404]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.02628, sqweights=0.19404]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.02614, sqweights=0.19435]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 30.54it/s, loss=-0.02552, sqweights=0.19478]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 30.54it/s, loss=-0.02458, sqweights=0.19548]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 30.54it/s, loss=-0.02543, sqweights=0.19564]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 30.91it/s, loss=-0.02543, sqweights=0.19564]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 30.91it/s, loss=-0.02549, sqweights=0.19620]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 30.91it/s, loss=-0.02662, sqweights=0.19658]
Epoch 4: 100%|##########| 20/20 [00:01<00:00, 30.91it/s, loss=-0.02662, sqweights=0.19658, train_loss=-0.02070, train_sqweights=0.14585, val_loss=-0.02163, val_sqweights=0.14583]
Epoch 4: 100%|##########| 20/20 [00:01<00:00, 30.91it/s, loss=-0.02662, sqweights=0.19658, train_loss=-0.02070, train_sqweights=0.14585, val_loss=-0.02163, val_sqweights=0.14583]
Epoch 4: 100%|##########| 20/20 [00:01<00:00, 10.88it/s, loss=-0.02662, sqweights=0.19658, train_loss=-0.02070, train_sqweights=0.14585, val_loss=-0.02163, val_sqweights=0.14583]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 23.55it/s, loss=-0.02565, sqweights=0.20057]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 27.05it/s, loss=-0.02454, sqweights=0.20397]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 28.03it/s, loss=-0.02454, sqweights=0.20397]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 28.03it/s, loss=-0.02923, sqweights=0.20268]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 28.03it/s, loss=-0.03096, sqweights=0.20272]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 28.03it/s, loss=-0.03143, sqweights=0.20429]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 28.03it/s, loss=-0.03176, sqweights=0.20525]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 29.06it/s, loss=-0.03176, sqweights=0.20525]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 29.06it/s, loss=-0.03076, sqweights=0.20580]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 29.06it/s, loss=-0.03160, sqweights=0.20538]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 29.06it/s, loss=-0.03148, sqweights=0.20673]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 29.06it/s, loss=-0.03136, sqweights=0.20664]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 29.85it/s, loss=-0.03136, sqweights=0.20664]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 29.85it/s, loss=-0.03079, sqweights=0.20739]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 29.85it/s, loss=-0.03022, sqweights=0.20771]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 29.85it/s, loss=-0.03044, sqweights=0.20785]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 29.85it/s, loss=-0.03076, sqweights=0.20848]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 30.20it/s, loss=-0.03076, sqweights=0.20848]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 30.20it/s, loss=-0.03147, sqweights=0.20918]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 30.20it/s, loss=-0.03243, sqweights=0.20956]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 30.20it/s, loss=-0.03316, sqweights=0.20972]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 30.20it/s, loss=-0.03213, sqweights=0.21008]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 30.60it/s, loss=-0.03213, sqweights=0.21008]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 30.60it/s, loss=-0.03270, sqweights=0.21065]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 30.60it/s, loss=-0.03343, sqweights=0.21067]
Epoch 5: 100%|##########| 20/20 [00:01<00:00, 30.60it/s, loss=-0.03343, sqweights=0.21067, train_loss=-0.03681, train_sqweights=0.17280, val_loss=-0.03550, val_sqweights=0.17265]
Epoch 5: 100%|##########| 20/20 [00:01<00:00, 30.60it/s, loss=-0.03343, sqweights=0.21067, train_loss=-0.03681, train_sqweights=0.17280, val_loss=-0.03550, val_sqweights=0.17265]
Epoch 5: 100%|##########| 20/20 [00:01<00:00, 10.89it/s, loss=-0.03343, sqweights=0.21067, train_loss=-0.03681, train_sqweights=0.17280, val_loss=-0.03550, val_sqweights=0.17265]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 23.30it/s, loss=-0.03567, sqweights=0.21877]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 26.99it/s, loss=-0.04687, sqweights=0.22240]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 28.54it/s, loss=-0.04687, sqweights=0.22240]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 28.54it/s, loss=-0.04257, sqweights=0.22281]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 28.54it/s, loss=-0.04413, sqweights=0.22367]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 28.54it/s, loss=-0.04454, sqweights=0.22504]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 28.54it/s, loss=-0.04482, sqweights=0.22492]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 29.50it/s, loss=-0.04482, sqweights=0.22492]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 29.50it/s, loss=-0.04390, sqweights=0.22645]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 29.50it/s, loss=-0.04167, sqweights=0.22593]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 29.50it/s, loss=-0.04138, sqweights=0.22618]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 29.50it/s, loss=-0.04240, sqweights=0.22688]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 30.21it/s, loss=-0.04240, sqweights=0.22688]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 30.21it/s, loss=-0.04235, sqweights=0.22725]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 30.21it/s, loss=-0.04218, sqweights=0.22831]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 30.21it/s, loss=-0.04296, sqweights=0.22836]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 30.21it/s, loss=-0.04327, sqweights=0.22861]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 30.74it/s, loss=-0.04327, sqweights=0.22861]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 30.74it/s, loss=-0.04389, sqweights=0.22984]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 30.74it/s, loss=-0.04361, sqweights=0.23125]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 30.74it/s, loss=-0.04198, sqweights=0.23108]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 30.74it/s, loss=-0.04167, sqweights=0.23129]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 30.98it/s, loss=-0.04167, sqweights=0.23129]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 30.98it/s, loss=-0.04132, sqweights=0.23179]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 30.98it/s, loss=-0.04077, sqweights=0.23212]
Epoch 6: 100%|##########| 20/20 [00:01<00:00, 30.98it/s, loss=-0.04077, sqweights=0.23212, train_loss=-0.04736, train_sqweights=0.19281, val_loss=-0.04405, val_sqweights=0.19233]
Epoch 6: 100%|##########| 20/20 [00:01<00:00, 30.98it/s, loss=-0.04077, sqweights=0.23212, train_loss=-0.04736, train_sqweights=0.19281, val_loss=-0.04405, val_sqweights=0.19233]
Epoch 6: 100%|##########| 20/20 [00:01<00:00, 10.33it/s, loss=-0.04077, sqweights=0.23212, train_loss=-0.04736, train_sqweights=0.19281, val_loss=-0.04405, val_sqweights=0.19233]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 23.33it/s, loss=-0.04510, sqweights=0.24240]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 26.91it/s, loss=-0.04933, sqweights=0.24646]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 28.36it/s, loss=-0.04933, sqweights=0.24646]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 28.36it/s, loss=-0.04658, sqweights=0.24776]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 28.36it/s, loss=-0.04459, sqweights=0.24725]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 28.36it/s, loss=-0.04078, sqweights=0.24681]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 28.36it/s, loss=-0.04382, sqweights=0.24774]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 29.22it/s, loss=-0.04382, sqweights=0.24774]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 29.22it/s, loss=-0.04033, sqweights=0.24718]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 29.22it/s, loss=-0.04241, sqweights=0.24826]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 29.22it/s, loss=-0.04126, sqweights=0.24907]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 29.22it/s, loss=-0.04124, sqweights=0.24857]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 30.00it/s, loss=-0.04124, sqweights=0.24857]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 30.00it/s, loss=-0.04159, sqweights=0.24920]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 30.00it/s, loss=-0.04151, sqweights=0.24972]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 30.00it/s, loss=-0.04123, sqweights=0.25061]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 30.00it/s, loss=-0.04219, sqweights=0.25088]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 30.48it/s, loss=-0.04219, sqweights=0.25088]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 30.48it/s, loss=-0.04435, sqweights=0.25126]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 30.48it/s, loss=-0.04614, sqweights=0.25257]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 30.48it/s, loss=-0.04678, sqweights=0.25348]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 30.48it/s, loss=-0.04722, sqweights=0.25439]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 30.94it/s, loss=-0.04722, sqweights=0.25439]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 30.94it/s, loss=-0.04743, sqweights=0.25418]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 30.94it/s, loss=-0.04826, sqweights=0.25475]
Epoch 7: 100%|##########| 20/20 [00:01<00:00, 30.94it/s, loss=-0.04826, sqweights=0.25475, train_loss=-0.05600, train_sqweights=0.20885, val_loss=-0.05106, val_sqweights=0.20784]
Epoch 7: 100%|##########| 20/20 [00:01<00:00, 30.94it/s, loss=-0.04826, sqweights=0.25475, train_loss=-0.05600, train_sqweights=0.20885, val_loss=-0.05106, val_sqweights=0.20784]
Epoch 7: 100%|##########| 20/20 [00:01<00:00, 10.91it/s, loss=-0.04826, sqweights=0.25475, train_loss=-0.05600, train_sqweights=0.20885, val_loss=-0.05106, val_sqweights=0.20784]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:00, 20.21it/s, loss=-0.05567, sqweights=0.26485]
Epoch 8:  10%|#         | 2/20 [00:00<00:00, 24.80it/s, loss=-0.05813, sqweights=0.26429]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 26.77it/s, loss=-0.05813, sqweights=0.26429]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 26.77it/s, loss=-0.05920, sqweights=0.26773]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 26.77it/s, loss=-0.05462, sqweights=0.26648]
Epoch 8:  25%|##5       | 5/20 [00:00<00:00, 26.77it/s, loss=-0.05014, sqweights=0.26774]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 26.77it/s, loss=-0.04810, sqweights=0.26851]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 28.07it/s, loss=-0.04810, sqweights=0.26851]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 28.07it/s, loss=-0.04952, sqweights=0.27075]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 28.07it/s, loss=-0.04992, sqweights=0.27027]
Epoch 8:  45%|####5     | 9/20 [00:00<00:00, 28.07it/s, loss=-0.05132, sqweights=0.27128]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 28.07it/s, loss=-0.05094, sqweights=0.27259]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 29.09it/s, loss=-0.05094, sqweights=0.27259]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 29.09it/s, loss=-0.05223, sqweights=0.27262]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 29.09it/s, loss=-0.05182, sqweights=0.27263]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 29.09it/s, loss=-0.05178, sqweights=0.27381]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 29.09it/s, loss=-0.05180, sqweights=0.27380]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 29.85it/s, loss=-0.05180, sqweights=0.27380]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 29.85it/s, loss=-0.05236, sqweights=0.27441]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 29.85it/s, loss=-0.05225, sqweights=0.27544]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 29.85it/s, loss=-0.05237, sqweights=0.27543]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 29.85it/s, loss=-0.05262, sqweights=0.27644]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 30.33it/s, loss=-0.05262, sqweights=0.27644]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 30.33it/s, loss=-0.05224, sqweights=0.27735]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 30.33it/s, loss=-0.05395, sqweights=0.27728]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 30.33it/s, loss=-0.05395, sqweights=0.27728, train_loss=-0.06454, train_sqweights=0.22667, val_loss=-0.05799, val_sqweights=0.22481]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 30.33it/s, loss=-0.05395, sqweights=0.27728, train_loss=-0.06454, train_sqweights=0.22667, val_loss=-0.05799, val_sqweights=0.22481]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 10.87it/s, loss=-0.05395, sqweights=0.27728, train_loss=-0.06454, train_sqweights=0.22667, val_loss=-0.05799, val_sqweights=0.22481]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 23.40it/s, loss=-0.06516, sqweights=0.29983]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 26.96it/s, loss=-0.05793, sqweights=0.29524]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 28.13it/s, loss=-0.05793, sqweights=0.29524]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 28.13it/s, loss=-0.05476, sqweights=0.29232]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 28.13it/s, loss=-0.05564, sqweights=0.28999]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 28.13it/s, loss=-0.05890, sqweights=0.29145]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 28.13it/s, loss=-0.06194, sqweights=0.29206]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 29.14it/s, loss=-0.06194, sqweights=0.29206]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 29.14it/s, loss=-0.06005, sqweights=0.29337]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 29.14it/s, loss=-0.05975, sqweights=0.29432]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 29.14it/s, loss=-0.05902, sqweights=0.29513]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 29.14it/s, loss=-0.05842, sqweights=0.29525]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 29.96it/s, loss=-0.05842, sqweights=0.29525]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 29.96it/s, loss=-0.06017, sqweights=0.29553]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 29.96it/s, loss=-0.06158, sqweights=0.29632]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 29.96it/s, loss=-0.06077, sqweights=0.29747]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 29.96it/s, loss=-0.05950, sqweights=0.29903]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 30.55it/s, loss=-0.05950, sqweights=0.29903]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 30.55it/s, loss=-0.06055, sqweights=0.30065]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 30.55it/s, loss=-0.06072, sqweights=0.30153]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 30.55it/s, loss=-0.05981, sqweights=0.30184]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 30.55it/s, loss=-0.05947, sqweights=0.30216]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 30.90it/s, loss=-0.05947, sqweights=0.30216]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 30.90it/s, loss=-0.06001, sqweights=0.30296]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 30.90it/s, loss=-0.06052, sqweights=0.30306]
Epoch 9: 100%|##########| 20/20 [00:01<00:00, 30.90it/s, loss=-0.06052, sqweights=0.30306, train_loss=-0.07262, train_sqweights=0.24608, val_loss=-0.06450, val_sqweights=0.24367]
Epoch 9: 100%|##########| 20/20 [00:01<00:00, 30.90it/s, loss=-0.06052, sqweights=0.30306, train_loss=-0.07262, train_sqweights=0.24608, val_loss=-0.06450, val_sqweights=0.24367]
Epoch 9: 100%|##########| 20/20 [00:01<00:00, 10.31it/s, loss=-0.06052, sqweights=0.30306, train_loss=-0.07262, train_sqweights=0.24608, val_loss=-0.06450, val_sqweights=0.24367]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 22.62it/s, loss=-0.05653, sqweights=0.31929]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 26.66it/s, loss=-0.06528, sqweights=0.31691]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 28.28it/s, loss=-0.06528, sqweights=0.31691]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 28.28it/s, loss=-0.06460, sqweights=0.31597]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 28.28it/s, loss=-0.05639, sqweights=0.31573]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 28.28it/s, loss=-0.05941, sqweights=0.31768]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 28.28it/s, loss=-0.06243, sqweights=0.31783]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 28.93it/s, loss=-0.06243, sqweights=0.31783]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 28.93it/s, loss=-0.06165, sqweights=0.32123]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 28.93it/s, loss=-0.06350, sqweights=0.32266]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 28.93it/s, loss=-0.06363, sqweights=0.32433]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 28.93it/s, loss=-0.06310, sqweights=0.32460]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 29.79it/s, loss=-0.06310, sqweights=0.32460]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 29.79it/s, loss=-0.06481, sqweights=0.32514]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 29.79it/s, loss=-0.06346, sqweights=0.32637]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 29.79it/s, loss=-0.06477, sqweights=0.32730]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 29.79it/s, loss=-0.06449, sqweights=0.32917]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 30.38it/s, loss=-0.06449, sqweights=0.32917]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 30.38it/s, loss=-0.06426, sqweights=0.32935]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 30.38it/s, loss=-0.06323, sqweights=0.33050]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 30.38it/s, loss=-0.06361, sqweights=0.33102]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 30.38it/s, loss=-0.06453, sqweights=0.33154]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 30.81it/s, loss=-0.06453, sqweights=0.33154]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 30.81it/s, loss=-0.06573, sqweights=0.33251]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 30.81it/s, loss=-0.06515, sqweights=0.33266]
Epoch 10: 100%|##########| 20/20 [00:01<00:00, 30.81it/s, loss=-0.06515, sqweights=0.33266, train_loss=-0.08053, train_sqweights=0.26317, val_loss=-0.07084, val_sqweights=0.26089]
Epoch 10: 100%|##########| 20/20 [00:01<00:00, 30.81it/s, loss=-0.06515, sqweights=0.33266, train_loss=-0.08053, train_sqweights=0.26317, val_loss=-0.07084, val_sqweights=0.26089]
Epoch 10: 100%|##########| 20/20 [00:01<00:00, 10.94it/s, loss=-0.06515, sqweights=0.33266, train_loss=-0.08053, train_sqweights=0.26317, val_loss=-0.07084, val_sqweights=0.26089]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 23.39it/s, loss=-0.07144, sqweights=0.34834]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 26.95it/s, loss=-0.07127, sqweights=0.34662]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 28.50it/s, loss=-0.07127, sqweights=0.34662]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 28.50it/s, loss=-0.07834, sqweights=0.35406]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 28.50it/s, loss=-0.07730, sqweights=0.35018]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 28.50it/s, loss=-0.07615, sqweights=0.35466]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 28.50it/s, loss=-0.07512, sqweights=0.35213]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 29.49it/s, loss=-0.07512, sqweights=0.35213]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 29.49it/s, loss=-0.07313, sqweights=0.35234]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 29.49it/s, loss=-0.07349, sqweights=0.35224]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 29.49it/s, loss=-0.07270, sqweights=0.35186]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 29.49it/s, loss=-0.07354, sqweights=0.35252]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 30.18it/s, loss=-0.07354, sqweights=0.35252]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 30.18it/s, loss=-0.07379, sqweights=0.35321]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 30.18it/s, loss=-0.07347, sqweights=0.35381]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 30.18it/s, loss=-0.07360, sqweights=0.35501]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 30.18it/s, loss=-0.07398, sqweights=0.35421]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 30.66it/s, loss=-0.07398, sqweights=0.35421]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 30.66it/s, loss=-0.07221, sqweights=0.35381]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 30.66it/s, loss=-0.07235, sqweights=0.35439]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 30.66it/s, loss=-0.07274, sqweights=0.35392]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 30.66it/s, loss=-0.07395, sqweights=0.35501]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 31.04it/s, loss=-0.07395, sqweights=0.35501]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 31.04it/s, loss=-0.07424, sqweights=0.35592]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 31.04it/s, loss=-0.07348, sqweights=0.35566]
Epoch 11: 100%|##########| 20/20 [00:01<00:00, 31.04it/s, loss=-0.07348, sqweights=0.35566, train_loss=-0.08826, train_sqweights=0.28382, val_loss=-0.07691, val_sqweights=0.28064]
Epoch 11: 100%|##########| 20/20 [00:01<00:00, 31.04it/s, loss=-0.07348, sqweights=0.35566, train_loss=-0.08826, train_sqweights=0.28382, val_loss=-0.07691, val_sqweights=0.28064]
Epoch 11: 100%|##########| 20/20 [00:01<00:00, 10.94it/s, loss=-0.07348, sqweights=0.35566, train_loss=-0.08826, train_sqweights=0.28382, val_loss=-0.07691, val_sqweights=0.28064]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 23.22it/s, loss=-0.08720, sqweights=0.38962]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 26.65it/s, loss=-0.08816, sqweights=0.38076]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 27.62it/s, loss=-0.08816, sqweights=0.38076]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 27.62it/s, loss=-0.09107, sqweights=0.37792]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 27.62it/s, loss=-0.08812, sqweights=0.37439]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 27.62it/s, loss=-0.08434, sqweights=0.37631]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 27.62it/s, loss=-0.08565, sqweights=0.37618]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 28.71it/s, loss=-0.08565, sqweights=0.37618]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 28.71it/s, loss=-0.08676, sqweights=0.37792]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 28.71it/s, loss=-0.08564, sqweights=0.37878]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 28.71it/s, loss=-0.08497, sqweights=0.37876]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 28.71it/s, loss=-0.08518, sqweights=0.37815]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 29.53it/s, loss=-0.08518, sqweights=0.37815]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 29.53it/s, loss=-0.08367, sqweights=0.37811]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 29.53it/s, loss=-0.08346, sqweights=0.37940]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 29.53it/s, loss=-0.08275, sqweights=0.37997]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 29.53it/s, loss=-0.08016, sqweights=0.37961]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 30.06it/s, loss=-0.08016, sqweights=0.37961]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 30.06it/s, loss=-0.08033, sqweights=0.37991]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 30.06it/s, loss=-0.08069, sqweights=0.38068]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 30.06it/s, loss=-0.08087, sqweights=0.38131]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 30.06it/s, loss=-0.08086, sqweights=0.38223]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 30.45it/s, loss=-0.08086, sqweights=0.38223]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 30.45it/s, loss=-0.07961, sqweights=0.38226]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 30.45it/s, loss=-0.07984, sqweights=0.38211]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 30.45it/s, loss=-0.07984, sqweights=0.38211, train_loss=-0.09563, train_sqweights=0.30569, val_loss=-0.08251, val_sqweights=0.30096]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 30.45it/s, loss=-0.07984, sqweights=0.38211, train_loss=-0.09563, train_sqweights=0.30569, val_loss=-0.08251, val_sqweights=0.30096]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 10.26it/s, loss=-0.07984, sqweights=0.38211, train_loss=-0.09563, train_sqweights=0.30569, val_loss=-0.08251, val_sqweights=0.30096]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 23.02it/s, loss=-0.07120, sqweights=0.40826]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 26.71it/s, loss=-0.08051, sqweights=0.40583]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 28.25it/s, loss=-0.08051, sqweights=0.40583]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 28.25it/s, loss=-0.07585, sqweights=0.40542]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 28.25it/s, loss=-0.07757, sqweights=0.39703]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 28.25it/s, loss=-0.07547, sqweights=0.39638]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 26.02it/s, loss=-0.07547, sqweights=0.39638]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 26.02it/s, loss=-0.08112, sqweights=0.40023]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 26.02it/s, loss=-0.07987, sqweights=0.40190]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 26.02it/s, loss=-0.08245, sqweights=0.40201]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 26.02it/s, loss=-0.08399, sqweights=0.40244]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 27.19it/s, loss=-0.08399, sqweights=0.40244]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 27.19it/s, loss=-0.08429, sqweights=0.40373]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 27.19it/s, loss=-0.08495, sqweights=0.40536]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 27.19it/s, loss=-0.08403, sqweights=0.40582]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 27.19it/s, loss=-0.08314, sqweights=0.40605]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 28.60it/s, loss=-0.08314, sqweights=0.40605]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 28.60it/s, loss=-0.08320, sqweights=0.40654]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 28.60it/s, loss=-0.08249, sqweights=0.40599]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 28.60it/s, loss=-0.08213, sqweights=0.40557]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 28.60it/s, loss=-0.08258, sqweights=0.40565]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 29.48it/s, loss=-0.08258, sqweights=0.40565]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 29.48it/s, loss=-0.08151, sqweights=0.40625]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 29.48it/s, loss=-0.08118, sqweights=0.40692]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 29.48it/s, loss=-0.08154, sqweights=0.40615]
Epoch 13: 100%|##########| 20/20 [00:01<00:00, 29.48it/s, loss=-0.08154, sqweights=0.40615, train_loss=-0.10255, train_sqweights=0.32512, val_loss=-0.08769, val_sqweights=0.31982]
Epoch 13: 100%|##########| 20/20 [00:01<00:00, 29.48it/s, loss=-0.08154, sqweights=0.40615, train_loss=-0.10255, train_sqweights=0.32512, val_loss=-0.08769, val_sqweights=0.31982]
Epoch 13: 100%|##########| 20/20 [00:01<00:00, 10.64it/s, loss=-0.08154, sqweights=0.40615, train_loss=-0.10255, train_sqweights=0.32512, val_loss=-0.08769, val_sqweights=0.31982]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 23.45it/s, loss=-0.09586, sqweights=0.41481]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 26.99it/s, loss=-0.09707, sqweights=0.41858]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 28.40it/s, loss=-0.09707, sqweights=0.41858]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 28.40it/s, loss=-0.08839, sqweights=0.42349]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 28.40it/s, loss=-0.08819, sqweights=0.42517]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 28.40it/s, loss=-0.09065, sqweights=0.42679]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 28.40it/s, loss=-0.09114, sqweights=0.42809]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.09114, sqweights=0.42809]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.08784, sqweights=0.42569]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 29.25it/s, loss=-0.08947, sqweights=0.42554]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 29.25it/s, loss=-0.08645, sqweights=0.42436]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 29.25it/s, loss=-0.08452, sqweights=0.42274]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.08452, sqweights=0.42274]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.08518, sqweights=0.42197]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 29.98it/s, loss=-0.08586, sqweights=0.42346]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 29.98it/s, loss=-0.08620, sqweights=0.42715]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 29.98it/s, loss=-0.08568, sqweights=0.42640]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 30.50it/s, loss=-0.08568, sqweights=0.42640]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 30.50it/s, loss=-0.08519, sqweights=0.42722]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 30.50it/s, loss=-0.08393, sqweights=0.42849]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 30.50it/s, loss=-0.08476, sqweights=0.43005]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 30.50it/s, loss=-0.08454, sqweights=0.43008]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 30.78it/s, loss=-0.08454, sqweights=0.43008]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 30.78it/s, loss=-0.08557, sqweights=0.43042]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 30.78it/s, loss=-0.08499, sqweights=0.43039]
Epoch 14: 100%|##########| 20/20 [00:01<00:00, 30.78it/s, loss=-0.08499, sqweights=0.43039, train_loss=-0.10920, train_sqweights=0.34601, val_loss=-0.09255, val_sqweights=0.33983]
Epoch 14: 100%|##########| 20/20 [00:01<00:00, 30.78it/s, loss=-0.08499, sqweights=0.43039, train_loss=-0.10920, train_sqweights=0.34601, val_loss=-0.09255, val_sqweights=0.33983]
Epoch 14: 100%|##########| 20/20 [00:01<00:00, 10.88it/s, loss=-0.08499, sqweights=0.43039, train_loss=-0.10920, train_sqweights=0.34601, val_loss=-0.09255, val_sqweights=0.33983]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 22.77it/s, loss=-0.08834, sqweights=0.43580]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 24.30it/s, loss=-0.08763, sqweights=0.43319]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 26.34it/s, loss=-0.08763, sqweights=0.43319]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 26.34it/s, loss=-0.07841, sqweights=0.43230]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 26.34it/s, loss=-0.08299, sqweights=0.43607]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 26.34it/s, loss=-0.08393, sqweights=0.43885]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 26.34it/s, loss=-0.08713, sqweights=0.44074]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 27.73it/s, loss=-0.08713, sqweights=0.44074]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 27.73it/s, loss=-0.08999, sqweights=0.44313]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 27.73it/s, loss=-0.08969, sqweights=0.44371]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 27.73it/s, loss=-0.09124, sqweights=0.44341]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 27.73it/s, loss=-0.09253, sqweights=0.44525]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 28.75it/s, loss=-0.09253, sqweights=0.44525]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 28.75it/s, loss=-0.09001, sqweights=0.44603]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 28.75it/s, loss=-0.09011, sqweights=0.44748]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 28.75it/s, loss=-0.08964, sqweights=0.44716]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 28.75it/s, loss=-0.09015, sqweights=0.44630]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 29.58it/s, loss=-0.09015, sqweights=0.44630]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 29.58it/s, loss=-0.09175, sqweights=0.44845]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 29.58it/s, loss=-0.09172, sqweights=0.44886]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 29.58it/s, loss=-0.09207, sqweights=0.44795]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 29.58it/s, loss=-0.09237, sqweights=0.44924]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 30.23it/s, loss=-0.09237, sqweights=0.44924]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 30.23it/s, loss=-0.09120, sqweights=0.44991]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 30.23it/s, loss=-0.09205, sqweights=0.45096]
Epoch 15: 100%|##########| 20/20 [00:01<00:00, 30.23it/s, loss=-0.09205, sqweights=0.45096, train_loss=-0.11582, train_sqweights=0.36575, val_loss=-0.09757, val_sqweights=0.35991]
Epoch 15: 100%|##########| 20/20 [00:01<00:00, 30.23it/s, loss=-0.09205, sqweights=0.45096, train_loss=-0.11582, train_sqweights=0.36575, val_loss=-0.09757, val_sqweights=0.35991]
Epoch 15: 100%|##########| 20/20 [00:01<00:00, 10.19it/s, loss=-0.09205, sqweights=0.45096, train_loss=-0.11582, train_sqweights=0.36575, val_loss=-0.09757, val_sqweights=0.35991]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 22.92it/s, loss=-0.10425, sqweights=0.46168]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 26.59it/s, loss=-0.10380, sqweights=0.46857]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 28.03it/s, loss=-0.10380, sqweights=0.46857]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 28.03it/s, loss=-0.10290, sqweights=0.47159]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 28.03it/s, loss=-0.10083, sqweights=0.47063]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 28.03it/s, loss=-0.09729, sqweights=0.46960]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 28.03it/s, loss=-0.09919, sqweights=0.46956]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 28.85it/s, loss=-0.09919, sqweights=0.46956]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 28.85it/s, loss=-0.09569, sqweights=0.46970]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 28.85it/s, loss=-0.09336, sqweights=0.46725]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 28.85it/s, loss=-0.09539, sqweights=0.46855]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 28.85it/s, loss=-0.09485, sqweights=0.46797]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 29.58it/s, loss=-0.09485, sqweights=0.46797]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 29.58it/s, loss=-0.09451, sqweights=0.46993]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 29.58it/s, loss=-0.09326, sqweights=0.46965]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 29.58it/s, loss=-0.09264, sqweights=0.47052]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 29.58it/s, loss=-0.09116, sqweights=0.47196]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 30.08it/s, loss=-0.09116, sqweights=0.47196]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 30.08it/s, loss=-0.09321, sqweights=0.47289]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 30.08it/s, loss=-0.09232, sqweights=0.47355]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 30.08it/s, loss=-0.09346, sqweights=0.47399]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 30.08it/s, loss=-0.09380, sqweights=0.47464]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 30.54it/s, loss=-0.09380, sqweights=0.47464]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 30.54it/s, loss=-0.09357, sqweights=0.47534]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 30.54it/s, loss=-0.09380, sqweights=0.47742]
Epoch 16: 100%|##########| 20/20 [00:01<00:00, 30.54it/s, loss=-0.09380, sqweights=0.47742, train_loss=-0.12212, train_sqweights=0.38868, val_loss=-0.10184, val_sqweights=0.38294]
Epoch 16: 100%|##########| 20/20 [00:01<00:00, 30.54it/s, loss=-0.09380, sqweights=0.47742, train_loss=-0.12212, train_sqweights=0.38868, val_loss=-0.10184, val_sqweights=0.38294]
Epoch 16: 100%|##########| 20/20 [00:01<00:00, 10.74it/s, loss=-0.09380, sqweights=0.47742, train_loss=-0.12212, train_sqweights=0.38868, val_loss=-0.10184, val_sqweights=0.38294]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 23.02it/s, loss=-0.09670, sqweights=0.49489]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 26.06it/s, loss=-0.09791, sqweights=0.49537]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 27.67it/s, loss=-0.09791, sqweights=0.49537]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 27.67it/s, loss=-0.09641, sqweights=0.49534]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 27.67it/s, loss=-0.09152, sqweights=0.49331]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 27.67it/s, loss=-0.08884, sqweights=0.49110]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 27.67it/s, loss=-0.08874, sqweights=0.49207]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 28.65it/s, loss=-0.08874, sqweights=0.49207]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 28.65it/s, loss=-0.08875, sqweights=0.48996]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 28.65it/s, loss=-0.08955, sqweights=0.48928]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 28.65it/s, loss=-0.09083, sqweights=0.48963]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 28.65it/s, loss=-0.08933, sqweights=0.49020]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 29.50it/s, loss=-0.08933, sqweights=0.49020]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 29.50it/s, loss=-0.08979, sqweights=0.48932]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 29.50it/s, loss=-0.09131, sqweights=0.49118]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 29.50it/s, loss=-0.09365, sqweights=0.49294]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 29.50it/s, loss=-0.09411, sqweights=0.49267]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 30.11it/s, loss=-0.09411, sqweights=0.49267]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 30.11it/s, loss=-0.09530, sqweights=0.49364]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 30.11it/s, loss=-0.09544, sqweights=0.49386]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 30.11it/s, loss=-0.09548, sqweights=0.49495]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 30.11it/s, loss=-0.09555, sqweights=0.49535]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 30.49it/s, loss=-0.09555, sqweights=0.49535]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 30.49it/s, loss=-0.09578, sqweights=0.49644]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 30.49it/s, loss=-0.09585, sqweights=0.49735]
Epoch 17: 100%|##########| 20/20 [00:01<00:00, 30.49it/s, loss=-0.09585, sqweights=0.49735, train_loss=-0.12752, train_sqweights=0.41000, val_loss=-0.10547, val_sqweights=0.40299]
Epoch 17: 100%|##########| 20/20 [00:01<00:00, 30.49it/s, loss=-0.09585, sqweights=0.49735, train_loss=-0.12752, train_sqweights=0.41000, val_loss=-0.10547, val_sqweights=0.40299]
Epoch 17: 100%|##########| 20/20 [00:01<00:00, 10.85it/s, loss=-0.09585, sqweights=0.49735, train_loss=-0.12752, train_sqweights=0.41000, val_loss=-0.10547, val_sqweights=0.40299]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 23.13it/s, loss=-0.08558, sqweights=0.52872]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 26.69it/s, loss=-0.08746, sqweights=0.51685]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 28.17it/s, loss=-0.08746, sqweights=0.51685]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 28.17it/s, loss=-0.08702, sqweights=0.51447]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 28.17it/s, loss=-0.08847, sqweights=0.51544]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 28.17it/s, loss=-0.09373, sqweights=0.51458]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 28.17it/s, loss=-0.09426, sqweights=0.51280]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 29.10it/s, loss=-0.09426, sqweights=0.51280]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 29.10it/s, loss=-0.09754, sqweights=0.51571]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 29.10it/s, loss=-0.09779, sqweights=0.51500]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 29.10it/s, loss=-0.09570, sqweights=0.51498]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 29.10it/s, loss=-0.09965, sqweights=0.51430]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 29.78it/s, loss=-0.09965, sqweights=0.51430]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 29.78it/s, loss=-0.09804, sqweights=0.51499]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 29.78it/s, loss=-0.09990, sqweights=0.51506]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 29.78it/s, loss=-0.10100, sqweights=0.51761]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 29.78it/s, loss=-0.10009, sqweights=0.51890]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 30.42it/s, loss=-0.10009, sqweights=0.51890]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 30.42it/s, loss=-0.10146, sqweights=0.51944]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 30.42it/s, loss=-0.10269, sqweights=0.52048]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 30.42it/s, loss=-0.10312, sqweights=0.52101]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 30.42it/s, loss=-0.10124, sqweights=0.52278]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 30.86it/s, loss=-0.10124, sqweights=0.52278]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 30.86it/s, loss=-0.10016, sqweights=0.52201]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 30.86it/s, loss=-0.09880, sqweights=0.52258]
Epoch 18: 100%|##########| 20/20 [00:01<00:00, 30.86it/s, loss=-0.09880, sqweights=0.52258, train_loss=-0.13255, train_sqweights=0.43187, val_loss=-0.10932, val_sqweights=0.42405]
Epoch 18: 100%|##########| 20/20 [00:01<00:00, 30.86it/s, loss=-0.09880, sqweights=0.52258, train_loss=-0.13255, train_sqweights=0.43187, val_loss=-0.10932, val_sqweights=0.42405]
Epoch 18: 100%|##########| 20/20 [00:01<00:00, 10.25it/s, loss=-0.09880, sqweights=0.52258, train_loss=-0.13255, train_sqweights=0.43187, val_loss=-0.10932, val_sqweights=0.42405]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 23.18it/s, loss=-0.11755, sqweights=0.54564]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 26.78it/s, loss=-0.11506, sqweights=0.54828]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 27.94it/s, loss=-0.11506, sqweights=0.54828]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 27.94it/s, loss=-0.11125, sqweights=0.54654]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 27.94it/s, loss=-0.11335, sqweights=0.54306]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 27.94it/s, loss=-0.11226, sqweights=0.54806]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 27.94it/s, loss=-0.11128, sqweights=0.54172]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 28.96it/s, loss=-0.11128, sqweights=0.54172]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 28.96it/s, loss=-0.11219, sqweights=0.54114]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 28.96it/s, loss=-0.11298, sqweights=0.54245]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 28.96it/s, loss=-0.11378, sqweights=0.54512]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 28.96it/s, loss=-0.11429, sqweights=0.54557]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 29.72it/s, loss=-0.11429, sqweights=0.54557]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 29.72it/s, loss=-0.11544, sqweights=0.54444]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 29.72it/s, loss=-0.11500, sqweights=0.54493]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 29.72it/s, loss=-0.11382, sqweights=0.54588]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 29.72it/s, loss=-0.11065, sqweights=0.54461]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 30.33it/s, loss=-0.11065, sqweights=0.54461]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 30.33it/s, loss=-0.10999, sqweights=0.54377]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 30.33it/s, loss=-0.10774, sqweights=0.54428]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 30.33it/s, loss=-0.10755, sqweights=0.54303]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 30.33it/s, loss=-0.10655, sqweights=0.54388]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 30.84it/s, loss=-0.10655, sqweights=0.54388]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 30.84it/s, loss=-0.10653, sqweights=0.54394]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 30.84it/s, loss=-0.10804, sqweights=0.54433]
Epoch 19: 100%|##########| 20/20 [00:01<00:00, 30.84it/s, loss=-0.10804, sqweights=0.54433, train_loss=-0.13719, train_sqweights=0.45357, val_loss=-0.11208, val_sqweights=0.44489]
Epoch 19: 100%|##########| 20/20 [00:01<00:00, 30.84it/s, loss=-0.10804, sqweights=0.54433, train_loss=-0.13719, train_sqweights=0.45357, val_loss=-0.11208, val_sqweights=0.44489]
Epoch 19: 100%|##########| 20/20 [00:01<00:00, 10.89it/s, loss=-0.10804, sqweights=0.54433, train_loss=-0.13719, train_sqweights=0.45357, val_loss=-0.11208, val_sqweights=0.44489]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 23.07it/s, loss=-0.10021, sqweights=0.56078]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 26.72it/s, loss=-0.09131, sqweights=0.54094]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 28.26it/s, loss=-0.09131, sqweights=0.54094]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 28.26it/s, loss=-0.10246, sqweights=0.54511]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 28.26it/s, loss=-0.10693, sqweights=0.55148]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 28.26it/s, loss=-0.11452, sqweights=0.54947]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 28.59it/s, loss=-0.11452, sqweights=0.54947]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 28.59it/s, loss=-0.11257, sqweights=0.55345]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 28.59it/s, loss=-0.11228, sqweights=0.55612]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 28.59it/s, loss=-0.11117, sqweights=0.55604]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 28.59it/s, loss=-0.10929, sqweights=0.55627]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 29.40it/s, loss=-0.10929, sqweights=0.55627]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 29.40it/s, loss=-0.10961, sqweights=0.55590]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 29.40it/s, loss=-0.10910, sqweights=0.55847]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 29.40it/s, loss=-0.11068, sqweights=0.55929]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 29.40it/s, loss=-0.11177, sqweights=0.55961]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 30.08it/s, loss=-0.11177, sqweights=0.55961]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 30.08it/s, loss=-0.11085, sqweights=0.56163]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 30.08it/s, loss=-0.11082, sqweights=0.56153]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 30.08it/s, loss=-0.11084, sqweights=0.56291]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 30.08it/s, loss=-0.10985, sqweights=0.56220]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 30.60it/s, loss=-0.10985, sqweights=0.56220]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 30.60it/s, loss=-0.11077, sqweights=0.56228]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 30.60it/s, loss=-0.11026, sqweights=0.56206]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 30.60it/s, loss=-0.10829, sqweights=0.56287]
Epoch 20: 100%|##########| 20/20 [00:01<00:00, 30.60it/s, loss=-0.10829, sqweights=0.56287, train_loss=-0.14152, train_sqweights=0.47556, val_loss=-0.11502, val_sqweights=0.46690]
Epoch 20: 100%|##########| 20/20 [00:01<00:00, 30.60it/s, loss=-0.10829, sqweights=0.56287, train_loss=-0.14152, train_sqweights=0.47556, val_loss=-0.11502, val_sqweights=0.46690]
Epoch 20: 100%|##########| 20/20 [00:01<00:00, 10.84it/s, loss=-0.10829, sqweights=0.56287, train_loss=-0.14152, train_sqweights=0.47556, val_loss=-0.11502, val_sqweights=0.46690]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 23.32it/s, loss=-0.10201, sqweights=0.58913]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 26.89it/s, loss=-0.10228, sqweights=0.57973]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 28.35it/s, loss=-0.10228, sqweights=0.57973]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 28.35it/s, loss=-0.10636, sqweights=0.58451]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 28.35it/s, loss=-0.10978, sqweights=0.58866]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 28.35it/s, loss=-0.11233, sqweights=0.58472]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 28.35it/s, loss=-0.11267, sqweights=0.58265]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 29.13it/s, loss=-0.11267, sqweights=0.58265]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 29.13it/s, loss=-0.11201, sqweights=0.58174]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 29.13it/s, loss=-0.11106, sqweights=0.57996]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 29.13it/s, loss=-0.11328, sqweights=0.58007]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 29.13it/s, loss=-0.11347, sqweights=0.58118]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 29.89it/s, loss=-0.11347, sqweights=0.58118]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 29.89it/s, loss=-0.11100, sqweights=0.58267]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 29.89it/s, loss=-0.11028, sqweights=0.58493]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 29.89it/s, loss=-0.10930, sqweights=0.58536]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 29.89it/s, loss=-0.10989, sqweights=0.58395]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 30.44it/s, loss=-0.10989, sqweights=0.58395]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 30.44it/s, loss=-0.10993, sqweights=0.58523]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 30.44it/s, loss=-0.11020, sqweights=0.58712]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 30.44it/s, loss=-0.10654, sqweights=0.58703]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 30.44it/s, loss=-0.10766, sqweights=0.58630]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 30.87it/s, loss=-0.10766, sqweights=0.58630]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 30.87it/s, loss=-0.10886, sqweights=0.58690]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 30.87it/s, loss=-0.10943, sqweights=0.58850]
Epoch 21: 100%|##########| 20/20 [00:01<00:00, 30.87it/s, loss=-0.10943, sqweights=0.58850, train_loss=-0.14553, train_sqweights=0.49804, val_loss=-0.11772, val_sqweights=0.48875]
Epoch 21: 100%|##########| 20/20 [00:01<00:00, 30.87it/s, loss=-0.10943, sqweights=0.58850, train_loss=-0.14553, train_sqweights=0.49804, val_loss=-0.11772, val_sqweights=0.48875]
Epoch 21: 100%|##########| 20/20 [00:01<00:00, 10.23it/s, loss=-0.10943, sqweights=0.58850, train_loss=-0.14553, train_sqweights=0.49804, val_loss=-0.11772, val_sqweights=0.48875]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 23.13it/s, loss=-0.11312, sqweights=0.58540]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 26.97it/s, loss=-0.12094, sqweights=0.59594]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 28.46it/s, loss=-0.12094, sqweights=0.59594]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 28.46it/s, loss=-0.12329, sqweights=0.60307]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 28.46it/s, loss=-0.11980, sqweights=0.59740]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 28.46it/s, loss=-0.11886, sqweights=0.60050]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 28.46it/s, loss=-0.12084, sqweights=0.60024]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 29.37it/s, loss=-0.12084, sqweights=0.60024]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 29.37it/s, loss=-0.11774, sqweights=0.59982]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 29.37it/s, loss=-0.11743, sqweights=0.60006]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 29.37it/s, loss=-0.11732, sqweights=0.60096]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 29.37it/s, loss=-0.11549, sqweights=0.60166]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 29.94it/s, loss=-0.11549, sqweights=0.60166]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 29.94it/s, loss=-0.11519, sqweights=0.60155]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 29.94it/s, loss=-0.11549, sqweights=0.60385]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 29.94it/s, loss=-0.11460, sqweights=0.60311]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 29.94it/s, loss=-0.11446, sqweights=0.60250]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 30.49it/s, loss=-0.11446, sqweights=0.60250]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 30.49it/s, loss=-0.11451, sqweights=0.60283]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 30.49it/s, loss=-0.11457, sqweights=0.60272]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 30.49it/s, loss=-0.11420, sqweights=0.60395]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 30.49it/s, loss=-0.11141, sqweights=0.60478]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 30.88it/s, loss=-0.11141, sqweights=0.60478]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 30.88it/s, loss=-0.11049, sqweights=0.60467]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 30.88it/s, loss=-0.11228, sqweights=0.60454]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 30.88it/s, loss=-0.11228, sqweights=0.60454, train_loss=-0.14903, train_sqweights=0.51983, val_loss=-0.12015, val_sqweights=0.50934]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 30.88it/s, loss=-0.11228, sqweights=0.60454, train_loss=-0.14903, train_sqweights=0.51983, val_loss=-0.12015, val_sqweights=0.50934]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 10.90it/s, loss=-0.11228, sqweights=0.60454, train_loss=-0.14903, train_sqweights=0.51983, val_loss=-0.12015, val_sqweights=0.50934]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:00, 23.43it/s, loss=-0.11234, sqweights=0.62286]
Epoch 23:  10%|#         | 2/20 [00:00<00:00, 27.05it/s, loss=-0.11787, sqweights=0.61855]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 28.57it/s, loss=-0.11787, sqweights=0.61855]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 28.57it/s, loss=-0.12291, sqweights=0.61450]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 28.57it/s, loss=-0.12016, sqweights=0.61965]
Epoch 23:  25%|##5       | 5/20 [00:00<00:00, 28.57it/s, loss=-0.11734, sqweights=0.62051]
Epoch 23:  30%|###       | 6/20 [00:00<00:00, 28.57it/s, loss=-0.11850, sqweights=0.61962]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 29.40it/s, loss=-0.11850, sqweights=0.61962]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 29.40it/s, loss=-0.11636, sqweights=0.61643]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 29.40it/s, loss=-0.11233, sqweights=0.61675]
Epoch 23:  45%|####5     | 9/20 [00:00<00:00, 29.40it/s, loss=-0.11348, sqweights=0.61868]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 29.40it/s, loss=-0.11529, sqweights=0.62058]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 30.10it/s, loss=-0.11529, sqweights=0.62058]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 30.10it/s, loss=-0.11433, sqweights=0.62019]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 30.10it/s, loss=-0.11464, sqweights=0.62280]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 30.10it/s, loss=-0.11472, sqweights=0.62333]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 30.10it/s, loss=-0.11730, sqweights=0.62485]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 30.32it/s, loss=-0.11730, sqweights=0.62485]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 30.32it/s, loss=-0.11865, sqweights=0.62340]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 30.32it/s, loss=-0.11706, sqweights=0.62370]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 30.32it/s, loss=-0.11630, sqweights=0.62437]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 30.32it/s, loss=-0.11570, sqweights=0.62436]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 30.74it/s, loss=-0.11570, sqweights=0.62436]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 30.74it/s, loss=-0.11541, sqweights=0.62317]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 30.74it/s, loss=-0.11433, sqweights=0.62364]
Epoch 23: 100%|##########| 20/20 [00:01<00:00, 30.74it/s, loss=-0.11433, sqweights=0.62364, train_loss=-0.15204, train_sqweights=0.54287, val_loss=-0.12220, val_sqweights=0.53209]
Epoch 23: 100%|##########| 20/20 [00:01<00:00, 30.74it/s, loss=-0.11433, sqweights=0.62364, train_loss=-0.15204, train_sqweights=0.54287, val_loss=-0.12220, val_sqweights=0.53209]
Epoch 23: 100%|##########| 20/20 [00:01<00:00, 10.84it/s, loss=-0.11433, sqweights=0.62364, train_loss=-0.15204, train_sqweights=0.54287, val_loss=-0.12220, val_sqweights=0.53209]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 23.24it/s, loss=-0.14692, sqweights=0.64503]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 26.89it/s, loss=-0.13386, sqweights=0.63772]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 28.40it/s, loss=-0.13386, sqweights=0.63772]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 28.40it/s, loss=-0.13510, sqweights=0.63615]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 28.40it/s, loss=-0.13076, sqweights=0.63725]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 28.40it/s, loss=-0.12318, sqweights=0.63530]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 28.40it/s, loss=-0.11743, sqweights=0.63684]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 29.39it/s, loss=-0.11743, sqweights=0.63684]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 29.39it/s, loss=-0.11647, sqweights=0.64049]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 29.39it/s, loss=-0.11526, sqweights=0.64247]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 29.39it/s, loss=-0.11521, sqweights=0.64251]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 29.39it/s, loss=-0.11345, sqweights=0.64136]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 30.06it/s, loss=-0.11345, sqweights=0.64136]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 30.06it/s, loss=-0.11183, sqweights=0.63909]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 30.06it/s, loss=-0.11107, sqweights=0.63946]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 30.06it/s, loss=-0.11027, sqweights=0.64157]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 30.06it/s, loss=-0.11151, sqweights=0.64332]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.11151, sqweights=0.64332]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.11351, sqweights=0.64473]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 30.54it/s, loss=-0.11307, sqweights=0.64426]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 30.54it/s, loss=-0.11422, sqweights=0.64497]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 30.54it/s, loss=-0.11456, sqweights=0.64416]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 30.93it/s, loss=-0.11456, sqweights=0.64416]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 30.93it/s, loss=-0.11573, sqweights=0.64339]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 30.93it/s, loss=-0.11413, sqweights=0.64361]
Epoch 24: 100%|##########| 20/20 [00:01<00:00, 30.93it/s, loss=-0.11413, sqweights=0.64361, train_loss=-0.15496, train_sqweights=0.56038, val_loss=-0.12434, val_sqweights=0.54957]
Epoch 24: 100%|##########| 20/20 [00:01<00:00, 30.93it/s, loss=-0.11413, sqweights=0.64361, train_loss=-0.15496, train_sqweights=0.56038, val_loss=-0.12434, val_sqweights=0.54957]
Epoch 24: 100%|##########| 20/20 [00:01<00:00, 10.92it/s, loss=-0.11413, sqweights=0.64361, train_loss=-0.15496, train_sqweights=0.56038, val_loss=-0.12434, val_sqweights=0.54957]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:03,  6.32it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:03,  6.32it/s, loss=-0.10340, sqweights=0.65229]
Epoch 25:  10%|#         | 2/20 [00:00<00:02,  6.32it/s, loss=-0.11809, sqweights=0.65320]
Epoch 25:  15%|#5        | 3/20 [00:00<00:02,  6.32it/s, loss=-0.12081, sqweights=0.65220]
Epoch 25:  20%|##        | 4/20 [00:00<00:02,  6.32it/s, loss=-0.12270, sqweights=0.65141]
Epoch 25:  25%|##5       | 5/20 [00:00<00:01,  8.32it/s, loss=-0.12270, sqweights=0.65141]
Epoch 25:  25%|##5       | 5/20 [00:00<00:01,  8.32it/s, loss=-0.12148, sqweights=0.65587]
Epoch 25:  30%|###       | 6/20 [00:00<00:01,  8.32it/s, loss=-0.12068, sqweights=0.65815]
Epoch 25:  35%|###5      | 7/20 [00:00<00:01,  8.32it/s, loss=-0.12212, sqweights=0.65870]
Epoch 25:  40%|####      | 8/20 [00:00<00:01,  8.32it/s, loss=-0.11748, sqweights=0.65774]
Epoch 25:  45%|####5     | 9/20 [00:00<00:01, 10.68it/s, loss=-0.11748, sqweights=0.65774]
Epoch 25:  45%|####5     | 9/20 [00:00<00:01, 10.68it/s, loss=-0.11866, sqweights=0.65944]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 10.68it/s, loss=-0.11549, sqweights=0.66052]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 10.68it/s, loss=-0.11682, sqweights=0.65915]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 10.68it/s, loss=-0.11615, sqweights=0.66029]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 13.34it/s, loss=-0.11615, sqweights=0.66029]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 13.34it/s, loss=-0.11843, sqweights=0.66148]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 13.34it/s, loss=-0.11760, sqweights=0.66050]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 13.34it/s, loss=-0.11928, sqweights=0.65935]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 13.34it/s, loss=-0.12033, sqweights=0.65988]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 16.17it/s, loss=-0.12033, sqweights=0.65988]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 16.17it/s, loss=-0.11965, sqweights=0.65943]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 16.17it/s, loss=-0.11806, sqweights=0.66030]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 16.17it/s, loss=-0.11887, sqweights=0.66048]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 16.17it/s, loss=-0.11923, sqweights=0.66084]
Epoch 25: 100%|##########| 20/20 [00:01<00:00, 16.17it/s, loss=-0.11923, sqweights=0.66084, train_loss=-0.15783, train_sqweights=0.57820, val_loss=-0.12633, val_sqweights=0.56667]
Epoch 25: 100%|##########| 20/20 [00:01<00:00, 16.17it/s, loss=-0.11923, sqweights=0.66084, train_loss=-0.15783, train_sqweights=0.57820, val_loss=-0.12633, val_sqweights=0.56667]
Epoch 25: 100%|##########| 20/20 [00:01<00:00, 10.26it/s, loss=-0.11923, sqweights=0.66084, train_loss=-0.15783, train_sqweights=0.57820, val_loss=-0.12633, val_sqweights=0.56667]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 23.44it/s, loss=-0.12017, sqweights=0.67970]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 27.17it/s, loss=-0.12703, sqweights=0.68861]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 28.29it/s, loss=-0.12703, sqweights=0.68861]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 28.29it/s, loss=-0.12209, sqweights=0.68597]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 28.29it/s, loss=-0.12596, sqweights=0.68261]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 28.29it/s, loss=-0.12903, sqweights=0.68395]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 28.29it/s, loss=-0.13285, sqweights=0.67845]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.13285, sqweights=0.67845]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.13771, sqweights=0.68019]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 29.25it/s, loss=-0.13786, sqweights=0.67884]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 29.25it/s, loss=-0.13246, sqweights=0.67924]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 29.25it/s, loss=-0.12972, sqweights=0.67745]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.12972, sqweights=0.67745]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.12706, sqweights=0.67914]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 29.98it/s, loss=-0.12785, sqweights=0.67952]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 29.98it/s, loss=-0.12515, sqweights=0.67970]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 29.98it/s, loss=-0.12458, sqweights=0.67744]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.12458, sqweights=0.67744]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 30.54it/s, loss=-0.12374, sqweights=0.67925]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 30.54it/s, loss=-0.12341, sqweights=0.67959]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 30.54it/s, loss=-0.12431, sqweights=0.67836]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 30.54it/s, loss=-0.12528, sqweights=0.67927]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 30.96it/s, loss=-0.12528, sqweights=0.67927]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 30.96it/s, loss=-0.12493, sqweights=0.67873]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 30.96it/s, loss=-0.12431, sqweights=0.67914]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 30.96it/s, loss=-0.12431, sqweights=0.67914, train_loss=-0.16091, train_sqweights=0.59838, val_loss=-0.12825, val_sqweights=0.58771]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 30.96it/s, loss=-0.12431, sqweights=0.67914, train_loss=-0.16091, train_sqweights=0.59838, val_loss=-0.12825, val_sqweights=0.58771]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 10.87it/s, loss=-0.12431, sqweights=0.67914, train_loss=-0.16091, train_sqweights=0.59838, val_loss=-0.12825, val_sqweights=0.58771]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 22.79it/s, loss=-0.11453, sqweights=0.68539]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 26.12it/s, loss=-0.11908, sqweights=0.68985]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 27.80it/s, loss=-0.11908, sqweights=0.68985]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 27.80it/s, loss=-0.12580, sqweights=0.68551]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 27.80it/s, loss=-0.12148, sqweights=0.68564]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 27.80it/s, loss=-0.11964, sqweights=0.68128]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 27.80it/s, loss=-0.12053, sqweights=0.67748]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 28.78it/s, loss=-0.12053, sqweights=0.67748]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 28.78it/s, loss=-0.12381, sqweights=0.67858]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 28.78it/s, loss=-0.12084, sqweights=0.68318]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 28.78it/s, loss=-0.11882, sqweights=0.68469]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 28.78it/s, loss=-0.11687, sqweights=0.68620]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 29.65it/s, loss=-0.11687, sqweights=0.68620]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 29.65it/s, loss=-0.11986, sqweights=0.68571]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 29.65it/s, loss=-0.12180, sqweights=0.68580]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 29.65it/s, loss=-0.12105, sqweights=0.68548]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 29.65it/s, loss=-0.12024, sqweights=0.68453]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 30.33it/s, loss=-0.12024, sqweights=0.68453]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 30.33it/s, loss=-0.12202, sqweights=0.68638]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 30.33it/s, loss=-0.12180, sqweights=0.68582]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 30.33it/s, loss=-0.12239, sqweights=0.68674]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 30.33it/s, loss=-0.12131, sqweights=0.68693]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 30.69it/s, loss=-0.12131, sqweights=0.68693]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 30.69it/s, loss=-0.12008, sqweights=0.68727]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 30.69it/s, loss=-0.11983, sqweights=0.68663]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 30.69it/s, loss=-0.11983, sqweights=0.68663, train_loss=-0.16341, train_sqweights=0.62025, val_loss=-0.12971, val_sqweights=0.60960]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 30.69it/s, loss=-0.11983, sqweights=0.68663, train_loss=-0.16341, train_sqweights=0.62025, val_loss=-0.12971, val_sqweights=0.60960]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 10.91it/s, loss=-0.11983, sqweights=0.68663, train_loss=-0.16341, train_sqweights=0.62025, val_loss=-0.12971, val_sqweights=0.60960]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 23.36it/s, loss=-0.13017, sqweights=0.71145]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 26.99it/s, loss=-0.11853, sqweights=0.70978]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 28.41it/s, loss=-0.11853, sqweights=0.70978]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 28.41it/s, loss=-0.10730, sqweights=0.70850]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 28.41it/s, loss=-0.10602, sqweights=0.70826]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 28.41it/s, loss=-0.10916, sqweights=0.70667]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 21.56it/s, loss=-0.10916, sqweights=0.70667]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 21.56it/s, loss=-0.11149, sqweights=0.70854]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 21.56it/s, loss=-0.10887, sqweights=0.70422]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 21.56it/s, loss=-0.11416, sqweights=0.70886]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 21.56it/s, loss=-0.11319, sqweights=0.70831]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 23.87it/s, loss=-0.11319, sqweights=0.70831]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 23.87it/s, loss=-0.11608, sqweights=0.70741]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 23.87it/s, loss=-0.11293, sqweights=0.70446]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 23.87it/s, loss=-0.11446, sqweights=0.70502]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 23.87it/s, loss=-0.11230, sqweights=0.70497]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 25.82it/s, loss=-0.11230, sqweights=0.70497]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 25.82it/s, loss=-0.11157, sqweights=0.70561]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 25.82it/s, loss=-0.11065, sqweights=0.70601]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 25.82it/s, loss=-0.11035, sqweights=0.70752]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 25.82it/s, loss=-0.11091, sqweights=0.70811]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 27.40it/s, loss=-0.11091, sqweights=0.70811]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 27.40it/s, loss=-0.11036, sqweights=0.70985]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 27.40it/s, loss=-0.11214, sqweights=0.70982]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 27.40it/s, loss=-0.11115, sqweights=0.71012]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 27.40it/s, loss=-0.11115, sqweights=0.71012, train_loss=-0.16508, train_sqweights=0.63764, val_loss=-0.13121, val_sqweights=0.62584]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 27.40it/s, loss=-0.11115, sqweights=0.71012, train_loss=-0.16508, train_sqweights=0.63764, val_loss=-0.13121, val_sqweights=0.62584]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 10.25it/s, loss=-0.11115, sqweights=0.71012, train_loss=-0.16508, train_sqweights=0.63764, val_loss=-0.13121, val_sqweights=0.62584]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 22.85it/s, loss=-0.12253, sqweights=0.71643]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 26.67it/s, loss=-0.12315, sqweights=0.71470]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 28.17it/s, loss=-0.12315, sqweights=0.71470]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 28.17it/s, loss=-0.12195, sqweights=0.71153]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 28.17it/s, loss=-0.11480, sqweights=0.71483]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 28.17it/s, loss=-0.11226, sqweights=0.71321]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 28.17it/s, loss=-0.11711, sqweights=0.71601]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.11711, sqweights=0.71601]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 29.25it/s, loss=-0.11768, sqweights=0.71642]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 29.25it/s, loss=-0.11826, sqweights=0.71640]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 29.25it/s, loss=-0.11813, sqweights=0.71681]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 29.25it/s, loss=-0.12159, sqweights=0.71673]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.12159, sqweights=0.71673]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 29.98it/s, loss=-0.12074, sqweights=0.71548]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 29.98it/s, loss=-0.12174, sqweights=0.71778]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 29.98it/s, loss=-0.12198, sqweights=0.71780]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 29.98it/s, loss=-0.12324, sqweights=0.71693]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 30.52it/s, loss=-0.12324, sqweights=0.71693]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 30.52it/s, loss=-0.12292, sqweights=0.71871]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 30.52it/s, loss=-0.12234, sqweights=0.72019]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 30.52it/s, loss=-0.12163, sqweights=0.72085]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 30.52it/s, loss=-0.11993, sqweights=0.72280]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 30.85it/s, loss=-0.11993, sqweights=0.72280]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 30.85it/s, loss=-0.11916, sqweights=0.72258]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 30.85it/s, loss=-0.12035, sqweights=0.72384]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 30.85it/s, loss=-0.12035, sqweights=0.72384, train_loss=-0.16692, train_sqweights=0.65199, val_loss=-0.13189, val_sqweights=0.64101]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 30.85it/s, loss=-0.12035, sqweights=0.72384, train_loss=-0.16692, train_sqweights=0.65199, val_loss=-0.13189, val_sqweights=0.64101]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 10.88it/s, loss=-0.12035, sqweights=0.72384, train_loss=-0.16692, train_sqweights=0.65199, val_loss=-0.13189, val_sqweights=0.64101]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 24.83it/s, loss=-0.11766, sqweights=0.72685]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 27.84it/s, loss=-0.11115, sqweights=0.72648]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 28.96it/s, loss=-0.11115, sqweights=0.72648]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 28.96it/s, loss=-0.11272, sqweights=0.72733]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 28.96it/s, loss=-0.11106, sqweights=0.73402]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 28.96it/s, loss=-0.11103, sqweights=0.73526]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 27.78it/s, loss=-0.11103, sqweights=0.73526]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 27.78it/s, loss=-0.11587, sqweights=0.73429]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 27.78it/s, loss=-0.11425, sqweights=0.73231]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 27.78it/s, loss=-0.11639, sqweights=0.73387]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 27.78it/s, loss=-0.11932, sqweights=0.73585]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 28.67it/s, loss=-0.11932, sqweights=0.73585]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 28.67it/s, loss=-0.11997, sqweights=0.73814]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 28.67it/s, loss=-0.11816, sqweights=0.73918]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 28.67it/s, loss=-0.11774, sqweights=0.73817]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 28.67it/s, loss=-0.11586, sqweights=0.73847]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 28.93it/s, loss=-0.11586, sqweights=0.73847]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 28.93it/s, loss=-0.11690, sqweights=0.73880]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 28.93it/s, loss=-0.11681, sqweights=0.73872]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 28.93it/s, loss=-0.11728, sqweights=0.73709]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 28.93it/s, loss=-0.11828, sqweights=0.73675]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 29.49it/s, loss=-0.11828, sqweights=0.73675]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 29.49it/s, loss=-0.11863, sqweights=0.73729]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 29.49it/s, loss=-0.12000, sqweights=0.73686]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 29.49it/s, loss=-0.12109, sqweights=0.73694]
Epoch 30: 100%|##########| 20/20 [00:01<00:00, 29.49it/s, loss=-0.12109, sqweights=0.73694, train_loss=-0.16830, train_sqweights=0.66780, val_loss=-0.13208, val_sqweights=0.65663]
Epoch 30: 100%|##########| 20/20 [00:01<00:00, 29.49it/s, loss=-0.12109, sqweights=0.73694, train_loss=-0.16830, train_sqweights=0.66780, val_loss=-0.13208, val_sqweights=0.65663]
Epoch 30: 100%|##########| 20/20 [00:01<00:00, 10.66it/s, loss=-0.12109, sqweights=0.73694, train_loss=-0.16830, train_sqweights=0.66780, val_loss=-0.13208, val_sqweights=0.65663]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 23.54it/s, loss=-0.13797, sqweights=0.73318]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 27.12it/s, loss=-0.12897, sqweights=0.73796]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 28.27it/s, loss=-0.12897, sqweights=0.73796]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 28.27it/s, loss=-0.09705, sqweights=0.73672]
Epoch 31:  20%|##        | 4/20 [00:00<00:00, 28.27it/s, loss=-0.10484, sqweights=0.73612]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 28.27it/s, loss=-0.11276, sqweights=0.73706]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 28.27it/s, loss=-0.11581, sqweights=0.73571]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 29.21it/s, loss=-0.11581, sqweights=0.73571]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 29.21it/s, loss=-0.11907, sqweights=0.73683]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 29.21it/s, loss=-0.11595, sqweights=0.73584]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 29.21it/s, loss=-0.11755, sqweights=0.73437]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 29.21it/s, loss=-0.11902, sqweights=0.73455]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 29.92it/s, loss=-0.11902, sqweights=0.73455]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 29.92it/s, loss=-0.11902, sqweights=0.73394]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 29.92it/s, loss=-0.12039, sqweights=0.73496]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 29.92it/s, loss=-0.12105, sqweights=0.73454]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 29.92it/s, loss=-0.12109, sqweights=0.73575]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 30.52it/s, loss=-0.12109, sqweights=0.73575]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 30.52it/s, loss=-0.12151, sqweights=0.73517]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 30.52it/s, loss=-0.12253, sqweights=0.73563]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 30.52it/s, loss=-0.12319, sqweights=0.73586]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 30.52it/s, loss=-0.12319, sqweights=0.73783]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 30.93it/s, loss=-0.12319, sqweights=0.73783]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 30.93it/s, loss=-0.12182, sqweights=0.73800]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 30.93it/s, loss=-0.12157, sqweights=0.74018]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 30.93it/s, loss=-0.12157, sqweights=0.74018, train_loss=-0.16970, train_sqweights=0.68108, val_loss=-0.13295, val_sqweights=0.66976]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 30.93it/s, loss=-0.12157, sqweights=0.74018, train_loss=-0.16970, train_sqweights=0.68108, val_loss=-0.13295, val_sqweights=0.66976]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 10.26it/s, loss=-0.12157, sqweights=0.74018, train_loss=-0.16970, train_sqweights=0.68108, val_loss=-0.13295, val_sqweights=0.66976]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 22.99it/s, loss=-0.14213, sqweights=0.74663]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 26.76it/s, loss=-0.14350, sqweights=0.75002]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 28.28it/s, loss=-0.14350, sqweights=0.75002]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 28.28it/s, loss=-0.14076, sqweights=0.75382]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 28.28it/s, loss=-0.14267, sqweights=0.75734]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 28.28it/s, loss=-0.13745, sqweights=0.75531]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 28.28it/s, loss=-0.13226, sqweights=0.75423]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 29.27it/s, loss=-0.13226, sqweights=0.75423]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 29.27it/s, loss=-0.12819, sqweights=0.75384]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 29.27it/s, loss=-0.12344, sqweights=0.75150]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 29.27it/s, loss=-0.11835, sqweights=0.75115]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 29.27it/s, loss=-0.11784, sqweights=0.75156]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 29.92it/s, loss=-0.11784, sqweights=0.75156]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 29.92it/s, loss=-0.11882, sqweights=0.75246]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 29.92it/s, loss=-0.12026, sqweights=0.75320]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 29.92it/s, loss=-0.12298, sqweights=0.75448]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 29.92it/s, loss=-0.12450, sqweights=0.75328]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 30.50it/s, loss=-0.12450, sqweights=0.75328]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 30.50it/s, loss=-0.12414, sqweights=0.75468]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 30.50it/s, loss=-0.12406, sqweights=0.75356]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 30.50it/s, loss=-0.12213, sqweights=0.75487]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 30.50it/s, loss=-0.12232, sqweights=0.75491]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 30.88it/s, loss=-0.12232, sqweights=0.75491]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 30.88it/s, loss=-0.12144, sqweights=0.75562]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 30.88it/s, loss=-0.12063, sqweights=0.75719]
Epoch 32: 100%|##########| 20/20 [00:01<00:00, 30.88it/s, loss=-0.12063, sqweights=0.75719, train_loss=-0.17112, train_sqweights=0.69624, val_loss=-0.13357, val_sqweights=0.68571]
Epoch 32: 100%|##########| 20/20 [00:01<00:00, 30.88it/s, loss=-0.12063, sqweights=0.75719, train_loss=-0.17112, train_sqweights=0.69624, val_loss=-0.13357, val_sqweights=0.68571]
Epoch 32: 100%|##########| 20/20 [00:01<00:00, 10.89it/s, loss=-0.12063, sqweights=0.75719, train_loss=-0.17112, train_sqweights=0.69624, val_loss=-0.13357, val_sqweights=0.68571]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 23.34it/s, loss=-0.09986, sqweights=0.77585]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 26.94it/s, loss=-0.10025, sqweights=0.76519]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 28.39it/s, loss=-0.10025, sqweights=0.76519]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 28.39it/s, loss=-0.10605, sqweights=0.76009]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 28.39it/s, loss=-0.11263, sqweights=0.76795]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 28.39it/s, loss=-0.11566, sqweights=0.76341]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 28.39it/s, loss=-0.11902, sqweights=0.76759]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 29.29it/s, loss=-0.11902, sqweights=0.76759]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 29.29it/s, loss=-0.11927, sqweights=0.76441]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 29.29it/s, loss=-0.11725, sqweights=0.76228]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 29.29it/s, loss=-0.11701, sqweights=0.76180]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 29.29it/s, loss=-0.12028, sqweights=0.76351]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 30.05it/s, loss=-0.12028, sqweights=0.76351]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 30.05it/s, loss=-0.12254, sqweights=0.76505]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 30.05it/s, loss=-0.12382, sqweights=0.76604]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 30.05it/s, loss=-0.12539, sqweights=0.76555]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 30.05it/s, loss=-0.12462, sqweights=0.76510]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 30.56it/s, loss=-0.12462, sqweights=0.76510]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 30.56it/s, loss=-0.12420, sqweights=0.76420]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 30.56it/s, loss=-0.12430, sqweights=0.76493]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 30.56it/s, loss=-0.12419, sqweights=0.76522]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 30.56it/s, loss=-0.12432, sqweights=0.76589]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 30.94it/s, loss=-0.12432, sqweights=0.76589]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 30.94it/s, loss=-0.12291, sqweights=0.76599]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 30.94it/s, loss=-0.12390, sqweights=0.76686]
Epoch 33: 100%|##########| 20/20 [00:01<00:00, 30.94it/s, loss=-0.12390, sqweights=0.76686, train_loss=-0.17259, train_sqweights=0.71010, val_loss=-0.13450, val_sqweights=0.70059]
Epoch 33: 100%|##########| 20/20 [00:01<00:00, 30.94it/s, loss=-0.12390, sqweights=0.76686, train_loss=-0.17259, train_sqweights=0.71010, val_loss=-0.13450, val_sqweights=0.70059]
Epoch 33: 100%|##########| 20/20 [00:01<00:00, 10.91it/s, loss=-0.12390, sqweights=0.76686, train_loss=-0.17259, train_sqweights=0.71010, val_loss=-0.13450, val_sqweights=0.70059]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 22.79it/s, loss=-0.16445, sqweights=0.78456]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 26.46it/s, loss=-0.13379, sqweights=0.77402]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 27.97it/s, loss=-0.13379, sqweights=0.77402]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 27.97it/s, loss=-0.13237, sqweights=0.76886]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 27.97it/s, loss=-0.12676, sqweights=0.77228]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 27.97it/s, loss=-0.12380, sqweights=0.77136]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 27.97it/s, loss=-0.12022, sqweights=0.77204]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 29.01it/s, loss=-0.12022, sqweights=0.77204]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 29.01it/s, loss=-0.12033, sqweights=0.77005]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 29.01it/s, loss=-0.11967, sqweights=0.77206]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 29.01it/s, loss=-0.11610, sqweights=0.77184]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 29.01it/s, loss=-0.11773, sqweights=0.77245]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 29.76it/s, loss=-0.11773, sqweights=0.77245]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 29.76it/s, loss=-0.12013, sqweights=0.77285]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 29.76it/s, loss=-0.12117, sqweights=0.77488]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 29.76it/s, loss=-0.12237, sqweights=0.77503]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 29.76it/s, loss=-0.12280, sqweights=0.77457]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 30.27it/s, loss=-0.12280, sqweights=0.77457]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 30.27it/s, loss=-0.12218, sqweights=0.77455]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 30.27it/s, loss=-0.12354, sqweights=0.77535]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 30.27it/s, loss=-0.12362, sqweights=0.77660]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 30.27it/s, loss=-0.12518, sqweights=0.77806]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 30.79it/s, loss=-0.12518, sqweights=0.77806]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 30.79it/s, loss=-0.12494, sqweights=0.77755]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 30.79it/s, loss=-0.12380, sqweights=0.77809]
Epoch 34: 100%|##########| 20/20 [00:01<00:00, 30.79it/s, loss=-0.12380, sqweights=0.77809, train_loss=-0.17366, train_sqweights=0.72323, val_loss=-0.13524, val_sqweights=0.71317]
Epoch 34: 100%|##########| 20/20 [00:01<00:00, 30.79it/s, loss=-0.12380, sqweights=0.77809, train_loss=-0.17366, train_sqweights=0.72323, val_loss=-0.13524, val_sqweights=0.71317]
Epoch 34: 100%|##########| 20/20 [00:01<00:00, 10.23it/s, loss=-0.12380, sqweights=0.77809, train_loss=-0.17366, train_sqweights=0.72323, val_loss=-0.13524, val_sqweights=0.71317]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 23.15it/s, loss=-0.12752, sqweights=0.78823]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 26.80it/s, loss=-0.13874, sqweights=0.77992]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 28.32it/s, loss=-0.13874, sqweights=0.77992]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 28.32it/s, loss=-0.12893, sqweights=0.79093]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 28.32it/s, loss=-0.12398, sqweights=0.78857]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 28.32it/s, loss=-0.12139, sqweights=0.78464]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 28.32it/s, loss=-0.11974, sqweights=0.78310]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 29.17it/s, loss=-0.11974, sqweights=0.78310]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 29.17it/s, loss=-0.12261, sqweights=0.78404]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 29.17it/s, loss=-0.12614, sqweights=0.78361]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 29.17it/s, loss=-0.12609, sqweights=0.78418]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 29.17it/s, loss=-0.12295, sqweights=0.78335]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 29.94it/s, loss=-0.12295, sqweights=0.78335]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 29.94it/s, loss=-0.12266, sqweights=0.78238]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 29.94it/s, loss=-0.12098, sqweights=0.78229]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 29.94it/s, loss=-0.12136, sqweights=0.78226]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 29.94it/s, loss=-0.12160, sqweights=0.78283]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 30.35it/s, loss=-0.12160, sqweights=0.78283]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 30.35it/s, loss=-0.12211, sqweights=0.78335]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 30.35it/s, loss=-0.12208, sqweights=0.78324]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 30.35it/s, loss=-0.12288, sqweights=0.78402]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 29.90it/s, loss=-0.12288, sqweights=0.78402]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 29.90it/s, loss=-0.12293, sqweights=0.78521]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 29.90it/s, loss=-0.12369, sqweights=0.78618]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 29.90it/s, loss=-0.12584, sqweights=0.78823]
Epoch 35: 100%|##########| 20/20 [00:01<00:00, 29.90it/s, loss=-0.12584, sqweights=0.78823, train_loss=-0.17499, train_sqweights=0.73558, val_loss=-0.13543, val_sqweights=0.72579]
Epoch 35: 100%|##########| 20/20 [00:01<00:00, 29.90it/s, loss=-0.12584, sqweights=0.78823, train_loss=-0.17499, train_sqweights=0.73558, val_loss=-0.13543, val_sqweights=0.72579]
Epoch 35: 100%|##########| 20/20 [00:01<00:00, 10.80it/s, loss=-0.12584, sqweights=0.78823, train_loss=-0.17499, train_sqweights=0.73558, val_loss=-0.13543, val_sqweights=0.72579]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 23.16it/s, loss=-0.14593, sqweights=0.78741]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 26.29it/s, loss=-0.14747, sqweights=0.79076]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 28.08it/s, loss=-0.14747, sqweights=0.79076]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 28.08it/s, loss=-0.12897, sqweights=0.79022]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 28.08it/s, loss=-0.13139, sqweights=0.79002]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 28.08it/s, loss=-0.13539, sqweights=0.78455]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 28.08it/s, loss=-0.13511, sqweights=0.78669]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 29.11it/s, loss=-0.13511, sqweights=0.78669]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 29.11it/s, loss=-0.13697, sqweights=0.78902]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 29.11it/s, loss=-0.13538, sqweights=0.79040]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 29.11it/s, loss=-0.13379, sqweights=0.79345]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 29.11it/s, loss=-0.13594, sqweights=0.79688]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.13594, sqweights=0.79688]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.13304, sqweights=0.79649]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 29.80it/s, loss=-0.13333, sqweights=0.79753]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 29.80it/s, loss=-0.12920, sqweights=0.79911]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 29.80it/s, loss=-0.12881, sqweights=0.79904]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 30.41it/s, loss=-0.12881, sqweights=0.79904]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 30.41it/s, loss=-0.12826, sqweights=0.79962]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 30.41it/s, loss=-0.13054, sqweights=0.80125]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 30.41it/s, loss=-0.12907, sqweights=0.80050]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 30.41it/s, loss=-0.12845, sqweights=0.80155]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 30.74it/s, loss=-0.12845, sqweights=0.80155]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 30.74it/s, loss=-0.12800, sqweights=0.80207]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 30.74it/s, loss=-0.12768, sqweights=0.80238]
Epoch 36: 100%|##########| 20/20 [00:01<00:00, 30.74it/s, loss=-0.12768, sqweights=0.80238, train_loss=-0.17610, train_sqweights=0.75066, val_loss=-0.13577, val_sqweights=0.74243]
Epoch 36: 100%|##########| 20/20 [00:01<00:00, 30.74it/s, loss=-0.12768, sqweights=0.80238, train_loss=-0.17610, train_sqweights=0.75066, val_loss=-0.13577, val_sqweights=0.74243]
Epoch 36: 100%|##########| 20/20 [00:01<00:00, 10.88it/s, loss=-0.12768, sqweights=0.80238, train_loss=-0.17610, train_sqweights=0.75066, val_loss=-0.13577, val_sqweights=0.74243]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 23.14it/s, loss=-0.10948, sqweights=0.79820]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 26.77it/s, loss=-0.12830, sqweights=0.81026]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 28.31it/s, loss=-0.12830, sqweights=0.81026]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 28.31it/s, loss=-0.13222, sqweights=0.81084]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 28.31it/s, loss=-0.12200, sqweights=0.80712]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 28.31it/s, loss=-0.12495, sqweights=0.80815]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 28.31it/s, loss=-0.12317, sqweights=0.80821]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 29.31it/s, loss=-0.12317, sqweights=0.80821]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 29.31it/s, loss=-0.12341, sqweights=0.80905]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 29.31it/s, loss=-0.12640, sqweights=0.80902]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 29.31it/s, loss=-0.12587, sqweights=0.80987]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 29.31it/s, loss=-0.12722, sqweights=0.80917]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 29.89it/s, loss=-0.12722, sqweights=0.80917]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 29.89it/s, loss=-0.12652, sqweights=0.81116]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 29.89it/s, loss=-0.12922, sqweights=0.81078]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 29.89it/s, loss=-0.12831, sqweights=0.81098]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 29.89it/s, loss=-0.12661, sqweights=0.81012]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 30.42it/s, loss=-0.12661, sqweights=0.81012]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 30.42it/s, loss=-0.12727, sqweights=0.81118]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 30.42it/s, loss=-0.12466, sqweights=0.81192]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 30.42it/s, loss=-0.12530, sqweights=0.81242]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 30.42it/s, loss=-0.12409, sqweights=0.81318]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 30.73it/s, loss=-0.12409, sqweights=0.81318]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 30.73it/s, loss=-0.12299, sqweights=0.81272]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 30.73it/s, loss=-0.12111, sqweights=0.81259]
Epoch 37: 100%|##########| 20/20 [00:01<00:00, 30.73it/s, loss=-0.12111, sqweights=0.81259, train_loss=-0.17758, train_sqweights=0.76143, val_loss=-0.13609, val_sqweights=0.75286]
Epoch 37: 100%|##########| 20/20 [00:01<00:00, 30.73it/s, loss=-0.12111, sqweights=0.81259, train_loss=-0.17758, train_sqweights=0.76143, val_loss=-0.13609, val_sqweights=0.75286]
Epoch 37: 100%|##########| 20/20 [00:01<00:00, 10.19it/s, loss=-0.12111, sqweights=0.81259, train_loss=-0.17758, train_sqweights=0.76143, val_loss=-0.13609, val_sqweights=0.75286]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:00, 23.35it/s, loss=-0.12717, sqweights=0.81825]
Epoch 38:  10%|#         | 2/20 [00:00<00:00, 26.98it/s, loss=-0.12201, sqweights=0.82775]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 28.10it/s, loss=-0.12201, sqweights=0.82775]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 28.10it/s, loss=-0.12479, sqweights=0.81814]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 28.10it/s, loss=-0.12454, sqweights=0.81558]
Epoch 38:  25%|##5       | 5/20 [00:00<00:00, 28.10it/s, loss=-0.12742, sqweights=0.81243]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 28.10it/s, loss=-0.13047, sqweights=0.81602]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 29.16it/s, loss=-0.13047, sqweights=0.81602]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 29.16it/s, loss=-0.12938, sqweights=0.81641]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 29.16it/s, loss=-0.13078, sqweights=0.81714]
Epoch 38:  45%|####5     | 9/20 [00:00<00:00, 29.16it/s, loss=-0.12865, sqweights=0.81756]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 29.16it/s, loss=-0.12977, sqweights=0.81782]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 29.88it/s, loss=-0.12977, sqweights=0.81782]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 29.88it/s, loss=-0.12553, sqweights=0.81825]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 29.88it/s, loss=-0.12575, sqweights=0.81844]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 29.88it/s, loss=-0.12591, sqweights=0.81987]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 29.88it/s, loss=-0.12437, sqweights=0.81849]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 30.48it/s, loss=-0.12437, sqweights=0.81849]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 30.48it/s, loss=-0.12501, sqweights=0.81847]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 30.48it/s, loss=-0.12326, sqweights=0.81686]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 30.48it/s, loss=-0.12359, sqweights=0.81765]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 30.48it/s, loss=-0.12328, sqweights=0.81759]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 30.76it/s, loss=-0.12328, sqweights=0.81759]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 30.76it/s, loss=-0.12339, sqweights=0.81829]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 30.76it/s, loss=-0.12410, sqweights=0.81904]
Epoch 38: 100%|##########| 20/20 [00:01<00:00, 30.76it/s, loss=-0.12410, sqweights=0.81904, train_loss=-0.17829, train_sqweights=0.76850, val_loss=-0.13643, val_sqweights=0.76189]
Epoch 38: 100%|##########| 20/20 [00:01<00:00, 30.76it/s, loss=-0.12410, sqweights=0.81904, train_loss=-0.17829, train_sqweights=0.76850, val_loss=-0.13643, val_sqweights=0.76189]
Epoch 38: 100%|##########| 20/20 [00:01<00:00, 10.86it/s, loss=-0.12410, sqweights=0.81904, train_loss=-0.17829, train_sqweights=0.76850, val_loss=-0.13643, val_sqweights=0.76189]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 23.76it/s, loss=-0.15617, sqweights=0.82221]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 27.19it/s, loss=-0.14162, sqweights=0.82185]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 28.61it/s, loss=-0.14162, sqweights=0.82185]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 28.61it/s, loss=-0.13493, sqweights=0.82584]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 28.61it/s, loss=-0.13136, sqweights=0.82749]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 28.61it/s, loss=-0.13005, sqweights=0.82665]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 28.61it/s, loss=-0.12785, sqweights=0.82663]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 29.14it/s, loss=-0.12785, sqweights=0.82663]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 29.14it/s, loss=-0.13155, sqweights=0.82722]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 29.14it/s, loss=-0.12887, sqweights=0.82814]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 29.14it/s, loss=-0.12825, sqweights=0.82515]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 29.14it/s, loss=-0.12965, sqweights=0.82464]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.12965, sqweights=0.82464]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 29.80it/s, loss=-0.12973, sqweights=0.82562]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 29.80it/s, loss=-0.13064, sqweights=0.82528]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 29.80it/s, loss=-0.13212, sqweights=0.82579]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 29.80it/s, loss=-0.13079, sqweights=0.82564]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 30.39it/s, loss=-0.13079, sqweights=0.82564]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 30.39it/s, loss=-0.13128, sqweights=0.82597]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 30.39it/s, loss=-0.12859, sqweights=0.82455]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 30.39it/s, loss=-0.12829, sqweights=0.82595]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 30.39it/s, loss=-0.12924, sqweights=0.82716]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 30.81it/s, loss=-0.12924, sqweights=0.82716]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 30.81it/s, loss=-0.12904, sqweights=0.82655]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 30.81it/s, loss=-0.12744, sqweights=0.82509]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 30.81it/s, loss=-0.12744, sqweights=0.82509, train_loss=-0.17858, train_sqweights=0.77892, val_loss=-0.13638, val_sqweights=0.77451]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 30.81it/s, loss=-0.12744, sqweights=0.82509, train_loss=-0.17858, train_sqweights=0.77892, val_loss=-0.13638, val_sqweights=0.77451]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 10.88it/s, loss=-0.12744, sqweights=0.82509, train_loss=-0.17858, train_sqweights=0.77892, val_loss=-0.13638, val_sqweights=0.77451]

<matplotlib.legend.Legend object at 0x7f5eed462090>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 1 minutes 29.753 seconds)

Gallery generated by Sphinx-Gallery