Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

Lütkepohl2005

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss

Out:

model       metric     epoch  dataloader
1overN      loss       -1     train         0.003
                              val           0.002
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train         0.001
                              val           0.004
            sqweights  -1     train         0.144
                              val           0.144
Random      loss       -1     train         0.003
                              val           0.001
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.171
                              val          -0.164
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 47.40it/s, loss=0.00601, sqweights=0.16934]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 53.46it/s, loss=0.00738, sqweights=0.16780]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 23.32it/s, loss=0.00738, sqweights=0.16780]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 23.32it/s, loss=0.00679, sqweights=0.16691]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 23.32it/s, loss=0.00757, sqweights=0.16686]
Epoch 0:  25%|##5       | 5/20 [00:00<00:00, 23.32it/s, loss=0.00886, sqweights=0.16673]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 23.32it/s, loss=0.00564, sqweights=0.16663]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 23.32it/s, loss=0.00315, sqweights=0.16669]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 23.32it/s, loss=0.00327, sqweights=0.16644]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 23.32it/s, loss=0.00327, sqweights=0.16621]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 45.80it/s, loss=0.00327, sqweights=0.16621]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 45.80it/s, loss=0.00323, sqweights=0.16635]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 45.80it/s, loss=0.00325, sqweights=0.16662]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 45.80it/s, loss=0.00376, sqweights=0.16671]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 45.80it/s, loss=0.00382, sqweights=0.16685]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 45.80it/s, loss=0.00353, sqweights=0.16686]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 45.80it/s, loss=0.00323, sqweights=0.16666]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 45.80it/s, loss=0.00255, sqweights=0.16666]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 54.21it/s, loss=0.00255, sqweights=0.16666]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 54.21it/s, loss=0.00219, sqweights=0.16665]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 54.21it/s, loss=0.00230, sqweights=0.16666]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 54.21it/s, loss=0.00244, sqweights=0.16670]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 54.21it/s, loss=0.00302, sqweights=0.16675]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 54.21it/s, loss=0.00302, sqweights=0.16675, train_loss=0.00254, train_sqweights=0.12554, val_loss=0.00225, val_sqweights=0.12554]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 54.21it/s, loss=0.00302, sqweights=0.16675, train_loss=0.00254, train_sqweights=0.12554, val_loss=0.00225, val_sqweights=0.12554]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 20.72it/s, loss=0.00302, sqweights=0.16675, train_loss=0.00254, train_sqweights=0.12554, val_loss=0.00225, val_sqweights=0.12554]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 47.66it/s, loss=-0.00304, sqweights=0.16586]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 55.03it/s, loss=-0.00204, sqweights=0.16564]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 58.24it/s, loss=0.00258, sqweights=0.16648]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 60.01it/s, loss=0.00277, sqweights=0.16676]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 61.19it/s, loss=0.00077, sqweights=0.16635]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 61.94it/s, loss=0.00129, sqweights=0.16660]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 62.52it/s, loss=0.00129, sqweights=0.16660]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 62.52it/s, loss=0.00108, sqweights=0.16727]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 62.52it/s, loss=0.00014, sqweights=0.16725]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 62.52it/s, loss=-0.00174, sqweights=0.16757]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 62.52it/s, loss=-0.00412, sqweights=0.16783]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 62.52it/s, loss=-0.00441, sqweights=0.16809]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 62.52it/s, loss=-0.00364, sqweights=0.16777]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 62.52it/s, loss=-0.00358, sqweights=0.16778]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 64.11it/s, loss=-0.00358, sqweights=0.16778]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 64.11it/s, loss=-0.00375, sqweights=0.16796]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 64.11it/s, loss=-0.00349, sqweights=0.16803]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 64.11it/s, loss=-0.00443, sqweights=0.16822]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 64.11it/s, loss=-0.00509, sqweights=0.16812]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 64.11it/s, loss=-0.00623, sqweights=0.16801]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 64.11it/s, loss=-0.00588, sqweights=0.16812]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 64.11it/s, loss=-0.00546, sqweights=0.16817]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 64.11it/s, loss=-0.00546, sqweights=0.16817, train_loss=0.00217, train_sqweights=0.12564, val_loss=0.00189, val_sqweights=0.12564]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 64.11it/s, loss=-0.00546, sqweights=0.16817, train_loss=0.00217, train_sqweights=0.12564, val_loss=0.00189, val_sqweights=0.12564]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 22.92it/s, loss=-0.00546, sqweights=0.16817, train_loss=0.00217, train_sqweights=0.12564, val_loss=0.00189, val_sqweights=0.12564]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 48.33it/s, loss=0.00235, sqweights=0.16879]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 55.14it/s, loss=-0.00850, sqweights=0.17017]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 57.72it/s, loss=-0.00858, sqweights=0.17000]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 59.48it/s, loss=-0.00713, sqweights=0.17004]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 59.82it/s, loss=-0.00945, sqweights=0.17007]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 60.68it/s, loss=-0.00933, sqweights=0.17030]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 61.39it/s, loss=-0.00933, sqweights=0.17030]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 61.39it/s, loss=-0.01165, sqweights=0.17045]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 61.39it/s, loss=-0.00936, sqweights=0.17020]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 61.39it/s, loss=-0.00986, sqweights=0.17036]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 61.39it/s, loss=-0.00891, sqweights=0.17050]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 61.39it/s, loss=-0.00809, sqweights=0.17078]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 61.39it/s, loss=-0.00804, sqweights=0.17086]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 61.39it/s, loss=-0.00846, sqweights=0.17112]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 63.86it/s, loss=-0.00846, sqweights=0.17112]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 63.86it/s, loss=-0.00744, sqweights=0.17124]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 63.86it/s, loss=-0.00740, sqweights=0.17135]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 63.86it/s, loss=-0.00739, sqweights=0.17162]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 63.86it/s, loss=-0.00750, sqweights=0.17176]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 63.86it/s, loss=-0.00788, sqweights=0.17206]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 63.86it/s, loss=-0.00842, sqweights=0.17228]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 63.86it/s, loss=-0.00950, sqweights=0.17264]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 63.86it/s, loss=-0.00950, sqweights=0.17264, train_loss=0.00072, train_sqweights=0.12599, val_loss=0.00056, val_sqweights=0.12599]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 63.86it/s, loss=-0.00950, sqweights=0.17264, train_loss=0.00072, train_sqweights=0.12599, val_loss=0.00056, val_sqweights=0.12599]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 22.83it/s, loss=-0.00950, sqweights=0.17264, train_loss=0.00072, train_sqweights=0.12599, val_loss=0.00056, val_sqweights=0.12599]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 47.13it/s, loss=-0.01313, sqweights=0.17649]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 54.57it/s, loss=-0.00901, sqweights=0.17547]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 57.78it/s, loss=-0.00837, sqweights=0.17541]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 59.29it/s, loss=-0.01285, sqweights=0.17523]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 60.02it/s, loss=-0.01142, sqweights=0.17485]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 60.75it/s, loss=-0.01330, sqweights=0.17645]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 61.27it/s, loss=-0.01330, sqweights=0.17645]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 61.27it/s, loss=-0.01195, sqweights=0.17723]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 61.27it/s, loss=-0.01254, sqweights=0.17710]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 61.27it/s, loss=-0.01407, sqweights=0.17732]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 61.27it/s, loss=-0.01389, sqweights=0.17748]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 61.27it/s, loss=-0.01360, sqweights=0.17766]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 61.27it/s, loss=-0.01423, sqweights=0.17794]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 61.27it/s, loss=-0.01526, sqweights=0.17846]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 44.80it/s, loss=-0.01526, sqweights=0.17846]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 44.80it/s, loss=-0.01498, sqweights=0.17857]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 44.80it/s, loss=-0.01532, sqweights=0.17888]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 44.80it/s, loss=-0.01507, sqweights=0.17919]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 44.80it/s, loss=-0.01548, sqweights=0.17949]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 44.80it/s, loss=-0.01560, sqweights=0.17958]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 44.80it/s, loss=-0.01632, sqweights=0.17975]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 44.80it/s, loss=-0.01543, sqweights=0.18027]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 44.80it/s, loss=-0.01543, sqweights=0.18027, train_loss=-0.00466, train_sqweights=0.12852, val_loss=-0.00432, val_sqweights=0.12851]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 44.80it/s, loss=-0.01543, sqweights=0.18027, train_loss=-0.00466, train_sqweights=0.12852, val_loss=-0.00432, val_sqweights=0.12851]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 21.02it/s, loss=-0.01543, sqweights=0.18027, train_loss=-0.00466, train_sqweights=0.12852, val_loss=-0.00432, val_sqweights=0.12851]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 47.71it/s, loss=-0.03036, sqweights=0.18264]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 55.09it/s, loss=-0.02018, sqweights=0.18058]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 58.19it/s, loss=-0.02194, sqweights=0.18360]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 59.68it/s, loss=-0.02183, sqweights=0.18447]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 60.82it/s, loss=-0.01636, sqweights=0.18526]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 61.60it/s, loss=-0.01431, sqweights=0.18556]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 62.23it/s, loss=-0.01431, sqweights=0.18556]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 62.23it/s, loss=-0.01667, sqweights=0.18648]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 62.23it/s, loss=-0.01767, sqweights=0.18701]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 62.23it/s, loss=-0.01949, sqweights=0.18699]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 62.23it/s, loss=-0.01853, sqweights=0.18728]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 62.23it/s, loss=-0.01773, sqweights=0.18747]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 62.23it/s, loss=-0.01817, sqweights=0.18773]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 62.23it/s, loss=-0.01909, sqweights=0.18830]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.01909, sqweights=0.18830]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.01853, sqweights=0.18858]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 64.12it/s, loss=-0.01936, sqweights=0.18902]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 64.12it/s, loss=-0.01998, sqweights=0.18949]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 64.12it/s, loss=-0.02077, sqweights=0.18994]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 64.12it/s, loss=-0.02092, sqweights=0.19027]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 64.12it/s, loss=-0.02136, sqweights=0.19067]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.02161, sqweights=0.19095]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.02161, sqweights=0.19095, train_loss=-0.01848, train_sqweights=0.14281, val_loss=-0.01677, val_sqweights=0.14256]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.02161, sqweights=0.19095, train_loss=-0.01848, train_sqweights=0.14281, val_loss=-0.01677, val_sqweights=0.14256]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 22.94it/s, loss=-0.02161, sqweights=0.19095, train_loss=-0.01848, train_sqweights=0.14281, val_loss=-0.01677, val_sqweights=0.14256]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 48.37it/s, loss=-0.02792, sqweights=0.19943]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 55.57it/s, loss=-0.02083, sqweights=0.20369]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 57.61it/s, loss=-0.02255, sqweights=0.20172]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 59.45it/s, loss=-0.02629, sqweights=0.20200]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 60.39it/s, loss=-0.02776, sqweights=0.20252]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 61.18it/s, loss=-0.02973, sqweights=0.20283]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 61.74it/s, loss=-0.02973, sqweights=0.20283]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 61.74it/s, loss=-0.03019, sqweights=0.20299]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 61.74it/s, loss=-0.02966, sqweights=0.20318]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 61.74it/s, loss=-0.03027, sqweights=0.20345]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 61.74it/s, loss=-0.03052, sqweights=0.20291]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 61.74it/s, loss=-0.03023, sqweights=0.20289]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 61.74it/s, loss=-0.03073, sqweights=0.20330]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 61.74it/s, loss=-0.03071, sqweights=0.20383]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 63.92it/s, loss=-0.03071, sqweights=0.20383]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 63.92it/s, loss=-0.03087, sqweights=0.20389]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 63.92it/s, loss=-0.03138, sqweights=0.20382]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 63.92it/s, loss=-0.03170, sqweights=0.20412]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 63.92it/s, loss=-0.03061, sqweights=0.20464]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 63.92it/s, loss=-0.02996, sqweights=0.20510]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 63.92it/s, loss=-0.02995, sqweights=0.20550]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 63.92it/s, loss=-0.02946, sqweights=0.20533]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 63.92it/s, loss=-0.02946, sqweights=0.20533, train_loss=-0.03389, train_sqweights=0.16710, val_loss=-0.03050, val_sqweights=0.16615]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 63.92it/s, loss=-0.02946, sqweights=0.20533, train_loss=-0.03389, train_sqweights=0.16710, val_loss=-0.03050, val_sqweights=0.16615]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 22.93it/s, loss=-0.02946, sqweights=0.20533, train_loss=-0.03389, train_sqweights=0.16710, val_loss=-0.03050, val_sqweights=0.16615]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 47.45it/s, loss=-0.03112, sqweights=0.20940]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 55.16it/s, loss=-0.02894, sqweights=0.20852]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 58.50it/s, loss=-0.02888, sqweights=0.20813]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 60.25it/s, loss=-0.02964, sqweights=0.20898]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 61.17it/s, loss=-0.03121, sqweights=0.21118]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 62.03it/s, loss=-0.03420, sqweights=0.21237]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 62.20it/s, loss=-0.03420, sqweights=0.21237]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 62.20it/s, loss=-0.03623, sqweights=0.21294]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 62.20it/s, loss=-0.03374, sqweights=0.21333]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 62.20it/s, loss=-0.03252, sqweights=0.21418]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 62.20it/s, loss=-0.03415, sqweights=0.21504]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 62.20it/s, loss=-0.03507, sqweights=0.21554]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 62.20it/s, loss=-0.03549, sqweights=0.21594]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 62.20it/s, loss=-0.03634, sqweights=0.21628]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 64.09it/s, loss=-0.03634, sqweights=0.21628]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 64.09it/s, loss=-0.03680, sqweights=0.21697]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 64.09it/s, loss=-0.03709, sqweights=0.21734]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 64.09it/s, loss=-0.03644, sqweights=0.21823]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 64.09it/s, loss=-0.03622, sqweights=0.21897]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 64.09it/s, loss=-0.03626, sqweights=0.21924]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 64.09it/s, loss=-0.03656, sqweights=0.21948]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.09it/s, loss=-0.03697, sqweights=0.22094]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.09it/s, loss=-0.03697, sqweights=0.22094, train_loss=-0.04421, train_sqweights=0.18340, val_loss=-0.03965, val_sqweights=0.18176]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.09it/s, loss=-0.03697, sqweights=0.22094, train_loss=-0.04421, train_sqweights=0.18340, val_loss=-0.03965, val_sqweights=0.18176]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 21.13it/s, loss=-0.03697, sqweights=0.22094, train_loss=-0.04421, train_sqweights=0.18340, val_loss=-0.03965, val_sqweights=0.18176]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 48.13it/s, loss=-0.05766, sqweights=0.23778]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 55.34it/s, loss=-0.04570, sqweights=0.23152]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 58.45it/s, loss=-0.04012, sqweights=0.23272]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 59.69it/s, loss=-0.04475, sqweights=0.23174]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 60.27it/s, loss=-0.04556, sqweights=0.23287]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 60.98it/s, loss=-0.04361, sqweights=0.23207]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 61.68it/s, loss=-0.04361, sqweights=0.23207]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 61.68it/s, loss=-0.04315, sqweights=0.23313]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 61.68it/s, loss=-0.04330, sqweights=0.23325]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 61.68it/s, loss=-0.04591, sqweights=0.23492]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 61.68it/s, loss=-0.04621, sqweights=0.23463]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 61.68it/s, loss=-0.04584, sqweights=0.23503]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 61.68it/s, loss=-0.04522, sqweights=0.23581]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 61.68it/s, loss=-0.04472, sqweights=0.23712]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.04472, sqweights=0.23712]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.04510, sqweights=0.23789]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 64.12it/s, loss=-0.04542, sqweights=0.23865]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 64.12it/s, loss=-0.04515, sqweights=0.23941]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 64.12it/s, loss=-0.04455, sqweights=0.24034]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 64.12it/s, loss=-0.04482, sqweights=0.24113]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 64.12it/s, loss=-0.04513, sqweights=0.24163]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.04550, sqweights=0.24180]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.04550, sqweights=0.24180, train_loss=-0.05317, train_sqweights=0.19879, val_loss=-0.04749, val_sqweights=0.19642]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.04550, sqweights=0.24180, train_loss=-0.05317, train_sqweights=0.19879, val_loss=-0.04749, val_sqweights=0.19642]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 22.99it/s, loss=-0.04550, sqweights=0.24180, train_loss=-0.05317, train_sqweights=0.19879, val_loss=-0.04749, val_sqweights=0.19642]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:00, 47.92it/s, loss=-0.03874, sqweights=0.24989]
Epoch 8:  10%|#         | 2/20 [00:00<00:00, 55.36it/s, loss=-0.03936, sqweights=0.25476]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 58.50it/s, loss=-0.03954, sqweights=0.25672]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 60.17it/s, loss=-0.04477, sqweights=0.25774]
Epoch 8:  25%|##5       | 5/20 [00:00<00:00, 61.26it/s, loss=-0.04644, sqweights=0.25767]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 62.06it/s, loss=-0.04650, sqweights=0.25618]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 62.65it/s, loss=-0.04650, sqweights=0.25618]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 62.65it/s, loss=-0.04691, sqweights=0.25677]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 62.65it/s, loss=-0.04713, sqweights=0.25785]
Epoch 8:  45%|####5     | 9/20 [00:00<00:00, 62.65it/s, loss=-0.04744, sqweights=0.25791]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 62.65it/s, loss=-0.05091, sqweights=0.25885]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 62.65it/s, loss=-0.05108, sqweights=0.25876]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 62.65it/s, loss=-0.04977, sqweights=0.25873]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 62.65it/s, loss=-0.05016, sqweights=0.25930]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 64.34it/s, loss=-0.05016, sqweights=0.25930]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 64.34it/s, loss=-0.05086, sqweights=0.26013]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 64.34it/s, loss=-0.04946, sqweights=0.26073]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 64.34it/s, loss=-0.04907, sqweights=0.26174]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 64.34it/s, loss=-0.05009, sqweights=0.26176]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 64.34it/s, loss=-0.05027, sqweights=0.26236]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 64.34it/s, loss=-0.05051, sqweights=0.26264]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 64.34it/s, loss=-0.05021, sqweights=0.26305]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 64.34it/s, loss=-0.05021, sqweights=0.26305, train_loss=-0.06172, train_sqweights=0.21506, val_loss=-0.05491, val_sqweights=0.21180]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 64.34it/s, loss=-0.05021, sqweights=0.26305, train_loss=-0.06172, train_sqweights=0.21506, val_loss=-0.05491, val_sqweights=0.21180]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 22.97it/s, loss=-0.05021, sqweights=0.26305, train_loss=-0.06172, train_sqweights=0.21506, val_loss=-0.05491, val_sqweights=0.21180]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 47.15it/s, loss=-0.06009, sqweights=0.26843]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 54.78it/s, loss=-0.05083, sqweights=0.28241]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 58.10it/s, loss=-0.05388, sqweights=0.28206]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 59.83it/s, loss=-0.05309, sqweights=0.27885]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 60.95it/s, loss=-0.05616, sqweights=0.28100]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 61.66it/s, loss=-0.05601, sqweights=0.28245]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 61.95it/s, loss=-0.05601, sqweights=0.28245]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 61.95it/s, loss=-0.05530, sqweights=0.28270]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 61.95it/s, loss=-0.05395, sqweights=0.28436]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 61.95it/s, loss=-0.05410, sqweights=0.28510]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 61.95it/s, loss=-0.05328, sqweights=0.28377]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 61.95it/s, loss=-0.05318, sqweights=0.28398]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 61.95it/s, loss=-0.05357, sqweights=0.28400]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 61.95it/s, loss=-0.05411, sqweights=0.28344]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 63.89it/s, loss=-0.05411, sqweights=0.28344]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 63.89it/s, loss=-0.05395, sqweights=0.28435]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 63.89it/s, loss=-0.05586, sqweights=0.28538]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 63.89it/s, loss=-0.05541, sqweights=0.28572]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 63.89it/s, loss=-0.05556, sqweights=0.28619]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 63.89it/s, loss=-0.05567, sqweights=0.28659]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 63.89it/s, loss=-0.05650, sqweights=0.28769]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 63.89it/s, loss=-0.05580, sqweights=0.28738]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 63.89it/s, loss=-0.05580, sqweights=0.28738, train_loss=-0.07013, train_sqweights=0.23461, val_loss=-0.06212, val_sqweights=0.23043]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 63.89it/s, loss=-0.05580, sqweights=0.28738, train_loss=-0.07013, train_sqweights=0.23461, val_loss=-0.06212, val_sqweights=0.23043]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 21.01it/s, loss=-0.05580, sqweights=0.28738, train_loss=-0.07013, train_sqweights=0.23461, val_loss=-0.06212, val_sqweights=0.23043]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 46.88it/s, loss=-0.06029, sqweights=0.31314]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 54.70it/s, loss=-0.05353, sqweights=0.31092]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 57.81it/s, loss=-0.05239, sqweights=0.30622]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 59.58it/s, loss=-0.05671, sqweights=0.30466]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 60.80it/s, loss=-0.05788, sqweights=0.30640]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 61.64it/s, loss=-0.05933, sqweights=0.30461]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 62.28it/s, loss=-0.05933, sqweights=0.30461]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 62.28it/s, loss=-0.06136, sqweights=0.30668]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 62.28it/s, loss=-0.06346, sqweights=0.30755]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 62.28it/s, loss=-0.06286, sqweights=0.30961]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 62.28it/s, loss=-0.06357, sqweights=0.31058]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 62.28it/s, loss=-0.06486, sqweights=0.31053]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 62.28it/s, loss=-0.06390, sqweights=0.31147]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 62.28it/s, loss=-0.06421, sqweights=0.31277]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 64.40it/s, loss=-0.06421, sqweights=0.31277]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 64.40it/s, loss=-0.06455, sqweights=0.31300]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 64.40it/s, loss=-0.06404, sqweights=0.31301]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 64.40it/s, loss=-0.06481, sqweights=0.31415]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 64.40it/s, loss=-0.06611, sqweights=0.31477]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 64.40it/s, loss=-0.06556, sqweights=0.31548]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 64.40it/s, loss=-0.06462, sqweights=0.31633]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.06556, sqweights=0.31780]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.06556, sqweights=0.31780, train_loss=-0.07849, train_sqweights=0.25563, val_loss=-0.06894, val_sqweights=0.25042]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.06556, sqweights=0.31780, train_loss=-0.07849, train_sqweights=0.25563, val_loss=-0.06894, val_sqweights=0.25042]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 22.96it/s, loss=-0.06556, sqweights=0.31780, train_loss=-0.07849, train_sqweights=0.25563, val_loss=-0.06894, val_sqweights=0.25042]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 47.56it/s, loss=-0.06311, sqweights=0.32176]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 54.97it/s, loss=-0.06800, sqweights=0.33408]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 58.08it/s, loss=-0.06895, sqweights=0.33506]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 59.76it/s, loss=-0.06951, sqweights=0.33342]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 60.82it/s, loss=-0.06755, sqweights=0.33638]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 61.58it/s, loss=-0.07009, sqweights=0.33537]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 61.99it/s, loss=-0.07009, sqweights=0.33537]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 61.99it/s, loss=-0.07127, sqweights=0.33474]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 61.99it/s, loss=-0.07193, sqweights=0.33497]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 61.99it/s, loss=-0.07144, sqweights=0.33568]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 61.99it/s, loss=-0.07315, sqweights=0.33593]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 61.99it/s, loss=-0.07286, sqweights=0.33524]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 61.99it/s, loss=-0.07217, sqweights=0.33643]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 61.99it/s, loss=-0.07122, sqweights=0.33765]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 63.85it/s, loss=-0.07122, sqweights=0.33765]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 63.85it/s, loss=-0.07211, sqweights=0.33851]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 63.85it/s, loss=-0.07131, sqweights=0.34058]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 63.85it/s, loss=-0.07088, sqweights=0.34369]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 63.85it/s, loss=-0.07166, sqweights=0.34393]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 63.85it/s, loss=-0.07119, sqweights=0.34417]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 63.85it/s, loss=-0.07086, sqweights=0.34489]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 63.85it/s, loss=-0.07279, sqweights=0.34641]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 63.85it/s, loss=-0.07279, sqweights=0.34641, train_loss=-0.08670, train_sqweights=0.27786, val_loss=-0.07574, val_sqweights=0.27149]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 63.85it/s, loss=-0.07279, sqweights=0.34641, train_loss=-0.08670, train_sqweights=0.27786, val_loss=-0.07574, val_sqweights=0.27149]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 22.91it/s, loss=-0.07279, sqweights=0.34641, train_loss=-0.08670, train_sqweights=0.27786, val_loss=-0.07574, val_sqweights=0.27149]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 48.10it/s, loss=-0.05254, sqweights=0.34273]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 55.30it/s, loss=-0.05849, sqweights=0.34978]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 57.49it/s, loss=-0.06300, sqweights=0.35451]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 58.80it/s, loss=-0.06634, sqweights=0.35618]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 59.98it/s, loss=-0.06942, sqweights=0.35714]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 60.90it/s, loss=-0.07093, sqweights=0.35675]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 61.57it/s, loss=-0.07093, sqweights=0.35675]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 61.57it/s, loss=-0.07079, sqweights=0.35836]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 61.57it/s, loss=-0.06999, sqweights=0.35709]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 61.57it/s, loss=-0.07251, sqweights=0.35865]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 61.57it/s, loss=-0.07435, sqweights=0.36035]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 61.57it/s, loss=-0.07519, sqweights=0.36171]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 61.57it/s, loss=-0.07775, sqweights=0.36320]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 61.57it/s, loss=-0.07605, sqweights=0.36448]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 64.00it/s, loss=-0.07605, sqweights=0.36448]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 64.00it/s, loss=-0.07570, sqweights=0.36676]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 64.00it/s, loss=-0.07576, sqweights=0.36699]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 64.00it/s, loss=-0.07635, sqweights=0.36831]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 64.00it/s, loss=-0.07621, sqweights=0.36889]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 64.00it/s, loss=-0.07690, sqweights=0.36912]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 64.00it/s, loss=-0.07682, sqweights=0.37070]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.00it/s, loss=-0.07744, sqweights=0.37093]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.00it/s, loss=-0.07744, sqweights=0.37093, train_loss=-0.09457, train_sqweights=0.30225, val_loss=-0.08213, val_sqweights=0.29512]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.00it/s, loss=-0.07744, sqweights=0.37093, train_loss=-0.09457, train_sqweights=0.30225, val_loss=-0.08213, val_sqweights=0.29512]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 21.07it/s, loss=-0.07744, sqweights=0.37093, train_loss=-0.09457, train_sqweights=0.30225, val_loss=-0.08213, val_sqweights=0.29512]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 46.94it/s, loss=-0.06851, sqweights=0.39919]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 54.44it/s, loss=-0.06604, sqweights=0.38521]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 57.35it/s, loss=-0.07360, sqweights=0.38363]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 59.05it/s, loss=-0.07443, sqweights=0.38878]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 59.97it/s, loss=-0.07456, sqweights=0.39127]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 60.87it/s, loss=-0.07065, sqweights=0.39333]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 61.38it/s, loss=-0.07065, sqweights=0.39333]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 61.38it/s, loss=-0.07337, sqweights=0.39436]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 61.38it/s, loss=-0.07530, sqweights=0.39353]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 61.38it/s, loss=-0.07314, sqweights=0.39438]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 61.38it/s, loss=-0.07458, sqweights=0.39416]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 61.38it/s, loss=-0.07537, sqweights=0.39393]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 61.38it/s, loss=-0.07472, sqweights=0.39537]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 61.38it/s, loss=-0.07513, sqweights=0.39595]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 63.83it/s, loss=-0.07513, sqweights=0.39595]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 63.83it/s, loss=-0.07614, sqweights=0.39809]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 63.83it/s, loss=-0.07676, sqweights=0.39909]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 63.83it/s, loss=-0.07671, sqweights=0.39939]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 63.83it/s, loss=-0.07750, sqweights=0.40098]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 63.83it/s, loss=-0.07849, sqweights=0.40334]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 63.83it/s, loss=-0.07883, sqweights=0.40349]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.07979, sqweights=0.40358]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.07979, sqweights=0.40358, train_loss=-0.10163, train_sqweights=0.32386, val_loss=-0.08767, val_sqweights=0.31607]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.07979, sqweights=0.40358, train_loss=-0.10163, train_sqweights=0.32386, val_loss=-0.08767, val_sqweights=0.31607]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 22.94it/s, loss=-0.07979, sqweights=0.40358, train_loss=-0.10163, train_sqweights=0.32386, val_loss=-0.08767, val_sqweights=0.31607]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 48.24it/s, loss=-0.10427, sqweights=0.40871]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 55.40it/s, loss=-0.08966, sqweights=0.42735]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 58.48it/s, loss=-0.08658, sqweights=0.42271]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 59.92it/s, loss=-0.08749, sqweights=0.42327]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 60.30it/s, loss=-0.08944, sqweights=0.42364]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 61.12it/s, loss=-0.08941, sqweights=0.42314]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 61.80it/s, loss=-0.08941, sqweights=0.42314]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 61.80it/s, loss=-0.08900, sqweights=0.42436]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 61.80it/s, loss=-0.08869, sqweights=0.42163]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 61.80it/s, loss=-0.08803, sqweights=0.42204]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 61.80it/s, loss=-0.08504, sqweights=0.42124]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 61.80it/s, loss=-0.08408, sqweights=0.42260]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 61.80it/s, loss=-0.08519, sqweights=0.42310]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 61.80it/s, loss=-0.08470, sqweights=0.42365]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 63.76it/s, loss=-0.08470, sqweights=0.42365]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 63.76it/s, loss=-0.08380, sqweights=0.42328]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 63.76it/s, loss=-0.08559, sqweights=0.42515]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 63.76it/s, loss=-0.08666, sqweights=0.42585]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 63.76it/s, loss=-0.08575, sqweights=0.42581]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 63.76it/s, loss=-0.08536, sqweights=0.42637]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 63.76it/s, loss=-0.08576, sqweights=0.42638]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 63.76it/s, loss=-0.08708, sqweights=0.42805]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 63.76it/s, loss=-0.08708, sqweights=0.42805, train_loss=-0.10839, train_sqweights=0.34568, val_loss=-0.09305, val_sqweights=0.33726]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 63.76it/s, loss=-0.08708, sqweights=0.42805, train_loss=-0.10839, train_sqweights=0.34568, val_loss=-0.09305, val_sqweights=0.33726]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 22.89it/s, loss=-0.08708, sqweights=0.42805, train_loss=-0.10839, train_sqweights=0.34568, val_loss=-0.09305, val_sqweights=0.33726]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 47.75it/s, loss=-0.08597, sqweights=0.44013]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 55.33it/s, loss=-0.07722, sqweights=0.45344]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 58.52it/s, loss=-0.07640, sqweights=0.44991]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 60.12it/s, loss=-0.08149, sqweights=0.44534]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 61.15it/s, loss=-0.08368, sqweights=0.44679]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 61.90it/s, loss=-0.08908, sqweights=0.44631]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.08908, sqweights=0.44631]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.08994, sqweights=0.44530]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 62.47it/s, loss=-0.08915, sqweights=0.44517]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 62.47it/s, loss=-0.09176, sqweights=0.44568]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 62.47it/s, loss=-0.09372, sqweights=0.44617]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 62.47it/s, loss=-0.09378, sqweights=0.44688]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 62.47it/s, loss=-0.09299, sqweights=0.44608]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 62.47it/s, loss=-0.09346, sqweights=0.44613]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 64.41it/s, loss=-0.09346, sqweights=0.44613]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 64.41it/s, loss=-0.09364, sqweights=0.44785]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 64.41it/s, loss=-0.09170, sqweights=0.44773]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 64.41it/s, loss=-0.09070, sqweights=0.44827]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 64.41it/s, loss=-0.09178, sqweights=0.44881]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 64.41it/s, loss=-0.09004, sqweights=0.44993]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 64.41it/s, loss=-0.09096, sqweights=0.45058]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.41it/s, loss=-0.09152, sqweights=0.45297]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.41it/s, loss=-0.09152, sqweights=0.45297, train_loss=-0.11487, train_sqweights=0.36998, val_loss=-0.09807, val_sqweights=0.36087]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.41it/s, loss=-0.09152, sqweights=0.45297, train_loss=-0.11487, train_sqweights=0.36998, val_loss=-0.09807, val_sqweights=0.36087]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 21.07it/s, loss=-0.09152, sqweights=0.45297, train_loss=-0.11487, train_sqweights=0.36998, val_loss=-0.09807, val_sqweights=0.36087]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 48.08it/s, loss=-0.09578, sqweights=0.48264]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 55.63it/s, loss=-0.08699, sqweights=0.47113]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 58.72it/s, loss=-0.09460, sqweights=0.47943]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 60.32it/s, loss=-0.09313, sqweights=0.47562]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 61.18it/s, loss=-0.09021, sqweights=0.47733]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 61.93it/s, loss=-0.09163, sqweights=0.47661]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 62.21it/s, loss=-0.09163, sqweights=0.47661]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 62.21it/s, loss=-0.09001, sqweights=0.47640]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 62.21it/s, loss=-0.08854, sqweights=0.47746]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 62.21it/s, loss=-0.08895, sqweights=0.47761]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 62.21it/s, loss=-0.09183, sqweights=0.47373]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 62.21it/s, loss=-0.09238, sqweights=0.47606]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 62.21it/s, loss=-0.09397, sqweights=0.47623]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 62.21it/s, loss=-0.09449, sqweights=0.47721]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 64.32it/s, loss=-0.09449, sqweights=0.47721]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 64.32it/s, loss=-0.09346, sqweights=0.47658]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 64.32it/s, loss=-0.09288, sqweights=0.47652]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 64.32it/s, loss=-0.09391, sqweights=0.47885]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 64.32it/s, loss=-0.09423, sqweights=0.47847]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 64.32it/s, loss=-0.09418, sqweights=0.47828]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 64.32it/s, loss=-0.09298, sqweights=0.47822]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.32it/s, loss=-0.09300, sqweights=0.47924]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.32it/s, loss=-0.09300, sqweights=0.47924, train_loss=-0.12079, train_sqweights=0.39438, val_loss=-0.10267, val_sqweights=0.38483]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.32it/s, loss=-0.09300, sqweights=0.47924, train_loss=-0.12079, train_sqweights=0.39438, val_loss=-0.10267, val_sqweights=0.38483]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 22.96it/s, loss=-0.09300, sqweights=0.47924, train_loss=-0.12079, train_sqweights=0.39438, val_loss=-0.10267, val_sqweights=0.38483]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 48.38it/s, loss=-0.08798, sqweights=0.49518]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 54.55it/s, loss=-0.09042, sqweights=0.50408]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 57.72it/s, loss=-0.09403, sqweights=0.49877]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 59.41it/s, loss=-0.09085, sqweights=0.49241]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 60.54it/s, loss=-0.08915, sqweights=0.49031]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 61.13it/s, loss=-0.08903, sqweights=0.49581]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 61.76it/s, loss=-0.08903, sqweights=0.49581]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 61.76it/s, loss=-0.08994, sqweights=0.49444]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 61.76it/s, loss=-0.09122, sqweights=0.49217]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 61.76it/s, loss=-0.09218, sqweights=0.49171]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 61.76it/s, loss=-0.09275, sqweights=0.49358]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 61.76it/s, loss=-0.09711, sqweights=0.49615]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 61.76it/s, loss=-0.09651, sqweights=0.49536]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 61.76it/s, loss=-0.09534, sqweights=0.49595]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 63.87it/s, loss=-0.09534, sqweights=0.49595]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 63.87it/s, loss=-0.09727, sqweights=0.49800]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 63.87it/s, loss=-0.09729, sqweights=0.50055]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 63.87it/s, loss=-0.09641, sqweights=0.49954]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 63.87it/s, loss=-0.09702, sqweights=0.50081]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 63.87it/s, loss=-0.09656, sqweights=0.50159]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 63.87it/s, loss=-0.09730, sqweights=0.50281]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 63.87it/s, loss=-0.09885, sqweights=0.50302]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 63.87it/s, loss=-0.09885, sqweights=0.50302, train_loss=-0.12615, train_sqweights=0.41569, val_loss=-0.10689, val_sqweights=0.40580]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 63.87it/s, loss=-0.09885, sqweights=0.50302, train_loss=-0.12615, train_sqweights=0.41569, val_loss=-0.10689, val_sqweights=0.40580]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 22.89it/s, loss=-0.09885, sqweights=0.50302, train_loss=-0.12615, train_sqweights=0.41569, val_loss=-0.10689, val_sqweights=0.40580]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 47.48it/s, loss=-0.07029, sqweights=0.50702]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 55.03it/s, loss=-0.07589, sqweights=0.51248]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 58.24it/s, loss=-0.08323, sqweights=0.51127]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 59.93it/s, loss=-0.08581, sqweights=0.51136]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 61.08it/s, loss=-0.08759, sqweights=0.51633]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 61.63it/s, loss=-0.09214, sqweights=0.51812]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 62.20it/s, loss=-0.09214, sqweights=0.51812]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 62.20it/s, loss=-0.09646, sqweights=0.51599]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 62.20it/s, loss=-0.09918, sqweights=0.52037]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 62.20it/s, loss=-0.09860, sqweights=0.51925]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 62.20it/s, loss=-0.10106, sqweights=0.51766]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 62.20it/s, loss=-0.10237, sqweights=0.51825]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 62.20it/s, loss=-0.10486, sqweights=0.52097]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 62.20it/s, loss=-0.10410, sqweights=0.52232]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 64.20it/s, loss=-0.10410, sqweights=0.52232]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 64.20it/s, loss=-0.10511, sqweights=0.52320]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 64.20it/s, loss=-0.10474, sqweights=0.52421]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 64.20it/s, loss=-0.10336, sqweights=0.52534]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 64.20it/s, loss=-0.10435, sqweights=0.52628]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 64.20it/s, loss=-0.10593, sqweights=0.52556]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 64.20it/s, loss=-0.10469, sqweights=0.52511]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.20it/s, loss=-0.10383, sqweights=0.52507]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.20it/s, loss=-0.10383, sqweights=0.52507, train_loss=-0.13117, train_sqweights=0.43670, val_loss=-0.11038, val_sqweights=0.42646]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.20it/s, loss=-0.10383, sqweights=0.52507, train_loss=-0.13117, train_sqweights=0.43670, val_loss=-0.11038, val_sqweights=0.42646]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 20.93it/s, loss=-0.10383, sqweights=0.52507, train_loss=-0.13117, train_sqweights=0.43670, val_loss=-0.11038, val_sqweights=0.42646]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 47.12it/s, loss=-0.10188, sqweights=0.51771]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 54.31it/s, loss=-0.09754, sqweights=0.53058]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 56.32it/s, loss=-0.09920, sqweights=0.53886]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 58.33it/s, loss=-0.09839, sqweights=0.53465]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 59.18it/s, loss=-0.09601, sqweights=0.53780]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 60.17it/s, loss=-0.09363, sqweights=0.53818]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 60.90it/s, loss=-0.09363, sqweights=0.53818]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 60.90it/s, loss=-0.09395, sqweights=0.53792]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 60.90it/s, loss=-0.09656, sqweights=0.53907]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 60.90it/s, loss=-0.09832, sqweights=0.53794]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 60.90it/s, loss=-0.09883, sqweights=0.53919]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 60.90it/s, loss=-0.09839, sqweights=0.53883]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 60.90it/s, loss=-0.09872, sqweights=0.53865]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 60.90it/s, loss=-0.09873, sqweights=0.53870]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 63.53it/s, loss=-0.09873, sqweights=0.53870]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 63.53it/s, loss=-0.09880, sqweights=0.53917]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 63.53it/s, loss=-0.10054, sqweights=0.54141]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 63.53it/s, loss=-0.10193, sqweights=0.54149]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 63.53it/s, loss=-0.10064, sqweights=0.54119]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 63.53it/s, loss=-0.10212, sqweights=0.54285]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 63.53it/s, loss=-0.10255, sqweights=0.54419]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 63.53it/s, loss=-0.10345, sqweights=0.54703]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 63.53it/s, loss=-0.10345, sqweights=0.54703, train_loss=-0.13582, train_sqweights=0.45844, val_loss=-0.11390, val_sqweights=0.44765]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 63.53it/s, loss=-0.10345, sqweights=0.54703, train_loss=-0.13582, train_sqweights=0.45844, val_loss=-0.11390, val_sqweights=0.44765]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 22.79it/s, loss=-0.10345, sqweights=0.54703, train_loss=-0.13582, train_sqweights=0.45844, val_loss=-0.11390, val_sqweights=0.44765]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 46.74it/s, loss=-0.09658, sqweights=0.55288]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 54.23it/s, loss=-0.10867, sqweights=0.55607]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 57.48it/s, loss=-0.10887, sqweights=0.55280]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 59.16it/s, loss=-0.11076, sqweights=0.55486]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 60.28it/s, loss=-0.11046, sqweights=0.55728]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 61.02it/s, loss=-0.10596, sqweights=0.56183]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 61.45it/s, loss=-0.10596, sqweights=0.56183]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 61.45it/s, loss=-0.10937, sqweights=0.55975]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 61.45it/s, loss=-0.11009, sqweights=0.55677]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 61.45it/s, loss=-0.10771, sqweights=0.55523]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 61.45it/s, loss=-0.10786, sqweights=0.55664]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 61.45it/s, loss=-0.10650, sqweights=0.55717]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 61.45it/s, loss=-0.10657, sqweights=0.55724]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 61.45it/s, loss=-0.10703, sqweights=0.55759]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 63.56it/s, loss=-0.10703, sqweights=0.55759]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 63.56it/s, loss=-0.10447, sqweights=0.56022]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 63.56it/s, loss=-0.10540, sqweights=0.56138]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 63.56it/s, loss=-0.10603, sqweights=0.55978]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 63.56it/s, loss=-0.10604, sqweights=0.56105]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 63.56it/s, loss=-0.10581, sqweights=0.56215]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 63.56it/s, loss=-0.10539, sqweights=0.56317]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 63.56it/s, loss=-0.10498, sqweights=0.56376]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 63.56it/s, loss=-0.10498, sqweights=0.56376, train_loss=-0.13993, train_sqweights=0.48041, val_loss=-0.11708, val_sqweights=0.46924]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 63.56it/s, loss=-0.10498, sqweights=0.56376, train_loss=-0.13993, train_sqweights=0.48041, val_loss=-0.11708, val_sqweights=0.46924]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 22.86it/s, loss=-0.10498, sqweights=0.56376, train_loss=-0.13993, train_sqweights=0.48041, val_loss=-0.11708, val_sqweights=0.46924]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 47.76it/s, loss=-0.11021, sqweights=0.60403]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 55.22it/s, loss=-0.12262, sqweights=0.58913]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 58.06it/s, loss=-0.11992, sqweights=0.58161]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 59.10it/s, loss=-0.11803, sqweights=0.58495]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 59.33it/s, loss=-0.10913, sqweights=0.58676]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 60.29it/s, loss=-0.11074, sqweights=0.58602]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 61.01it/s, loss=-0.11074, sqweights=0.58602]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 61.01it/s, loss=-0.10849, sqweights=0.58338]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 61.01it/s, loss=-0.10664, sqweights=0.58345]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 61.01it/s, loss=-0.10541, sqweights=0.58332]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 61.01it/s, loss=-0.10881, sqweights=0.58577]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 61.01it/s, loss=-0.10944, sqweights=0.58665]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 61.01it/s, loss=-0.11084, sqweights=0.58591]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 61.01it/s, loss=-0.10914, sqweights=0.58730]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 63.77it/s, loss=-0.10914, sqweights=0.58730]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 63.77it/s, loss=-0.11013, sqweights=0.58772]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 63.77it/s, loss=-0.10914, sqweights=0.58760]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 63.77it/s, loss=-0.10860, sqweights=0.58951]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 63.77it/s, loss=-0.10833, sqweights=0.58995]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 63.77it/s, loss=-0.10761, sqweights=0.58960]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 63.77it/s, loss=-0.10804, sqweights=0.59076]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 63.77it/s, loss=-0.10770, sqweights=0.58922]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 63.77it/s, loss=-0.10770, sqweights=0.58922, train_loss=-0.14367, train_sqweights=0.49963, val_loss=-0.12016, val_sqweights=0.48839]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 63.77it/s, loss=-0.10770, sqweights=0.58922, train_loss=-0.14367, train_sqweights=0.49963, val_loss=-0.12016, val_sqweights=0.48839]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 20.91it/s, loss=-0.10770, sqweights=0.58922, train_loss=-0.14367, train_sqweights=0.49963, val_loss=-0.12016, val_sqweights=0.48839]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 46.64it/s, loss=-0.10983, sqweights=0.57812]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 54.36it/s, loss=-0.11428, sqweights=0.59717]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 57.53it/s, loss=-0.11069, sqweights=0.59244]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 59.43it/s, loss=-0.11484, sqweights=0.59877]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 60.63it/s, loss=-0.11547, sqweights=0.59944]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 61.45it/s, loss=-0.11261, sqweights=0.60473]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 61.96it/s, loss=-0.11261, sqweights=0.60473]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 61.96it/s, loss=-0.10938, sqweights=0.60413]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 61.96it/s, loss=-0.10743, sqweights=0.60227]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 61.96it/s, loss=-0.10564, sqweights=0.60457]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 61.96it/s, loss=-0.10692, sqweights=0.60357]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 61.96it/s, loss=-0.10722, sqweights=0.60366]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 61.96it/s, loss=-0.10710, sqweights=0.60506]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 61.96it/s, loss=-0.10725, sqweights=0.60452]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 63.82it/s, loss=-0.10725, sqweights=0.60452]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 63.82it/s, loss=-0.10930, sqweights=0.60487]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 63.82it/s, loss=-0.11011, sqweights=0.60580]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 63.82it/s, loss=-0.11014, sqweights=0.60553]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 63.82it/s, loss=-0.11195, sqweights=0.60712]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 63.82it/s, loss=-0.11053, sqweights=0.60771]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 63.82it/s, loss=-0.11004, sqweights=0.60868]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 63.82it/s, loss=-0.11161, sqweights=0.60914]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 63.82it/s, loss=-0.11161, sqweights=0.60914, train_loss=-0.14733, train_sqweights=0.52096, val_loss=-0.12274, val_sqweights=0.51023]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 63.82it/s, loss=-0.11161, sqweights=0.60914, train_loss=-0.14733, train_sqweights=0.52096, val_loss=-0.12274, val_sqweights=0.51023]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 22.89it/s, loss=-0.11161, sqweights=0.60914, train_loss=-0.14733, train_sqweights=0.52096, val_loss=-0.12274, val_sqweights=0.51023]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:00, 47.73it/s, loss=-0.10189, sqweights=0.62317]
Epoch 23:  10%|#         | 2/20 [00:00<00:00, 54.81it/s, loss=-0.09014, sqweights=0.62369]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 57.94it/s, loss=-0.09457, sqweights=0.62294]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 59.66it/s, loss=-0.09759, sqweights=0.62838]
Epoch 23:  25%|##5       | 5/20 [00:00<00:00, 60.79it/s, loss=-0.09827, sqweights=0.62507]
Epoch 23:  30%|###       | 6/20 [00:00<00:00, 60.95it/s, loss=-0.09741, sqweights=0.62451]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 61.64it/s, loss=-0.09741, sqweights=0.62451]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 61.64it/s, loss=-0.09721, sqweights=0.62286]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 61.64it/s, loss=-0.09592, sqweights=0.62339]
Epoch 23:  45%|####5     | 9/20 [00:00<00:00, 61.64it/s, loss=-0.09846, sqweights=0.62310]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 61.64it/s, loss=-0.10111, sqweights=0.62261]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 61.64it/s, loss=-0.10385, sqweights=0.62577]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 61.64it/s, loss=-0.10680, sqweights=0.62724]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 61.64it/s, loss=-0.10744, sqweights=0.62744]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 63.63it/s, loss=-0.10744, sqweights=0.62744]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 63.63it/s, loss=-0.10699, sqweights=0.62720]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 63.63it/s, loss=-0.10539, sqweights=0.62670]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 63.63it/s, loss=-0.10532, sqweights=0.62738]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 63.63it/s, loss=-0.10750, sqweights=0.62826]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 63.63it/s, loss=-0.10835, sqweights=0.63050]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 63.63it/s, loss=-0.10885, sqweights=0.63002]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 63.63it/s, loss=-0.10959, sqweights=0.62822]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 63.63it/s, loss=-0.10959, sqweights=0.62822, train_loss=-0.15079, train_sqweights=0.53957, val_loss=-0.12452, val_sqweights=0.52951]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 63.63it/s, loss=-0.10959, sqweights=0.62822, train_loss=-0.15079, train_sqweights=0.53957, val_loss=-0.12452, val_sqweights=0.52951]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 22.92it/s, loss=-0.10959, sqweights=0.62822, train_loss=-0.15079, train_sqweights=0.53957, val_loss=-0.12452, val_sqweights=0.52951]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 47.23it/s, loss=-0.12973, sqweights=0.63515]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 54.74it/s, loss=-0.12763, sqweights=0.64643]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 57.59it/s, loss=-0.13200, sqweights=0.64165]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 59.46it/s, loss=-0.13457, sqweights=0.64950]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 60.50it/s, loss=-0.12727, sqweights=0.65245]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 61.25it/s, loss=-0.12347, sqweights=0.65416]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 61.84it/s, loss=-0.12347, sqweights=0.65416]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 61.84it/s, loss=-0.12111, sqweights=0.65215]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 61.84it/s, loss=-0.12204, sqweights=0.64892]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 61.84it/s, loss=-0.11911, sqweights=0.64800]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 61.84it/s, loss=-0.11981, sqweights=0.64662]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 61.84it/s, loss=-0.11911, sqweights=0.64659]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 61.84it/s, loss=-0.11886, sqweights=0.64647]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 61.84it/s, loss=-0.11818, sqweights=0.64683]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 63.62it/s, loss=-0.11818, sqweights=0.64683]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 63.62it/s, loss=-0.11848, sqweights=0.64672]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 63.62it/s, loss=-0.12006, sqweights=0.64616]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 63.62it/s, loss=-0.11972, sqweights=0.64542]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 63.62it/s, loss=-0.11766, sqweights=0.64544]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 63.62it/s, loss=-0.11684, sqweights=0.64427]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 63.62it/s, loss=-0.11617, sqweights=0.64380]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 63.62it/s, loss=-0.11682, sqweights=0.64500]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 63.62it/s, loss=-0.11682, sqweights=0.64500, train_loss=-0.15390, train_sqweights=0.56222, val_loss=-0.12632, val_sqweights=0.55202]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 63.62it/s, loss=-0.11682, sqweights=0.64500, train_loss=-0.15390, train_sqweights=0.56222, val_loss=-0.12632, val_sqweights=0.55202]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 22.88it/s, loss=-0.11682, sqweights=0.64500, train_loss=-0.15390, train_sqweights=0.56222, val_loss=-0.12632, val_sqweights=0.55202]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:01,  9.73it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:01,  9.73it/s, loss=-0.09156, sqweights=0.66250]
Epoch 25:  10%|#         | 2/20 [00:00<00:01,  9.73it/s, loss=-0.10479, sqweights=0.64430]
Epoch 25:  15%|#5        | 3/20 [00:00<00:01,  9.73it/s, loss=-0.10634, sqweights=0.64696]
Epoch 25:  20%|##        | 4/20 [00:00<00:01,  9.73it/s, loss=-0.11372, sqweights=0.64810]
Epoch 25:  25%|##5       | 5/20 [00:00<00:01,  9.73it/s, loss=-0.11464, sqweights=0.64730]
Epoch 25:  30%|###       | 6/20 [00:00<00:01,  9.73it/s, loss=-0.11817, sqweights=0.65089]
Epoch 25:  35%|###5      | 7/20 [00:00<00:01,  9.73it/s, loss=-0.11576, sqweights=0.65264]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 43.05it/s, loss=-0.11576, sqweights=0.65264]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 43.05it/s, loss=-0.11682, sqweights=0.65449]
Epoch 25:  45%|####5     | 9/20 [00:00<00:00, 43.05it/s, loss=-0.11536, sqweights=0.65618]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 43.05it/s, loss=-0.11921, sqweights=0.65764]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 43.05it/s, loss=-0.11795, sqweights=0.65698]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 43.05it/s, loss=-0.11647, sqweights=0.65719]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 43.05it/s, loss=-0.11717, sqweights=0.65788]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 43.05it/s, loss=-0.11715, sqweights=0.65743]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 53.37it/s, loss=-0.11715, sqweights=0.65743]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 53.37it/s, loss=-0.11706, sqweights=0.65677]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 53.37it/s, loss=-0.11651, sqweights=0.65842]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 53.37it/s, loss=-0.11606, sqweights=0.65854]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 53.37it/s, loss=-0.11744, sqweights=0.65900]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 53.37it/s, loss=-0.11930, sqweights=0.66053]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 53.37it/s, loss=-0.11864, sqweights=0.66051]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 53.37it/s, loss=-0.11864, sqweights=0.66051, train_loss=-0.15682, train_sqweights=0.58250, val_loss=-0.12819, val_sqweights=0.57231]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 53.37it/s, loss=-0.11864, sqweights=0.66051, train_loss=-0.15682, train_sqweights=0.58250, val_loss=-0.12819, val_sqweights=0.57231]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 20.97it/s, loss=-0.11864, sqweights=0.66051, train_loss=-0.15682, train_sqweights=0.58250, val_loss=-0.12819, val_sqweights=0.57231]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 47.60it/s, loss=-0.10470, sqweights=0.67497]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 55.13it/s, loss=-0.10532, sqweights=0.66212]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 57.15it/s, loss=-0.10920, sqweights=0.66409]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 59.15it/s, loss=-0.10400, sqweights=0.66668]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 60.47it/s, loss=-0.10720, sqweights=0.66864]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 61.31it/s, loss=-0.11315, sqweights=0.66719]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 61.91it/s, loss=-0.11315, sqweights=0.66719]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 61.91it/s, loss=-0.11380, sqweights=0.66797]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 61.91it/s, loss=-0.11308, sqweights=0.66758]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 61.91it/s, loss=-0.11336, sqweights=0.66827]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 61.91it/s, loss=-0.11378, sqweights=0.66963]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 61.91it/s, loss=-0.11524, sqweights=0.67151]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 61.91it/s, loss=-0.11495, sqweights=0.67194]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 61.91it/s, loss=-0.11439, sqweights=0.66952]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 63.90it/s, loss=-0.11439, sqweights=0.66952]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 63.90it/s, loss=-0.11526, sqweights=0.67020]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 63.90it/s, loss=-0.11641, sqweights=0.67148]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 63.90it/s, loss=-0.11699, sqweights=0.67208]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 63.90it/s, loss=-0.11732, sqweights=0.67212]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 63.90it/s, loss=-0.11704, sqweights=0.67288]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 63.90it/s, loss=-0.11839, sqweights=0.67269]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 63.90it/s, loss=-0.11867, sqweights=0.67311]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 63.90it/s, loss=-0.11867, sqweights=0.67311, train_loss=-0.15961, train_sqweights=0.60092, val_loss=-0.12996, val_sqweights=0.59055]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 63.90it/s, loss=-0.11867, sqweights=0.67311, train_loss=-0.15961, train_sqweights=0.60092, val_loss=-0.12996, val_sqweights=0.59055]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 22.94it/s, loss=-0.11867, sqweights=0.67311, train_loss=-0.15961, train_sqweights=0.60092, val_loss=-0.12996, val_sqweights=0.59055]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 47.54it/s, loss=-0.08931, sqweights=0.67794]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 54.88it/s, loss=-0.11015, sqweights=0.67070]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 57.90it/s, loss=-0.11220, sqweights=0.68004]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 59.74it/s, loss=-0.11204, sqweights=0.68708]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 60.84it/s, loss=-0.11213, sqweights=0.68693]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 61.62it/s, loss=-0.11244, sqweights=0.68712]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 62.21it/s, loss=-0.11244, sqweights=0.68712]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 62.21it/s, loss=-0.11616, sqweights=0.68955]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 62.21it/s, loss=-0.11711, sqweights=0.69267]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 62.21it/s, loss=-0.11616, sqweights=0.69248]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 62.21it/s, loss=-0.11629, sqweights=0.69344]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 62.21it/s, loss=-0.11438, sqweights=0.69218]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 62.21it/s, loss=-0.11568, sqweights=0.69239]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 62.21it/s, loss=-0.11816, sqweights=0.69435]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 63.83it/s, loss=-0.11816, sqweights=0.69435]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 63.83it/s, loss=-0.11764, sqweights=0.69381]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 63.83it/s, loss=-0.11721, sqweights=0.69394]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 63.83it/s, loss=-0.11636, sqweights=0.69196]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 63.83it/s, loss=-0.11645, sqweights=0.69259]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 63.83it/s, loss=-0.11485, sqweights=0.69292]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 63.83it/s, loss=-0.11484, sqweights=0.69180]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.11323, sqweights=0.69140]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.11323, sqweights=0.69140, train_loss=-0.16239, train_sqweights=0.61816, val_loss=-0.13177, val_sqweights=0.60865]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 63.83it/s, loss=-0.11323, sqweights=0.69140, train_loss=-0.16239, train_sqweights=0.61816, val_loss=-0.13177, val_sqweights=0.60865]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 22.82it/s, loss=-0.11323, sqweights=0.69140, train_loss=-0.16239, train_sqweights=0.61816, val_loss=-0.13177, val_sqweights=0.60865]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 47.97it/s, loss=-0.09019, sqweights=0.68347]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 55.23it/s, loss=-0.12314, sqweights=0.70189]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 58.18it/s, loss=-0.12475, sqweights=0.70592]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 59.62it/s, loss=-0.12210, sqweights=0.70406]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 59.95it/s, loss=-0.11902, sqweights=0.70395]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 32.64it/s, loss=-0.11902, sqweights=0.70395]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 32.64it/s, loss=-0.11879, sqweights=0.70770]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 32.64it/s, loss=-0.11904, sqweights=0.70746]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 32.64it/s, loss=-0.11948, sqweights=0.70644]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 32.64it/s, loss=-0.11686, sqweights=0.70587]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 32.64it/s, loss=-0.11755, sqweights=0.70526]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 32.64it/s, loss=-0.11731, sqweights=0.70499]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 32.64it/s, loss=-0.11958, sqweights=0.70573]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 47.53it/s, loss=-0.11958, sqweights=0.70573]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 47.53it/s, loss=-0.12083, sqweights=0.70610]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 47.53it/s, loss=-0.12015, sqweights=0.70791]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 47.53it/s, loss=-0.12036, sqweights=0.70888]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 47.53it/s, loss=-0.11960, sqweights=0.70931]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 47.53it/s, loss=-0.11950, sqweights=0.71004]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 47.53it/s, loss=-0.11959, sqweights=0.70987]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 47.53it/s, loss=-0.11902, sqweights=0.70964]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 55.70it/s, loss=-0.11902, sqweights=0.70964]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 55.70it/s, loss=-0.12100, sqweights=0.71216]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 55.70it/s, loss=-0.12100, sqweights=0.71216, train_loss=-0.16446, train_sqweights=0.63396, val_loss=-0.13316, val_sqweights=0.62402]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 55.70it/s, loss=-0.12100, sqweights=0.71216, train_loss=-0.16446, train_sqweights=0.63396, val_loss=-0.13316, val_sqweights=0.62402]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 20.89it/s, loss=-0.12100, sqweights=0.71216, train_loss=-0.16446, train_sqweights=0.63396, val_loss=-0.13316, val_sqweights=0.62402]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 47.09it/s, loss=-0.12752, sqweights=0.72834]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 54.64it/s, loss=-0.12679, sqweights=0.72798]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 57.57it/s, loss=-0.12470, sqweights=0.72240]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 58.77it/s, loss=-0.11991, sqweights=0.71982]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 59.87it/s, loss=-0.11939, sqweights=0.71389]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 60.71it/s, loss=-0.11497, sqweights=0.71174]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 61.26it/s, loss=-0.11497, sqweights=0.71174]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 61.26it/s, loss=-0.11554, sqweights=0.71382]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 61.26it/s, loss=-0.11844, sqweights=0.71847]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 61.26it/s, loss=-0.11324, sqweights=0.71925]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 61.26it/s, loss=-0.11548, sqweights=0.71904]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 61.26it/s, loss=-0.11778, sqweights=0.72184]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 61.26it/s, loss=-0.11952, sqweights=0.72297]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 61.26it/s, loss=-0.12022, sqweights=0.72243]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 63.66it/s, loss=-0.12022, sqweights=0.72243]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 63.66it/s, loss=-0.11817, sqweights=0.72240]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 63.66it/s, loss=-0.11918, sqweights=0.72312]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 63.66it/s, loss=-0.11884, sqweights=0.72322]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 63.66it/s, loss=-0.11899, sqweights=0.72395]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 63.66it/s, loss=-0.11888, sqweights=0.72379]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 63.66it/s, loss=-0.11915, sqweights=0.72544]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 63.66it/s, loss=-0.11814, sqweights=0.72400]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 63.66it/s, loss=-0.11814, sqweights=0.72400, train_loss=-0.16686, train_sqweights=0.65243, val_loss=-0.13435, val_sqweights=0.64331]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 63.66it/s, loss=-0.11814, sqweights=0.72400, train_loss=-0.16686, train_sqweights=0.65243, val_loss=-0.13435, val_sqweights=0.64331]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 22.82it/s, loss=-0.11814, sqweights=0.72400, train_loss=-0.16686, train_sqweights=0.65243, val_loss=-0.13435, val_sqweights=0.64331]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 46.66it/s, loss=-0.08975, sqweights=0.74962]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 53.80it/s, loss=-0.11361, sqweights=0.73260]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 57.10it/s, loss=-0.11839, sqweights=0.73616]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 58.81it/s, loss=-0.11846, sqweights=0.73500]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 59.93it/s, loss=-0.11918, sqweights=0.73481]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 60.53it/s, loss=-0.11947, sqweights=0.73192]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 61.09it/s, loss=-0.11947, sqweights=0.73192]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 61.09it/s, loss=-0.11914, sqweights=0.73273]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 61.09it/s, loss=-0.11920, sqweights=0.73628]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 61.09it/s, loss=-0.12335, sqweights=0.73624]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 61.09it/s, loss=-0.12325, sqweights=0.73637]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 61.09it/s, loss=-0.12332, sqweights=0.73552]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 61.09it/s, loss=-0.12103, sqweights=0.73602]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 61.09it/s, loss=-0.12311, sqweights=0.73581]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 62.87it/s, loss=-0.12311, sqweights=0.73581]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 62.87it/s, loss=-0.12218, sqweights=0.73568]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 62.87it/s, loss=-0.11988, sqweights=0.73551]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 62.87it/s, loss=-0.11975, sqweights=0.73663]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 62.87it/s, loss=-0.12008, sqweights=0.73628]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 62.87it/s, loss=-0.12102, sqweights=0.73716]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 62.87it/s, loss=-0.12114, sqweights=0.73509]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 62.87it/s, loss=-0.12223, sqweights=0.73404]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 62.87it/s, loss=-0.12223, sqweights=0.73404, train_loss=-0.16889, train_sqweights=0.66914, val_loss=-0.13579, val_sqweights=0.66045]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 62.87it/s, loss=-0.12223, sqweights=0.73404, train_loss=-0.16889, train_sqweights=0.66914, val_loss=-0.13579, val_sqweights=0.66045]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 22.54it/s, loss=-0.12223, sqweights=0.73404, train_loss=-0.16889, train_sqweights=0.66914, val_loss=-0.13579, val_sqweights=0.66045]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 47.58it/s, loss=-0.10647, sqweights=0.75683]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 53.86it/s, loss=-0.11188, sqweights=0.74844]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 56.01it/s, loss=-0.12725, sqweights=0.75602]
Epoch 31:  20%|##        | 4/20 [00:00<00:00, 57.86it/s, loss=-0.13030, sqweights=0.75836]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 59.07it/s, loss=-0.12473, sqweights=0.74993]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 59.76it/s, loss=-0.12473, sqweights=0.74993]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 59.76it/s, loss=-0.12538, sqweights=0.74775]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 59.76it/s, loss=-0.12701, sqweights=0.74975]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 59.76it/s, loss=-0.12545, sqweights=0.74617]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 59.76it/s, loss=-0.12440, sqweights=0.74831]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 59.76it/s, loss=-0.12317, sqweights=0.74918]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 59.76it/s, loss=-0.12259, sqweights=0.75102]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 59.76it/s, loss=-0.12038, sqweights=0.75158]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 62.72it/s, loss=-0.12038, sqweights=0.75158]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 62.72it/s, loss=-0.12235, sqweights=0.75189]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 62.72it/s, loss=-0.12292, sqweights=0.75248]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 62.72it/s, loss=-0.12154, sqweights=0.75210]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 62.72it/s, loss=-0.12200, sqweights=0.75092]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 62.72it/s, loss=-0.11976, sqweights=0.75180]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 62.72it/s, loss=-0.11965, sqweights=0.75262]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 62.72it/s, loss=-0.11921, sqweights=0.75363]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.73it/s, loss=-0.11921, sqweights=0.75363]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.73it/s, loss=-0.11985, sqweights=0.75470]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.73it/s, loss=-0.11985, sqweights=0.75470, train_loss=-0.17068, train_sqweights=0.68732, val_loss=-0.13670, val_sqweights=0.67888]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.73it/s, loss=-0.11985, sqweights=0.75470, train_loss=-0.17068, train_sqweights=0.68732, val_loss=-0.13670, val_sqweights=0.67888]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 20.58it/s, loss=-0.11985, sqweights=0.75470, train_loss=-0.17068, train_sqweights=0.68732, val_loss=-0.13670, val_sqweights=0.67888]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 47.45it/s, loss=-0.14038, sqweights=0.74914]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 54.65it/s, loss=-0.13524, sqweights=0.74267]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 57.70it/s, loss=-0.12595, sqweights=0.74568]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 59.43it/s, loss=-0.12936, sqweights=0.74515]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 60.42it/s, loss=-0.12650, sqweights=0.75174]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 61.02it/s, loss=-0.12257, sqweights=0.74636]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 61.65it/s, loss=-0.12257, sqweights=0.74636]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 61.65it/s, loss=-0.12163, sqweights=0.74898]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 61.65it/s, loss=-0.12626, sqweights=0.75568]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 61.65it/s, loss=-0.12643, sqweights=0.75489]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 61.65it/s, loss=-0.12597, sqweights=0.75237]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 61.65it/s, loss=-0.12674, sqweights=0.75575]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 61.65it/s, loss=-0.12874, sqweights=0.75741]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 61.65it/s, loss=-0.12939, sqweights=0.75828]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 63.34it/s, loss=-0.12939, sqweights=0.75828]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 63.34it/s, loss=-0.12884, sqweights=0.75968]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 63.34it/s, loss=-0.12791, sqweights=0.75786]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 63.34it/s, loss=-0.12688, sqweights=0.75696]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 63.34it/s, loss=-0.12668, sqweights=0.75599]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 63.34it/s, loss=-0.12589, sqweights=0.75746]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 63.34it/s, loss=-0.12555, sqweights=0.75814]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 63.34it/s, loss=-0.12444, sqweights=0.75931]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 63.34it/s, loss=-0.12444, sqweights=0.75931, train_loss=-0.17207, train_sqweights=0.70150, val_loss=-0.13754, val_sqweights=0.69383]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 63.34it/s, loss=-0.12444, sqweights=0.75931, train_loss=-0.17207, train_sqweights=0.70150, val_loss=-0.13754, val_sqweights=0.69383]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 22.83it/s, loss=-0.12444, sqweights=0.75931, train_loss=-0.17207, train_sqweights=0.70150, val_loss=-0.13754, val_sqweights=0.69383]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 48.11it/s, loss=-0.09969, sqweights=0.77595]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 55.39it/s, loss=-0.10834, sqweights=0.77341]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 58.38it/s, loss=-0.11012, sqweights=0.77007]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 59.78it/s, loss=-0.11841, sqweights=0.77417]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 60.04it/s, loss=-0.11589, sqweights=0.77615]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 60.69it/s, loss=-0.11438, sqweights=0.77231]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 60.90it/s, loss=-0.11438, sqweights=0.77231]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 60.90it/s, loss=-0.11374, sqweights=0.77583]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 60.90it/s, loss=-0.11696, sqweights=0.77313]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 60.90it/s, loss=-0.11744, sqweights=0.77398]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 60.90it/s, loss=-0.11833, sqweights=0.77209]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 60.90it/s, loss=-0.11870, sqweights=0.77195]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 60.90it/s, loss=-0.11680, sqweights=0.77312]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 60.90it/s, loss=-0.11843, sqweights=0.77228]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 63.15it/s, loss=-0.11843, sqweights=0.77228]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 63.15it/s, loss=-0.11786, sqweights=0.77214]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 63.15it/s, loss=-0.11785, sqweights=0.77352]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 63.15it/s, loss=-0.11747, sqweights=0.77475]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 63.15it/s, loss=-0.11932, sqweights=0.77469]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 63.15it/s, loss=-0.11636, sqweights=0.77433]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 63.15it/s, loss=-0.11591, sqweights=0.77393]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 63.15it/s, loss=-0.11831, sqweights=0.77504]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 63.15it/s, loss=-0.11831, sqweights=0.77504, train_loss=-0.17351, train_sqweights=0.71443, val_loss=-0.13893, val_sqweights=0.70731]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 63.15it/s, loss=-0.11831, sqweights=0.77504, train_loss=-0.17351, train_sqweights=0.71443, val_loss=-0.13893, val_sqweights=0.70731]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 22.79it/s, loss=-0.11831, sqweights=0.77504, train_loss=-0.17351, train_sqweights=0.71443, val_loss=-0.13893, val_sqweights=0.70731]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 46.13it/s, loss=-0.16324, sqweights=0.78784]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 53.98it/s, loss=-0.14455, sqweights=0.78077]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 57.25it/s, loss=-0.14024, sqweights=0.77537]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 59.09it/s, loss=-0.14221, sqweights=0.78178]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 60.26it/s, loss=-0.13457, sqweights=0.78010]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 61.09it/s, loss=-0.12802, sqweights=0.77961]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 61.71it/s, loss=-0.12802, sqweights=0.77961]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 61.71it/s, loss=-0.12418, sqweights=0.77572]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 61.71it/s, loss=-0.12454, sqweights=0.77387]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 61.71it/s, loss=-0.12662, sqweights=0.77704]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 61.71it/s, loss=-0.12472, sqweights=0.78006]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 61.71it/s, loss=-0.12231, sqweights=0.78247]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 61.71it/s, loss=-0.12259, sqweights=0.78339]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 61.71it/s, loss=-0.12154, sqweights=0.78231]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 63.91it/s, loss=-0.12154, sqweights=0.78231]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 63.91it/s, loss=-0.11931, sqweights=0.78295]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 63.91it/s, loss=-0.12003, sqweights=0.78285]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 63.91it/s, loss=-0.12196, sqweights=0.78291]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 63.91it/s, loss=-0.12167, sqweights=0.78485]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 63.91it/s, loss=-0.12232, sqweights=0.78560]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 63.91it/s, loss=-0.12116, sqweights=0.78596]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 63.91it/s, loss=-0.12124, sqweights=0.78668]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 63.91it/s, loss=-0.12124, sqweights=0.78668, train_loss=-0.17485, train_sqweights=0.72696, val_loss=-0.14022, val_sqweights=0.71958]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 63.91it/s, loss=-0.12124, sqweights=0.78668, train_loss=-0.17485, train_sqweights=0.72696, val_loss=-0.14022, val_sqweights=0.71958]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 20.76it/s, loss=-0.12124, sqweights=0.78668, train_loss=-0.17485, train_sqweights=0.72696, val_loss=-0.14022, val_sqweights=0.71958]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 47.87it/s, loss=-0.10777, sqweights=0.82000]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 54.53it/s, loss=-0.11491, sqweights=0.80791]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 57.71it/s, loss=-0.11993, sqweights=0.79784]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 59.43it/s, loss=-0.11793, sqweights=0.79275]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 60.51it/s, loss=-0.11832, sqweights=0.79327]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 61.28it/s, loss=-0.11984, sqweights=0.79264]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 61.62it/s, loss=-0.11984, sqweights=0.79264]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 61.62it/s, loss=-0.11660, sqweights=0.79268]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 61.62it/s, loss=-0.11705, sqweights=0.79260]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 61.62it/s, loss=-0.11702, sqweights=0.79348]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 61.62it/s, loss=-0.11852, sqweights=0.79376]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 61.62it/s, loss=-0.12006, sqweights=0.79594]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 61.62it/s, loss=-0.12099, sqweights=0.79834]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 61.62it/s, loss=-0.12171, sqweights=0.79898]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 63.55it/s, loss=-0.12171, sqweights=0.79898]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 63.55it/s, loss=-0.12117, sqweights=0.79752]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 63.55it/s, loss=-0.12270, sqweights=0.79716]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 63.55it/s, loss=-0.12421, sqweights=0.79664]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 63.55it/s, loss=-0.12396, sqweights=0.79573]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 63.55it/s, loss=-0.12251, sqweights=0.79593]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 63.55it/s, loss=-0.12226, sqweights=0.79575]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 63.55it/s, loss=-0.12331, sqweights=0.79596]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 63.55it/s, loss=-0.12331, sqweights=0.79596, train_loss=-0.17598, train_sqweights=0.74184, val_loss=-0.14088, val_sqweights=0.73432]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 63.55it/s, loss=-0.12331, sqweights=0.79596, train_loss=-0.17598, train_sqweights=0.74184, val_loss=-0.14088, val_sqweights=0.73432]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 22.84it/s, loss=-0.12331, sqweights=0.79596, train_loss=-0.17598, train_sqweights=0.74184, val_loss=-0.14088, val_sqweights=0.73432]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 47.88it/s, loss=-0.09839, sqweights=0.81489]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 53.80it/s, loss=-0.11358, sqweights=0.81237]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 57.24it/s, loss=-0.11447, sqweights=0.81363]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 59.13it/s, loss=-0.11932, sqweights=0.81529]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 60.38it/s, loss=-0.11817, sqweights=0.81902]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 60.86it/s, loss=-0.11232, sqweights=0.81667]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 61.52it/s, loss=-0.11232, sqweights=0.81667]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 61.52it/s, loss=-0.10884, sqweights=0.81265]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 61.52it/s, loss=-0.10901, sqweights=0.81351]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 61.52it/s, loss=-0.11062, sqweights=0.81113]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 61.52it/s, loss=-0.11418, sqweights=0.80800]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 61.52it/s, loss=-0.11325, sqweights=0.80943]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 61.52it/s, loss=-0.11527, sqweights=0.81037]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 61.52it/s, loss=-0.12021, sqweights=0.81182]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 64.18it/s, loss=-0.12021, sqweights=0.81182]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 64.18it/s, loss=-0.12200, sqweights=0.81246]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 64.18it/s, loss=-0.12172, sqweights=0.81250]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 64.18it/s, loss=-0.11985, sqweights=0.81224]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 64.18it/s, loss=-0.12060, sqweights=0.81333]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 64.18it/s, loss=-0.11951, sqweights=0.81378]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 64.18it/s, loss=-0.11939, sqweights=0.81385]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.18it/s, loss=-0.11944, sqweights=0.81471]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.18it/s, loss=-0.11944, sqweights=0.81471, train_loss=-0.17719, train_sqweights=0.75626, val_loss=-0.14117, val_sqweights=0.74964]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.18it/s, loss=-0.11944, sqweights=0.81471, train_loss=-0.17719, train_sqweights=0.75626, val_loss=-0.14117, val_sqweights=0.74964]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 22.94it/s, loss=-0.11944, sqweights=0.81471, train_loss=-0.17719, train_sqweights=0.75626, val_loss=-0.14117, val_sqweights=0.74964]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 47.35it/s, loss=-0.09669, sqweights=0.80371]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 54.48it/s, loss=-0.11420, sqweights=0.81049]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 57.54it/s, loss=-0.12097, sqweights=0.81250]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 59.28it/s, loss=-0.12923, sqweights=0.81603]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 60.41it/s, loss=-0.12845, sqweights=0.81897]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 61.33it/s, loss=-0.12446, sqweights=0.81689]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 61.98it/s, loss=-0.12446, sqweights=0.81689]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 61.98it/s, loss=-0.12119, sqweights=0.81777]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 61.98it/s, loss=-0.11886, sqweights=0.81662]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 61.98it/s, loss=-0.11647, sqweights=0.81343]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 61.98it/s, loss=-0.11688, sqweights=0.81170]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 61.98it/s, loss=-0.11829, sqweights=0.81052]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 61.98it/s, loss=-0.11920, sqweights=0.80982]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 61.98it/s, loss=-0.11829, sqweights=0.80992]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.11829, sqweights=0.80992]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 64.12it/s, loss=-0.11929, sqweights=0.81049]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 64.12it/s, loss=-0.12044, sqweights=0.81164]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 64.12it/s, loss=-0.12100, sqweights=0.81205]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 64.12it/s, loss=-0.12035, sqweights=0.81363]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 64.12it/s, loss=-0.11999, sqweights=0.81255]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 64.12it/s, loss=-0.12092, sqweights=0.81334]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.12184, sqweights=0.81348]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.12184, sqweights=0.81348, train_loss=-0.17800, train_sqweights=0.76916, val_loss=-0.14122, val_sqweights=0.76233]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.12it/s, loss=-0.12184, sqweights=0.81348, train_loss=-0.17800, train_sqweights=0.76916, val_loss=-0.14122, val_sqweights=0.76233]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 20.84it/s, loss=-0.12184, sqweights=0.81348, train_loss=-0.17800, train_sqweights=0.76916, val_loss=-0.14122, val_sqweights=0.76233]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:00, 46.87it/s, loss=-0.13066, sqweights=0.83461]
Epoch 38:  10%|#         | 2/20 [00:00<00:00, 54.42it/s, loss=-0.12665, sqweights=0.82183]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 56.31it/s, loss=-0.12674, sqweights=0.82148]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 58.36it/s, loss=-0.12358, sqweights=0.82614]
Epoch 38:  25%|##5       | 5/20 [00:00<00:00, 59.76it/s, loss=-0.12783, sqweights=0.82535]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 60.71it/s, loss=-0.12487, sqweights=0.82444]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 61.42it/s, loss=-0.12487, sqweights=0.82444]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 61.42it/s, loss=-0.12447, sqweights=0.82529]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 61.42it/s, loss=-0.12679, sqweights=0.82738]
Epoch 38:  45%|####5     | 9/20 [00:00<00:00, 61.42it/s, loss=-0.12726, sqweights=0.82864]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 61.42it/s, loss=-0.12600, sqweights=0.82937]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 61.42it/s, loss=-0.12362, sqweights=0.83082]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 61.42it/s, loss=-0.12040, sqweights=0.82815]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 61.42it/s, loss=-0.12115, sqweights=0.82795]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 63.51it/s, loss=-0.12115, sqweights=0.82795]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 63.51it/s, loss=-0.12025, sqweights=0.82806]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 63.51it/s, loss=-0.11985, sqweights=0.82882]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 63.51it/s, loss=-0.12022, sqweights=0.82882]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 63.51it/s, loss=-0.11996, sqweights=0.82795]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 63.51it/s, loss=-0.11834, sqweights=0.82861]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 63.51it/s, loss=-0.12000, sqweights=0.82894]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 63.51it/s, loss=-0.12019, sqweights=0.82741]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 63.51it/s, loss=-0.12019, sqweights=0.82741, train_loss=-0.17926, train_sqweights=0.78053, val_loss=-0.14188, val_sqweights=0.77323]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 63.51it/s, loss=-0.12019, sqweights=0.82741, train_loss=-0.17926, train_sqweights=0.78053, val_loss=-0.14188, val_sqweights=0.77323]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 22.80it/s, loss=-0.12019, sqweights=0.82741, train_loss=-0.17926, train_sqweights=0.78053, val_loss=-0.14188, val_sqweights=0.77323]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 46.64it/s, loss=-0.11934, sqweights=0.83920]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 54.02it/s, loss=-0.12744, sqweights=0.82929]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 57.32it/s, loss=-0.13355, sqweights=0.83433]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 59.22it/s, loss=-0.13399, sqweights=0.83025]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 60.35it/s, loss=-0.12489, sqweights=0.83485]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 61.18it/s, loss=-0.12600, sqweights=0.83713]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 61.71it/s, loss=-0.12600, sqweights=0.83713]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 61.71it/s, loss=-0.12480, sqweights=0.83511]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 61.71it/s, loss=-0.12381, sqweights=0.83628]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 61.71it/s, loss=-0.12263, sqweights=0.83559]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 61.71it/s, loss=-0.12247, sqweights=0.83682]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 61.71it/s, loss=-0.11934, sqweights=0.83594]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 61.71it/s, loss=-0.12342, sqweights=0.83625]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 61.71it/s, loss=-0.12386, sqweights=0.83586]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 63.49it/s, loss=-0.12386, sqweights=0.83586]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 63.49it/s, loss=-0.12373, sqweights=0.83688]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 63.49it/s, loss=-0.12379, sqweights=0.83734]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 63.49it/s, loss=-0.12277, sqweights=0.83776]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 63.49it/s, loss=-0.12233, sqweights=0.83761]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 63.49it/s, loss=-0.12244, sqweights=0.83630]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 63.49it/s, loss=-0.12340, sqweights=0.83698]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 63.49it/s, loss=-0.12448, sqweights=0.83555]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 63.49it/s, loss=-0.12448, sqweights=0.83555, train_loss=-0.18044, train_sqweights=0.78962, val_loss=-0.14197, val_sqweights=0.78204]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 63.49it/s, loss=-0.12448, sqweights=0.83555, train_loss=-0.18044, train_sqweights=0.78962, val_loss=-0.14197, val_sqweights=0.78204]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 22.89it/s, loss=-0.12448, sqweights=0.83555, train_loss=-0.18044, train_sqweights=0.78962, val_loss=-0.14197, val_sqweights=0.78204]

<matplotlib.legend.Legend object at 0x7f2455779050>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 0 minutes 42.719 seconds)

Gallery generated by Sphinx-Gallery