Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

Lütkepohl2005

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss

Out:

/home/docs/checkouts/readthedocs.org/user_builds/deepdow/envs/v0.2.0/lib/python3.7/site-packages/patsy/constraint.py:13: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Mapping
model       metric     epoch  dataloader
1overN      loss       -1     train         0.001
                              val           0.002
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train         0.000
                              val           0.004
            sqweights  -1     train         0.144
                              val           0.145
Random      loss       -1     train         0.001
                              val           0.002
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.172
                              val          -0.167
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 32.83it/s, loss=-0.00257, sqweights=0.16073]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 33.97it/s, loss=-0.00790, sqweights=0.16208]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 34.17it/s, loss=-0.00284, sqweights=0.16155]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 34.19it/s, loss=-0.00284, sqweights=0.16155]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 34.19it/s, loss=-0.00081, sqweights=0.16129]
Epoch 0:  25%|##5       | 5/20 [00:00<00:00, 34.19it/s, loss=-0.00316, sqweights=0.16101]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 34.19it/s, loss=-0.00131, sqweights=0.16113]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 34.19it/s, loss=-0.00200, sqweights=0.16163]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 34.50it/s, loss=-0.00200, sqweights=0.16163]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 34.50it/s, loss=-0.00263, sqweights=0.16172]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 34.50it/s, loss=-0.00207, sqweights=0.16192]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 34.50it/s, loss=-0.00297, sqweights=0.16175]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 34.50it/s, loss=-0.00248, sqweights=0.16188]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 34.92it/s, loss=-0.00248, sqweights=0.16188]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 34.92it/s, loss=-0.00297, sqweights=0.16211]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 34.92it/s, loss=-0.00330, sqweights=0.16213]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 34.92it/s, loss=-0.00246, sqweights=0.16231]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 34.92it/s, loss=-0.00317, sqweights=0.16249]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 34.55it/s, loss=-0.00317, sqweights=0.16249]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 34.55it/s, loss=-0.00288, sqweights=0.16239]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 34.55it/s, loss=-0.00234, sqweights=0.16246]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 34.55it/s, loss=-0.00216, sqweights=0.16261]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 34.55it/s, loss=-0.00192, sqweights=0.16259]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 34.85it/s, loss=-0.00192, sqweights=0.16259]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 34.85it/s, loss=-0.00172, sqweights=0.16249]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 34.85it/s, loss=-0.00172, sqweights=0.16249, train_loss=0.00063, train_sqweights=0.12532, val_loss=0.00220, val_sqweights=0.12532]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 12.02it/s, loss=-0.00172, sqweights=0.16249, train_loss=0.00063, train_sqweights=0.12532, val_loss=0.00220, val_sqweights=0.12532]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 35.98it/s, loss=-0.00668, sqweights=0.16182]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 36.29it/s, loss=-0.00478, sqweights=0.16354]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 36.39it/s, loss=-0.00162, sqweights=0.16319]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 36.37it/s, loss=-0.00162, sqweights=0.16319]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 36.37it/s, loss=-0.00320, sqweights=0.16328]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 36.37it/s, loss=-0.00359, sqweights=0.16340]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 36.37it/s, loss=-0.00455, sqweights=0.16386]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 36.37it/s, loss=-0.00651, sqweights=0.16417]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 35.73it/s, loss=-0.00651, sqweights=0.16417]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 35.73it/s, loss=-0.00663, sqweights=0.16438]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 35.73it/s, loss=-0.00828, sqweights=0.16468]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 35.73it/s, loss=-0.00778, sqweights=0.16483]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 35.73it/s, loss=-0.00731, sqweights=0.16505]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 35.91it/s, loss=-0.00731, sqweights=0.16505]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 35.91it/s, loss=-0.00713, sqweights=0.16503]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 35.91it/s, loss=-0.00596, sqweights=0.16491]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 35.91it/s, loss=-0.00665, sqweights=0.16495]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 35.91it/s, loss=-0.00688, sqweights=0.16492]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 36.04it/s, loss=-0.00688, sqweights=0.16492]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 36.04it/s, loss=-0.00671, sqweights=0.16492]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 36.04it/s, loss=-0.00709, sqweights=0.16514]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 36.04it/s, loss=-0.00745, sqweights=0.16529]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 36.04it/s, loss=-0.00839, sqweights=0.16545]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 36.87it/s, loss=-0.00839, sqweights=0.16545]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 36.87it/s, loss=-0.00928, sqweights=0.16565]
Epoch 1: 100%|##########| 20/20 [00:01<00:00, 36.87it/s, loss=-0.00928, sqweights=0.16565, train_loss=0.00019, train_sqweights=0.12531, val_loss=0.00183, val_sqweights=0.12531]
Epoch 1: 100%|##########| 20/20 [00:01<00:00, 12.25it/s, loss=-0.00928, sqweights=0.16565, train_loss=0.00019, train_sqweights=0.12531, val_loss=0.00183, val_sqweights=0.12531]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 35.46it/s, loss=-0.01043, sqweights=0.16662]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 36.79it/s, loss=-0.01147, sqweights=0.16700]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 36.99it/s, loss=-0.01358, sqweights=0.16729]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 36.84it/s, loss=-0.01358, sqweights=0.16729]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 36.84it/s, loss=-0.01152, sqweights=0.16749]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 36.84it/s, loss=-0.01355, sqweights=0.16823]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 36.84it/s, loss=-0.01236, sqweights=0.16839]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 36.84it/s, loss=-0.01323, sqweights=0.16811]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 36.89it/s, loss=-0.01323, sqweights=0.16811]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 36.89it/s, loss=-0.01494, sqweights=0.16879]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 36.89it/s, loss=-0.01308, sqweights=0.16952]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 36.89it/s, loss=-0.01291, sqweights=0.16930]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 36.89it/s, loss=-0.01315, sqweights=0.16957]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 36.34it/s, loss=-0.01315, sqweights=0.16957]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 36.34it/s, loss=-0.01283, sqweights=0.16977]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 36.34it/s, loss=-0.01285, sqweights=0.16985]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 36.34it/s, loss=-0.01350, sqweights=0.17019]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 36.34it/s, loss=-0.01439, sqweights=0.17015]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 35.68it/s, loss=-0.01439, sqweights=0.17015]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 35.68it/s, loss=-0.01557, sqweights=0.17038]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 35.68it/s, loss=-0.01529, sqweights=0.17084]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 35.68it/s, loss=-0.01636, sqweights=0.17115]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 35.68it/s, loss=-0.01598, sqweights=0.17127]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 36.44it/s, loss=-0.01598, sqweights=0.17127]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 36.44it/s, loss=-0.01594, sqweights=0.17132]
Epoch 2: 100%|##########| 20/20 [00:01<00:00, 36.44it/s, loss=-0.01594, sqweights=0.17132, train_loss=-0.00156, train_sqweights=0.12557, val_loss=0.00029, val_sqweights=0.12557]
Epoch 2: 100%|##########| 20/20 [00:01<00:00, 12.33it/s, loss=-0.01594, sqweights=0.17132, train_loss=-0.00156, train_sqweights=0.12557, val_loss=0.00029, val_sqweights=0.12557]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 31.81it/s, loss=-0.02326, sqweights=0.17420]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 33.74it/s, loss=-0.02151, sqweights=0.17671]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 34.03it/s, loss=-0.01862, sqweights=0.17589]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 34.63it/s, loss=-0.01862, sqweights=0.17589]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 34.63it/s, loss=-0.02002, sqweights=0.17611]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 34.63it/s, loss=-0.01772, sqweights=0.17640]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 34.63it/s, loss=-0.01905, sqweights=0.17658]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 34.63it/s, loss=-0.01991, sqweights=0.17610]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 34.77it/s, loss=-0.01991, sqweights=0.17610]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 34.77it/s, loss=-0.01846, sqweights=0.17593]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 34.77it/s, loss=-0.01992, sqweights=0.17601]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 34.77it/s, loss=-0.02031, sqweights=0.17632]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 34.77it/s, loss=-0.01988, sqweights=0.17672]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 35.19it/s, loss=-0.01988, sqweights=0.17672]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 35.19it/s, loss=-0.02017, sqweights=0.17701]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 35.19it/s, loss=-0.02052, sqweights=0.17748]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 35.19it/s, loss=-0.01999, sqweights=0.17744]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 35.19it/s, loss=-0.01910, sqweights=0.17775]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 35.47it/s, loss=-0.01910, sqweights=0.17775]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 35.47it/s, loss=-0.01878, sqweights=0.17776]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 35.47it/s, loss=-0.01884, sqweights=0.17791]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 35.47it/s, loss=-0.01903, sqweights=0.17813]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 35.47it/s, loss=-0.02026, sqweights=0.17857]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 36.22it/s, loss=-0.02026, sqweights=0.17857]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 36.22it/s, loss=-0.02038, sqweights=0.17877]
Epoch 3: 100%|##########| 20/20 [00:01<00:00, 36.22it/s, loss=-0.02038, sqweights=0.17877, train_loss=-0.00774, train_sqweights=0.12794, val_loss=-0.00520, val_sqweights=0.12793]
Epoch 3: 100%|##########| 20/20 [00:01<00:00, 12.02it/s, loss=-0.02038, sqweights=0.17877, train_loss=-0.00774, train_sqweights=0.12794, val_loss=-0.00520, val_sqweights=0.12793]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 33.94it/s, loss=-0.04565, sqweights=0.18555]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 34.39it/s, loss=-0.03789, sqweights=0.18121]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 35.11it/s, loss=-0.03675, sqweights=0.18230]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 35.87it/s, loss=-0.03675, sqweights=0.18230]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 35.87it/s, loss=-0.03527, sqweights=0.18300]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 35.87it/s, loss=-0.03064, sqweights=0.18348]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 35.87it/s, loss=-0.02929, sqweights=0.18484]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 35.87it/s, loss=-0.02730, sqweights=0.18487]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 36.14it/s, loss=-0.02730, sqweights=0.18487]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 36.14it/s, loss=-0.02549, sqweights=0.18497]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 36.14it/s, loss=-0.02637, sqweights=0.18526]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 36.14it/s, loss=-0.02528, sqweights=0.18566]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 36.14it/s, loss=-0.02664, sqweights=0.18575]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 36.17it/s, loss=-0.02664, sqweights=0.18575]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 36.17it/s, loss=-0.02568, sqweights=0.18626]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 36.17it/s, loss=-0.02637, sqweights=0.18683]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 36.17it/s, loss=-0.02654, sqweights=0.18687]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 36.17it/s, loss=-0.02696, sqweights=0.18722]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 35.80it/s, loss=-0.02696, sqweights=0.18722]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 35.80it/s, loss=-0.02725, sqweights=0.18790]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 35.80it/s, loss=-0.02698, sqweights=0.18811]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 35.80it/s, loss=-0.02668, sqweights=0.18836]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 35.80it/s, loss=-0.02705, sqweights=0.18856]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 36.86it/s, loss=-0.02705, sqweights=0.18856]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 36.86it/s, loss=-0.02774, sqweights=0.18889]
Epoch 4: 100%|##########| 20/20 [00:01<00:00, 36.86it/s, loss=-0.02774, sqweights=0.18889, train_loss=-0.02334, train_sqweights=0.14206, val_loss=-0.01895, val_sqweights=0.14183]
Epoch 4: 100%|##########| 20/20 [00:01<00:00, 12.27it/s, loss=-0.02774, sqweights=0.18889, train_loss=-0.02334, train_sqweights=0.14206, val_loss=-0.01895, val_sqweights=0.14183]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 35.36it/s, loss=-0.04527, sqweights=0.19895]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 35.26it/s, loss=-0.04400, sqweights=0.19824]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 35.57it/s, loss=-0.04402, sqweights=0.19819]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 35.51it/s, loss=-0.04402, sqweights=0.19819]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 35.51it/s, loss=-0.03978, sqweights=0.19831]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 35.51it/s, loss=-0.03899, sqweights=0.19944]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 35.51it/s, loss=-0.03776, sqweights=0.19978]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 35.51it/s, loss=-0.03918, sqweights=0.19981]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 33.60it/s, loss=-0.03918, sqweights=0.19981]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 33.60it/s, loss=-0.03687, sqweights=0.20025]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 33.60it/s, loss=-0.03764, sqweights=0.20065]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 33.60it/s, loss=-0.03640, sqweights=0.20114]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 33.60it/s, loss=-0.03758, sqweights=0.20223]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 33.59it/s, loss=-0.03758, sqweights=0.20223]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 33.59it/s, loss=-0.03639, sqweights=0.20248]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 33.59it/s, loss=-0.03709, sqweights=0.20301]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 33.59it/s, loss=-0.03660, sqweights=0.20311]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 33.59it/s, loss=-0.03606, sqweights=0.20336]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 34.22it/s, loss=-0.03606, sqweights=0.20336]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 34.22it/s, loss=-0.03678, sqweights=0.20390]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 34.22it/s, loss=-0.03669, sqweights=0.20379]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 34.22it/s, loss=-0.03712, sqweights=0.20416]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 34.22it/s, loss=-0.03660, sqweights=0.20441]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 35.73it/s, loss=-0.03660, sqweights=0.20441]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 35.73it/s, loss=-0.03667, sqweights=0.20507]
Epoch 5: 100%|##########| 20/20 [00:01<00:00, 35.73it/s, loss=-0.03667, sqweights=0.20507, train_loss=-0.04038, train_sqweights=0.16675, val_loss=-0.03342, val_sqweights=0.16583]
Epoch 5: 100%|##########| 20/20 [00:01<00:00, 12.01it/s, loss=-0.03667, sqweights=0.20507, train_loss=-0.04038, train_sqweights=0.16675, val_loss=-0.03342, val_sqweights=0.16583]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 34.54it/s, loss=-0.02954, sqweights=0.21199]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 35.41it/s, loss=-0.03933, sqweights=0.21913]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 35.77it/s, loss=-0.04274, sqweights=0.21866]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 35.71it/s, loss=-0.04274, sqweights=0.21866]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 35.71it/s, loss=-0.04645, sqweights=0.21911]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 35.71it/s, loss=-0.03961, sqweights=0.21699]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 35.71it/s, loss=-0.03674, sqweights=0.21666]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 35.71it/s, loss=-0.03826, sqweights=0.21715]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 35.59it/s, loss=-0.03826, sqweights=0.21715]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 35.59it/s, loss=-0.03928, sqweights=0.21777]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 35.59it/s, loss=-0.03745, sqweights=0.21829]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 35.59it/s, loss=-0.03603, sqweights=0.21839]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 35.59it/s, loss=-0.03651, sqweights=0.21858]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 35.38it/s, loss=-0.03651, sqweights=0.21858]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 35.38it/s, loss=-0.03599, sqweights=0.21904]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 35.38it/s, loss=-0.03662, sqweights=0.21946]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 35.38it/s, loss=-0.03857, sqweights=0.21948]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 35.38it/s, loss=-0.03898, sqweights=0.21985]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 34.93it/s, loss=-0.03898, sqweights=0.21985]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 34.93it/s, loss=-0.04007, sqweights=0.22027]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 34.93it/s, loss=-0.03970, sqweights=0.22045]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 34.93it/s, loss=-0.04053, sqweights=0.22138]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 34.93it/s, loss=-0.04027, sqweights=0.22155]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 35.52it/s, loss=-0.04027, sqweights=0.22155]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 35.52it/s, loss=-0.04034, sqweights=0.22201]
Epoch 6: 100%|##########| 20/20 [00:01<00:00, 35.52it/s, loss=-0.04034, sqweights=0.22201, train_loss=-0.05124, train_sqweights=0.18382, val_loss=-0.04261, val_sqweights=0.18231]
Epoch 6: 100%|##########| 20/20 [00:01<00:00, 12.20it/s, loss=-0.04034, sqweights=0.22201, train_loss=-0.05124, train_sqweights=0.18382, val_loss=-0.04261, val_sqweights=0.18231]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 36.41it/s, loss=-0.06194, sqweights=0.24249]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 36.34it/s, loss=-0.05182, sqweights=0.23828]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 36.67it/s, loss=-0.04438, sqweights=0.23533]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 36.85it/s, loss=-0.04438, sqweights=0.23533]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 36.85it/s, loss=-0.04408, sqweights=0.23340]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 36.85it/s, loss=-0.04194, sqweights=0.23396]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 36.85it/s, loss=-0.04223, sqweights=0.23448]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 36.85it/s, loss=-0.04252, sqweights=0.23488]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 36.67it/s, loss=-0.04252, sqweights=0.23488]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 36.67it/s, loss=-0.04221, sqweights=0.23749]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 36.67it/s, loss=-0.04262, sqweights=0.23697]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 36.67it/s, loss=-0.04332, sqweights=0.23769]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 36.67it/s, loss=-0.04359, sqweights=0.23764]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 36.44it/s, loss=-0.04359, sqweights=0.23764]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 36.44it/s, loss=-0.04459, sqweights=0.23720]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 36.44it/s, loss=-0.04574, sqweights=0.23676]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 36.44it/s, loss=-0.04672, sqweights=0.23739]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 36.44it/s, loss=-0.04678, sqweights=0.23824]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 36.00it/s, loss=-0.04678, sqweights=0.23824]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 36.00it/s, loss=-0.04700, sqweights=0.23887]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 36.00it/s, loss=-0.04812, sqweights=0.23907]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 36.00it/s, loss=-0.04950, sqweights=0.23992]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 36.00it/s, loss=-0.04956, sqweights=0.24024]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 36.95it/s, loss=-0.04956, sqweights=0.24024]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 36.95it/s, loss=-0.04968, sqweights=0.24066]
Epoch 7: 100%|##########| 20/20 [00:01<00:00, 36.95it/s, loss=-0.04968, sqweights=0.24066, train_loss=-0.06017, train_sqweights=0.19890, val_loss=-0.04989, val_sqweights=0.19696]
Epoch 7: 100%|##########| 20/20 [00:01<00:00, 12.40it/s, loss=-0.04968, sqweights=0.24066, train_loss=-0.06017, train_sqweights=0.19890, val_loss=-0.04989, val_sqweights=0.19696]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:00, 33.81it/s, loss=-0.06443, sqweights=0.26344]
Epoch 8:  10%|#         | 2/20 [00:00<00:00, 34.07it/s, loss=-0.06012, sqweights=0.25610]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 34.01it/s, loss=-0.06194, sqweights=0.25505]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 34.10it/s, loss=-0.06194, sqweights=0.25505]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 34.10it/s, loss=-0.06204, sqweights=0.25843]
Epoch 8:  25%|##5       | 5/20 [00:00<00:00, 34.10it/s, loss=-0.06044, sqweights=0.25929]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 34.10it/s, loss=-0.05910, sqweights=0.26062]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 34.10it/s, loss=-0.05753, sqweights=0.26219]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 34.65it/s, loss=-0.05753, sqweights=0.26219]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 34.65it/s, loss=-0.05749, sqweights=0.26188]
Epoch 8:  45%|####5     | 9/20 [00:00<00:00, 34.65it/s, loss=-0.05676, sqweights=0.26060]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 34.65it/s, loss=-0.05586, sqweights=0.26083]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 34.65it/s, loss=-0.05483, sqweights=0.26087]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 34.84it/s, loss=-0.05483, sqweights=0.26087]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 34.84it/s, loss=-0.05491, sqweights=0.26248]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 34.84it/s, loss=-0.05610, sqweights=0.26316]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 34.84it/s, loss=-0.05573, sqweights=0.26457]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 34.84it/s, loss=-0.05721, sqweights=0.26518]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 34.98it/s, loss=-0.05721, sqweights=0.26518]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 34.98it/s, loss=-0.05839, sqweights=0.26619]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 34.98it/s, loss=-0.05777, sqweights=0.26666]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 34.98it/s, loss=-0.05777, sqweights=0.26671]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 34.98it/s, loss=-0.05744, sqweights=0.26715]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 35.99it/s, loss=-0.05744, sqweights=0.26715]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 35.99it/s, loss=-0.05694, sqweights=0.26741]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 35.99it/s, loss=-0.05694, sqweights=0.26741, train_loss=-0.06902, train_sqweights=0.21666, val_loss=-0.05709, val_sqweights=0.21412]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 12.32it/s, loss=-0.05694, sqweights=0.26741, train_loss=-0.06902, train_sqweights=0.21666, val_loss=-0.05709, val_sqweights=0.21412]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 34.69it/s, loss=-0.07769, sqweights=0.28987]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 35.03it/s, loss=-0.07190, sqweights=0.28476]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 35.08it/s, loss=-0.06568, sqweights=0.28396]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 35.12it/s, loss=-0.06568, sqweights=0.28396]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 35.12it/s, loss=-0.06559, sqweights=0.28337]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 35.12it/s, loss=-0.06326, sqweights=0.28514]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 35.12it/s, loss=-0.06460, sqweights=0.28608]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 35.12it/s, loss=-0.06344, sqweights=0.28601]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 34.96it/s, loss=-0.06344, sqweights=0.28601]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 34.96it/s, loss=-0.06293, sqweights=0.28661]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 34.96it/s, loss=-0.06199, sqweights=0.28801]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 34.96it/s, loss=-0.06007, sqweights=0.28785]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 32.96it/s, loss=-0.06007, sqweights=0.28785]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 32.96it/s, loss=-0.06157, sqweights=0.28883]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 32.96it/s, loss=-0.06238, sqweights=0.28984]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 32.96it/s, loss=-0.06177, sqweights=0.28898]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 32.96it/s, loss=-0.06354, sqweights=0.28946]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 32.95it/s, loss=-0.06354, sqweights=0.28946]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 32.95it/s, loss=-0.06424, sqweights=0.28995]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 32.95it/s, loss=-0.06438, sqweights=0.29033]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 32.95it/s, loss=-0.06400, sqweights=0.29086]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 32.95it/s, loss=-0.06382, sqweights=0.29178]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 33.75it/s, loss=-0.06382, sqweights=0.29178]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 33.75it/s, loss=-0.06359, sqweights=0.29200]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 33.75it/s, loss=-0.06525, sqweights=0.29319]
Epoch 9: 100%|##########| 20/20 [00:01<00:00, 33.75it/s, loss=-0.06525, sqweights=0.29319, train_loss=-0.07769, train_sqweights=0.23535, val_loss=-0.06413, val_sqweights=0.23225]
Epoch 9: 100%|##########| 20/20 [00:01<00:00, 11.65it/s, loss=-0.06525, sqweights=0.29319, train_loss=-0.07769, train_sqweights=0.23535, val_loss=-0.06413, val_sqweights=0.23225]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 34.77it/s, loss=-0.06871, sqweights=0.30596]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 34.84it/s, loss=-0.06846, sqweights=0.30966]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 34.98it/s, loss=-0.06676, sqweights=0.30767]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 34.96it/s, loss=-0.06676, sqweights=0.30767]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 34.96it/s, loss=-0.06965, sqweights=0.31044]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 34.96it/s, loss=-0.06836, sqweights=0.31195]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 34.96it/s, loss=-0.06925, sqweights=0.31364]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 34.96it/s, loss=-0.07201, sqweights=0.31413]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 34.48it/s, loss=-0.07201, sqweights=0.31413]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 34.48it/s, loss=-0.07182, sqweights=0.31540]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 34.48it/s, loss=-0.07051, sqweights=0.31376]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 34.48it/s, loss=-0.06979, sqweights=0.31441]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 34.48it/s, loss=-0.07151, sqweights=0.31453]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 34.32it/s, loss=-0.07151, sqweights=0.31453]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 34.32it/s, loss=-0.07335, sqweights=0.31449]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 34.32it/s, loss=-0.07326, sqweights=0.31633]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 34.32it/s, loss=-0.07347, sqweights=0.31687]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 34.32it/s, loss=-0.07313, sqweights=0.31744]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 33.99it/s, loss=-0.07313, sqweights=0.31744]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 33.99it/s, loss=-0.07295, sqweights=0.31789]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 33.99it/s, loss=-0.07299, sqweights=0.31852]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 33.99it/s, loss=-0.07257, sqweights=0.31911]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 33.99it/s, loss=-0.07194, sqweights=0.31969]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 34.82it/s, loss=-0.07194, sqweights=0.31969]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 34.82it/s, loss=-0.07267, sqweights=0.32071]
Epoch 10: 100%|##########| 20/20 [00:01<00:00, 34.82it/s, loss=-0.07267, sqweights=0.32071, train_loss=-0.08608, train_sqweights=0.25663, val_loss=-0.07086, val_sqweights=0.25297]
Epoch 10: 100%|##########| 20/20 [00:01<00:00, 11.77it/s, loss=-0.07267, sqweights=0.32071, train_loss=-0.08608, train_sqweights=0.25663, val_loss=-0.07086, val_sqweights=0.25297]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 34.59it/s, loss=-0.06749, sqweights=0.34734]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 34.93it/s, loss=-0.07783, sqweights=0.35113]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 33.30it/s, loss=-0.07951, sqweights=0.34721]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 33.90it/s, loss=-0.07951, sqweights=0.34721]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 33.90it/s, loss=-0.07736, sqweights=0.34397]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 33.90it/s, loss=-0.07845, sqweights=0.34382]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 33.90it/s, loss=-0.07592, sqweights=0.34221]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 33.90it/s, loss=-0.07605, sqweights=0.34316]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 34.14it/s, loss=-0.07605, sqweights=0.34316]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 34.14it/s, loss=-0.07583, sqweights=0.34245]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 34.14it/s, loss=-0.07793, sqweights=0.34277]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 34.14it/s, loss=-0.07743, sqweights=0.34154]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 34.14it/s, loss=-0.07728, sqweights=0.34223]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 33.98it/s, loss=-0.07728, sqweights=0.34223]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 33.98it/s, loss=-0.07649, sqweights=0.34148]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 33.98it/s, loss=-0.07597, sqweights=0.34203]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 33.98it/s, loss=-0.07616, sqweights=0.34243]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 33.98it/s, loss=-0.07688, sqweights=0.34290]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 33.71it/s, loss=-0.07688, sqweights=0.34290]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 33.71it/s, loss=-0.07711, sqweights=0.34337]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 33.71it/s, loss=-0.07593, sqweights=0.34395]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 33.71it/s, loss=-0.07579, sqweights=0.34499]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 33.71it/s, loss=-0.07558, sqweights=0.34559]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 34.72it/s, loss=-0.07558, sqweights=0.34559]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 34.72it/s, loss=-0.07525, sqweights=0.34636]
Epoch 11: 100%|##########| 20/20 [00:01<00:00, 34.72it/s, loss=-0.07525, sqweights=0.34636, train_loss=-0.09405, train_sqweights=0.27907, val_loss=-0.07708, val_sqweights=0.27466]
Epoch 11: 100%|##########| 20/20 [00:01<00:00, 11.78it/s, loss=-0.07525, sqweights=0.34636, train_loss=-0.09405, train_sqweights=0.27907, val_loss=-0.07708, val_sqweights=0.27466]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 32.15it/s, loss=-0.08069, sqweights=0.35697]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 33.10it/s, loss=-0.08071, sqweights=0.36723]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 33.47it/s, loss=-0.08388, sqweights=0.36992]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 33.50it/s, loss=-0.08388, sqweights=0.36992]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 33.50it/s, loss=-0.08151, sqweights=0.36687]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 33.50it/s, loss=-0.08202, sqweights=0.37205]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 33.50it/s, loss=-0.08147, sqweights=0.37191]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 33.50it/s, loss=-0.08015, sqweights=0.37414]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 33.71it/s, loss=-0.08015, sqweights=0.37414]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 33.71it/s, loss=-0.08136, sqweights=0.37409]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 33.71it/s, loss=-0.08243, sqweights=0.37359]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 33.71it/s, loss=-0.07973, sqweights=0.37451]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 33.71it/s, loss=-0.07965, sqweights=0.37370]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 33.94it/s, loss=-0.07965, sqweights=0.37370]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 33.94it/s, loss=-0.08191, sqweights=0.37472]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 33.94it/s, loss=-0.08207, sqweights=0.37527]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 33.94it/s, loss=-0.08311, sqweights=0.37529]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 33.94it/s, loss=-0.08379, sqweights=0.37631]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 34.02it/s, loss=-0.08379, sqweights=0.37631]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 34.02it/s, loss=-0.08473, sqweights=0.37760]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 34.02it/s, loss=-0.08387, sqweights=0.37882]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 34.02it/s, loss=-0.08442, sqweights=0.37918]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 34.02it/s, loss=-0.08416, sqweights=0.37964]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 35.22it/s, loss=-0.08416, sqweights=0.37964]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 35.22it/s, loss=-0.08443, sqweights=0.38126]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 35.22it/s, loss=-0.08443, sqweights=0.38126, train_loss=-0.10180, train_sqweights=0.30193, val_loss=-0.08318, val_sqweights=0.29689]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 11.79it/s, loss=-0.08443, sqweights=0.38126, train_loss=-0.10180, train_sqweights=0.30193, val_loss=-0.08318, val_sqweights=0.29689]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 34.02it/s, loss=-0.06695, sqweights=0.39149]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 33.86it/s, loss=-0.07341, sqweights=0.38811]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 33.56it/s, loss=-0.07468, sqweights=0.40252]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 33.22it/s, loss=-0.07468, sqweights=0.40252]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 33.22it/s, loss=-0.07511, sqweights=0.40553]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 33.22it/s, loss=-0.08187, sqweights=0.40548]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 33.22it/s, loss=-0.08294, sqweights=0.40424]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 33.22it/s, loss=-0.07976, sqweights=0.40214]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 33.67it/s, loss=-0.07976, sqweights=0.40214]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 33.67it/s, loss=-0.08003, sqweights=0.40269]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 33.67it/s, loss=-0.08013, sqweights=0.40056]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 33.67it/s, loss=-0.08173, sqweights=0.40106]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 33.67it/s, loss=-0.08300, sqweights=0.40116]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 34.08it/s, loss=-0.08300, sqweights=0.40116]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 34.08it/s, loss=-0.08568, sqweights=0.40190]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 34.08it/s, loss=-0.08628, sqweights=0.40306]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 34.08it/s, loss=-0.08610, sqweights=0.40280]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 34.08it/s, loss=-0.08527, sqweights=0.40439]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 33.67it/s, loss=-0.08527, sqweights=0.40439]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 33.67it/s, loss=-0.08576, sqweights=0.40490]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 33.67it/s, loss=-0.08627, sqweights=0.40590]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 33.67it/s, loss=-0.08742, sqweights=0.40509]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 33.67it/s, loss=-0.08698, sqweights=0.40510]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 34.68it/s, loss=-0.08698, sqweights=0.40510]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 34.68it/s, loss=-0.08588, sqweights=0.40595]
Epoch 13: 100%|##########| 20/20 [00:01<00:00, 34.68it/s, loss=-0.08588, sqweights=0.40595, train_loss=-0.10897, train_sqweights=0.32470, val_loss=-0.08895, val_sqweights=0.31896]
Epoch 13: 100%|##########| 20/20 [00:01<00:00, 11.79it/s, loss=-0.08588, sqweights=0.40595, train_loss=-0.10897, train_sqweights=0.32470, val_loss=-0.08895, val_sqweights=0.31896]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 34.45it/s, loss=-0.09606, sqweights=0.40760]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 34.75it/s, loss=-0.09120, sqweights=0.41196]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 34.41it/s, loss=-0.08646, sqweights=0.42203]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 18.90it/s, loss=-0.08646, sqweights=0.42203]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 18.90it/s, loss=-0.08315, sqweights=0.41870]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 18.90it/s, loss=-0.08661, sqweights=0.42173]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 18.90it/s, loss=-0.09077, sqweights=0.42474]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 18.90it/s, loss=-0.08886, sqweights=0.42615]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 21.82it/s, loss=-0.08886, sqweights=0.42615]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 21.82it/s, loss=-0.09058, sqweights=0.42747]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 21.82it/s, loss=-0.08918, sqweights=0.42672]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 21.82it/s, loss=-0.08881, sqweights=0.42969]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 21.82it/s, loss=-0.08863, sqweights=0.42990]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 24.61it/s, loss=-0.08863, sqweights=0.42990]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 24.61it/s, loss=-0.08865, sqweights=0.43056]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 24.61it/s, loss=-0.08979, sqweights=0.43188]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 24.61it/s, loss=-0.09211, sqweights=0.43324]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 24.61it/s, loss=-0.09397, sqweights=0.43380]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 26.79it/s, loss=-0.09397, sqweights=0.43380]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 26.79it/s, loss=-0.09413, sqweights=0.43625]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 26.79it/s, loss=-0.09337, sqweights=0.43566]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 26.79it/s, loss=-0.09299, sqweights=0.43577]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 26.79it/s, loss=-0.09300, sqweights=0.43516]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 29.42it/s, loss=-0.09300, sqweights=0.43516]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 29.42it/s, loss=-0.09227, sqweights=0.43638]
Epoch 14: 100%|##########| 20/20 [00:01<00:00, 29.42it/s, loss=-0.09227, sqweights=0.43638, train_loss=-0.11569, train_sqweights=0.34647, val_loss=-0.09421, val_sqweights=0.34063]
Epoch 14: 100%|##########| 20/20 [00:01<00:00, 11.30it/s, loss=-0.09227, sqweights=0.43638, train_loss=-0.11569, train_sqweights=0.34647, val_loss=-0.09421, val_sqweights=0.34063]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 32.74it/s, loss=-0.08871, sqweights=0.44832]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 33.90it/s, loss=-0.09065, sqweights=0.44178]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 34.32it/s, loss=-0.10542, sqweights=0.44208]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 34.68it/s, loss=-0.10542, sqweights=0.44208]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 34.68it/s, loss=-0.10208, sqweights=0.44675]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 34.68it/s, loss=-0.10217, sqweights=0.45192]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 34.68it/s, loss=-0.09882, sqweights=0.45289]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 34.68it/s, loss=-0.09664, sqweights=0.45133]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 34.80it/s, loss=-0.09664, sqweights=0.45133]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 34.80it/s, loss=-0.09322, sqweights=0.45170]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 34.80it/s, loss=-0.09492, sqweights=0.44936]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 34.80it/s, loss=-0.09481, sqweights=0.44932]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 34.80it/s, loss=-0.09378, sqweights=0.44935]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 34.15it/s, loss=-0.09378, sqweights=0.44935]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 34.15it/s, loss=-0.09597, sqweights=0.45060]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 34.15it/s, loss=-0.09793, sqweights=0.45131]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 34.15it/s, loss=-0.09661, sqweights=0.45104]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 34.15it/s, loss=-0.09695, sqweights=0.45314]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 34.04it/s, loss=-0.09695, sqweights=0.45314]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 34.04it/s, loss=-0.09619, sqweights=0.45510]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 34.04it/s, loss=-0.09597, sqweights=0.45657]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 34.04it/s, loss=-0.09621, sqweights=0.45697]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 34.04it/s, loss=-0.09688, sqweights=0.45739]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 35.29it/s, loss=-0.09688, sqweights=0.45739]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 35.29it/s, loss=-0.09623, sqweights=0.45778]
Epoch 15: 100%|##########| 20/20 [00:01<00:00, 35.29it/s, loss=-0.09623, sqweights=0.45778, train_loss=-0.12205, train_sqweights=0.37005, val_loss=-0.09897, val_sqweights=0.36407]
Epoch 15: 100%|##########| 20/20 [00:01<00:00, 11.80it/s, loss=-0.09623, sqweights=0.45778, train_loss=-0.12205, train_sqweights=0.37005, val_loss=-0.09897, val_sqweights=0.36407]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 31.38it/s, loss=-0.09851, sqweights=0.44960]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 32.57it/s, loss=-0.10356, sqweights=0.45611]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 33.00it/s, loss=-0.09479, sqweights=0.45852]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 33.08it/s, loss=-0.09479, sqweights=0.45852]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 33.08it/s, loss=-0.09134, sqweights=0.47015]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 33.08it/s, loss=-0.08951, sqweights=0.47280]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 33.08it/s, loss=-0.09214, sqweights=0.47284]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 33.08it/s, loss=-0.08958, sqweights=0.47707]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 33.61it/s, loss=-0.08958, sqweights=0.47707]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 33.61it/s, loss=-0.09249, sqweights=0.47761]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 33.61it/s, loss=-0.09469, sqweights=0.47890]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 33.61it/s, loss=-0.09666, sqweights=0.47990]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 33.61it/s, loss=-0.09732, sqweights=0.47881]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.09732, sqweights=0.47881]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.10033, sqweights=0.48009]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 34.06it/s, loss=-0.10088, sqweights=0.48054]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 34.06it/s, loss=-0.10000, sqweights=0.47976]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 34.06it/s, loss=-0.10086, sqweights=0.48051]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 34.11it/s, loss=-0.10086, sqweights=0.48051]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 34.11it/s, loss=-0.10037, sqweights=0.48069]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 34.11it/s, loss=-0.10026, sqweights=0.48057]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 34.11it/s, loss=-0.09946, sqweights=0.48114]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 34.11it/s, loss=-0.09811, sqweights=0.48262]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 34.76it/s, loss=-0.09811, sqweights=0.48262]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 34.76it/s, loss=-0.09932, sqweights=0.48396]
Epoch 16: 100%|##########| 20/20 [00:01<00:00, 34.76it/s, loss=-0.09932, sqweights=0.48396, train_loss=-0.12756, train_sqweights=0.39128, val_loss=-0.10340, val_sqweights=0.38561]
Epoch 16: 100%|##########| 20/20 [00:01<00:00, 11.75it/s, loss=-0.09932, sqweights=0.48396, train_loss=-0.12756, train_sqweights=0.39128, val_loss=-0.10340, val_sqweights=0.38561]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 33.40it/s, loss=-0.09663, sqweights=0.48199]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 33.76it/s, loss=-0.09790, sqweights=0.49628]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 33.95it/s, loss=-0.10524, sqweights=0.49342]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 34.19it/s, loss=-0.10524, sqweights=0.49342]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 34.19it/s, loss=-0.09904, sqweights=0.49938]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 34.19it/s, loss=-0.09871, sqweights=0.49727]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 34.19it/s, loss=-0.09769, sqweights=0.49739]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 34.19it/s, loss=-0.09349, sqweights=0.49280]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 33.92it/s, loss=-0.09349, sqweights=0.49280]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 33.92it/s, loss=-0.09711, sqweights=0.49234]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 33.92it/s, loss=-0.09920, sqweights=0.49347]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 33.92it/s, loss=-0.09715, sqweights=0.49527]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 33.92it/s, loss=-0.09532, sqweights=0.49430]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.09532, sqweights=0.49430]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.09667, sqweights=0.49520]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 34.06it/s, loss=-0.09912, sqweights=0.49638]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 34.06it/s, loss=-0.09935, sqweights=0.49708]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 34.06it/s, loss=-0.10096, sqweights=0.49775]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 33.30it/s, loss=-0.10096, sqweights=0.49775]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 33.30it/s, loss=-0.10112, sqweights=0.49766]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 33.30it/s, loss=-0.10355, sqweights=0.49938]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 33.30it/s, loss=-0.10295, sqweights=0.50036]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 33.30it/s, loss=-0.10240, sqweights=0.50117]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 34.57it/s, loss=-0.10240, sqweights=0.50117]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 34.57it/s, loss=-0.10129, sqweights=0.49923]
Epoch 17: 100%|##########| 20/20 [00:01<00:00, 34.57it/s, loss=-0.10129, sqweights=0.49923, train_loss=-0.13290, train_sqweights=0.41296, val_loss=-0.10744, val_sqweights=0.40769]
Epoch 17: 100%|##########| 20/20 [00:01<00:00, 11.64it/s, loss=-0.10129, sqweights=0.49923, train_loss=-0.13290, train_sqweights=0.41296, val_loss=-0.10744, val_sqweights=0.40769]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 33.58it/s, loss=-0.10392, sqweights=0.53602]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 34.07it/s, loss=-0.10740, sqweights=0.52982]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 34.24it/s, loss=-0.10568, sqweights=0.51969]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 34.42it/s, loss=-0.10568, sqweights=0.51969]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 34.42it/s, loss=-0.10590, sqweights=0.52241]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 34.42it/s, loss=-0.10635, sqweights=0.51814]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 34.42it/s, loss=-0.10586, sqweights=0.51789]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 34.42it/s, loss=-0.10511, sqweights=0.52080]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 34.78it/s, loss=-0.10511, sqweights=0.52080]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 34.78it/s, loss=-0.10767, sqweights=0.51963]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 34.78it/s, loss=-0.10668, sqweights=0.51990]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 34.78it/s, loss=-0.10695, sqweights=0.52203]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 34.78it/s, loss=-0.10692, sqweights=0.52196]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 35.00it/s, loss=-0.10692, sqweights=0.52196]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 35.00it/s, loss=-0.10718, sqweights=0.52234]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 35.00it/s, loss=-0.10779, sqweights=0.52159]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 35.00it/s, loss=-0.11022, sqweights=0.52430]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 35.00it/s, loss=-0.11020, sqweights=0.52566]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 34.83it/s, loss=-0.11020, sqweights=0.52566]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 34.83it/s, loss=-0.10795, sqweights=0.52661]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 34.83it/s, loss=-0.10827, sqweights=0.52756]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 34.83it/s, loss=-0.10837, sqweights=0.52842]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 34.83it/s, loss=-0.10854, sqweights=0.52994]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 36.02it/s, loss=-0.10854, sqweights=0.52994]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 36.02it/s, loss=-0.10773, sqweights=0.53027]
Epoch 18: 100%|##########| 20/20 [00:01<00:00, 36.02it/s, loss=-0.10773, sqweights=0.53027, train_loss=-0.13820, train_sqweights=0.43578, val_loss=-0.11128, val_sqweights=0.43038]
Epoch 18: 100%|##########| 20/20 [00:01<00:00, 11.99it/s, loss=-0.10773, sqweights=0.53027, train_loss=-0.13820, train_sqweights=0.43578, val_loss=-0.11128, val_sqweights=0.43038]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 34.75it/s, loss=-0.09956, sqweights=0.54063]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 34.74it/s, loss=-0.09436, sqweights=0.53819]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 35.08it/s, loss=-0.09753, sqweights=0.53755]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 33.96it/s, loss=-0.09753, sqweights=0.53755]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 33.96it/s, loss=-0.10607, sqweights=0.54122]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 33.96it/s, loss=-0.10889, sqweights=0.54299]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 33.96it/s, loss=-0.10562, sqweights=0.54509]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 33.96it/s, loss=-0.10509, sqweights=0.54328]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 33.96it/s, loss=-0.10509, sqweights=0.54328]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 33.96it/s, loss=-0.10284, sqweights=0.54451]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 33.96it/s, loss=-0.10185, sqweights=0.54267]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 33.96it/s, loss=-0.10093, sqweights=0.54296]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 33.96it/s, loss=-0.10127, sqweights=0.54281]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 33.06it/s, loss=-0.10127, sqweights=0.54281]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 33.06it/s, loss=-0.09987, sqweights=0.54212]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 33.06it/s, loss=-0.10160, sqweights=0.54179]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 33.06it/s, loss=-0.10339, sqweights=0.54344]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 33.06it/s, loss=-0.10417, sqweights=0.54366]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 33.73it/s, loss=-0.10417, sqweights=0.54366]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 33.73it/s, loss=-0.10384, sqweights=0.54421]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 33.73it/s, loss=-0.10448, sqweights=0.54339]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 33.73it/s, loss=-0.10289, sqweights=0.54475]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 33.73it/s, loss=-0.10219, sqweights=0.54551]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 35.07it/s, loss=-0.10219, sqweights=0.54551]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 35.07it/s, loss=-0.10260, sqweights=0.54521]
Epoch 19: 100%|##########| 20/20 [00:01<00:00, 35.07it/s, loss=-0.10260, sqweights=0.54521, train_loss=-0.14278, train_sqweights=0.45749, val_loss=-0.11450, val_sqweights=0.45201]
Epoch 19: 100%|##########| 20/20 [00:01<00:00, 11.43it/s, loss=-0.10260, sqweights=0.54521, train_loss=-0.14278, train_sqweights=0.45749, val_loss=-0.11450, val_sqweights=0.45201]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 33.97it/s, loss=-0.09018, sqweights=0.55170]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 34.50it/s, loss=-0.10050, sqweights=0.55040]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 34.61it/s, loss=-0.11020, sqweights=0.56377]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 34.83it/s, loss=-0.11020, sqweights=0.56377]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 34.83it/s, loss=-0.11103, sqweights=0.56514]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 34.83it/s, loss=-0.10832, sqweights=0.56474]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 34.83it/s, loss=-0.10794, sqweights=0.56829]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 34.83it/s, loss=-0.10946, sqweights=0.56865]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 35.04it/s, loss=-0.10946, sqweights=0.56865]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 35.04it/s, loss=-0.11328, sqweights=0.57087]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 35.04it/s, loss=-0.11080, sqweights=0.56914]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 35.04it/s, loss=-0.11031, sqweights=0.57035]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 35.04it/s, loss=-0.10850, sqweights=0.57167]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 34.77it/s, loss=-0.10850, sqweights=0.57167]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 34.77it/s, loss=-0.10673, sqweights=0.57046]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 34.77it/s, loss=-0.10761, sqweights=0.57130]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 34.77it/s, loss=-0.10985, sqweights=0.56946]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 34.77it/s, loss=-0.10999, sqweights=0.56817]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 34.49it/s, loss=-0.10999, sqweights=0.56817]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 34.49it/s, loss=-0.10909, sqweights=0.56917]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 34.49it/s, loss=-0.10845, sqweights=0.57003]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 34.49it/s, loss=-0.10923, sqweights=0.56913]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 34.49it/s, loss=-0.10884, sqweights=0.57055]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 35.58it/s, loss=-0.10884, sqweights=0.57055]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 35.58it/s, loss=-0.10924, sqweights=0.57214]
Epoch 20: 100%|##########| 20/20 [00:01<00:00, 35.58it/s, loss=-0.10924, sqweights=0.57214, train_loss=-0.14683, train_sqweights=0.47962, val_loss=-0.11690, val_sqweights=0.47393]
Epoch 20: 100%|##########| 20/20 [00:01<00:00, 11.78it/s, loss=-0.10924, sqweights=0.57214, train_loss=-0.14683, train_sqweights=0.47962, val_loss=-0.11690, val_sqweights=0.47393]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 34.20it/s, loss=-0.11324, sqweights=0.58553]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 34.98it/s, loss=-0.12360, sqweights=0.59270]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 34.20it/s, loss=-0.12482, sqweights=0.60182]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 34.15it/s, loss=-0.12482, sqweights=0.60182]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 34.15it/s, loss=-0.11932, sqweights=0.60062]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 34.15it/s, loss=-0.12009, sqweights=0.59357]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 34.15it/s, loss=-0.11928, sqweights=0.58889]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 34.15it/s, loss=-0.12169, sqweights=0.58743]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 34.13it/s, loss=-0.12169, sqweights=0.58743]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 34.13it/s, loss=-0.12159, sqweights=0.58916]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 34.13it/s, loss=-0.12365, sqweights=0.58830]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 34.13it/s, loss=-0.12487, sqweights=0.58900]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 34.13it/s, loss=-0.12216, sqweights=0.58926]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 34.25it/s, loss=-0.12216, sqweights=0.58926]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 34.25it/s, loss=-0.11814, sqweights=0.58851]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 34.25it/s, loss=-0.11561, sqweights=0.58969]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 34.25it/s, loss=-0.11500, sqweights=0.59133]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 34.25it/s, loss=-0.11417, sqweights=0.59355]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 34.29it/s, loss=-0.11417, sqweights=0.59355]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 34.29it/s, loss=-0.11410, sqweights=0.59506]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 34.29it/s, loss=-0.11353, sqweights=0.59488]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 34.29it/s, loss=-0.11332, sqweights=0.59546]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 34.29it/s, loss=-0.11391, sqweights=0.59575]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 35.35it/s, loss=-0.11391, sqweights=0.59575]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 35.35it/s, loss=-0.11431, sqweights=0.59472]
Epoch 21: 100%|##########| 20/20 [00:01<00:00, 35.35it/s, loss=-0.11431, sqweights=0.59472, train_loss=-0.15112, train_sqweights=0.50120, val_loss=-0.11953, val_sqweights=0.49543]
Epoch 21: 100%|##########| 20/20 [00:01<00:00, 11.69it/s, loss=-0.11431, sqweights=0.59472, train_loss=-0.15112, train_sqweights=0.50120, val_loss=-0.11953, val_sqweights=0.49543]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 34.17it/s, loss=-0.12079, sqweights=0.61652]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 34.74it/s, loss=-0.11317, sqweights=0.61268]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 34.83it/s, loss=-0.11415, sqweights=0.62147]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 35.01it/s, loss=-0.11415, sqweights=0.62147]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 35.01it/s, loss=-0.11379, sqweights=0.62143]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 35.01it/s, loss=-0.11339, sqweights=0.62185]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 35.01it/s, loss=-0.11683, sqweights=0.61760]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 35.01it/s, loss=-0.11903, sqweights=0.61533]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 34.98it/s, loss=-0.11903, sqweights=0.61533]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 34.98it/s, loss=-0.11283, sqweights=0.61322]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 34.98it/s, loss=-0.11230, sqweights=0.61145]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 34.98it/s, loss=-0.11315, sqweights=0.61278]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 34.98it/s, loss=-0.11322, sqweights=0.61220]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 34.70it/s, loss=-0.11322, sqweights=0.61220]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 34.70it/s, loss=-0.11180, sqweights=0.61256]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 34.70it/s, loss=-0.11234, sqweights=0.61356]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 34.70it/s, loss=-0.11283, sqweights=0.61219]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 34.70it/s, loss=-0.11318, sqweights=0.61313]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 34.54it/s, loss=-0.11318, sqweights=0.61313]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 34.54it/s, loss=-0.11297, sqweights=0.61315]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 34.54it/s, loss=-0.11374, sqweights=0.61288]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 34.54it/s, loss=-0.11366, sqweights=0.61318]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 34.54it/s, loss=-0.11168, sqweights=0.61315]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 35.52it/s, loss=-0.11168, sqweights=0.61315]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 35.52it/s, loss=-0.11043, sqweights=0.61396]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 35.52it/s, loss=-0.11043, sqweights=0.61396, train_loss=-0.15463, train_sqweights=0.52114, val_loss=-0.12157, val_sqweights=0.51515]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 11.97it/s, loss=-0.11043, sqweights=0.61396, train_loss=-0.15463, train_sqweights=0.52114, val_loss=-0.12157, val_sqweights=0.51515]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:00, 33.51it/s, loss=-0.12276, sqweights=0.63504]
Epoch 23:  10%|#         | 2/20 [00:00<00:00, 33.50it/s, loss=-0.13743, sqweights=0.63091]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 33.58it/s, loss=-0.13516, sqweights=0.62919]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 33.83it/s, loss=-0.13516, sqweights=0.62919]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 33.83it/s, loss=-0.13606, sqweights=0.63217]
Epoch 23:  25%|##5       | 5/20 [00:00<00:00, 33.83it/s, loss=-0.12946, sqweights=0.63010]
Epoch 23:  30%|###       | 6/20 [00:00<00:00, 33.83it/s, loss=-0.12860, sqweights=0.62858]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 33.83it/s, loss=-0.13118, sqweights=0.62943]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 33.04it/s, loss=-0.13118, sqweights=0.62943]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 33.04it/s, loss=-0.12497, sqweights=0.62931]
Epoch 23:  45%|####5     | 9/20 [00:00<00:00, 33.04it/s, loss=-0.12600, sqweights=0.63106]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 33.04it/s, loss=-0.12375, sqweights=0.62898]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 33.04it/s, loss=-0.12341, sqweights=0.62933]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 33.61it/s, loss=-0.12341, sqweights=0.62933]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 33.61it/s, loss=-0.12191, sqweights=0.62731]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 33.61it/s, loss=-0.12190, sqweights=0.62671]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 33.61it/s, loss=-0.12219, sqweights=0.62686]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 33.61it/s, loss=-0.12070, sqweights=0.62591]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 33.36it/s, loss=-0.12070, sqweights=0.62591]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 33.36it/s, loss=-0.12082, sqweights=0.62626]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 33.36it/s, loss=-0.12071, sqweights=0.62607]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 33.36it/s, loss=-0.12096, sqweights=0.62777]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 33.36it/s, loss=-0.11921, sqweights=0.62866]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 34.29it/s, loss=-0.11921, sqweights=0.62866]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 34.29it/s, loss=-0.11899, sqweights=0.62829]
Epoch 23: 100%|##########| 20/20 [00:01<00:00, 34.29it/s, loss=-0.11899, sqweights=0.62829, train_loss=-0.15826, train_sqweights=0.54091, val_loss=-0.12349, val_sqweights=0.53488]
Epoch 23: 100%|##########| 20/20 [00:01<00:00, 11.78it/s, loss=-0.11899, sqweights=0.62829, train_loss=-0.15826, train_sqweights=0.54091, val_loss=-0.12349, val_sqweights=0.53488]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 33.61it/s, loss=-0.11424, sqweights=0.63838]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 34.00it/s, loss=-0.12372, sqweights=0.64162]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 34.29it/s, loss=-0.12214, sqweights=0.64080]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 34.45it/s, loss=-0.12214, sqweights=0.64080]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 34.45it/s, loss=-0.12247, sqweights=0.64066]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 34.45it/s, loss=-0.12297, sqweights=0.63988]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 34.45it/s, loss=-0.12315, sqweights=0.63924]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 34.45it/s, loss=-0.12188, sqweights=0.63944]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 34.39it/s, loss=-0.12188, sqweights=0.63944]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 34.39it/s, loss=-0.12038, sqweights=0.63829]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 34.39it/s, loss=-0.12073, sqweights=0.63846]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 34.39it/s, loss=-0.11712, sqweights=0.63915]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 34.39it/s, loss=-0.11667, sqweights=0.63875]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 34.13it/s, loss=-0.11667, sqweights=0.63875]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 34.13it/s, loss=-0.11820, sqweights=0.63963]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 34.13it/s, loss=-0.11746, sqweights=0.64031]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 34.13it/s, loss=-0.11819, sqweights=0.64045]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 34.13it/s, loss=-0.11773, sqweights=0.63977]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 34.20it/s, loss=-0.11773, sqweights=0.63977]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 34.20it/s, loss=-0.11772, sqweights=0.64056]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 34.20it/s, loss=-0.11974, sqweights=0.63983]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 34.20it/s, loss=-0.12054, sqweights=0.64035]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 34.20it/s, loss=-0.12029, sqweights=0.64045]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 35.26it/s, loss=-0.12029, sqweights=0.64045]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 35.26it/s, loss=-0.11983, sqweights=0.63895]
Epoch 24: 100%|##########| 20/20 [00:01<00:00, 35.26it/s, loss=-0.11983, sqweights=0.63895, train_loss=-0.16138, train_sqweights=0.55900, val_loss=-0.12553, val_sqweights=0.55314]
Epoch 24: 100%|##########| 20/20 [00:01<00:00, 11.94it/s, loss=-0.11983, sqweights=0.63895, train_loss=-0.16138, train_sqweights=0.55900, val_loss=-0.12553, val_sqweights=0.55314]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:00, 34.04it/s, loss=-0.09954, sqweights=0.63761]
Epoch 25:  10%|#         | 2/20 [00:00<00:00, 34.48it/s, loss=-0.11149, sqweights=0.64840]
Epoch 25:  15%|#5        | 3/20 [00:00<00:00, 34.85it/s, loss=-0.11752, sqweights=0.66445]
Epoch 25:  20%|##        | 4/20 [00:00<00:00, 35.02it/s, loss=-0.11752, sqweights=0.66445]
Epoch 25:  20%|##        | 4/20 [00:00<00:00, 35.02it/s, loss=-0.11527, sqweights=0.66194]
Epoch 25:  25%|##5       | 5/20 [00:00<00:00, 35.02it/s, loss=-0.11836, sqweights=0.66403]
Epoch 25:  30%|###       | 6/20 [00:00<00:00, 35.02it/s, loss=-0.11687, sqweights=0.66073]
Epoch 25:  35%|###5      | 7/20 [00:00<00:00, 35.02it/s, loss=-0.11773, sqweights=0.66002]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 35.31it/s, loss=-0.11773, sqweights=0.66002]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 35.31it/s, loss=-0.11828, sqweights=0.65739]
Epoch 25:  45%|####5     | 9/20 [00:00<00:00, 35.31it/s, loss=-0.11607, sqweights=0.65584]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 35.31it/s, loss=-0.11838, sqweights=0.65604]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 35.31it/s, loss=-0.11662, sqweights=0.65643]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 35.25it/s, loss=-0.11662, sqweights=0.65643]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 35.25it/s, loss=-0.11608, sqweights=0.65739]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 35.25it/s, loss=-0.11747, sqweights=0.65831]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 35.25it/s, loss=-0.11743, sqweights=0.65597]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 35.25it/s, loss=-0.11651, sqweights=0.65655]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 34.79it/s, loss=-0.11651, sqweights=0.65655]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 34.79it/s, loss=-0.11626, sqweights=0.65816]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 34.79it/s, loss=-0.11622, sqweights=0.65843]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 34.79it/s, loss=-0.11656, sqweights=0.65933]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 34.79it/s, loss=-0.11536, sqweights=0.65993]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 35.50it/s, loss=-0.11536, sqweights=0.65993]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 35.50it/s, loss=-0.11699, sqweights=0.65842]
Epoch 25: 100%|##########| 20/20 [00:01<00:00, 35.50it/s, loss=-0.11699, sqweights=0.65842, train_loss=-0.16421, train_sqweights=0.57667, val_loss=-0.12700, val_sqweights=0.57164]
Epoch 25: 100%|##########| 20/20 [00:01<00:00, 11.93it/s, loss=-0.11699, sqweights=0.65842, train_loss=-0.16421, train_sqweights=0.57667, val_loss=-0.12700, val_sqweights=0.57164]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 28.22it/s, loss=-0.11451, sqweights=0.66772]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 27.17it/s, loss=-0.13457, sqweights=0.66970]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 29.17it/s, loss=-0.13457, sqweights=0.66970]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 29.17it/s, loss=-0.11887, sqweights=0.66409]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 29.17it/s, loss=-0.12551, sqweights=0.66480]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 29.17it/s, loss=-0.11974, sqweights=0.66179]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 29.17it/s, loss=-0.11924, sqweights=0.66630]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 30.45it/s, loss=-0.11924, sqweights=0.66630]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 30.45it/s, loss=-0.12028, sqweights=0.66780]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 30.45it/s, loss=-0.11988, sqweights=0.66944]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 30.45it/s, loss=-0.11749, sqweights=0.66994]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 30.45it/s, loss=-0.11749, sqweights=0.67159]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 31.65it/s, loss=-0.11749, sqweights=0.67159]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 31.65it/s, loss=-0.11969, sqweights=0.67322]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 31.65it/s, loss=-0.11898, sqweights=0.67570]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 31.65it/s, loss=-0.11742, sqweights=0.67580]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 31.65it/s, loss=-0.11683, sqweights=0.67519]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 32.42it/s, loss=-0.11683, sqweights=0.67519]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 32.42it/s, loss=-0.11666, sqweights=0.67523]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 32.42it/s, loss=-0.11641, sqweights=0.67574]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 32.42it/s, loss=-0.11628, sqweights=0.67576]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 32.42it/s, loss=-0.11797, sqweights=0.67643]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 33.00it/s, loss=-0.11797, sqweights=0.67643]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 33.00it/s, loss=-0.11861, sqweights=0.67689]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 33.00it/s, loss=-0.11984, sqweights=0.67661]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 33.00it/s, loss=-0.11984, sqweights=0.67661, train_loss=-0.16679, train_sqweights=0.59419, val_loss=-0.12801, val_sqweights=0.58968]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 11.74it/s, loss=-0.11984, sqweights=0.67661, train_loss=-0.16679, train_sqweights=0.59419, val_loss=-0.12801, val_sqweights=0.58968]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 33.00it/s, loss=-0.11830, sqweights=0.67331]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 33.59it/s, loss=-0.12152, sqweights=0.67709]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 34.02it/s, loss=-0.11481, sqweights=0.67693]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 34.17it/s, loss=-0.11481, sqweights=0.67693]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 34.17it/s, loss=-0.11487, sqweights=0.67691]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 34.17it/s, loss=-0.11476, sqweights=0.67546]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 34.17it/s, loss=-0.11672, sqweights=0.68048]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 34.17it/s, loss=-0.11657, sqweights=0.68029]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 34.33it/s, loss=-0.11657, sqweights=0.68029]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 34.33it/s, loss=-0.11622, sqweights=0.68037]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 34.33it/s, loss=-0.11787, sqweights=0.67831]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 34.33it/s, loss=-0.11986, sqweights=0.67719]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 34.33it/s, loss=-0.11932, sqweights=0.67751]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 34.39it/s, loss=-0.11932, sqweights=0.67751]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 34.39it/s, loss=-0.12111, sqweights=0.67723]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 34.39it/s, loss=-0.12173, sqweights=0.67882]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 34.39it/s, loss=-0.12235, sqweights=0.67980]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 34.39it/s, loss=-0.12097, sqweights=0.68068]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 34.35it/s, loss=-0.12097, sqweights=0.68068]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 34.35it/s, loss=-0.12321, sqweights=0.68019]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 34.35it/s, loss=-0.12396, sqweights=0.68242]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 34.35it/s, loss=-0.12435, sqweights=0.68349]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 34.35it/s, loss=-0.12269, sqweights=0.68496]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 35.05it/s, loss=-0.12269, sqweights=0.68496]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 35.05it/s, loss=-0.12337, sqweights=0.68671]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 35.05it/s, loss=-0.12337, sqweights=0.68671, train_loss=-0.16924, train_sqweights=0.61177, val_loss=-0.12970, val_sqweights=0.60755]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 11.83it/s, loss=-0.12337, sqweights=0.68671, train_loss=-0.16924, train_sqweights=0.61177, val_loss=-0.12970, val_sqweights=0.60755]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 33.86it/s, loss=-0.11389, sqweights=0.71360]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 33.99it/s, loss=-0.11615, sqweights=0.70255]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 33.69it/s, loss=-0.12015, sqweights=0.70417]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 33.79it/s, loss=-0.12015, sqweights=0.70417]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 33.79it/s, loss=-0.11534, sqweights=0.69574]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 33.79it/s, loss=-0.11375, sqweights=0.69712]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 33.79it/s, loss=-0.11645, sqweights=0.69695]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 33.79it/s, loss=-0.12152, sqweights=0.70012]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 33.94it/s, loss=-0.12152, sqweights=0.70012]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 33.94it/s, loss=-0.12062, sqweights=0.69791]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 33.94it/s, loss=-0.12004, sqweights=0.69908]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 33.94it/s, loss=-0.11939, sqweights=0.69961]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 33.94it/s, loss=-0.11916, sqweights=0.70046]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 33.74it/s, loss=-0.11916, sqweights=0.70046]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 33.74it/s, loss=-0.11715, sqweights=0.70217]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 33.74it/s, loss=-0.11619, sqweights=0.70332]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 33.74it/s, loss=-0.11769, sqweights=0.70491]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 33.74it/s, loss=-0.11829, sqweights=0.70539]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 33.95it/s, loss=-0.11829, sqweights=0.70539]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 33.95it/s, loss=-0.11668, sqweights=0.70364]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 33.95it/s, loss=-0.11743, sqweights=0.70257]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 33.95it/s, loss=-0.11721, sqweights=0.70296]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 33.95it/s, loss=-0.11823, sqweights=0.70423]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 34.96it/s, loss=-0.11823, sqweights=0.70423]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 34.96it/s, loss=-0.11968, sqweights=0.70293]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 34.96it/s, loss=-0.11968, sqweights=0.70293, train_loss=-0.17143, train_sqweights=0.62696, val_loss=-0.13104, val_sqweights=0.62227]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 11.72it/s, loss=-0.11968, sqweights=0.70293, train_loss=-0.17143, train_sqweights=0.62696, val_loss=-0.13104, val_sqweights=0.62227]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 31.70it/s, loss=-0.15160, sqweights=0.71683]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 33.16it/s, loss=-0.13267, sqweights=0.71284]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 33.71it/s, loss=-0.12469, sqweights=0.71951]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 34.04it/s, loss=-0.12469, sqweights=0.71951]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 34.04it/s, loss=-0.11746, sqweights=0.71598]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 34.04it/s, loss=-0.11583, sqweights=0.71081]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 34.04it/s, loss=-0.12544, sqweights=0.70923]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 34.04it/s, loss=-0.12409, sqweights=0.70984]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 34.10it/s, loss=-0.12409, sqweights=0.70984]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 34.10it/s, loss=-0.12320, sqweights=0.71172]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 34.10it/s, loss=-0.12200, sqweights=0.70832]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 34.10it/s, loss=-0.12034, sqweights=0.70963]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 34.10it/s, loss=-0.12140, sqweights=0.70963]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.12140, sqweights=0.70963]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 34.06it/s, loss=-0.11917, sqweights=0.71157]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 34.06it/s, loss=-0.12119, sqweights=0.71259]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 34.06it/s, loss=-0.12132, sqweights=0.71220]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 34.06it/s, loss=-0.12130, sqweights=0.71257]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 34.26it/s, loss=-0.12130, sqweights=0.71257]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 34.26it/s, loss=-0.12163, sqweights=0.71183]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 34.26it/s, loss=-0.12016, sqweights=0.71161]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 34.26it/s, loss=-0.12098, sqweights=0.71351]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 34.26it/s, loss=-0.12038, sqweights=0.71464]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 35.29it/s, loss=-0.12038, sqweights=0.71464]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 35.29it/s, loss=-0.12145, sqweights=0.71413]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 35.29it/s, loss=-0.12145, sqweights=0.71413, train_loss=-0.17342, train_sqweights=0.64064, val_loss=-0.13218, val_sqweights=0.63481]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 11.84it/s, loss=-0.12145, sqweights=0.71413, train_loss=-0.17342, train_sqweights=0.64064, val_loss=-0.13218, val_sqweights=0.63481]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 34.19it/s, loss=-0.14740, sqweights=0.72044]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 34.53it/s, loss=-0.12773, sqweights=0.72253]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 34.18it/s, loss=-0.11316, sqweights=0.73285]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 34.81it/s, loss=-0.11316, sqweights=0.73285]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 34.81it/s, loss=-0.11264, sqweights=0.73311]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 34.81it/s, loss=-0.11805, sqweights=0.73161]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 34.81it/s, loss=-0.12116, sqweights=0.73116]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 34.81it/s, loss=-0.12281, sqweights=0.73049]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 34.54it/s, loss=-0.12281, sqweights=0.73049]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 34.54it/s, loss=-0.12422, sqweights=0.72941]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 34.54it/s, loss=-0.12351, sqweights=0.72827]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 34.54it/s, loss=-0.12375, sqweights=0.72845]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 34.54it/s, loss=-0.12056, sqweights=0.72779]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 33.98it/s, loss=-0.12056, sqweights=0.72779]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 33.98it/s, loss=-0.12119, sqweights=0.72902]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 33.98it/s, loss=-0.12211, sqweights=0.73129]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 33.98it/s, loss=-0.12581, sqweights=0.73144]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 33.98it/s, loss=-0.12566, sqweights=0.73178]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 34.13it/s, loss=-0.12566, sqweights=0.73178]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 34.13it/s, loss=-0.12683, sqweights=0.73178]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 34.13it/s, loss=-0.12591, sqweights=0.73213]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 34.13it/s, loss=-0.12640, sqweights=0.73164]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 34.13it/s, loss=-0.12595, sqweights=0.73076]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 35.20it/s, loss=-0.12595, sqweights=0.73076]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 35.20it/s, loss=-0.12451, sqweights=0.73140]
Epoch 30: 100%|##########| 20/20 [00:01<00:00, 35.20it/s, loss=-0.12451, sqweights=0.73140, train_loss=-0.17531, train_sqweights=0.65459, val_loss=-0.13349, val_sqweights=0.64882]
Epoch 30: 100%|##########| 20/20 [00:01<00:00, 11.75it/s, loss=-0.12451, sqweights=0.73140, train_loss=-0.17531, train_sqweights=0.65459, val_loss=-0.13349, val_sqweights=0.64882]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 33.72it/s, loss=-0.10181, sqweights=0.72042]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 34.23it/s, loss=-0.13097, sqweights=0.73118]
Epoch 31:  15%|#5        | 3/20 [00:00<00:01, 15.72it/s, loss=-0.13097, sqweights=0.73118]
Epoch 31:  15%|#5        | 3/20 [00:00<00:01, 15.72it/s, loss=-0.12559, sqweights=0.72808]
Epoch 31:  20%|##        | 4/20 [00:00<00:01, 15.72it/s, loss=-0.12394, sqweights=0.72810]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 15.72it/s, loss=-0.12088, sqweights=0.73500]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 15.72it/s, loss=-0.12614, sqweights=0.73474]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 18.78it/s, loss=-0.12614, sqweights=0.73474]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 18.78it/s, loss=-0.12996, sqweights=0.73916]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 18.78it/s, loss=-0.12910, sqweights=0.73850]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 18.78it/s, loss=-0.12700, sqweights=0.73888]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 18.78it/s, loss=-0.12629, sqweights=0.73768]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 21.72it/s, loss=-0.12629, sqweights=0.73768]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 21.72it/s, loss=-0.12462, sqweights=0.73785]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 21.72it/s, loss=-0.12562, sqweights=0.73684]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 21.72it/s, loss=-0.12582, sqweights=0.73659]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 21.72it/s, loss=-0.12800, sqweights=0.73709]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 24.28it/s, loss=-0.12800, sqweights=0.73709]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 24.28it/s, loss=-0.12876, sqweights=0.73654]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 24.28it/s, loss=-0.12910, sqweights=0.73825]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 24.28it/s, loss=-0.12914, sqweights=0.73813]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 24.28it/s, loss=-0.12923, sqweights=0.73843]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 26.57it/s, loss=-0.12923, sqweights=0.73843]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 26.57it/s, loss=-0.12999, sqweights=0.73853]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 26.57it/s, loss=-0.13027, sqweights=0.73921]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 26.57it/s, loss=-0.13027, sqweights=0.73921, train_loss=-0.17723, train_sqweights=0.67215, val_loss=-0.13439, val_sqweights=0.66581]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 11.11it/s, loss=-0.13027, sqweights=0.73921, train_loss=-0.17723, train_sqweights=0.67215, val_loss=-0.13439, val_sqweights=0.66581]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 30.80it/s, loss=-0.11822, sqweights=0.74001]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 32.55it/s, loss=-0.10537, sqweights=0.74258]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 32.85it/s, loss=-0.10756, sqweights=0.74806]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 33.16it/s, loss=-0.10756, sqweights=0.74806]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 33.16it/s, loss=-0.11383, sqweights=0.74284]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 33.16it/s, loss=-0.11614, sqweights=0.74252]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 33.16it/s, loss=-0.12123, sqweights=0.74371]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 33.16it/s, loss=-0.11817, sqweights=0.74706]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 33.58it/s, loss=-0.11817, sqweights=0.74706]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 33.58it/s, loss=-0.11973, sqweights=0.74738]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 33.58it/s, loss=-0.12160, sqweights=0.74810]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 33.58it/s, loss=-0.12279, sqweights=0.74989]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 33.58it/s, loss=-0.12013, sqweights=0.74998]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 34.18it/s, loss=-0.12013, sqweights=0.74998]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 34.18it/s, loss=-0.12291, sqweights=0.75194]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 34.18it/s, loss=-0.12207, sqweights=0.75296]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 34.18it/s, loss=-0.12144, sqweights=0.75311]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 34.18it/s, loss=-0.12247, sqweights=0.75219]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 34.44it/s, loss=-0.12247, sqweights=0.75219]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 34.44it/s, loss=-0.12208, sqweights=0.75318]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 34.44it/s, loss=-0.12235, sqweights=0.75295]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 34.44it/s, loss=-0.12162, sqweights=0.75170]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 34.44it/s, loss=-0.12131, sqweights=0.75083]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 35.72it/s, loss=-0.12131, sqweights=0.75083]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 35.72it/s, loss=-0.11959, sqweights=0.75088]
Epoch 32: 100%|##########| 20/20 [00:01<00:00, 35.72it/s, loss=-0.11959, sqweights=0.75088, train_loss=-0.17904, train_sqweights=0.68799, val_loss=-0.13515, val_sqweights=0.68209]
Epoch 32: 100%|##########| 20/20 [00:01<00:00, 11.96it/s, loss=-0.11959, sqweights=0.75088, train_loss=-0.17904, train_sqweights=0.68799, val_loss=-0.13515, val_sqweights=0.68209]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 31.02it/s, loss=-0.14975, sqweights=0.74913]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 32.23it/s, loss=-0.13821, sqweights=0.74835]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 32.80it/s, loss=-0.13201, sqweights=0.75010]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 33.15it/s, loss=-0.13201, sqweights=0.75010]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 33.15it/s, loss=-0.12607, sqweights=0.75375]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 33.15it/s, loss=-0.12526, sqweights=0.75593]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 33.15it/s, loss=-0.12694, sqweights=0.75524]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 33.15it/s, loss=-0.12524, sqweights=0.75384]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 33.35it/s, loss=-0.12524, sqweights=0.75384]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 33.35it/s, loss=-0.12664, sqweights=0.75641]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 33.35it/s, loss=-0.12749, sqweights=0.75638]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 33.35it/s, loss=-0.12581, sqweights=0.75551]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 33.35it/s, loss=-0.12528, sqweights=0.75919]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 33.97it/s, loss=-0.12528, sqweights=0.75919]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 33.97it/s, loss=-0.12726, sqweights=0.76150]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 33.97it/s, loss=-0.12511, sqweights=0.76166]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 33.97it/s, loss=-0.12745, sqweights=0.76182]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 33.97it/s, loss=-0.12745, sqweights=0.76178]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 33.82it/s, loss=-0.12745, sqweights=0.76178]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 33.82it/s, loss=-0.12770, sqweights=0.76384]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 33.82it/s, loss=-0.12754, sqweights=0.76648]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 33.82it/s, loss=-0.12855, sqweights=0.76818]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 31.96it/s, loss=-0.12855, sqweights=0.76818]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 31.96it/s, loss=-0.13050, sqweights=0.76939]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 31.96it/s, loss=-0.12832, sqweights=0.76948]
Epoch 33: 100%|##########| 20/20 [00:01<00:00, 31.96it/s, loss=-0.12832, sqweights=0.76948, train_loss=-0.18033, train_sqweights=0.70170, val_loss=-0.13570, val_sqweights=0.69625]
Epoch 33: 100%|##########| 20/20 [00:01<00:00, 11.73it/s, loss=-0.12832, sqweights=0.76948, train_loss=-0.18033, train_sqweights=0.70170, val_loss=-0.13570, val_sqweights=0.69625]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 33.96it/s, loss=-0.15479, sqweights=0.77750]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 34.25it/s, loss=-0.14818, sqweights=0.77817]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 33.87it/s, loss=-0.13760, sqweights=0.77935]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 34.03it/s, loss=-0.13760, sqweights=0.77935]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 34.03it/s, loss=-0.13262, sqweights=0.77880]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 34.03it/s, loss=-0.12856, sqweights=0.78121]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 34.03it/s, loss=-0.12715, sqweights=0.77957]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 34.03it/s, loss=-0.12827, sqweights=0.77632]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 34.04it/s, loss=-0.12827, sqweights=0.77632]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 34.04it/s, loss=-0.12798, sqweights=0.77736]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 34.04it/s, loss=-0.12712, sqweights=0.77900]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 34.04it/s, loss=-0.12789, sqweights=0.77949]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 32.64it/s, loss=-0.12789, sqweights=0.77949]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 32.64it/s, loss=-0.12856, sqweights=0.78079]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 32.64it/s, loss=-0.13076, sqweights=0.78210]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 32.64it/s, loss=-0.13038, sqweights=0.78128]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 32.64it/s, loss=-0.13112, sqweights=0.78191]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 33.10it/s, loss=-0.13112, sqweights=0.78191]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 33.10it/s, loss=-0.13030, sqweights=0.77979]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 33.10it/s, loss=-0.13129, sqweights=0.77876]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 33.10it/s, loss=-0.13210, sqweights=0.77849]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 33.10it/s, loss=-0.13289, sqweights=0.78003]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 33.47it/s, loss=-0.13289, sqweights=0.78003]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 33.47it/s, loss=-0.13193, sqweights=0.78070]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 33.47it/s, loss=-0.13116, sqweights=0.78031]
Epoch 34: 100%|##########| 20/20 [00:01<00:00, 33.47it/s, loss=-0.13116, sqweights=0.78031, train_loss=-0.18158, train_sqweights=0.71524, val_loss=-0.13612, val_sqweights=0.70912]
Epoch 34: 100%|##########| 20/20 [00:01<00:00, 11.70it/s, loss=-0.13116, sqweights=0.78031, train_loss=-0.18158, train_sqweights=0.71524, val_loss=-0.13612, val_sqweights=0.70912]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 34.14it/s, loss=-0.11033, sqweights=0.76222]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 34.59it/s, loss=-0.13022, sqweights=0.78008]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 34.66it/s, loss=-0.13373, sqweights=0.78572]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 34.76it/s, loss=-0.13373, sqweights=0.78572]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 34.76it/s, loss=-0.13524, sqweights=0.79016]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 34.76it/s, loss=-0.13249, sqweights=0.78704]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 34.76it/s, loss=-0.12765, sqweights=0.78651]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 34.76it/s, loss=-0.12886, sqweights=0.78322]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 34.74it/s, loss=-0.12886, sqweights=0.78322]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 34.74it/s, loss=-0.12905, sqweights=0.78333]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 34.74it/s, loss=-0.12821, sqweights=0.78246]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 34.74it/s, loss=-0.12778, sqweights=0.78009]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 34.74it/s, loss=-0.12877, sqweights=0.78116]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 34.51it/s, loss=-0.12877, sqweights=0.78116]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 34.51it/s, loss=-0.12936, sqweights=0.78171]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 34.51it/s, loss=-0.13111, sqweights=0.78234]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 34.51it/s, loss=-0.13136, sqweights=0.78303]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 34.51it/s, loss=-0.13027, sqweights=0.78209]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 34.29it/s, loss=-0.13027, sqweights=0.78209]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 34.29it/s, loss=-0.13204, sqweights=0.78234]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 34.29it/s, loss=-0.13245, sqweights=0.78221]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 34.29it/s, loss=-0.13265, sqweights=0.78254]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 34.29it/s, loss=-0.13263, sqweights=0.78304]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 35.36it/s, loss=-0.13263, sqweights=0.78304]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 35.36it/s, loss=-0.13148, sqweights=0.78392]
Epoch 35: 100%|##########| 20/20 [00:01<00:00, 35.36it/s, loss=-0.13148, sqweights=0.78392, train_loss=-0.18261, train_sqweights=0.72795, val_loss=-0.13654, val_sqweights=0.72101]
Epoch 35: 100%|##########| 20/20 [00:01<00:00, 11.84it/s, loss=-0.13148, sqweights=0.78392, train_loss=-0.18261, train_sqweights=0.72795, val_loss=-0.13654, val_sqweights=0.72101]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 30.85it/s, loss=-0.14873, sqweights=0.79143]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 32.57it/s, loss=-0.14727, sqweights=0.79479]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 33.28it/s, loss=-0.13555, sqweights=0.79355]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 33.70it/s, loss=-0.13555, sqweights=0.79355]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 33.70it/s, loss=-0.12637, sqweights=0.79383]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 33.70it/s, loss=-0.12781, sqweights=0.79310]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 33.70it/s, loss=-0.12957, sqweights=0.79410]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 33.70it/s, loss=-0.12757, sqweights=0.78841]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 33.94it/s, loss=-0.12757, sqweights=0.78841]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 33.94it/s, loss=-0.12615, sqweights=0.78783]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 33.94it/s, loss=-0.12942, sqweights=0.78811]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 33.94it/s, loss=-0.13177, sqweights=0.78908]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 33.94it/s, loss=-0.13190, sqweights=0.79043]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 33.99it/s, loss=-0.13190, sqweights=0.79043]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 33.99it/s, loss=-0.13071, sqweights=0.79067]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 33.99it/s, loss=-0.12857, sqweights=0.79010]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 33.99it/s, loss=-0.12786, sqweights=0.79039]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 33.99it/s, loss=-0.12859, sqweights=0.78960]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 33.69it/s, loss=-0.12859, sqweights=0.78960]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 33.69it/s, loss=-0.12928, sqweights=0.79056]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 33.69it/s, loss=-0.12817, sqweights=0.78920]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 33.69it/s, loss=-0.12816, sqweights=0.79050]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 33.69it/s, loss=-0.12834, sqweights=0.79166]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 34.68it/s, loss=-0.12834, sqweights=0.79166]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 34.68it/s, loss=-0.12839, sqweights=0.79105]
Epoch 36: 100%|##########| 20/20 [00:01<00:00, 34.68it/s, loss=-0.12839, sqweights=0.79105, train_loss=-0.18341, train_sqweights=0.74132, val_loss=-0.13665, val_sqweights=0.73435]
Epoch 36: 100%|##########| 20/20 [00:01<00:00, 11.84it/s, loss=-0.12839, sqweights=0.79105, train_loss=-0.18341, train_sqweights=0.74132, val_loss=-0.13665, val_sqweights=0.73435]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 33.56it/s, loss=-0.12934, sqweights=0.81568]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 34.08it/s, loss=-0.14287, sqweights=0.80655]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 34.45it/s, loss=-0.14247, sqweights=0.80817]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 34.65it/s, loss=-0.14247, sqweights=0.80817]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 34.65it/s, loss=-0.14508, sqweights=0.80746]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 34.65it/s, loss=-0.14108, sqweights=0.81252]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 34.65it/s, loss=-0.13768, sqweights=0.80989]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 34.65it/s, loss=-0.13293, sqweights=0.80654]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 34.84it/s, loss=-0.13293, sqweights=0.80654]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 34.84it/s, loss=-0.13436, sqweights=0.80744]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 34.84it/s, loss=-0.13606, sqweights=0.80767]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 34.84it/s, loss=-0.13619, sqweights=0.80821]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 34.84it/s, loss=-0.13621, sqweights=0.80831]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 34.44it/s, loss=-0.13621, sqweights=0.80831]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 34.44it/s, loss=-0.13406, sqweights=0.80758]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 34.44it/s, loss=-0.13515, sqweights=0.80757]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 34.44it/s, loss=-0.13431, sqweights=0.80733]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 34.44it/s, loss=-0.13316, sqweights=0.80741]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 34.39it/s, loss=-0.13316, sqweights=0.80741]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 34.39it/s, loss=-0.13378, sqweights=0.80653]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 34.39it/s, loss=-0.13194, sqweights=0.80741]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 34.39it/s, loss=-0.13161, sqweights=0.80675]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 34.39it/s, loss=-0.13016, sqweights=0.80694]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 35.58it/s, loss=-0.13016, sqweights=0.80694]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 35.58it/s, loss=-0.13111, sqweights=0.80867]
Epoch 37: 100%|##########| 20/20 [00:01<00:00, 35.58it/s, loss=-0.13111, sqweights=0.80867, train_loss=-0.18462, train_sqweights=0.75700, val_loss=-0.13740, val_sqweights=0.74987]
Epoch 37: 100%|##########| 20/20 [00:01<00:00, 11.88it/s, loss=-0.13111, sqweights=0.80867, train_loss=-0.18462, train_sqweights=0.75700, val_loss=-0.13740, val_sqweights=0.74987]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:00, 32.98it/s, loss=-0.13315, sqweights=0.83054]
Epoch 38:  10%|#         | 2/20 [00:00<00:00, 33.33it/s, loss=-0.14215, sqweights=0.82402]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 33.72it/s, loss=-0.13911, sqweights=0.82183]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 33.87it/s, loss=-0.13911, sqweights=0.82183]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 33.87it/s, loss=-0.14054, sqweights=0.82473]
Epoch 38:  25%|##5       | 5/20 [00:00<00:00, 33.87it/s, loss=-0.14008, sqweights=0.81905]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 33.87it/s, loss=-0.14156, sqweights=0.81905]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 33.87it/s, loss=-0.14256, sqweights=0.81801]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 34.02it/s, loss=-0.14256, sqweights=0.81801]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 34.02it/s, loss=-0.13758, sqweights=0.81748]
Epoch 38:  45%|####5     | 9/20 [00:00<00:00, 34.02it/s, loss=-0.13586, sqweights=0.81879]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 34.02it/s, loss=-0.13429, sqweights=0.81730]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 34.02it/s, loss=-0.13647, sqweights=0.82050]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 34.11it/s, loss=-0.13647, sqweights=0.82050]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 34.11it/s, loss=-0.13718, sqweights=0.82211]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 34.11it/s, loss=-0.13721, sqweights=0.82177]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 34.11it/s, loss=-0.13511, sqweights=0.82037]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 34.11it/s, loss=-0.13213, sqweights=0.81853]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 34.33it/s, loss=-0.13213, sqweights=0.81853]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 34.33it/s, loss=-0.13085, sqweights=0.81822]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 34.33it/s, loss=-0.13108, sqweights=0.81628]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 34.33it/s, loss=-0.13169, sqweights=0.81690]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 34.33it/s, loss=-0.13001, sqweights=0.81785]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 34.89it/s, loss=-0.13001, sqweights=0.81785]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 34.89it/s, loss=-0.12887, sqweights=0.81736]
Epoch 38: 100%|##########| 20/20 [00:01<00:00, 34.89it/s, loss=-0.12887, sqweights=0.81736, train_loss=-0.18569, train_sqweights=0.76786, val_loss=-0.13782, val_sqweights=0.76115]
Epoch 38: 100%|##########| 20/20 [00:01<00:00, 11.88it/s, loss=-0.12887, sqweights=0.81736, train_loss=-0.18569, train_sqweights=0.76786, val_loss=-0.13782, val_sqweights=0.76115]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 33.93it/s, loss=-0.11513, sqweights=0.82193]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 34.60it/s, loss=-0.12951, sqweights=0.82175]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 34.95it/s, loss=-0.13590, sqweights=0.81893]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 35.43it/s, loss=-0.13590, sqweights=0.81893]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 35.43it/s, loss=-0.13059, sqweights=0.81875]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 35.43it/s, loss=-0.13234, sqweights=0.82084]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 35.43it/s, loss=-0.12851, sqweights=0.82039]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 35.43it/s, loss=-0.13016, sqweights=0.82238]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 35.24it/s, loss=-0.13016, sqweights=0.82238]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 35.24it/s, loss=-0.13221, sqweights=0.82221]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 35.24it/s, loss=-0.13347, sqweights=0.82054]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 35.24it/s, loss=-0.13436, sqweights=0.82165]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 35.24it/s, loss=-0.13333, sqweights=0.82224]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 34.68it/s, loss=-0.13333, sqweights=0.82224]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 34.68it/s, loss=-0.13337, sqweights=0.82260]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 34.68it/s, loss=-0.13421, sqweights=0.82405]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 34.68it/s, loss=-0.13430, sqweights=0.82532]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 34.68it/s, loss=-0.13271, sqweights=0.82432]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 34.75it/s, loss=-0.13271, sqweights=0.82432]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 34.75it/s, loss=-0.13275, sqweights=0.82382]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 34.75it/s, loss=-0.13154, sqweights=0.82477]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 34.75it/s, loss=-0.13175, sqweights=0.82557]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 34.75it/s, loss=-0.13110, sqweights=0.82603]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 35.85it/s, loss=-0.13110, sqweights=0.82603]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 35.85it/s, loss=-0.13043, sqweights=0.82390]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 35.85it/s, loss=-0.13043, sqweights=0.82390, train_loss=-0.18623, train_sqweights=0.77971, val_loss=-0.13841, val_sqweights=0.77265]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 11.92it/s, loss=-0.13043, sqweights=0.82390, train_loss=-0.18623, train_sqweights=0.77971, val_loss=-0.13841, val_sqweights=0.77265]

<matplotlib.legend.Legend object at 0x7f83eb75c630>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 1 minutes 21.651 seconds)

Gallery generated by Sphinx-Gallery