Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

[Lütkepohl2005]

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss
model       metric     epoch  dataloader
1overN      loss       -1     train        -0.000
                              val          -0.001
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train        -0.001
                              val          -0.002
            sqweights  -1     train         0.144
                              val           0.144
Random      loss       -1     train         0.000
                              val          -0.002
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.165
                              val          -0.173
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 60.65it/s, loss=-0.00446, sqweights=0.16575]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 71.37it/s, loss=-0.00108, sqweights=0.16399]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 74.40it/s, loss=-0.00335, sqweights=0.16360]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 76.59it/s, loss=-0.00723, sqweights=0.16321]
Epoch 0:  25%|##5       | 5/20 [00:00<00:00, 78.28it/s, loss=-0.00163, sqweights=0.16287]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 79.49it/s, loss=0.00027, sqweights=0.16224]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 80.36it/s, loss=0.00189, sqweights=0.16178]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 80.96it/s, loss=0.00081, sqweights=0.16152]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 81.27it/s, loss=0.00081, sqweights=0.16152]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 81.27it/s, loss=0.00210, sqweights=0.16152]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 81.27it/s, loss=0.00285, sqweights=0.16150]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 81.27it/s, loss=0.00292, sqweights=0.16166]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 81.27it/s, loss=0.00414, sqweights=0.16168]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 81.27it/s, loss=0.00446, sqweights=0.16164]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 81.27it/s, loss=0.00274, sqweights=0.16164]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 81.27it/s, loss=0.00206, sqweights=0.16145]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 81.27it/s, loss=0.00175, sqweights=0.16139]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 81.27it/s, loss=0.00143, sqweights=0.16131]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 82.85it/s, loss=0.00143, sqweights=0.16131]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 82.85it/s, loss=0.00069, sqweights=0.16147]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 82.85it/s, loss=0.00038, sqweights=0.16149]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 82.85it/s, loss=-0.00097, sqweights=0.16151]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 82.85it/s, loss=-0.00097, sqweights=0.16151, train_loss=-0.00042, train_sqweights=0.12532, val_loss=-0.00143, val_sqweights=0.12532]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 82.85it/s, loss=-0.00097, sqweights=0.16151, train_loss=-0.00042, train_sqweights=0.12532, val_loss=-0.00143, val_sqweights=0.12532]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 25.39it/s, loss=-0.00097, sqweights=0.16151, train_loss=-0.00042, train_sqweights=0.12532, val_loss=-0.00143, val_sqweights=0.12532]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 62.30it/s, loss=-0.01792, sqweights=0.16331]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 72.01it/s, loss=-0.00793, sqweights=0.16183]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 76.05it/s, loss=-0.00732, sqweights=0.16146]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 78.14it/s, loss=-0.00322, sqweights=0.16197]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 79.72it/s, loss=-0.00691, sqweights=0.16222]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 80.84it/s, loss=-0.00716, sqweights=0.16232]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 81.71it/s, loss=-0.00551, sqweights=0.16274]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 82.20it/s, loss=-0.00554, sqweights=0.16275]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 82.42it/s, loss=-0.00554, sqweights=0.16275]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 82.42it/s, loss=-0.00580, sqweights=0.16269]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 82.42it/s, loss=-0.00677, sqweights=0.16297]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 82.42it/s, loss=-0.00660, sqweights=0.16310]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 82.42it/s, loss=-0.00560, sqweights=0.16325]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 82.42it/s, loss=-0.00672, sqweights=0.16316]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 82.42it/s, loss=-0.00593, sqweights=0.16342]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 82.42it/s, loss=-0.00549, sqweights=0.16333]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 82.42it/s, loss=-0.00610, sqweights=0.16340]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 82.42it/s, loss=-0.00632, sqweights=0.16377]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 81.73it/s, loss=-0.00632, sqweights=0.16377]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 81.73it/s, loss=-0.00720, sqweights=0.16389]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 81.73it/s, loss=-0.00766, sqweights=0.16394]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 81.73it/s, loss=-0.00715, sqweights=0.16393]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 81.73it/s, loss=-0.00715, sqweights=0.16393, train_loss=-0.00081, train_sqweights=0.12558, val_loss=-0.00174, val_sqweights=0.12558]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 81.73it/s, loss=-0.00715, sqweights=0.16393, train_loss=-0.00081, train_sqweights=0.12558, val_loss=-0.00174, val_sqweights=0.12558]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 28.62it/s, loss=-0.00715, sqweights=0.16393, train_loss=-0.00081, train_sqweights=0.12558, val_loss=-0.00174, val_sqweights=0.12558]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 62.74it/s, loss=-0.00355, sqweights=0.16697]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 72.22it/s, loss=-0.00414, sqweights=0.16542]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 76.11it/s, loss=-0.00464, sqweights=0.16612]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 77.46it/s, loss=-0.00816, sqweights=0.16615]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 78.88it/s, loss=-0.01111, sqweights=0.16613]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 79.21it/s, loss=-0.01176, sqweights=0.16637]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 79.55it/s, loss=-0.01093, sqweights=0.16647]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 80.06it/s, loss=-0.01044, sqweights=0.16691]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 80.51it/s, loss=-0.01044, sqweights=0.16691]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 80.51it/s, loss=-0.01130, sqweights=0.16676]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 80.51it/s, loss=-0.01089, sqweights=0.16707]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 80.51it/s, loss=-0.01145, sqweights=0.16706]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 80.51it/s, loss=-0.01144, sqweights=0.16708]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 80.51it/s, loss=-0.01185, sqweights=0.16726]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 80.51it/s, loss=-0.01172, sqweights=0.16739]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 80.51it/s, loss=-0.01217, sqweights=0.16743]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 80.51it/s, loss=-0.01225, sqweights=0.16753]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 80.51it/s, loss=-0.01282, sqweights=0.16798]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 82.69it/s, loss=-0.01282, sqweights=0.16798]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 82.69it/s, loss=-0.01260, sqweights=0.16805]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 82.69it/s, loss=-0.01391, sqweights=0.16832]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.01471, sqweights=0.16865]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.01471, sqweights=0.16865, train_loss=-0.00236, train_sqweights=0.12602, val_loss=-0.00308, val_sqweights=0.12601]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.01471, sqweights=0.16865, train_loss=-0.00236, train_sqweights=0.12602, val_loss=-0.00308, val_sqweights=0.12601]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 28.50it/s, loss=-0.01471, sqweights=0.16865, train_loss=-0.00236, train_sqweights=0.12602, val_loss=-0.00308, val_sqweights=0.12601]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 60.33it/s, loss=-0.01126, sqweights=0.17646]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 68.93it/s, loss=-0.01971, sqweights=0.17339]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 72.54it/s, loss=-0.02199, sqweights=0.17272]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 75.29it/s, loss=-0.02633, sqweights=0.17236]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 77.08it/s, loss=-0.02397, sqweights=0.17233]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 78.41it/s, loss=-0.02263, sqweights=0.17290]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 79.38it/s, loss=-0.02211, sqweights=0.17337]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 80.19it/s, loss=-0.02302, sqweights=0.17366]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 80.52it/s, loss=-0.02302, sqweights=0.17366]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 80.52it/s, loss=-0.02325, sqweights=0.17362]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 80.52it/s, loss=-0.02255, sqweights=0.17357]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 80.52it/s, loss=-0.02178, sqweights=0.17370]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 80.52it/s, loss=-0.02086, sqweights=0.17400]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 80.52it/s, loss=-0.02073, sqweights=0.17433]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 80.52it/s, loss=-0.02149, sqweights=0.17458]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 80.52it/s, loss=-0.02133, sqweights=0.17500]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 80.52it/s, loss=-0.02123, sqweights=0.17537]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 80.52it/s, loss=-0.02082, sqweights=0.17540]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 82.91it/s, loss=-0.02082, sqweights=0.17540]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 82.91it/s, loss=-0.02064, sqweights=0.17591]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 82.91it/s, loss=-0.02030, sqweights=0.17596]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 82.91it/s, loss=-0.02034, sqweights=0.17616]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 82.91it/s, loss=-0.02034, sqweights=0.17616, train_loss=-0.00807, train_sqweights=0.12868, val_loss=-0.00803, val_sqweights=0.12860]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 82.91it/s, loss=-0.02034, sqweights=0.17616, train_loss=-0.00807, train_sqweights=0.12868, val_loss=-0.00803, val_sqweights=0.12860]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 28.09it/s, loss=-0.02034, sqweights=0.17616, train_loss=-0.00807, train_sqweights=0.12868, val_loss=-0.00803, val_sqweights=0.12860]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 59.80it/s, loss=-0.03276, sqweights=0.17935]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 69.92it/s, loss=-0.03085, sqweights=0.18037]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 74.28it/s, loss=-0.02751, sqweights=0.17984]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 76.78it/s, loss=-0.02707, sqweights=0.18070]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 78.29it/s, loss=-0.02851, sqweights=0.18061]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 79.32it/s, loss=-0.02717, sqweights=0.18179]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 79.71it/s, loss=-0.02791, sqweights=0.18300]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 80.00it/s, loss=-0.02943, sqweights=0.18338]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 80.71it/s, loss=-0.02943, sqweights=0.18338]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 80.71it/s, loss=-0.02759, sqweights=0.18340]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 80.71it/s, loss=-0.02771, sqweights=0.18374]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 80.71it/s, loss=-0.02737, sqweights=0.18366]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 80.71it/s, loss=-0.02684, sqweights=0.18409]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 80.71it/s, loss=-0.02530, sqweights=0.18420]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 80.71it/s, loss=-0.02605, sqweights=0.18452]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 80.71it/s, loss=-0.02597, sqweights=0.18485]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 80.71it/s, loss=-0.02637, sqweights=0.18496]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 80.71it/s, loss=-0.02653, sqweights=0.18534]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 83.30it/s, loss=-0.02653, sqweights=0.18534]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 83.30it/s, loss=-0.02665, sqweights=0.18569]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 83.30it/s, loss=-0.02680, sqweights=0.18582]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.02732, sqweights=0.18631]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.02732, sqweights=0.18631, train_loss=-0.02261, train_sqweights=0.14234, val_loss=-0.02058, val_sqweights=0.14172]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.02732, sqweights=0.18631, train_loss=-0.02261, train_sqweights=0.14234, val_loss=-0.02058, val_sqweights=0.14172]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 25.19it/s, loss=-0.02732, sqweights=0.18631, train_loss=-0.02261, train_sqweights=0.14234, val_loss=-0.02058, val_sqweights=0.14172]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 62.70it/s, loss=-0.03423, sqweights=0.19241]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 71.62it/s, loss=-0.03367, sqweights=0.19618]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 73.39it/s, loss=-0.03399, sqweights=0.19519]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 75.20it/s, loss=-0.03511, sqweights=0.19518]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 76.45it/s, loss=-0.03252, sqweights=0.19640]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 77.79it/s, loss=-0.03202, sqweights=0.19677]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 78.82it/s, loss=-0.03376, sqweights=0.19645]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 79.50it/s, loss=-0.03376, sqweights=0.19645]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 79.50it/s, loss=-0.03435, sqweights=0.19665]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 79.50it/s, loss=-0.03209, sqweights=0.19746]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 79.50it/s, loss=-0.03333, sqweights=0.19731]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 79.50it/s, loss=-0.03430, sqweights=0.19762]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 79.50it/s, loss=-0.03468, sqweights=0.19816]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 79.50it/s, loss=-0.03469, sqweights=0.19890]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 79.50it/s, loss=-0.03381, sqweights=0.19892]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 79.50it/s, loss=-0.03355, sqweights=0.19946]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 79.50it/s, loss=-0.03421, sqweights=0.19984]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 82.44it/s, loss=-0.03421, sqweights=0.19984]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 82.44it/s, loss=-0.03369, sqweights=0.20016]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 82.44it/s, loss=-0.03518, sqweights=0.20090]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 82.44it/s, loss=-0.03525, sqweights=0.20132]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 82.44it/s, loss=-0.03585, sqweights=0.20151]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 82.44it/s, loss=-0.03585, sqweights=0.20151, train_loss=-0.03857, train_sqweights=0.16563, val_loss=-0.03423, val_sqweights=0.16385]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 82.44it/s, loss=-0.03585, sqweights=0.20151, train_loss=-0.03857, train_sqweights=0.16563, val_loss=-0.03423, val_sqweights=0.16385]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 28.48it/s, loss=-0.03585, sqweights=0.20151, train_loss=-0.03857, train_sqweights=0.16563, val_loss=-0.03423, val_sqweights=0.16385]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 62.45it/s, loss=-0.04781, sqweights=0.20308]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 71.95it/s, loss=-0.03747, sqweights=0.21088]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 75.95it/s, loss=-0.03596, sqweights=0.21296]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 78.08it/s, loss=-0.03598, sqweights=0.21331]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 79.33it/s, loss=-0.03488, sqweights=0.21284]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 80.24it/s, loss=-0.03717, sqweights=0.21357]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 80.73it/s, loss=-0.03680, sqweights=0.21437]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 81.38it/s, loss=-0.03703, sqweights=0.21454]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 81.90it/s, loss=-0.03703, sqweights=0.21454]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 81.90it/s, loss=-0.04028, sqweights=0.21486]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 81.90it/s, loss=-0.04205, sqweights=0.21541]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 81.90it/s, loss=-0.04230, sqweights=0.21473]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 81.90it/s, loss=-0.04088, sqweights=0.21467]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 81.90it/s, loss=-0.04039, sqweights=0.21437]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 81.90it/s, loss=-0.04102, sqweights=0.21482]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 81.90it/s, loss=-0.04097, sqweights=0.21537]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 81.90it/s, loss=-0.04060, sqweights=0.21587]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 81.90it/s, loss=-0.04171, sqweights=0.21623]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 83.44it/s, loss=-0.04171, sqweights=0.21623]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 83.44it/s, loss=-0.04193, sqweights=0.21721]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 83.44it/s, loss=-0.04268, sqweights=0.21759]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 83.44it/s, loss=-0.04294, sqweights=0.21770]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 83.44it/s, loss=-0.04294, sqweights=0.21770, train_loss=-0.04907, train_sqweights=0.18192, val_loss=-0.04307, val_sqweights=0.17912]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 83.44it/s, loss=-0.04294, sqweights=0.21770, train_loss=-0.04907, train_sqweights=0.18192, val_loss=-0.04307, val_sqweights=0.17912]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 28.71it/s, loss=-0.04294, sqweights=0.21770, train_loss=-0.04907, train_sqweights=0.18192, val_loss=-0.04307, val_sqweights=0.17912]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 62.27it/s, loss=-0.03759, sqweights=0.22183]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 72.06it/s, loss=-0.04407, sqweights=0.22774]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 76.15it/s, loss=-0.04295, sqweights=0.23217]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 78.32it/s, loss=-0.04274, sqweights=0.23236]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 78.50it/s, loss=-0.04374, sqweights=0.23288]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 79.28it/s, loss=-0.04331, sqweights=0.23427]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 80.18it/s, loss=-0.04562, sqweights=0.23413]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 80.78it/s, loss=-0.04555, sqweights=0.23465]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 81.14it/s, loss=-0.04555, sqweights=0.23465]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 81.14it/s, loss=-0.04628, sqweights=0.23357]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 81.14it/s, loss=-0.04800, sqweights=0.23408]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 81.14it/s, loss=-0.04885, sqweights=0.23401]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 81.14it/s, loss=-0.05000, sqweights=0.23507]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 81.14it/s, loss=-0.05049, sqweights=0.23541]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 81.14it/s, loss=-0.05136, sqweights=0.23584]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 81.14it/s, loss=-0.05174, sqweights=0.23661]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 81.14it/s, loss=-0.05141, sqweights=0.23711]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 81.14it/s, loss=-0.05165, sqweights=0.23805]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 83.54it/s, loss=-0.05165, sqweights=0.23805]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 83.54it/s, loss=-0.05162, sqweights=0.23865]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 83.54it/s, loss=-0.05120, sqweights=0.23906]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 83.54it/s, loss=-0.05021, sqweights=0.24026]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 83.54it/s, loss=-0.05021, sqweights=0.24026, train_loss=-0.05846, train_sqweights=0.19793, val_loss=-0.05079, val_sqweights=0.19403]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 83.54it/s, loss=-0.05021, sqweights=0.24026, train_loss=-0.05846, train_sqweights=0.19793, val_loss=-0.05079, val_sqweights=0.19403]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 28.63it/s, loss=-0.05021, sqweights=0.24026, train_loss=-0.05846, train_sqweights=0.19793, val_loss=-0.05079, val_sqweights=0.19403]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:02,  9.20it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:02,  9.20it/s, loss=-0.04418, sqweights=0.25112]
Epoch 8:  10%|#         | 2/20 [00:00<00:01,  9.20it/s, loss=-0.04191, sqweights=0.24505]
Epoch 8:  15%|#5        | 3/20 [00:00<00:01,  9.20it/s, loss=-0.05083, sqweights=0.24918]
Epoch 8:  20%|##        | 4/20 [00:00<00:01,  9.20it/s, loss=-0.04920, sqweights=0.25016]
Epoch 8:  25%|##5       | 5/20 [00:00<00:01,  9.20it/s, loss=-0.04738, sqweights=0.24997]
Epoch 8:  30%|###       | 6/20 [00:00<00:01,  9.20it/s, loss=-0.04937, sqweights=0.25233]
Epoch 8:  35%|###5      | 7/20 [00:00<00:01,  9.20it/s, loss=-0.05041, sqweights=0.25322]
Epoch 8:  40%|####      | 8/20 [00:00<00:01,  9.20it/s, loss=-0.04892, sqweights=0.25227]
Epoch 8:  45%|####5     | 9/20 [00:00<00:01,  9.20it/s, loss=-0.05101, sqweights=0.25434]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 53.32it/s, loss=-0.05101, sqweights=0.25434]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 53.32it/s, loss=-0.05415, sqweights=0.25523]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 53.32it/s, loss=-0.05608, sqweights=0.25593]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 53.32it/s, loss=-0.05640, sqweights=0.25721]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 53.32it/s, loss=-0.05534, sqweights=0.25769]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 53.32it/s, loss=-0.05492, sqweights=0.25827]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 53.32it/s, loss=-0.05504, sqweights=0.25787]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 53.32it/s, loss=-0.05576, sqweights=0.25818]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 53.32it/s, loss=-0.05437, sqweights=0.25877]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 53.32it/s, loss=-0.05493, sqweights=0.25885]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 67.52it/s, loss=-0.05493, sqweights=0.25885]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 67.52it/s, loss=-0.05582, sqweights=0.25974]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 67.52it/s, loss=-0.05576, sqweights=0.26132]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 67.52it/s, loss=-0.05576, sqweights=0.26132, train_loss=-0.06729, train_sqweights=0.21366, val_loss=-0.05804, val_sqweights=0.20864]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 67.52it/s, loss=-0.05576, sqweights=0.26132, train_loss=-0.06729, train_sqweights=0.21366, val_loss=-0.05804, val_sqweights=0.20864]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 25.31it/s, loss=-0.05576, sqweights=0.26132, train_loss=-0.06729, train_sqweights=0.21366, val_loss=-0.05804, val_sqweights=0.20864]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 60.92it/s, loss=-0.06685, sqweights=0.27727]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 70.36it/s, loss=-0.07256, sqweights=0.27458]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 74.65it/s, loss=-0.06865, sqweights=0.26770]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 76.98it/s, loss=-0.07179, sqweights=0.27147]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 78.23it/s, loss=-0.07061, sqweights=0.27115]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 79.04it/s, loss=-0.07209, sqweights=0.27338]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 79.35it/s, loss=-0.06968, sqweights=0.27485]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 80.07it/s, loss=-0.07008, sqweights=0.27782]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 80.69it/s, loss=-0.07008, sqweights=0.27782]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 80.69it/s, loss=-0.06883, sqweights=0.27949]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 80.69it/s, loss=-0.06691, sqweights=0.28109]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 80.69it/s, loss=-0.06840, sqweights=0.28299]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 80.69it/s, loss=-0.06757, sqweights=0.28411]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 80.69it/s, loss=-0.06760, sqweights=0.28467]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 80.69it/s, loss=-0.06654, sqweights=0.28439]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 80.69it/s, loss=-0.06651, sqweights=0.28480]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 80.69it/s, loss=-0.06575, sqweights=0.28531]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 80.69it/s, loss=-0.06540, sqweights=0.28562]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 83.27it/s, loss=-0.06540, sqweights=0.28562]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 83.27it/s, loss=-0.06611, sqweights=0.28612]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 83.27it/s, loss=-0.06626, sqweights=0.28731]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 83.27it/s, loss=-0.06589, sqweights=0.28838]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 83.27it/s, loss=-0.06589, sqweights=0.28838, train_loss=-0.07634, train_sqweights=0.23347, val_loss=-0.06536, val_sqweights=0.22704]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 83.27it/s, loss=-0.06589, sqweights=0.28838, train_loss=-0.07634, train_sqweights=0.23347, val_loss=-0.06536, val_sqweights=0.22704]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 28.64it/s, loss=-0.06589, sqweights=0.28838, train_loss=-0.07634, train_sqweights=0.23347, val_loss=-0.06536, val_sqweights=0.22704]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 62.14it/s, loss=-0.05118, sqweights=0.29837]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 68.98it/s, loss=-0.05762, sqweights=0.30550]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 73.81it/s, loss=-0.05803, sqweights=0.30483]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 76.52it/s, loss=-0.06208, sqweights=0.30212]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 77.07it/s, loss=-0.06406, sqweights=0.30131]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 78.18it/s, loss=-0.06606, sqweights=0.30592]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 78.90it/s, loss=-0.06478, sqweights=0.30697]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 79.61it/s, loss=-0.06478, sqweights=0.30697]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 79.61it/s, loss=-0.06206, sqweights=0.30780]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 79.61it/s, loss=-0.06436, sqweights=0.30833]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 79.61it/s, loss=-0.06310, sqweights=0.30798]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 79.61it/s, loss=-0.06307, sqweights=0.30916]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 79.61it/s, loss=-0.06364, sqweights=0.31034]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 79.61it/s, loss=-0.06448, sqweights=0.31051]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 79.61it/s, loss=-0.06573, sqweights=0.31140]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 79.61it/s, loss=-0.06591, sqweights=0.31195]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 79.61it/s, loss=-0.06553, sqweights=0.31234]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 82.23it/s, loss=-0.06553, sqweights=0.31234]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 82.23it/s, loss=-0.06560, sqweights=0.31269]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 82.23it/s, loss=-0.06722, sqweights=0.31342]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 82.23it/s, loss=-0.06785, sqweights=0.31402]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 82.23it/s, loss=-0.06694, sqweights=0.31392]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 82.23it/s, loss=-0.06694, sqweights=0.31392, train_loss=-0.08468, train_sqweights=0.25275, val_loss=-0.07217, val_sqweights=0.24505]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 82.23it/s, loss=-0.06694, sqweights=0.31392, train_loss=-0.08468, train_sqweights=0.25275, val_loss=-0.07217, val_sqweights=0.24505]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 28.47it/s, loss=-0.06694, sqweights=0.31392, train_loss=-0.08468, train_sqweights=0.25275, val_loss=-0.07217, val_sqweights=0.24505]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 62.57it/s, loss=-0.07020, sqweights=0.31674]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 70.84it/s, loss=-0.06949, sqweights=0.32710]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 75.39it/s, loss=-0.07015, sqweights=0.32991]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 77.77it/s, loss=-0.06887, sqweights=0.33148]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 79.46it/s, loss=-0.06424, sqweights=0.33253]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 80.54it/s, loss=-0.06483, sqweights=0.33243]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 80.99it/s, loss=-0.06691, sqweights=0.33441]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 81.43it/s, loss=-0.06985, sqweights=0.33461]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 81.62it/s, loss=-0.06985, sqweights=0.33461]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 81.62it/s, loss=-0.07059, sqweights=0.33455]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 81.62it/s, loss=-0.07134, sqweights=0.33479]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 81.62it/s, loss=-0.07208, sqweights=0.33647]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 81.62it/s, loss=-0.07345, sqweights=0.33693]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 81.62it/s, loss=-0.07411, sqweights=0.33853]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 81.62it/s, loss=-0.07245, sqweights=0.33849]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 81.62it/s, loss=-0.07305, sqweights=0.33967]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 81.62it/s, loss=-0.07386, sqweights=0.34010]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 81.62it/s, loss=-0.07433, sqweights=0.33973]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 83.31it/s, loss=-0.07433, sqweights=0.33973]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 83.31it/s, loss=-0.07495, sqweights=0.34071]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 83.31it/s, loss=-0.07447, sqweights=0.34060]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.07431, sqweights=0.34098]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.07431, sqweights=0.34098, train_loss=-0.09252, train_sqweights=0.27437, val_loss=-0.07872, val_sqweights=0.26547]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.07431, sqweights=0.34098, train_loss=-0.09252, train_sqweights=0.27437, val_loss=-0.07872, val_sqweights=0.26547]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 25.29it/s, loss=-0.07431, sqweights=0.34098, train_loss=-0.09252, train_sqweights=0.27437, val_loss=-0.07872, val_sqweights=0.26547]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 62.45it/s, loss=-0.07609, sqweights=0.36109]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 71.67it/s, loss=-0.05990, sqweights=0.34726]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 75.77it/s, loss=-0.06769, sqweights=0.35078]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 76.58it/s, loss=-0.07057, sqweights=0.35503]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 78.09it/s, loss=-0.07167, sqweights=0.35568]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 79.34it/s, loss=-0.07470, sqweights=0.35893]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 80.26it/s, loss=-0.07485, sqweights=0.35871]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 80.85it/s, loss=-0.07438, sqweights=0.35978]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 81.37it/s, loss=-0.07438, sqweights=0.35978]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 81.37it/s, loss=-0.07519, sqweights=0.35999]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 81.37it/s, loss=-0.07628, sqweights=0.36129]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 81.37it/s, loss=-0.07609, sqweights=0.36272]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 81.37it/s, loss=-0.07605, sqweights=0.36342]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 81.37it/s, loss=-0.07759, sqweights=0.36401]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 81.37it/s, loss=-0.07806, sqweights=0.36494]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 81.37it/s, loss=-0.07951, sqweights=0.36590]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 81.37it/s, loss=-0.07932, sqweights=0.36750]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 81.37it/s, loss=-0.07919, sqweights=0.36835]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 83.77it/s, loss=-0.07919, sqweights=0.36835]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 83.77it/s, loss=-0.07873, sqweights=0.36880]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 83.77it/s, loss=-0.07949, sqweights=0.36950]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 83.77it/s, loss=-0.08082, sqweights=0.37141]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 83.77it/s, loss=-0.08082, sqweights=0.37141, train_loss=-0.09992, train_sqweights=0.29623, val_loss=-0.08477, val_sqweights=0.28630]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 83.77it/s, loss=-0.08082, sqweights=0.37141, train_loss=-0.09992, train_sqweights=0.29623, val_loss=-0.08477, val_sqweights=0.28630]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 28.63it/s, loss=-0.08082, sqweights=0.37141, train_loss=-0.09992, train_sqweights=0.29623, val_loss=-0.08477, val_sqweights=0.28630]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 61.85it/s, loss=-0.08453, sqweights=0.38096]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 71.43it/s, loss=-0.08981, sqweights=0.38516]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 75.06it/s, loss=-0.07861, sqweights=0.38047]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 77.12it/s, loss=-0.07884, sqweights=0.37918]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 78.77it/s, loss=-0.07875, sqweights=0.38354]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 79.76it/s, loss=-0.08235, sqweights=0.38548]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 80.47it/s, loss=-0.08104, sqweights=0.38541]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 81.08it/s, loss=-0.07954, sqweights=0.38324]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 81.47it/s, loss=-0.07954, sqweights=0.38324]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 81.47it/s, loss=-0.08071, sqweights=0.38605]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 81.47it/s, loss=-0.08167, sqweights=0.38878]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 81.47it/s, loss=-0.08295, sqweights=0.39140]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 81.47it/s, loss=-0.08171, sqweights=0.39358]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 81.47it/s, loss=-0.08351, sqweights=0.39353]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 81.47it/s, loss=-0.08468, sqweights=0.39417]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 81.47it/s, loss=-0.08625, sqweights=0.39654]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 81.47it/s, loss=-0.08721, sqweights=0.39691]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 81.47it/s, loss=-0.08612, sqweights=0.39584]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 82.97it/s, loss=-0.08612, sqweights=0.39584]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 82.97it/s, loss=-0.08800, sqweights=0.39553]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 82.97it/s, loss=-0.08790, sqweights=0.39534]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.08661, sqweights=0.39485]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.08661, sqweights=0.39485, train_loss=-0.10717, train_sqweights=0.31931, val_loss=-0.09070, val_sqweights=0.30857]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.08661, sqweights=0.39485, train_loss=-0.10717, train_sqweights=0.31931, val_loss=-0.09070, val_sqweights=0.30857]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 28.71it/s, loss=-0.08661, sqweights=0.39485, train_loss=-0.10717, train_sqweights=0.31931, val_loss=-0.09070, val_sqweights=0.30857]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 61.49it/s, loss=-0.09515, sqweights=0.42961]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 70.74it/s, loss=-0.09257, sqweights=0.41485]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 74.23it/s, loss=-0.08823, sqweights=0.41037]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 76.61it/s, loss=-0.09504, sqweights=0.41162]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 78.27it/s, loss=-0.09229, sqweights=0.40937]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 78.64it/s, loss=-0.09193, sqweights=0.41225]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 79.56it/s, loss=-0.09250, sqweights=0.41100]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 80.24it/s, loss=-0.09238, sqweights=0.41406]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 80.75it/s, loss=-0.09238, sqweights=0.41406]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 80.75it/s, loss=-0.09448, sqweights=0.41616]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 80.75it/s, loss=-0.09363, sqweights=0.41567]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 80.75it/s, loss=-0.09458, sqweights=0.41777]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 80.75it/s, loss=-0.09517, sqweights=0.41898]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 80.75it/s, loss=-0.09465, sqweights=0.42085]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 80.75it/s, loss=-0.09464, sqweights=0.41942]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 80.75it/s, loss=-0.09380, sqweights=0.42049]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 80.75it/s, loss=-0.09201, sqweights=0.42175]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 80.75it/s, loss=-0.09175, sqweights=0.42176]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 83.07it/s, loss=-0.09175, sqweights=0.42176]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 83.07it/s, loss=-0.09085, sqweights=0.42243]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 83.07it/s, loss=-0.09124, sqweights=0.42270]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 83.07it/s, loss=-0.09038, sqweights=0.42349]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 83.07it/s, loss=-0.09038, sqweights=0.42349, train_loss=-0.11393, train_sqweights=0.34360, val_loss=-0.09623, val_sqweights=0.33230]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 83.07it/s, loss=-0.09038, sqweights=0.42349, train_loss=-0.11393, train_sqweights=0.34360, val_loss=-0.09623, val_sqweights=0.33230]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 28.65it/s, loss=-0.09038, sqweights=0.42349, train_loss=-0.11393, train_sqweights=0.34360, val_loss=-0.09623, val_sqweights=0.33230]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 59.23it/s, loss=-0.11447, sqweights=0.43478]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 69.55it/s, loss=-0.10311, sqweights=0.43153]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 73.53it/s, loss=-0.09081, sqweights=0.43612]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 75.72it/s, loss=-0.09167, sqweights=0.43854]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 76.99it/s, loss=-0.09019, sqweights=0.44230]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 78.08it/s, loss=-0.08832, sqweights=0.44098]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 78.76it/s, loss=-0.08848, sqweights=0.44515]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 79.30it/s, loss=-0.08848, sqweights=0.44515]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 79.30it/s, loss=-0.08826, sqweights=0.44457]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 79.30it/s, loss=-0.09042, sqweights=0.44675]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 79.30it/s, loss=-0.09128, sqweights=0.44656]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 79.30it/s, loss=-0.09138, sqweights=0.44646]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 79.30it/s, loss=-0.09370, sqweights=0.44700]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 79.30it/s, loss=-0.09444, sqweights=0.44894]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 79.30it/s, loss=-0.09382, sqweights=0.44840]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 79.30it/s, loss=-0.09431, sqweights=0.44928]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 79.30it/s, loss=-0.09410, sqweights=0.45046]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 81.75it/s, loss=-0.09410, sqweights=0.45046]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 81.75it/s, loss=-0.09604, sqweights=0.45155]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 81.75it/s, loss=-0.09702, sqweights=0.45113]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 81.75it/s, loss=-0.09657, sqweights=0.45164]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 81.75it/s, loss=-0.09784, sqweights=0.45317]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 81.75it/s, loss=-0.09784, sqweights=0.45317, train_loss=-0.11962, train_sqweights=0.36693, val_loss=-0.10127, val_sqweights=0.35531]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 81.75it/s, loss=-0.09784, sqweights=0.45317, train_loss=-0.11962, train_sqweights=0.36693, val_loss=-0.10127, val_sqweights=0.35531]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 25.08it/s, loss=-0.09784, sqweights=0.45317, train_loss=-0.11962, train_sqweights=0.36693, val_loss=-0.10127, val_sqweights=0.35531]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 61.36it/s, loss=-0.09266, sqweights=0.45921]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 70.09it/s, loss=-0.10229, sqweights=0.46094]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 72.97it/s, loss=-0.10274, sqweights=0.46440]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 75.64it/s, loss=-0.10021, sqweights=0.46302]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 77.35it/s, loss=-0.09805, sqweights=0.46730]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 78.52it/s, loss=-0.09784, sqweights=0.46677]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 79.30it/s, loss=-0.09667, sqweights=0.46211]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 79.00it/s, loss=-0.09667, sqweights=0.46211]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 79.00it/s, loss=-0.09539, sqweights=0.46299]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 79.00it/s, loss=-0.09676, sqweights=0.46453]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 79.00it/s, loss=-0.09482, sqweights=0.46714]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 79.00it/s, loss=-0.09634, sqweights=0.46663]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 79.00it/s, loss=-0.09669, sqweights=0.46818]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 79.00it/s, loss=-0.09678, sqweights=0.46776]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 79.00it/s, loss=-0.09695, sqweights=0.46780]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 79.00it/s, loss=-0.09703, sqweights=0.47051]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 79.00it/s, loss=-0.09665, sqweights=0.47167]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 82.00it/s, loss=-0.09665, sqweights=0.47167]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 82.00it/s, loss=-0.09575, sqweights=0.47219]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 82.00it/s, loss=-0.09673, sqweights=0.47199]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 82.00it/s, loss=-0.09752, sqweights=0.47418]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 82.00it/s, loss=-0.09652, sqweights=0.47463]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 82.00it/s, loss=-0.09652, sqweights=0.47463, train_loss=-0.12510, train_sqweights=0.39131, val_loss=-0.10593, val_sqweights=0.37915]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 82.00it/s, loss=-0.09652, sqweights=0.47463, train_loss=-0.12510, train_sqweights=0.39131, val_loss=-0.10593, val_sqweights=0.37915]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 28.41it/s, loss=-0.09652, sqweights=0.47463, train_loss=-0.12510, train_sqweights=0.39131, val_loss=-0.10593, val_sqweights=0.37915]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 62.26it/s, loss=-0.09600, sqweights=0.52309]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 71.63it/s, loss=-0.10958, sqweights=0.49620]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 73.69it/s, loss=-0.10217, sqweights=0.49638]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 76.36it/s, loss=-0.10753, sqweights=0.49171]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 77.70it/s, loss=-0.10766, sqweights=0.48674]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 78.87it/s, loss=-0.10514, sqweights=0.48664]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 79.70it/s, loss=-0.10365, sqweights=0.48838]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 80.35it/s, loss=-0.10102, sqweights=0.49284]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 80.98it/s, loss=-0.10102, sqweights=0.49284]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 80.98it/s, loss=-0.10113, sqweights=0.49168]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 80.98it/s, loss=-0.10014, sqweights=0.49262]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 80.98it/s, loss=-0.10062, sqweights=0.49158]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 80.98it/s, loss=-0.10005, sqweights=0.49205]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 80.98it/s, loss=-0.10007, sqweights=0.49306]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 80.98it/s, loss=-0.10015, sqweights=0.49502]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 80.98it/s, loss=-0.10102, sqweights=0.49488]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 80.98it/s, loss=-0.10104, sqweights=0.49721]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 80.98it/s, loss=-0.10220, sqweights=0.49918]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 82.97it/s, loss=-0.10220, sqweights=0.49918]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 82.97it/s, loss=-0.10324, sqweights=0.49886]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 82.97it/s, loss=-0.10210, sqweights=0.49818]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.10154, sqweights=0.49935]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.10154, sqweights=0.49935, train_loss=-0.13009, train_sqweights=0.41318, val_loss=-0.11020, val_sqweights=0.40025]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 82.97it/s, loss=-0.10154, sqweights=0.49935, train_loss=-0.13009, train_sqweights=0.41318, val_loss=-0.11020, val_sqweights=0.40025]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 28.50it/s, loss=-0.10154, sqweights=0.49935, train_loss=-0.13009, train_sqweights=0.41318, val_loss=-0.11020, val_sqweights=0.40025]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 61.42it/s, loss=-0.09459, sqweights=0.52113]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 70.71it/s, loss=-0.09885, sqweights=0.52329]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 74.86it/s, loss=-0.10257, sqweights=0.52816]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 77.16it/s, loss=-0.09881, sqweights=0.52641]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 78.57it/s, loss=-0.09885, sqweights=0.52445]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 78.87it/s, loss=-0.09849, sqweights=0.52373]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 79.67it/s, loss=-0.10100, sqweights=0.52448]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 80.28it/s, loss=-0.10246, sqweights=0.52291]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 80.78it/s, loss=-0.10246, sqweights=0.52291]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 80.78it/s, loss=-0.10096, sqweights=0.52158]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 80.78it/s, loss=-0.10100, sqweights=0.52011]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 80.78it/s, loss=-0.10200, sqweights=0.52041]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 80.78it/s, loss=-0.10179, sqweights=0.52073]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 80.78it/s, loss=-0.10136, sqweights=0.52050]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 80.78it/s, loss=-0.10041, sqweights=0.52030]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 80.78it/s, loss=-0.10147, sqweights=0.52118]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 80.78it/s, loss=-0.10178, sqweights=0.51901]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 80.78it/s, loss=-0.10128, sqweights=0.52009]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 82.21it/s, loss=-0.10128, sqweights=0.52009]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 82.21it/s, loss=-0.10246, sqweights=0.52103]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 82.21it/s, loss=-0.10139, sqweights=0.52043]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 82.21it/s, loss=-0.10006, sqweights=0.51996]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 82.21it/s, loss=-0.10006, sqweights=0.51996, train_loss=-0.13460, train_sqweights=0.43363, val_loss=-0.11358, val_sqweights=0.42063]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 82.21it/s, loss=-0.10006, sqweights=0.51996, train_loss=-0.13460, train_sqweights=0.43363, val_loss=-0.11358, val_sqweights=0.42063]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 28.28it/s, loss=-0.10006, sqweights=0.51996, train_loss=-0.13460, train_sqweights=0.43363, val_loss=-0.11358, val_sqweights=0.42063]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 62.30it/s, loss=-0.11431, sqweights=0.53782]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 71.51it/s, loss=-0.11247, sqweights=0.54442]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 75.16it/s, loss=-0.10896, sqweights=0.54349]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 77.09it/s, loss=-0.10858, sqweights=0.54018]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 76.90it/s, loss=-0.10714, sqweights=0.54216]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 77.92it/s, loss=-0.10742, sqweights=0.54048]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 78.87it/s, loss=-0.10830, sqweights=0.54015]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 79.69it/s, loss=-0.10830, sqweights=0.54015]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 79.69it/s, loss=-0.11050, sqweights=0.53796]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 79.69it/s, loss=-0.11260, sqweights=0.53823]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 79.69it/s, loss=-0.10973, sqweights=0.53719]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 79.69it/s, loss=-0.10998, sqweights=0.53751]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 79.69it/s, loss=-0.10901, sqweights=0.53781]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 79.69it/s, loss=-0.10827, sqweights=0.53940]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 79.69it/s, loss=-0.10670, sqweights=0.54060]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 79.69it/s, loss=-0.10717, sqweights=0.54140]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 79.69it/s, loss=-0.10757, sqweights=0.54159]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 82.34it/s, loss=-0.10757, sqweights=0.54159]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 82.34it/s, loss=-0.10743, sqweights=0.54156]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 82.34it/s, loss=-0.10729, sqweights=0.54191]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 82.34it/s, loss=-0.10607, sqweights=0.54229]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 82.34it/s, loss=-0.10615, sqweights=0.54193]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 82.34it/s, loss=-0.10615, sqweights=0.54193, train_loss=-0.13871, train_sqweights=0.45245, val_loss=-0.11689, val_sqweights=0.43975]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 82.34it/s, loss=-0.10615, sqweights=0.54193, train_loss=-0.13871, train_sqweights=0.45245, val_loss=-0.11689, val_sqweights=0.43975]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 25.05it/s, loss=-0.10615, sqweights=0.54193, train_loss=-0.13871, train_sqweights=0.45245, val_loss=-0.11689, val_sqweights=0.43975]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 60.38it/s, loss=-0.08790, sqweights=0.55737]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 70.82it/s, loss=-0.10162, sqweights=0.56754]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 75.09it/s, loss=-0.10153, sqweights=0.56527]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 77.39it/s, loss=-0.10480, sqweights=0.56645]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 78.54it/s, loss=-0.10300, sqweights=0.56287]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 79.50it/s, loss=-0.10844, sqweights=0.56030]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 80.26it/s, loss=-0.10745, sqweights=0.55663]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 80.77it/s, loss=-0.10891, sqweights=0.55864]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 81.39it/s, loss=-0.10891, sqweights=0.55864]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 81.39it/s, loss=-0.10806, sqweights=0.55718]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 81.39it/s, loss=-0.10712, sqweights=0.55534]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 81.39it/s, loss=-0.10870, sqweights=0.55613]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 81.39it/s, loss=-0.10917, sqweights=0.55651]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 81.39it/s, loss=-0.10968, sqweights=0.55710]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 81.39it/s, loss=-0.11014, sqweights=0.55659]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 81.39it/s, loss=-0.10750, sqweights=0.55623]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 81.39it/s, loss=-0.10798, sqweights=0.55754]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 81.39it/s, loss=-0.10617, sqweights=0.55784]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 82.98it/s, loss=-0.10617, sqweights=0.55784]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 82.98it/s, loss=-0.10596, sqweights=0.55915]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 82.98it/s, loss=-0.10677, sqweights=0.55994]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 82.98it/s, loss=-0.10801, sqweights=0.56010]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 82.98it/s, loss=-0.10801, sqweights=0.56010, train_loss=-0.14286, train_sqweights=0.47312, val_loss=-0.12025, val_sqweights=0.46021]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 82.98it/s, loss=-0.10801, sqweights=0.56010, train_loss=-0.14286, train_sqweights=0.47312, val_loss=-0.12025, val_sqweights=0.46021]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 28.58it/s, loss=-0.10801, sqweights=0.56010, train_loss=-0.14286, train_sqweights=0.47312, val_loss=-0.12025, val_sqweights=0.46021]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 61.34it/s, loss=-0.09513, sqweights=0.56212]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 68.73it/s, loss=-0.10995, sqweights=0.55230]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 73.06it/s, loss=-0.10697, sqweights=0.55490]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 75.84it/s, loss=-0.10797, sqweights=0.56001]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 77.65it/s, loss=-0.10979, sqweights=0.55996]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 78.83it/s, loss=-0.11283, sqweights=0.56809]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 79.59it/s, loss=-0.11354, sqweights=0.57252]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 79.80it/s, loss=-0.11354, sqweights=0.57252]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 79.80it/s, loss=-0.11236, sqweights=0.57080]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 79.80it/s, loss=-0.10985, sqweights=0.57046]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 79.80it/s, loss=-0.10898, sqweights=0.57365]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 79.80it/s, loss=-0.11110, sqweights=0.57411]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 79.80it/s, loss=-0.11101, sqweights=0.57713]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 79.80it/s, loss=-0.11314, sqweights=0.57804]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 79.80it/s, loss=-0.11326, sqweights=0.57696]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 79.80it/s, loss=-0.11479, sqweights=0.57796]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 79.80it/s, loss=-0.11532, sqweights=0.57939]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 82.69it/s, loss=-0.11532, sqweights=0.57939]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 82.69it/s, loss=-0.11483, sqweights=0.58011]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 82.69it/s, loss=-0.11424, sqweights=0.58071]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 82.69it/s, loss=-0.11305, sqweights=0.58155]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.11065, sqweights=0.58123]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.11065, sqweights=0.58123, train_loss=-0.14646, train_sqweights=0.49422, val_loss=-0.12295, val_sqweights=0.48178]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 82.69it/s, loss=-0.11065, sqweights=0.58123, train_loss=-0.14646, train_sqweights=0.49422, val_loss=-0.12295, val_sqweights=0.48178]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 28.49it/s, loss=-0.11065, sqweights=0.58123, train_loss=-0.14646, train_sqweights=0.49422, val_loss=-0.12295, val_sqweights=0.48178]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 62.41it/s, loss=-0.11888, sqweights=0.60440]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 72.31it/s, loss=-0.11421, sqweights=0.59736]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 75.01it/s, loss=-0.11380, sqweights=0.59556]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 77.68it/s, loss=-0.11179, sqweights=0.59277]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 79.17it/s, loss=-0.11634, sqweights=0.59383]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 80.15it/s, loss=-0.11497, sqweights=0.59295]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 80.47it/s, loss=-0.11640, sqweights=0.59442]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 81.10it/s, loss=-0.11792, sqweights=0.59548]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 81.52it/s, loss=-0.11792, sqweights=0.59548]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 81.52it/s, loss=-0.11651, sqweights=0.59689]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 81.52it/s, loss=-0.11598, sqweights=0.59931]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 81.52it/s, loss=-0.11587, sqweights=0.59935]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 81.52it/s, loss=-0.11598, sqweights=0.59995]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 81.52it/s, loss=-0.11663, sqweights=0.60081]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 81.52it/s, loss=-0.11781, sqweights=0.59981]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 81.52it/s, loss=-0.11660, sqweights=0.59949]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 81.52it/s, loss=-0.11609, sqweights=0.60047]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 81.52it/s, loss=-0.11508, sqweights=0.59967]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 83.64it/s, loss=-0.11508, sqweights=0.59967]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 83.64it/s, loss=-0.11425, sqweights=0.59923]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 83.64it/s, loss=-0.11399, sqweights=0.59935]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 83.64it/s, loss=-0.11393, sqweights=0.60073]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 83.64it/s, loss=-0.11393, sqweights=0.60073, train_loss=-0.14965, train_sqweights=0.51333, val_loss=-0.12524, val_sqweights=0.50138]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 83.64it/s, loss=-0.11393, sqweights=0.60073, train_loss=-0.14965, train_sqweights=0.51333, val_loss=-0.12524, val_sqweights=0.50138]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 28.80it/s, loss=-0.11393, sqweights=0.60073, train_loss=-0.14965, train_sqweights=0.51333, val_loss=-0.12524, val_sqweights=0.50138]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:02,  8.75it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:02,  8.75it/s, loss=-0.11889, sqweights=0.60337]
Epoch 23:  10%|#         | 2/20 [00:00<00:02,  8.75it/s, loss=-0.11639, sqweights=0.60719]
Epoch 23:  15%|#5        | 3/20 [00:00<00:01,  8.75it/s, loss=-0.11556, sqweights=0.61311]
Epoch 23:  20%|##        | 4/20 [00:00<00:01,  8.75it/s, loss=-0.11126, sqweights=0.61759]
Epoch 23:  25%|##5       | 5/20 [00:00<00:01,  8.75it/s, loss=-0.11288, sqweights=0.61903]
Epoch 23:  30%|###       | 6/20 [00:00<00:01,  8.75it/s, loss=-0.11447, sqweights=0.61908]
Epoch 23:  35%|###5      | 7/20 [00:00<00:01,  8.75it/s, loss=-0.11206, sqweights=0.61776]
Epoch 23:  40%|####      | 8/20 [00:00<00:01,  8.75it/s, loss=-0.11316, sqweights=0.61775]
Epoch 23:  45%|####5     | 9/20 [00:00<00:01,  8.75it/s, loss=-0.11505, sqweights=0.61490]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 52.20it/s, loss=-0.11505, sqweights=0.61490]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 52.20it/s, loss=-0.11356, sqweights=0.61494]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 52.20it/s, loss=-0.11680, sqweights=0.61482]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 52.20it/s, loss=-0.11812, sqweights=0.61543]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 52.20it/s, loss=-0.11827, sqweights=0.61439]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 52.20it/s, loss=-0.11990, sqweights=0.61401]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 52.20it/s, loss=-0.11954, sqweights=0.61246]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 52.20it/s, loss=-0.11861, sqweights=0.61279]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 52.20it/s, loss=-0.11901, sqweights=0.61470]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 52.20it/s, loss=-0.11770, sqweights=0.61525]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 67.02it/s, loss=-0.11770, sqweights=0.61525]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 67.02it/s, loss=-0.11676, sqweights=0.61544]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 67.02it/s, loss=-0.11644, sqweights=0.61450]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 67.02it/s, loss=-0.11644, sqweights=0.61450, train_loss=-0.15257, train_sqweights=0.53450, val_loss=-0.12720, val_sqweights=0.52205]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 67.02it/s, loss=-0.11644, sqweights=0.61450, train_loss=-0.15257, train_sqweights=0.53450, val_loss=-0.12720, val_sqweights=0.52205]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 25.07it/s, loss=-0.11644, sqweights=0.61450, train_loss=-0.15257, train_sqweights=0.53450, val_loss=-0.12720, val_sqweights=0.52205]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 60.26it/s, loss=-0.11990, sqweights=0.60645]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 70.31it/s, loss=-0.11455, sqweights=0.62099]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 74.91it/s, loss=-0.12310, sqweights=0.62572]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 77.49it/s, loss=-0.12136, sqweights=0.62996]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 78.23it/s, loss=-0.12272, sqweights=0.63101]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 79.09it/s, loss=-0.12240, sqweights=0.63439]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 79.88it/s, loss=-0.12071, sqweights=0.63187]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 80.41it/s, loss=-0.12317, sqweights=0.63226]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 80.89it/s, loss=-0.12317, sqweights=0.63226]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 80.89it/s, loss=-0.12046, sqweights=0.63436]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 80.89it/s, loss=-0.11981, sqweights=0.63293]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 80.89it/s, loss=-0.11877, sqweights=0.63344]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 80.89it/s, loss=-0.11553, sqweights=0.63558]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 80.89it/s, loss=-0.11653, sqweights=0.63654]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 80.89it/s, loss=-0.11580, sqweights=0.63516]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 80.89it/s, loss=-0.11632, sqweights=0.63562]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 80.89it/s, loss=-0.11660, sqweights=0.63701]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 80.89it/s, loss=-0.11585, sqweights=0.63766]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 83.30it/s, loss=-0.11585, sqweights=0.63766]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 83.30it/s, loss=-0.11646, sqweights=0.63899]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 83.30it/s, loss=-0.11536, sqweights=0.63914]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.11501, sqweights=0.63979]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.11501, sqweights=0.63979, train_loss=-0.15528, train_sqweights=0.55425, val_loss=-0.12911, val_sqweights=0.54195]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 83.30it/s, loss=-0.11501, sqweights=0.63979, train_loss=-0.15528, train_sqweights=0.55425, val_loss=-0.12911, val_sqweights=0.54195]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 28.62it/s, loss=-0.11501, sqweights=0.63979, train_loss=-0.15528, train_sqweights=0.55425, val_loss=-0.12911, val_sqweights=0.54195]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:00, 60.34it/s, loss=-0.08609, sqweights=0.62193]
Epoch 25:  10%|#         | 2/20 [00:00<00:00, 70.05it/s, loss=-0.11177, sqweights=0.62661]
Epoch 25:  15%|#5        | 3/20 [00:00<00:00, 73.92it/s, loss=-0.11374, sqweights=0.63367]
Epoch 25:  20%|##        | 4/20 [00:00<00:00, 76.15it/s, loss=-0.12055, sqweights=0.63760]
Epoch 25:  25%|##5       | 5/20 [00:00<00:00, 77.23it/s, loss=-0.11487, sqweights=0.63959]
Epoch 25:  30%|###       | 6/20 [00:00<00:00, 78.13it/s, loss=-0.11286, sqweights=0.64036]
Epoch 25:  35%|###5      | 7/20 [00:00<00:00, 78.88it/s, loss=-0.11455, sqweights=0.64140]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 79.62it/s, loss=-0.11455, sqweights=0.64140]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 79.62it/s, loss=-0.11445, sqweights=0.64437]
Epoch 25:  45%|####5     | 9/20 [00:00<00:00, 79.62it/s, loss=-0.11462, sqweights=0.64659]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 79.62it/s, loss=-0.11342, sqweights=0.64826]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 79.62it/s, loss=-0.11237, sqweights=0.64975]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 79.62it/s, loss=-0.11261, sqweights=0.65118]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 79.62it/s, loss=-0.11425, sqweights=0.65176]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 79.62it/s, loss=-0.11581, sqweights=0.65298]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 79.62it/s, loss=-0.11526, sqweights=0.65254]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 79.62it/s, loss=-0.11525, sqweights=0.65298]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 82.81it/s, loss=-0.11525, sqweights=0.65298]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 82.81it/s, loss=-0.11649, sqweights=0.65346]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 82.81it/s, loss=-0.11711, sqweights=0.65394]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 82.81it/s, loss=-0.11761, sqweights=0.65547]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 82.81it/s, loss=-0.11777, sqweights=0.65735]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 82.81it/s, loss=-0.11777, sqweights=0.65735, train_loss=-0.15804, train_sqweights=0.56977, val_loss=-0.13129, val_sqweights=0.55731]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 82.81it/s, loss=-0.11777, sqweights=0.65735, train_loss=-0.15804, train_sqweights=0.56977, val_loss=-0.13129, val_sqweights=0.55731]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 28.40it/s, loss=-0.11777, sqweights=0.65735, train_loss=-0.15804, train_sqweights=0.56977, val_loss=-0.13129, val_sqweights=0.55731]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 58.91it/s, loss=-0.10817, sqweights=0.65772]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 69.03it/s, loss=-0.11281, sqweights=0.65257]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 73.56it/s, loss=-0.10967, sqweights=0.65910]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 76.37it/s, loss=-0.11253, sqweights=0.66415]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 77.76it/s, loss=-0.11458, sqweights=0.66623]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 78.82it/s, loss=-0.11259, sqweights=0.66674]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 79.34it/s, loss=-0.11464, sqweights=0.67206]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 79.98it/s, loss=-0.11317, sqweights=0.66850]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 80.57it/s, loss=-0.11317, sqweights=0.66850]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 80.57it/s, loss=-0.11303, sqweights=0.67034]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 80.57it/s, loss=-0.11306, sqweights=0.67068]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 80.57it/s, loss=-0.11643, sqweights=0.67023]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 80.57it/s, loss=-0.11750, sqweights=0.67125]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 80.57it/s, loss=-0.11786, sqweights=0.66931]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 80.57it/s, loss=-0.11730, sqweights=0.66835]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 80.57it/s, loss=-0.11843, sqweights=0.66995]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 80.57it/s, loss=-0.11891, sqweights=0.67062]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 80.57it/s, loss=-0.11945, sqweights=0.67143]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 83.01it/s, loss=-0.11945, sqweights=0.67143]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 83.01it/s, loss=-0.12031, sqweights=0.67143]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 83.01it/s, loss=-0.12014, sqweights=0.67105]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 83.01it/s, loss=-0.11880, sqweights=0.67061]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 83.01it/s, loss=-0.11880, sqweights=0.67061, train_loss=-0.16052, train_sqweights=0.58600, val_loss=-0.13296, val_sqweights=0.57441]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 83.01it/s, loss=-0.11880, sqweights=0.67061, train_loss=-0.16052, train_sqweights=0.58600, val_loss=-0.13296, val_sqweights=0.57441]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 25.02it/s, loss=-0.11880, sqweights=0.67061, train_loss=-0.16052, train_sqweights=0.58600, val_loss=-0.13296, val_sqweights=0.57441]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 61.13it/s, loss=-0.11823, sqweights=0.68028]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 68.45it/s, loss=-0.12549, sqweights=0.68949]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 72.96it/s, loss=-0.11913, sqweights=0.69396]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 75.43it/s, loss=-0.12052, sqweights=0.68999]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 76.95it/s, loss=-0.12495, sqweights=0.69151]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 77.94it/s, loss=-0.12128, sqweights=0.68935]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 78.92it/s, loss=-0.12190, sqweights=0.68916]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 79.77it/s, loss=-0.12190, sqweights=0.68916]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 79.77it/s, loss=-0.12371, sqweights=0.69185]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 79.77it/s, loss=-0.12180, sqweights=0.69184]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 79.77it/s, loss=-0.12135, sqweights=0.69145]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 79.77it/s, loss=-0.12442, sqweights=0.68959]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 79.77it/s, loss=-0.12658, sqweights=0.68933]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 79.77it/s, loss=-0.12459, sqweights=0.68914]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 79.77it/s, loss=-0.12389, sqweights=0.68964]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 79.77it/s, loss=-0.12414, sqweights=0.68848]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 79.77it/s, loss=-0.12234, sqweights=0.68805]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 82.87it/s, loss=-0.12234, sqweights=0.68805]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 82.87it/s, loss=-0.12135, sqweights=0.68755]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 82.87it/s, loss=-0.12026, sqweights=0.68774]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 82.87it/s, loss=-0.12113, sqweights=0.68861]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 82.87it/s, loss=-0.12130, sqweights=0.68916]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 82.87it/s, loss=-0.12130, sqweights=0.68916, train_loss=-0.16300, train_sqweights=0.60217, val_loss=-0.13436, val_sqweights=0.59069]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 82.87it/s, loss=-0.12130, sqweights=0.68916, train_loss=-0.16300, train_sqweights=0.60217, val_loss=-0.13436, val_sqweights=0.59069]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 28.71it/s, loss=-0.12130, sqweights=0.68916, train_loss=-0.16300, train_sqweights=0.60217, val_loss=-0.13436, val_sqweights=0.59069]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 61.06it/s, loss=-0.12782, sqweights=0.69278]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 69.91it/s, loss=-0.13676, sqweights=0.69673]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 74.31it/s, loss=-0.13037, sqweights=0.68731]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 76.90it/s, loss=-0.12578, sqweights=0.68412]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 78.02it/s, loss=-0.11699, sqweights=0.68747]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 79.15it/s, loss=-0.12252, sqweights=0.68939]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 79.86it/s, loss=-0.12438, sqweights=0.68930]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 80.31it/s, loss=-0.12255, sqweights=0.69007]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 80.16it/s, loss=-0.12255, sqweights=0.69007]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 80.16it/s, loss=-0.12480, sqweights=0.69008]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 80.16it/s, loss=-0.12198, sqweights=0.69084]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 80.16it/s, loss=-0.12313, sqweights=0.69310]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 80.16it/s, loss=-0.12314, sqweights=0.69373]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 80.16it/s, loss=-0.12511, sqweights=0.69580]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 80.16it/s, loss=-0.12500, sqweights=0.69598]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 80.16it/s, loss=-0.12505, sqweights=0.69596]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 80.16it/s, loss=-0.12432, sqweights=0.69627]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 80.16it/s, loss=-0.12337, sqweights=0.69578]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 82.60it/s, loss=-0.12337, sqweights=0.69578]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 82.60it/s, loss=-0.12098, sqweights=0.69700]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 82.60it/s, loss=-0.12040, sqweights=0.69709]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 82.60it/s, loss=-0.11946, sqweights=0.69629]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 82.60it/s, loss=-0.11946, sqweights=0.69629, train_loss=-0.16474, train_sqweights=0.61958, val_loss=-0.13552, val_sqweights=0.60827]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 82.60it/s, loss=-0.11946, sqweights=0.69629, train_loss=-0.16474, train_sqweights=0.61958, val_loss=-0.13552, val_sqweights=0.60827]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 28.41it/s, loss=-0.11946, sqweights=0.69629, train_loss=-0.16474, train_sqweights=0.61958, val_loss=-0.13552, val_sqweights=0.60827]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 62.09it/s, loss=-0.09577, sqweights=0.68922]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 71.44it/s, loss=-0.11163, sqweights=0.69746]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 75.56it/s, loss=-0.11225, sqweights=0.70503]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 76.21it/s, loss=-0.11307, sqweights=0.70687]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 77.75it/s, loss=-0.11828, sqweights=0.71039]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 78.28it/s, loss=-0.11328, sqweights=0.70785]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 79.07it/s, loss=-0.11485, sqweights=0.70820]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 79.36it/s, loss=-0.11485, sqweights=0.70820]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 79.36it/s, loss=-0.11403, sqweights=0.70762]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 79.36it/s, loss=-0.11799, sqweights=0.70623]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 79.36it/s, loss=-0.11712, sqweights=0.70668]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 79.36it/s, loss=-0.11818, sqweights=0.70723]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 79.36it/s, loss=-0.11937, sqweights=0.70650]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 79.36it/s, loss=-0.11920, sqweights=0.70507]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 79.36it/s, loss=-0.11909, sqweights=0.70606]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 79.36it/s, loss=-0.12136, sqweights=0.70707]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 79.36it/s, loss=-0.12209, sqweights=0.70815]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 82.08it/s, loss=-0.12209, sqweights=0.70815]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 82.08it/s, loss=-0.12163, sqweights=0.70938]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 82.08it/s, loss=-0.12340, sqweights=0.70998]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 82.08it/s, loss=-0.12317, sqweights=0.70978]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 82.08it/s, loss=-0.12438, sqweights=0.71193]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 82.08it/s, loss=-0.12438, sqweights=0.71193, train_loss=-0.16721, train_sqweights=0.63485, val_loss=-0.13792, val_sqweights=0.62302]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 82.08it/s, loss=-0.12438, sqweights=0.71193, train_loss=-0.16721, train_sqweights=0.63485, val_loss=-0.13792, val_sqweights=0.62302]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 28.38it/s, loss=-0.12438, sqweights=0.71193, train_loss=-0.16721, train_sqweights=0.63485, val_loss=-0.13792, val_sqweights=0.62302]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 60.40it/s, loss=-0.13296, sqweights=0.70385]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 69.58it/s, loss=-0.12511, sqweights=0.71827]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 74.16it/s, loss=-0.12748, sqweights=0.71964]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 76.57it/s, loss=-0.12791, sqweights=0.72604]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 78.02it/s, loss=-0.12531, sqweights=0.72539]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 79.02it/s, loss=-0.12543, sqweights=0.72533]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 79.71it/s, loss=-0.12210, sqweights=0.72514]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 79.94it/s, loss=-0.12210, sqweights=0.72514]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 79.94it/s, loss=-0.11963, sqweights=0.72685]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 79.94it/s, loss=-0.12000, sqweights=0.72568]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 79.94it/s, loss=-0.12060, sqweights=0.72502]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 79.94it/s, loss=-0.12150, sqweights=0.72653]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 79.94it/s, loss=-0.12257, sqweights=0.72643]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 79.94it/s, loss=-0.12156, sqweights=0.72836]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 79.94it/s, loss=-0.12378, sqweights=0.72779]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 79.94it/s, loss=-0.12231, sqweights=0.72787]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 79.94it/s, loss=-0.12267, sqweights=0.72691]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 81.78it/s, loss=-0.12267, sqweights=0.72691]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 81.78it/s, loss=-0.12365, sqweights=0.72517]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 81.78it/s, loss=-0.12203, sqweights=0.72557]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 81.78it/s, loss=-0.12136, sqweights=0.72523]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 81.78it/s, loss=-0.12153, sqweights=0.72677]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 81.78it/s, loss=-0.12153, sqweights=0.72677, train_loss=-0.16903, train_sqweights=0.65153, val_loss=-0.13948, val_sqweights=0.64036]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 81.78it/s, loss=-0.12153, sqweights=0.72677, train_loss=-0.16903, train_sqweights=0.65153, val_loss=-0.13948, val_sqweights=0.64036]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 24.68it/s, loss=-0.12153, sqweights=0.72677, train_loss=-0.16903, train_sqweights=0.65153, val_loss=-0.13948, val_sqweights=0.64036]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 61.83it/s, loss=-0.14320, sqweights=0.70928]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 68.92it/s, loss=-0.12930, sqweights=0.71607]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 73.57it/s, loss=-0.12902, sqweights=0.72636]
Epoch 31:  20%|##        | 4/20 [00:00<00:00, 76.13it/s, loss=-0.13178, sqweights=0.72454]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 77.82it/s, loss=-0.13533, sqweights=0.72412]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 78.22it/s, loss=-0.13092, sqweights=0.72722]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 78.75it/s, loss=-0.13207, sqweights=0.72996]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 79.06it/s, loss=-0.13207, sqweights=0.72996]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 79.06it/s, loss=-0.12879, sqweights=0.73206]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 79.06it/s, loss=-0.12783, sqweights=0.72998]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 79.06it/s, loss=-0.12446, sqweights=0.72910]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 79.06it/s, loss=-0.12592, sqweights=0.72957]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 79.06it/s, loss=-0.12291, sqweights=0.72873]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 79.06it/s, loss=-0.12431, sqweights=0.72918]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 79.06it/s, loss=-0.12357, sqweights=0.73270]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 79.06it/s, loss=-0.12511, sqweights=0.73522]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 79.06it/s, loss=-0.12520, sqweights=0.73587]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 81.48it/s, loss=-0.12520, sqweights=0.73587]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 81.48it/s, loss=-0.12235, sqweights=0.73489]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 81.48it/s, loss=-0.12185, sqweights=0.73417]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 81.48it/s, loss=-0.12142, sqweights=0.73500]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 81.48it/s, loss=-0.12052, sqweights=0.73523]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 81.48it/s, loss=-0.12052, sqweights=0.73523, train_loss=-0.17078, train_sqweights=0.66463, val_loss=-0.14122, val_sqweights=0.65455]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 81.48it/s, loss=-0.12052, sqweights=0.73523, train_loss=-0.17078, train_sqweights=0.66463, val_loss=-0.14122, val_sqweights=0.65455]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 28.34it/s, loss=-0.12052, sqweights=0.73523, train_loss=-0.17078, train_sqweights=0.66463, val_loss=-0.14122, val_sqweights=0.65455]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 60.60it/s, loss=-0.13949, sqweights=0.75322]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 69.95it/s, loss=-0.12775, sqweights=0.75064]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 74.00it/s, loss=-0.12286, sqweights=0.75233]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 76.29it/s, loss=-0.12425, sqweights=0.74874]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 77.68it/s, loss=-0.12633, sqweights=0.75082]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 78.89it/s, loss=-0.12628, sqweights=0.75219]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 79.76it/s, loss=-0.12555, sqweights=0.75228]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 80.33it/s, loss=-0.12866, sqweights=0.75094]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 80.87it/s, loss=-0.12866, sqweights=0.75094]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 80.87it/s, loss=-0.12553, sqweights=0.74781]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 80.87it/s, loss=-0.12686, sqweights=0.74803]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 80.87it/s, loss=-0.12797, sqweights=0.74790]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 80.87it/s, loss=-0.12656, sqweights=0.74847]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 80.87it/s, loss=-0.12493, sqweights=0.74756]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 80.87it/s, loss=-0.12335, sqweights=0.74845]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 80.87it/s, loss=-0.12345, sqweights=0.74849]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 80.87it/s, loss=-0.12282, sqweights=0.74951]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 80.87it/s, loss=-0.12258, sqweights=0.75002]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 83.08it/s, loss=-0.12258, sqweights=0.75002]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 83.08it/s, loss=-0.12303, sqweights=0.75033]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 83.08it/s, loss=-0.12258, sqweights=0.75014]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 83.08it/s, loss=-0.12072, sqweights=0.74882]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 83.08it/s, loss=-0.12072, sqweights=0.74882, train_loss=-0.17233, train_sqweights=0.67712, val_loss=-0.14255, val_sqweights=0.66758]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 83.08it/s, loss=-0.12072, sqweights=0.74882, train_loss=-0.17233, train_sqweights=0.67712, val_loss=-0.14255, val_sqweights=0.66758]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 28.64it/s, loss=-0.12072, sqweights=0.74882, train_loss=-0.17233, train_sqweights=0.67712, val_loss=-0.14255, val_sqweights=0.66758]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 62.17it/s, loss=-0.11942, sqweights=0.73901]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 71.82it/s, loss=-0.12414, sqweights=0.73819]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 76.05it/s, loss=-0.11843, sqweights=0.74002]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 78.25it/s, loss=-0.12083, sqweights=0.73863]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 79.71it/s, loss=-0.12312, sqweights=0.73888]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 80.70it/s, loss=-0.11693, sqweights=0.73878]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 81.29it/s, loss=-0.11855, sqweights=0.73814]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 81.20it/s, loss=-0.12239, sqweights=0.74315]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 81.51it/s, loss=-0.12239, sqweights=0.74315]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 81.51it/s, loss=-0.12617, sqweights=0.74525]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 81.51it/s, loss=-0.12732, sqweights=0.74577]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 81.51it/s, loss=-0.12792, sqweights=0.74452]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 81.51it/s, loss=-0.12756, sqweights=0.74671]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 81.51it/s, loss=-0.12995, sqweights=0.74904]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 81.51it/s, loss=-0.12813, sqweights=0.75163]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 81.51it/s, loss=-0.12777, sqweights=0.75284]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 81.51it/s, loss=-0.12807, sqweights=0.75281]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 81.51it/s, loss=-0.12765, sqweights=0.75289]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 83.31it/s, loss=-0.12765, sqweights=0.75289]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 83.31it/s, loss=-0.12779, sqweights=0.75413]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 83.31it/s, loss=-0.12847, sqweights=0.75571]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.12787, sqweights=0.75658]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.12787, sqweights=0.75658, train_loss=-0.17346, train_sqweights=0.69138, val_loss=-0.14394, val_sqweights=0.68146]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 83.31it/s, loss=-0.12787, sqweights=0.75658, train_loss=-0.17346, train_sqweights=0.69138, val_loss=-0.14394, val_sqweights=0.68146]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 28.42it/s, loss=-0.12787, sqweights=0.75658, train_loss=-0.17346, train_sqweights=0.69138, val_loss=-0.14394, val_sqweights=0.68146]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 62.36it/s, loss=-0.16789, sqweights=0.78287]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 71.55it/s, loss=-0.15045, sqweights=0.77771]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 73.49it/s, loss=-0.14248, sqweights=0.77599]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 76.02it/s, loss=-0.14004, sqweights=0.77607]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 77.25it/s, loss=-0.13993, sqweights=0.78104]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 77.33it/s, loss=-0.13534, sqweights=0.77609]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 78.36it/s, loss=-0.13711, sqweights=0.77448]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 79.05it/s, loss=-0.13711, sqweights=0.77448]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 79.05it/s, loss=-0.13428, sqweights=0.77549]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 79.05it/s, loss=-0.12943, sqweights=0.77203]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 79.05it/s, loss=-0.12980, sqweights=0.77333]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 79.05it/s, loss=-0.12732, sqweights=0.77227]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 79.05it/s, loss=-0.12656, sqweights=0.77182]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 79.05it/s, loss=-0.12657, sqweights=0.77282]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 79.05it/s, loss=-0.12792, sqweights=0.77140]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 79.05it/s, loss=-0.12769, sqweights=0.77175]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 50.40it/s, loss=-0.12769, sqweights=0.77175]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 50.40it/s, loss=-0.12778, sqweights=0.77238]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 50.40it/s, loss=-0.12954, sqweights=0.77246]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 50.40it/s, loss=-0.12912, sqweights=0.77292]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 50.40it/s, loss=-0.12962, sqweights=0.77248]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 50.40it/s, loss=-0.12896, sqweights=0.77431]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 50.40it/s, loss=-0.12896, sqweights=0.77431, train_loss=-0.17472, train_sqweights=0.70752, val_loss=-0.14512, val_sqweights=0.69850]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 50.40it/s, loss=-0.12896, sqweights=0.77431, train_loss=-0.17472, train_sqweights=0.70752, val_loss=-0.14512, val_sqweights=0.69850]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 24.71it/s, loss=-0.12896, sqweights=0.77431, train_loss=-0.17472, train_sqweights=0.70752, val_loss=-0.14512, val_sqweights=0.69850]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 61.46it/s, loss=-0.13081, sqweights=0.79976]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 69.17it/s, loss=-0.12722, sqweights=0.79255]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 73.18it/s, loss=-0.12584, sqweights=0.78344]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 75.55it/s, loss=-0.12011, sqweights=0.78344]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 77.11it/s, loss=-0.12626, sqweights=0.78600]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 78.05it/s, loss=-0.12203, sqweights=0.78820]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 79.01it/s, loss=-0.12119, sqweights=0.78866]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 79.57it/s, loss=-0.12119, sqweights=0.78866]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 79.57it/s, loss=-0.12516, sqweights=0.78885]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 79.57it/s, loss=-0.12768, sqweights=0.79113]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 79.57it/s, loss=-0.12676, sqweights=0.79123]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 79.57it/s, loss=-0.12699, sqweights=0.78949]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 79.57it/s, loss=-0.12842, sqweights=0.78956]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 79.57it/s, loss=-0.13061, sqweights=0.78999]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 79.57it/s, loss=-0.12863, sqweights=0.78854]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 79.57it/s, loss=-0.12708, sqweights=0.78856]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 79.57it/s, loss=-0.12574, sqweights=0.78745]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 81.56it/s, loss=-0.12574, sqweights=0.78745]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 81.56it/s, loss=-0.12580, sqweights=0.78744]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 81.56it/s, loss=-0.12719, sqweights=0.78794]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 81.56it/s, loss=-0.12603, sqweights=0.78716]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 81.56it/s, loss=-0.12783, sqweights=0.78574]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 81.56it/s, loss=-0.12783, sqweights=0.78574, train_loss=-0.17617, train_sqweights=0.72451, val_loss=-0.14609, val_sqweights=0.71640]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 81.56it/s, loss=-0.12783, sqweights=0.78574, train_loss=-0.17617, train_sqweights=0.72451, val_loss=-0.14609, val_sqweights=0.71640]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 28.33it/s, loss=-0.12783, sqweights=0.78574, train_loss=-0.17617, train_sqweights=0.72451, val_loss=-0.14609, val_sqweights=0.71640]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 61.68it/s, loss=-0.12946, sqweights=0.77591]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 70.14it/s, loss=-0.13060, sqweights=0.79453]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 73.81it/s, loss=-0.13434, sqweights=0.79457]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 76.05it/s, loss=-0.13545, sqweights=0.78998]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 77.17it/s, loss=-0.13193, sqweights=0.78746]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 77.28it/s, loss=-0.12868, sqweights=0.78665]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 78.21it/s, loss=-0.12557, sqweights=0.79029]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 79.02it/s, loss=-0.12557, sqweights=0.79029]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 79.02it/s, loss=-0.12182, sqweights=0.79199]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 79.02it/s, loss=-0.12345, sqweights=0.79210]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 79.02it/s, loss=-0.12466, sqweights=0.79193]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 79.02it/s, loss=-0.12160, sqweights=0.79049]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 79.02it/s, loss=-0.12027, sqweights=0.79203]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 79.02it/s, loss=-0.12151, sqweights=0.79167]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 79.02it/s, loss=-0.12123, sqweights=0.79368]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 79.02it/s, loss=-0.12208, sqweights=0.79449]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 79.02it/s, loss=-0.12398, sqweights=0.79369]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 81.98it/s, loss=-0.12398, sqweights=0.79369]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 81.98it/s, loss=-0.12323, sqweights=0.79336]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 81.98it/s, loss=-0.12351, sqweights=0.79408]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 81.98it/s, loss=-0.12388, sqweights=0.79386]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 81.98it/s, loss=-0.12399, sqweights=0.79473]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 81.98it/s, loss=-0.12399, sqweights=0.79473, train_loss=-0.17754, train_sqweights=0.73596, val_loss=-0.14668, val_sqweights=0.72757]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 81.98it/s, loss=-0.12399, sqweights=0.79473, train_loss=-0.17754, train_sqweights=0.73596, val_loss=-0.14668, val_sqweights=0.72757]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 28.38it/s, loss=-0.12399, sqweights=0.79473, train_loss=-0.17754, train_sqweights=0.73596, val_loss=-0.14668, val_sqweights=0.72757]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 60.15it/s, loss=-0.12679, sqweights=0.81481]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 70.30it/s, loss=-0.12668, sqweights=0.79608]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 73.69it/s, loss=-0.13962, sqweights=0.80286]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 76.37it/s, loss=-0.13613, sqweights=0.80574]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 77.98it/s, loss=-0.12830, sqweights=0.80590]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 79.12it/s, loss=-0.12437, sqweights=0.80095]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 80.08it/s, loss=-0.12521, sqweights=0.80267]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 80.53it/s, loss=-0.12601, sqweights=0.80311]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 81.02it/s, loss=-0.12601, sqweights=0.80311]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 81.02it/s, loss=-0.12534, sqweights=0.80429]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 81.02it/s, loss=-0.12501, sqweights=0.80680]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 81.02it/s, loss=-0.12630, sqweights=0.80744]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 81.02it/s, loss=-0.12593, sqweights=0.80611]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 81.02it/s, loss=-0.12725, sqweights=0.80709]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 81.02it/s, loss=-0.12752, sqweights=0.80771]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 81.02it/s, loss=-0.12823, sqweights=0.80703]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 81.02it/s, loss=-0.13043, sqweights=0.80799]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 81.02it/s, loss=-0.13075, sqweights=0.80717]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 83.40it/s, loss=-0.13075, sqweights=0.80717]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 83.40it/s, loss=-0.13065, sqweights=0.80742]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 83.40it/s, loss=-0.12938, sqweights=0.80657]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 83.40it/s, loss=-0.12934, sqweights=0.80686]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 83.40it/s, loss=-0.12934, sqweights=0.80686, train_loss=-0.17867, train_sqweights=0.75027, val_loss=-0.14649, val_sqweights=0.74207]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 83.40it/s, loss=-0.12934, sqweights=0.80686, train_loss=-0.17867, train_sqweights=0.75027, val_loss=-0.14649, val_sqweights=0.74207]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 28.40it/s, loss=-0.12934, sqweights=0.80686, train_loss=-0.17867, train_sqweights=0.75027, val_loss=-0.14649, val_sqweights=0.74207]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:02,  8.24it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:02,  8.24it/s, loss=-0.13357, sqweights=0.83859]
Epoch 38:  10%|#         | 2/20 [00:00<00:02,  8.24it/s, loss=-0.11362, sqweights=0.83444]
Epoch 38:  15%|#5        | 3/20 [00:00<00:02,  8.24it/s, loss=-0.11510, sqweights=0.81927]
Epoch 38:  20%|##        | 4/20 [00:00<00:01,  8.24it/s, loss=-0.11625, sqweights=0.81216]
Epoch 38:  25%|##5       | 5/20 [00:00<00:01,  8.24it/s, loss=-0.11997, sqweights=0.80683]
Epoch 38:  30%|###       | 6/20 [00:00<00:01,  8.24it/s, loss=-0.12232, sqweights=0.80999]
Epoch 38:  35%|###5      | 7/20 [00:00<00:01,  8.24it/s, loss=-0.12519, sqweights=0.80974]
Epoch 38:  40%|####      | 8/20 [00:00<00:01,  8.24it/s, loss=-0.12493, sqweights=0.81053]
Epoch 38:  45%|####5     | 9/20 [00:00<00:01,  8.24it/s, loss=-0.12607, sqweights=0.80948]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 50.88it/s, loss=-0.12607, sqweights=0.80948]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 50.88it/s, loss=-0.12446, sqweights=0.81031]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 50.88it/s, loss=-0.12540, sqweights=0.81231]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 50.88it/s, loss=-0.12422, sqweights=0.81326]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 50.88it/s, loss=-0.12413, sqweights=0.81306]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 50.88it/s, loss=-0.12207, sqweights=0.81343]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 50.88it/s, loss=-0.12300, sqweights=0.81434]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 50.88it/s, loss=-0.12246, sqweights=0.81608]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 50.88it/s, loss=-0.12281, sqweights=0.81743]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 50.88it/s, loss=-0.12222, sqweights=0.81741]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 66.16it/s, loss=-0.12222, sqweights=0.81741]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 66.16it/s, loss=-0.12314, sqweights=0.81737]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 66.16it/s, loss=-0.12587, sqweights=0.81840]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 66.16it/s, loss=-0.12587, sqweights=0.81840, train_loss=-0.17957, train_sqweights=0.76224, val_loss=-0.14681, val_sqweights=0.75387]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 66.16it/s, loss=-0.12587, sqweights=0.81840, train_loss=-0.17957, train_sqweights=0.76224, val_loss=-0.14681, val_sqweights=0.75387]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 24.85it/s, loss=-0.12587, sqweights=0.81840, train_loss=-0.17957, train_sqweights=0.76224, val_loss=-0.14681, val_sqweights=0.75387]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 61.97it/s, loss=-0.13071, sqweights=0.84533]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 71.82it/s, loss=-0.14608, sqweights=0.83803]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 73.30it/s, loss=-0.14384, sqweights=0.83327]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 75.90it/s, loss=-0.14038, sqweights=0.82937]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 77.58it/s, loss=-0.13754, sqweights=0.82839]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 78.76it/s, loss=-0.13937, sqweights=0.83003]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 79.85it/s, loss=-0.13289, sqweights=0.83045]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 80.54it/s, loss=-0.13289, sqweights=0.82578]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 81.08it/s, loss=-0.13289, sqweights=0.82578]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 81.08it/s, loss=-0.13262, sqweights=0.82837]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 81.08it/s, loss=-0.13544, sqweights=0.82867]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 81.08it/s, loss=-0.13634, sqweights=0.82618]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 81.08it/s, loss=-0.13729, sqweights=0.82788]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 81.08it/s, loss=-0.13712, sqweights=0.82637]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 81.08it/s, loss=-0.13769, sqweights=0.82591]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 81.08it/s, loss=-0.13365, sqweights=0.82494]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 81.08it/s, loss=-0.13400, sqweights=0.82383]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 81.08it/s, loss=-0.13298, sqweights=0.82496]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 82.78it/s, loss=-0.13298, sqweights=0.82496]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 82.78it/s, loss=-0.13352, sqweights=0.82562]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 82.78it/s, loss=-0.13343, sqweights=0.82650]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 82.78it/s, loss=-0.13250, sqweights=0.82744]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 82.78it/s, loss=-0.13250, sqweights=0.82744, train_loss=-0.18083, train_sqweights=0.77741, val_loss=-0.14666, val_sqweights=0.76888]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 82.78it/s, loss=-0.13250, sqweights=0.82744, train_loss=-0.18083, train_sqweights=0.77741, val_loss=-0.14666, val_sqweights=0.76888]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 28.48it/s, loss=-0.13250, sqweights=0.82744, train_loss=-0.18083, train_sqweights=0.77741, val_loss=-0.14666, val_sqweights=0.76888]

<matplotlib.legend.Legend object at 0x7fda823b4e10>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 0 minutes 35.519 seconds)

Gallery generated by Sphinx-Gallery