Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

Lütkepohl2005

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss

Out:

model       metric     epoch  dataloader
1overN      loss       -1     train        -0.001
                              val           0.001
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train         0.000
                              val           0.001
            sqweights  -1     train         0.143
                              val           0.144
Random      loss       -1     train        -0.002
                              val          -0.000
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.165
                              val          -0.165
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 55.74it/s, loss=-0.00052, sqweights=0.16373]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 58.26it/s, loss=-0.00747, sqweights=0.16431]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 59.83it/s, loss=-0.00552, sqweights=0.16390]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 60.00it/s, loss=-0.00469, sqweights=0.16337]
Epoch 0:  25%|##5       | 5/20 [00:00<00:00, 60.24it/s, loss=-0.00259, sqweights=0.16421]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 60.55it/s, loss=-0.00438, sqweights=0.16416]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 60.65it/s, loss=-0.00438, sqweights=0.16416]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 60.65it/s, loss=-0.00426, sqweights=0.16407]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 60.65it/s, loss=-0.00395, sqweights=0.16436]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 60.65it/s, loss=-0.00530, sqweights=0.16463]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 60.65it/s, loss=-0.00497, sqweights=0.16480]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 60.65it/s, loss=-0.00413, sqweights=0.16491]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 60.65it/s, loss=-0.00359, sqweights=0.16481]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 60.65it/s, loss=-0.00303, sqweights=0.16478]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 61.12it/s, loss=-0.00303, sqweights=0.16478]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 61.12it/s, loss=-0.00250, sqweights=0.16497]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 61.12it/s, loss=-0.00247, sqweights=0.16496]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 61.12it/s, loss=-0.00267, sqweights=0.16484]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 61.12it/s, loss=-0.00404, sqweights=0.16491]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 61.12it/s, loss=-0.00399, sqweights=0.16512]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 61.12it/s, loss=-0.00356, sqweights=0.16520]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 61.12it/s, loss=-0.00288, sqweights=0.16540]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 61.12it/s, loss=-0.00288, sqweights=0.16540, train_loss=-0.00077, train_sqweights=0.12547, val_loss=0.00060, val_sqweights=0.12547]
Epoch 0: 100%|##########| 20/20 [00:01<00:00, 19.95it/s, loss=-0.00288, sqweights=0.16540, train_loss=-0.00077, train_sqweights=0.12547, val_loss=0.00060, val_sqweights=0.12547]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 56.11it/s, loss=0.00344, sqweights=0.16422]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 57.90it/s, loss=0.00169, sqweights=0.16567]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 57.82it/s, loss=0.00694, sqweights=0.16591]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 58.48it/s, loss=0.00154, sqweights=0.16578]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 58.44it/s, loss=-0.00234, sqweights=0.16568]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 58.94it/s, loss=-0.00234, sqweights=0.16568]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 58.94it/s, loss=-0.00480, sqweights=0.16589]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 58.94it/s, loss=-0.00674, sqweights=0.16576]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 58.94it/s, loss=-0.00862, sqweights=0.16573]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 58.94it/s, loss=-0.00761, sqweights=0.16578]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 58.94it/s, loss=-0.00792, sqweights=0.16622]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 58.94it/s, loss=-0.00771, sqweights=0.16637]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 58.94it/s, loss=-0.00809, sqweights=0.16659]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 59.40it/s, loss=-0.00809, sqweights=0.16659]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 59.40it/s, loss=-0.00798, sqweights=0.16663]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 59.40it/s, loss=-0.00805, sqweights=0.16674]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 59.40it/s, loss=-0.00951, sqweights=0.16704]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 59.40it/s, loss=-0.00881, sqweights=0.16707]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 59.40it/s, loss=-0.00881, sqweights=0.16734]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 59.40it/s, loss=-0.00925, sqweights=0.16763]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 58.40it/s, loss=-0.00925, sqweights=0.16763]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 58.40it/s, loss=-0.00991, sqweights=0.16779]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 58.40it/s, loss=-0.00955, sqweights=0.16779]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 58.40it/s, loss=-0.00955, sqweights=0.16779, train_loss=-0.00126, train_sqweights=0.12548, val_loss=0.00026, val_sqweights=0.12548]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 20.37it/s, loss=-0.00955, sqweights=0.16779, train_loss=-0.00126, train_sqweights=0.12548, val_loss=0.00026, val_sqweights=0.12548]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 60.22it/s, loss=-0.00143, sqweights=0.17135]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 60.99it/s, loss=-0.00928, sqweights=0.17075]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 59.93it/s, loss=-0.01089, sqweights=0.17152]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 60.54it/s, loss=-0.01170, sqweights=0.17086]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 60.98it/s, loss=-0.01153, sqweights=0.17178]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 61.18it/s, loss=-0.01335, sqweights=0.17162]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 61.43it/s, loss=-0.01335, sqweights=0.17162]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 61.43it/s, loss=-0.01458, sqweights=0.17170]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 61.43it/s, loss=-0.01549, sqweights=0.17226]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 61.43it/s, loss=-0.01669, sqweights=0.17228]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 61.43it/s, loss=-0.01743, sqweights=0.17197]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 61.43it/s, loss=-0.01731, sqweights=0.17236]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 61.43it/s, loss=-0.01777, sqweights=0.17271]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 61.43it/s, loss=-0.01721, sqweights=0.17298]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 61.08it/s, loss=-0.01721, sqweights=0.17298]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 61.08it/s, loss=-0.01731, sqweights=0.17307]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 61.08it/s, loss=-0.01676, sqweights=0.17321]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 61.08it/s, loss=-0.01623, sqweights=0.17335]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 61.08it/s, loss=-0.01626, sqweights=0.17329]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 61.08it/s, loss=-0.01671, sqweights=0.17371]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 61.08it/s, loss=-0.01745, sqweights=0.17380]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 61.08it/s, loss=-0.01716, sqweights=0.17424]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 61.08it/s, loss=-0.01716, sqweights=0.17424, train_loss=-0.00309, train_sqweights=0.12583, val_loss=-0.00114, val_sqweights=0.12583]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 20.32it/s, loss=-0.01716, sqweights=0.17424, train_loss=-0.00309, train_sqweights=0.12583, val_loss=-0.00114, val_sqweights=0.12583]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 58.40it/s, loss=-0.02218, sqweights=0.17785]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 58.07it/s, loss=-0.02341, sqweights=0.17900]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 58.05it/s, loss=-0.02149, sqweights=0.17828]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 57.67it/s, loss=-0.02165, sqweights=0.17822]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 57.36it/s, loss=-0.02197, sqweights=0.17860]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 56.84it/s, loss=-0.02197, sqweights=0.17860]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 56.84it/s, loss=-0.02059, sqweights=0.17817]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 56.84it/s, loss=-0.02291, sqweights=0.17890]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 56.84it/s, loss=-0.02225, sqweights=0.17881]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 56.84it/s, loss=-0.02249, sqweights=0.17919]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 56.84it/s, loss=-0.02141, sqweights=0.17949]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 54.48it/s, loss=-0.02141, sqweights=0.17949]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 54.48it/s, loss=-0.02229, sqweights=0.18020]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 54.48it/s, loss=-0.02280, sqweights=0.18063]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 54.48it/s, loss=-0.02375, sqweights=0.18070]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 54.48it/s, loss=-0.02413, sqweights=0.18118]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 54.48it/s, loss=-0.02383, sqweights=0.18156]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 54.48it/s, loss=-0.02311, sqweights=0.18153]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 54.99it/s, loss=-0.02311, sqweights=0.18153]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 54.99it/s, loss=-0.02382, sqweights=0.18181]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 54.99it/s, loss=-0.02430, sqweights=0.18210]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 54.99it/s, loss=-0.02422, sqweights=0.18257]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 54.99it/s, loss=-0.02467, sqweights=0.18288]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 54.99it/s, loss=-0.02467, sqweights=0.18288, train_loss=-0.00955, train_sqweights=0.12860, val_loss=-0.00623, val_sqweights=0.12851]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 20.17it/s, loss=-0.02467, sqweights=0.18288, train_loss=-0.00955, train_sqweights=0.12860, val_loss=-0.00623, val_sqweights=0.12851]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 59.54it/s, loss=-0.02368, sqweights=0.19180]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 60.58it/s, loss=-0.02308, sqweights=0.19067]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 61.23it/s, loss=-0.02241, sqweights=0.18935]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 61.66it/s, loss=-0.02308, sqweights=0.18977]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 61.61it/s, loss=-0.02767, sqweights=0.19073]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 61.60it/s, loss=-0.03120, sqweights=0.19092]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 61.75it/s, loss=-0.03120, sqweights=0.19092]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 61.75it/s, loss=-0.03032, sqweights=0.19075]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 61.75it/s, loss=-0.03016, sqweights=0.19073]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 61.75it/s, loss=-0.03105, sqweights=0.19132]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 61.75it/s, loss=-0.02973, sqweights=0.19198]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 61.75it/s, loss=-0.03060, sqweights=0.19227]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 61.75it/s, loss=-0.03055, sqweights=0.19259]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 61.75it/s, loss=-0.02989, sqweights=0.19272]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 61.73it/s, loss=-0.02989, sqweights=0.19272]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 61.73it/s, loss=-0.02960, sqweights=0.19315]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 61.73it/s, loss=-0.03027, sqweights=0.19390]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 61.73it/s, loss=-0.03128, sqweights=0.19436]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 61.73it/s, loss=-0.03144, sqweights=0.19484]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 61.73it/s, loss=-0.03103, sqweights=0.19524]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 61.73it/s, loss=-0.03171, sqweights=0.19563]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 61.73it/s, loss=-0.03198, sqweights=0.19599]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 61.73it/s, loss=-0.03198, sqweights=0.19599, train_loss=-0.02559, train_sqweights=0.14419, val_loss=-0.01898, val_sqweights=0.14353]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 20.60it/s, loss=-0.03198, sqweights=0.19599, train_loss=-0.02559, train_sqweights=0.14419, val_loss=-0.01898, val_sqweights=0.14353]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 58.80it/s, loss=-0.04178, sqweights=0.20066]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 58.66it/s, loss=-0.03976, sqweights=0.20390]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 58.98it/s, loss=-0.04051, sqweights=0.20648]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 59.60it/s, loss=-0.03766, sqweights=0.20584]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 57.68it/s, loss=-0.03633, sqweights=0.20613]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 58.27it/s, loss=-0.03633, sqweights=0.20613]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 58.27it/s, loss=-0.03752, sqweights=0.20589]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 58.27it/s, loss=-0.03820, sqweights=0.20601]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 58.27it/s, loss=-0.03952, sqweights=0.20569]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 58.27it/s, loss=-0.03944, sqweights=0.20659]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 58.27it/s, loss=-0.03948, sqweights=0.20704]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 58.27it/s, loss=-0.03989, sqweights=0.20704]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 58.57it/s, loss=-0.03989, sqweights=0.20704]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 58.57it/s, loss=-0.03983, sqweights=0.20722]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 58.57it/s, loss=-0.04006, sqweights=0.20788]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 58.57it/s, loss=-0.03911, sqweights=0.20802]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 58.57it/s, loss=-0.03870, sqweights=0.20830]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 58.57it/s, loss=-0.03907, sqweights=0.20888]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 58.57it/s, loss=-0.03897, sqweights=0.20960]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 58.65it/s, loss=-0.03897, sqweights=0.20960]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 58.65it/s, loss=-0.03938, sqweights=0.21028]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 58.65it/s, loss=-0.03949, sqweights=0.21085]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 58.65it/s, loss=-0.03842, sqweights=0.21177]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 58.65it/s, loss=-0.03842, sqweights=0.21177, train_loss=-0.04246, train_sqweights=0.17091, val_loss=-0.03239, val_sqweights=0.16899]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 20.95it/s, loss=-0.03842, sqweights=0.21177, train_loss=-0.04246, train_sqweights=0.17091, val_loss=-0.03239, val_sqweights=0.16899]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 56.21it/s, loss=-0.03570, sqweights=0.21575]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 56.82it/s, loss=-0.04297, sqweights=0.21455]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 57.19it/s, loss=-0.05180, sqweights=0.21681]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 57.36it/s, loss=-0.04765, sqweights=0.22038]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 57.63it/s, loss=-0.04525, sqweights=0.22091]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 58.38it/s, loss=-0.04525, sqweights=0.22091]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 58.38it/s, loss=-0.04678, sqweights=0.22237]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 58.38it/s, loss=-0.05061, sqweights=0.22385]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 58.38it/s, loss=-0.04997, sqweights=0.22434]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 58.38it/s, loss=-0.04878, sqweights=0.22557]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 58.38it/s, loss=-0.04915, sqweights=0.22569]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 58.38it/s, loss=-0.04824, sqweights=0.22609]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 56.67it/s, loss=-0.04824, sqweights=0.22609]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 56.67it/s, loss=-0.04639, sqweights=0.22659]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 56.67it/s, loss=-0.04476, sqweights=0.22741]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 56.67it/s, loss=-0.04509, sqweights=0.22755]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 56.67it/s, loss=-0.04480, sqweights=0.22845]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 56.67it/s, loss=-0.04543, sqweights=0.22903]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 56.67it/s, loss=-0.04541, sqweights=0.22954]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 57.44it/s, loss=-0.04541, sqweights=0.22954]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 57.44it/s, loss=-0.04619, sqweights=0.22959]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 57.44it/s, loss=-0.04610, sqweights=0.22948]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 57.44it/s, loss=-0.04649, sqweights=0.23087]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 57.44it/s, loss=-0.04649, sqweights=0.23087, train_loss=-0.05350, train_sqweights=0.19012, val_loss=-0.04095, val_sqweights=0.18720]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 20.03it/s, loss=-0.04649, sqweights=0.23087, train_loss=-0.05350, train_sqweights=0.19012, val_loss=-0.04095, val_sqweights=0.18720]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 54.64it/s, loss=-0.03243, sqweights=0.23209]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 56.38it/s, loss=-0.04009, sqweights=0.23903]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 57.51it/s, loss=-0.03703, sqweights=0.24235]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 55.80it/s, loss=-0.04333, sqweights=0.24274]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 55.76it/s, loss=-0.04996, sqweights=0.24348]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 56.67it/s, loss=-0.04996, sqweights=0.24348]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 56.67it/s, loss=-0.04988, sqweights=0.24430]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 56.67it/s, loss=-0.05269, sqweights=0.24433]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 56.67it/s, loss=-0.05115, sqweights=0.24528]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 56.67it/s, loss=-0.05256, sqweights=0.24616]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 56.67it/s, loss=-0.05346, sqweights=0.24704]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 56.67it/s, loss=-0.05242, sqweights=0.24795]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 56.63it/s, loss=-0.05242, sqweights=0.24795]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 56.63it/s, loss=-0.05047, sqweights=0.24854]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 56.63it/s, loss=-0.05035, sqweights=0.24876]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 56.63it/s, loss=-0.05096, sqweights=0.24949]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 56.63it/s, loss=-0.05105, sqweights=0.24959]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 56.63it/s, loss=-0.05199, sqweights=0.24965]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 56.63it/s, loss=-0.05288, sqweights=0.25026]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 56.63it/s, loss=-0.05270, sqweights=0.25099]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 57.64it/s, loss=-0.05270, sqweights=0.25099]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 57.64it/s, loss=-0.05242, sqweights=0.25165]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 57.64it/s, loss=-0.05221, sqweights=0.25189]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 57.64it/s, loss=-0.05221, sqweights=0.25189, train_loss=-0.06237, train_sqweights=0.20745, val_loss=-0.04790, val_sqweights=0.20353]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 20.51it/s, loss=-0.05221, sqweights=0.25189, train_loss=-0.06237, train_sqweights=0.20745, val_loss=-0.04790, val_sqweights=0.20353]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:00, 57.97it/s, loss=-0.06663, sqweights=0.25629]
Epoch 8:  10%|#         | 2/20 [00:00<00:00, 59.52it/s, loss=-0.05826, sqweights=0.26069]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 59.83it/s, loss=-0.05656, sqweights=0.26944]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 59.37it/s, loss=-0.05628, sqweights=0.26963]
Epoch 8:  25%|##5       | 5/20 [00:00<00:00, 59.65it/s, loss=-0.05712, sqweights=0.27050]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 59.32it/s, loss=-0.05712, sqweights=0.27050]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 59.32it/s, loss=-0.05967, sqweights=0.27127]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 59.32it/s, loss=-0.06054, sqweights=0.27172]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 59.32it/s, loss=-0.05972, sqweights=0.27148]
Epoch 8:  45%|####5     | 9/20 [00:00<00:00, 59.32it/s, loss=-0.05826, sqweights=0.27230]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 59.32it/s, loss=-0.05908, sqweights=0.27188]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 59.32it/s, loss=-0.05949, sqweights=0.27319]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 56.90it/s, loss=-0.05949, sqweights=0.27319]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 56.90it/s, loss=-0.05915, sqweights=0.27397]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 56.90it/s, loss=-0.06009, sqweights=0.27445]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 56.90it/s, loss=-0.06149, sqweights=0.27491]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 56.90it/s, loss=-0.06101, sqweights=0.27593]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 56.90it/s, loss=-0.06125, sqweights=0.27610]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 56.90it/s, loss=-0.06084, sqweights=0.27621]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 57.32it/s, loss=-0.06084, sqweights=0.27621]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 57.32it/s, loss=-0.06123, sqweights=0.27689]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 57.32it/s, loss=-0.06189, sqweights=0.27762]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 57.32it/s, loss=-0.06146, sqweights=0.27915]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 57.32it/s, loss=-0.06146, sqweights=0.27915, train_loss=-0.07103, train_sqweights=0.22669, val_loss=-0.05453, val_sqweights=0.22182]
Epoch 8: 100%|##########| 20/20 [00:01<00:00, 19.70it/s, loss=-0.06146, sqweights=0.27915, train_loss=-0.07103, train_sqweights=0.22669, val_loss=-0.05453, val_sqweights=0.22182]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 55.54it/s, loss=-0.08296, sqweights=0.28439]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 54.48it/s, loss=-0.07660, sqweights=0.29099]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 56.14it/s, loss=-0.06478, sqweights=0.29194]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 57.23it/s, loss=-0.06322, sqweights=0.29308]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 57.79it/s, loss=-0.06659, sqweights=0.29608]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 58.17it/s, loss=-0.06659, sqweights=0.29608]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 58.17it/s, loss=-0.06431, sqweights=0.29588]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 58.17it/s, loss=-0.06318, sqweights=0.29732]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 58.17it/s, loss=-0.06486, sqweights=0.29900]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 58.17it/s, loss=-0.06493, sqweights=0.29820]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 58.17it/s, loss=-0.06625, sqweights=0.29769]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 58.17it/s, loss=-0.06582, sqweights=0.29709]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 58.17it/s, loss=-0.06650, sqweights=0.29817]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 58.94it/s, loss=-0.06650, sqweights=0.29817]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 58.94it/s, loss=-0.06659, sqweights=0.29991]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 58.94it/s, loss=-0.06680, sqweights=0.30165]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 58.94it/s, loss=-0.06640, sqweights=0.30140]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 58.94it/s, loss=-0.06694, sqweights=0.30207]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 58.94it/s, loss=-0.06656, sqweights=0.30334]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 58.94it/s, loss=-0.06640, sqweights=0.30422]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 58.94it/s, loss=-0.06683, sqweights=0.30446]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 60.77it/s, loss=-0.06683, sqweights=0.30446]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 60.77it/s, loss=-0.06659, sqweights=0.30566]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 60.77it/s, loss=-0.06659, sqweights=0.30566, train_loss=-0.07920, train_sqweights=0.24630, val_loss=-0.06090, val_sqweights=0.24061]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 20.13it/s, loss=-0.06659, sqweights=0.30566, train_loss=-0.07920, train_sqweights=0.24630, val_loss=-0.06090, val_sqweights=0.24061]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 59.33it/s, loss=-0.05120, sqweights=0.32857]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 60.21it/s, loss=-0.05537, sqweights=0.32238]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 61.01it/s, loss=-0.06297, sqweights=0.31673]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 60.70it/s, loss=-0.06370, sqweights=0.31632]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 59.92it/s, loss=-0.06778, sqweights=0.32016]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 58.98it/s, loss=-0.06778, sqweights=0.32016]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 58.98it/s, loss=-0.06966, sqweights=0.32067]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 58.98it/s, loss=-0.07187, sqweights=0.32403]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 58.98it/s, loss=-0.07076, sqweights=0.32347]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 58.98it/s, loss=-0.07262, sqweights=0.32318]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 58.98it/s, loss=-0.07083, sqweights=0.32396]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 58.98it/s, loss=-0.07130, sqweights=0.32374]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 57.47it/s, loss=-0.07130, sqweights=0.32374]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 57.47it/s, loss=-0.07084, sqweights=0.32315]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 57.47it/s, loss=-0.07226, sqweights=0.32345]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 57.47it/s, loss=-0.07166, sqweights=0.32350]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 57.47it/s, loss=-0.07152, sqweights=0.32395]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 57.47it/s, loss=-0.07134, sqweights=0.32554]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 57.47it/s, loss=-0.07193, sqweights=0.32742]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 57.81it/s, loss=-0.07193, sqweights=0.32742]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 57.81it/s, loss=-0.07160, sqweights=0.32825]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 57.81it/s, loss=-0.07061, sqweights=0.32893]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 57.81it/s, loss=-0.07150, sqweights=0.32960]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 57.81it/s, loss=-0.07150, sqweights=0.32960, train_loss=-0.08704, train_sqweights=0.26821, val_loss=-0.06697, val_sqweights=0.26182]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 20.08it/s, loss=-0.07150, sqweights=0.32960, train_loss=-0.08704, train_sqweights=0.26821, val_loss=-0.06697, val_sqweights=0.26182]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 58.21it/s, loss=-0.08305, sqweights=0.34380]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 57.09it/s, loss=-0.08564, sqweights=0.35102]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 58.22it/s, loss=-0.08141, sqweights=0.34943]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 57.65it/s, loss=-0.07832, sqweights=0.34863]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 58.01it/s, loss=-0.07890, sqweights=0.35194]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 57.94it/s, loss=-0.07890, sqweights=0.35194]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 57.94it/s, loss=-0.07701, sqweights=0.35169]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 57.94it/s, loss=-0.07590, sqweights=0.35110]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 57.94it/s, loss=-0.07786, sqweights=0.35047]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 57.94it/s, loss=-0.07696, sqweights=0.35204]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 57.94it/s, loss=-0.07736, sqweights=0.35211]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 57.94it/s, loss=-0.07679, sqweights=0.35400]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 58.08it/s, loss=-0.07679, sqweights=0.35400]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 58.08it/s, loss=-0.07543, sqweights=0.35496]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 58.08it/s, loss=-0.07759, sqweights=0.35615]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 58.08it/s, loss=-0.07840, sqweights=0.35655]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 58.08it/s, loss=-0.07889, sqweights=0.35732]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 58.08it/s, loss=-0.07916, sqweights=0.35841]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 58.08it/s, loss=-0.07925, sqweights=0.36050]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 57.93it/s, loss=-0.07925, sqweights=0.36050]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 57.93it/s, loss=-0.07916, sqweights=0.36053]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 57.93it/s, loss=-0.08016, sqweights=0.36258]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 57.93it/s, loss=-0.07912, sqweights=0.36376]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 57.93it/s, loss=-0.07912, sqweights=0.36376, train_loss=-0.09448, train_sqweights=0.29043, val_loss=-0.07264, val_sqweights=0.28301]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 20.08it/s, loss=-0.07912, sqweights=0.36376, train_loss=-0.09448, train_sqweights=0.29043, val_loss=-0.07264, val_sqweights=0.28301]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 49.75it/s, loss=-0.10783, sqweights=0.37533]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 54.57it/s, loss=-0.09561, sqweights=0.37865]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 54.94it/s, loss=-0.08837, sqweights=0.38129]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 55.48it/s, loss=-0.08641, sqweights=0.38178]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 56.12it/s, loss=-0.08127, sqweights=0.37958]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 56.76it/s, loss=-0.08127, sqweights=0.37958]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 56.76it/s, loss=-0.08323, sqweights=0.38041]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 56.76it/s, loss=-0.08222, sqweights=0.38332]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 56.76it/s, loss=-0.08608, sqweights=0.38338]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 56.76it/s, loss=-0.08437, sqweights=0.38110]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 56.76it/s, loss=-0.08572, sqweights=0.38202]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 56.76it/s, loss=-0.08430, sqweights=0.38277]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 56.19it/s, loss=-0.08430, sqweights=0.38277]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 56.19it/s, loss=-0.08504, sqweights=0.38428]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 56.19it/s, loss=-0.08665, sqweights=0.38379]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 56.19it/s, loss=-0.08527, sqweights=0.38381]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 56.19it/s, loss=-0.08561, sqweights=0.38498]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 56.19it/s, loss=-0.08415, sqweights=0.38396]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 56.19it/s, loss=-0.08435, sqweights=0.38402]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 55.99it/s, loss=-0.08435, sqweights=0.38402]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 55.99it/s, loss=-0.08487, sqweights=0.38412]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 55.99it/s, loss=-0.08570, sqweights=0.38433]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 55.99it/s, loss=-0.08529, sqweights=0.38385]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 55.99it/s, loss=-0.08529, sqweights=0.38385, train_loss=-0.10177, train_sqweights=0.31316, val_loss=-0.07801, val_sqweights=0.30497]
Epoch 12: 100%|##########| 20/20 [00:01<00:00, 18.55it/s, loss=-0.08529, sqweights=0.38385, train_loss=-0.10177, train_sqweights=0.31316, val_loss=-0.07801, val_sqweights=0.30497]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 58.38it/s, loss=-0.10649, sqweights=0.39669]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 59.32it/s, loss=-0.09743, sqweights=0.40251]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 59.90it/s, loss=-0.08845, sqweights=0.40743]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 60.35it/s, loss=-0.09010, sqweights=0.40539]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 60.83it/s, loss=-0.09278, sqweights=0.40889]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 61.18it/s, loss=-0.09181, sqweights=0.41145]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 61.45it/s, loss=-0.09181, sqweights=0.41145]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 61.45it/s, loss=-0.09314, sqweights=0.41278]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 61.45it/s, loss=-0.09307, sqweights=0.41134]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 61.45it/s, loss=-0.09115, sqweights=0.41109]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 61.45it/s, loss=-0.08999, sqweights=0.41131]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 61.45it/s, loss=-0.08914, sqweights=0.41023]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 61.45it/s, loss=-0.08833, sqweights=0.41128]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 60.82it/s, loss=-0.08833, sqweights=0.41128]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 60.82it/s, loss=-0.08763, sqweights=0.41141]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 60.82it/s, loss=-0.08679, sqweights=0.41027]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 60.82it/s, loss=-0.08596, sqweights=0.41060]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 60.82it/s, loss=-0.08545, sqweights=0.41196]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 60.82it/s, loss=-0.08639, sqweights=0.41264]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 60.82it/s, loss=-0.08675, sqweights=0.41313]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 60.82it/s, loss=-0.08706, sqweights=0.41351]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 62.04it/s, loss=-0.08706, sqweights=0.41351]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 62.04it/s, loss=-0.08739, sqweights=0.41536]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 62.04it/s, loss=-0.08739, sqweights=0.41536, train_loss=-0.10866, train_sqweights=0.33491, val_loss=-0.08335, val_sqweights=0.32615]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 20.95it/s, loss=-0.08739, sqweights=0.41536, train_loss=-0.10866, train_sqweights=0.33491, val_loss=-0.08335, val_sqweights=0.32615]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 53.88it/s, loss=-0.10214, sqweights=0.41931]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 54.28it/s, loss=-0.10306, sqweights=0.42899]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 56.57it/s, loss=-0.10527, sqweights=0.43099]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 57.52it/s, loss=-0.09827, sqweights=0.42772]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 58.29it/s, loss=-0.09508, sqweights=0.42685]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 58.55it/s, loss=-0.09508, sqweights=0.42685]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 58.55it/s, loss=-0.09176, sqweights=0.42677]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 58.55it/s, loss=-0.09039, sqweights=0.42792]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 58.55it/s, loss=-0.09054, sqweights=0.42949]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 58.55it/s, loss=-0.09045, sqweights=0.42922]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 58.55it/s, loss=-0.09429, sqweights=0.42890]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 58.55it/s, loss=-0.09371, sqweights=0.43171]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 58.55it/s, loss=-0.09367, sqweights=0.43350]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 59.19it/s, loss=-0.09367, sqweights=0.43350]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 59.19it/s, loss=-0.09396, sqweights=0.43703]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 59.19it/s, loss=-0.09453, sqweights=0.43906]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 59.19it/s, loss=-0.09483, sqweights=0.43937]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 59.19it/s, loss=-0.09402, sqweights=0.43923]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 59.19it/s, loss=-0.09404, sqweights=0.43878]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 59.19it/s, loss=-0.09376, sqweights=0.43964]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 59.19it/s, loss=-0.09404, sqweights=0.43990]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 60.39it/s, loss=-0.09404, sqweights=0.43990]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 60.39it/s, loss=-0.09325, sqweights=0.43977]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 60.39it/s, loss=-0.09325, sqweights=0.43977, train_loss=-0.11493, train_sqweights=0.35693, val_loss=-0.08805, val_sqweights=0.34775]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 21.08it/s, loss=-0.09325, sqweights=0.43977, train_loss=-0.11493, train_sqweights=0.35693, val_loss=-0.08805, val_sqweights=0.34775]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 53.89it/s, loss=-0.07951, sqweights=0.44455]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 54.86it/s, loss=-0.09108, sqweights=0.44383]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 57.02it/s, loss=-0.09664, sqweights=0.44672]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 58.03it/s, loss=-0.09642, sqweights=0.45322]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 58.89it/s, loss=-0.09634, sqweights=0.45575]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 59.43it/s, loss=-0.09634, sqweights=0.45575]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 59.43it/s, loss=-0.10053, sqweights=0.45999]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 59.43it/s, loss=-0.10032, sqweights=0.46178]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 59.43it/s, loss=-0.09994, sqweights=0.46288]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 59.43it/s, loss=-0.09881, sqweights=0.46420]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 59.43it/s, loss=-0.10025, sqweights=0.46333]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 59.43it/s, loss=-0.10045, sqweights=0.46209]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 59.03it/s, loss=-0.10045, sqweights=0.46209]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 59.03it/s, loss=-0.10150, sqweights=0.46253]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 59.03it/s, loss=-0.09960, sqweights=0.46428]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 59.03it/s, loss=-0.09987, sqweights=0.46400]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 59.03it/s, loss=-0.10054, sqweights=0.46514]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 59.03it/s, loss=-0.10030, sqweights=0.46600]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 59.03it/s, loss=-0.10142, sqweights=0.46631]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 59.03it/s, loss=-0.09983, sqweights=0.46643]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 59.79it/s, loss=-0.09983, sqweights=0.46643]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 59.79it/s, loss=-0.09889, sqweights=0.46599]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 59.79it/s, loss=-0.09914, sqweights=0.46808]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 59.79it/s, loss=-0.09914, sqweights=0.46808, train_loss=-0.12084, train_sqweights=0.37954, val_loss=-0.09259, val_sqweights=0.37010]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 20.46it/s, loss=-0.09914, sqweights=0.46808, train_loss=-0.12084, train_sqweights=0.37954, val_loss=-0.09259, val_sqweights=0.37010]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 47.60it/s, loss=-0.09924, sqweights=0.48869]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 51.93it/s, loss=-0.10343, sqweights=0.47640]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 54.12it/s, loss=-0.09531, sqweights=0.48026]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 55.00it/s, loss=-0.09225, sqweights=0.47582]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 55.98it/s, loss=-0.09124, sqweights=0.47479]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 56.52it/s, loss=-0.09124, sqweights=0.47479]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 56.52it/s, loss=-0.09172, sqweights=0.47467]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 56.52it/s, loss=-0.09781, sqweights=0.47657]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 56.52it/s, loss=-0.09666, sqweights=0.48069]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 56.52it/s, loss=-0.09905, sqweights=0.47926]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 56.52it/s, loss=-0.10214, sqweights=0.48194]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 56.52it/s, loss=-0.09981, sqweights=0.48186]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 57.44it/s, loss=-0.09981, sqweights=0.48186]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 57.44it/s, loss=-0.09928, sqweights=0.48278]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 57.44it/s, loss=-0.09782, sqweights=0.48456]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 57.44it/s, loss=-0.09759, sqweights=0.48528]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 57.44it/s, loss=-0.09770, sqweights=0.48551]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 57.44it/s, loss=-0.09875, sqweights=0.48627]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 57.44it/s, loss=-0.09898, sqweights=0.48713]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 57.44it/s, loss=-0.10012, sqweights=0.48688]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 58.35it/s, loss=-0.10012, sqweights=0.48688]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 58.35it/s, loss=-0.09997, sqweights=0.48653]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 58.35it/s, loss=-0.10015, sqweights=0.48683]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 58.35it/s, loss=-0.10015, sqweights=0.48683, train_loss=-0.12640, train_sqweights=0.39960, val_loss=-0.09708, val_sqweights=0.39013]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 20.67it/s, loss=-0.10015, sqweights=0.48683, train_loss=-0.12640, train_sqweights=0.39960, val_loss=-0.09708, val_sqweights=0.39013]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 59.91it/s, loss=-0.11428, sqweights=0.50913]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 60.88it/s, loss=-0.09756, sqweights=0.50667]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 61.43it/s, loss=-0.10560, sqweights=0.50782]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 61.45it/s, loss=-0.10510, sqweights=0.50574]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 61.68it/s, loss=-0.10101, sqweights=0.50945]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 61.26it/s, loss=-0.10258, sqweights=0.51096]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 61.05it/s, loss=-0.10258, sqweights=0.51096]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 61.05it/s, loss=-0.10217, sqweights=0.51023]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 61.05it/s, loss=-0.10456, sqweights=0.51294]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 61.05it/s, loss=-0.10283, sqweights=0.51175]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 61.05it/s, loss=-0.10185, sqweights=0.51071]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 61.05it/s, loss=-0.10111, sqweights=0.51070]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 61.05it/s, loss=-0.10181, sqweights=0.51097]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 61.05it/s, loss=-0.10108, sqweights=0.51172]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 60.97it/s, loss=-0.10108, sqweights=0.51172]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 60.97it/s, loss=-0.10273, sqweights=0.51133]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 60.97it/s, loss=-0.10107, sqweights=0.51232]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 60.97it/s, loss=-0.10022, sqweights=0.51155]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 60.97it/s, loss=-0.09949, sqweights=0.51126]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 60.97it/s, loss=-0.09984, sqweights=0.51149]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 60.97it/s, loss=-0.10168, sqweights=0.51185]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 60.97it/s, loss=-0.10361, sqweights=0.51386]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 60.97it/s, loss=-0.10361, sqweights=0.51386, train_loss=-0.13193, train_sqweights=0.42244, val_loss=-0.10106, val_sqweights=0.41237]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 21.08it/s, loss=-0.10361, sqweights=0.51386, train_loss=-0.13193, train_sqweights=0.42244, val_loss=-0.10106, val_sqweights=0.41237]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 57.46it/s, loss=-0.12545, sqweights=0.52444]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 59.36it/s, loss=-0.11404, sqweights=0.52716]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 59.66it/s, loss=-0.09851, sqweights=0.51957]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 60.24it/s, loss=-0.09579, sqweights=0.52578]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 60.76it/s, loss=-0.09845, sqweights=0.52532]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 60.76it/s, loss=-0.10246, sqweights=0.52840]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 60.65it/s, loss=-0.10246, sqweights=0.52840]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 60.65it/s, loss=-0.09840, sqweights=0.52861]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 60.65it/s, loss=-0.10352, sqweights=0.53030]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 60.65it/s, loss=-0.10397, sqweights=0.52979]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 60.65it/s, loss=-0.10460, sqweights=0.53187]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 60.65it/s, loss=-0.10862, sqweights=0.53209]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 60.65it/s, loss=-0.10920, sqweights=0.53308]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 59.95it/s, loss=-0.10920, sqweights=0.53308]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 59.95it/s, loss=-0.10961, sqweights=0.53568]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 59.95it/s, loss=-0.11030, sqweights=0.53736]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 59.95it/s, loss=-0.11059, sqweights=0.53752]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 59.95it/s, loss=-0.10916, sqweights=0.53748]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 59.95it/s, loss=-0.10859, sqweights=0.53708]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 59.95it/s, loss=-0.10936, sqweights=0.53844]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 59.95it/s, loss=-0.10867, sqweights=0.53846]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 61.51it/s, loss=-0.10867, sqweights=0.53846]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 61.51it/s, loss=-0.10801, sqweights=0.53815]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 61.51it/s, loss=-0.10801, sqweights=0.53815, train_loss=-0.13675, train_sqweights=0.44299, val_loss=-0.10469, val_sqweights=0.43283]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 20.96it/s, loss=-0.10801, sqweights=0.53815, train_loss=-0.13675, train_sqweights=0.44299, val_loss=-0.10469, val_sqweights=0.43283]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 58.09it/s, loss=-0.11544, sqweights=0.53496]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 59.60it/s, loss=-0.11366, sqweights=0.54258]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 60.55it/s, loss=-0.11991, sqweights=0.54182]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 56.46it/s, loss=-0.11647, sqweights=0.54266]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 57.24it/s, loss=-0.11247, sqweights=0.54065]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 57.59it/s, loss=-0.11247, sqweights=0.54065]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 57.59it/s, loss=-0.11462, sqweights=0.53747]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 57.59it/s, loss=-0.10988, sqweights=0.53917]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 57.59it/s, loss=-0.11122, sqweights=0.54108]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 57.59it/s, loss=-0.11169, sqweights=0.54145]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 57.59it/s, loss=-0.11241, sqweights=0.54175]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 57.59it/s, loss=-0.10904, sqweights=0.54375]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 57.59it/s, loss=-0.10758, sqweights=0.54471]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 58.84it/s, loss=-0.10758, sqweights=0.54471]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 58.84it/s, loss=-0.10770, sqweights=0.54618]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 58.84it/s, loss=-0.10898, sqweights=0.54728]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 58.84it/s, loss=-0.11045, sqweights=0.54963]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 58.84it/s, loss=-0.10994, sqweights=0.54921]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 58.84it/s, loss=-0.10943, sqweights=0.54901]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 58.84it/s, loss=-0.10943, sqweights=0.54948]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 58.67it/s, loss=-0.10943, sqweights=0.54948]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 58.67it/s, loss=-0.10932, sqweights=0.54956]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 58.67it/s, loss=-0.10911, sqweights=0.54902]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 58.67it/s, loss=-0.10911, sqweights=0.54902, train_loss=-0.14124, train_sqweights=0.46551, val_loss=-0.10808, val_sqweights=0.45523]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 20.14it/s, loss=-0.10911, sqweights=0.54902, train_loss=-0.14124, train_sqweights=0.46551, val_loss=-0.10808, val_sqweights=0.45523]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 55.93it/s, loss=-0.12194, sqweights=0.56834]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 57.78it/s, loss=-0.11968, sqweights=0.56089]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 57.22it/s, loss=-0.12145, sqweights=0.56218]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 57.85it/s, loss=-0.11845, sqweights=0.56317]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 58.09it/s, loss=-0.11548, sqweights=0.56362]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 58.47it/s, loss=-0.11548, sqweights=0.56362]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 58.47it/s, loss=-0.11036, sqweights=0.56196]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 58.47it/s, loss=-0.10946, sqweights=0.56120]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 58.47it/s, loss=-0.10786, sqweights=0.56292]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 58.47it/s, loss=-0.11096, sqweights=0.56620]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 58.47it/s, loss=-0.11200, sqweights=0.56631]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 58.47it/s, loss=-0.11342, sqweights=0.56816]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 58.50it/s, loss=-0.11342, sqweights=0.56816]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 58.50it/s, loss=-0.11305, sqweights=0.56903]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 58.50it/s, loss=-0.11445, sqweights=0.56951]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 58.50it/s, loss=-0.11458, sqweights=0.57017]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 58.50it/s, loss=-0.11326, sqweights=0.56972]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 58.50it/s, loss=-0.11454, sqweights=0.56914]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 58.50it/s, loss=-0.11354, sqweights=0.56933]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 58.60it/s, loss=-0.11354, sqweights=0.56933]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 58.60it/s, loss=-0.11207, sqweights=0.56841]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 58.60it/s, loss=-0.11239, sqweights=0.56885]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 58.60it/s, loss=-0.11211, sqweights=0.57000]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 58.60it/s, loss=-0.11211, sqweights=0.57000, train_loss=-0.14540, train_sqweights=0.48705, val_loss=-0.11068, val_sqweights=0.47659]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 20.16it/s, loss=-0.11211, sqweights=0.57000, train_loss=-0.14540, train_sqweights=0.48705, val_loss=-0.11068, val_sqweights=0.47659]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 56.53it/s, loss=-0.13668, sqweights=0.57017]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 58.53it/s, loss=-0.12951, sqweights=0.57605]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 54.54it/s, loss=-0.12366, sqweights=0.57492]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 53.28it/s, loss=-0.12121, sqweights=0.58161]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 54.18it/s, loss=-0.11766, sqweights=0.58120]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 54.55it/s, loss=-0.11766, sqweights=0.58120]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 54.55it/s, loss=-0.11526, sqweights=0.58503]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 54.55it/s, loss=-0.11504, sqweights=0.58464]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 54.55it/s, loss=-0.11572, sqweights=0.58835]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 54.55it/s, loss=-0.11632, sqweights=0.59199]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 54.55it/s, loss=-0.11842, sqweights=0.58880]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 54.55it/s, loss=-0.11830, sqweights=0.58834]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 55.74it/s, loss=-0.11830, sqweights=0.58834]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 55.74it/s, loss=-0.11736, sqweights=0.58904]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 55.74it/s, loss=-0.11763, sqweights=0.58835]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 55.74it/s, loss=-0.11596, sqweights=0.58999]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 55.74it/s, loss=-0.11531, sqweights=0.59128]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 55.74it/s, loss=-0.11483, sqweights=0.59179]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 55.74it/s, loss=-0.11503, sqweights=0.59130]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 56.50it/s, loss=-0.11503, sqweights=0.59130]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 56.50it/s, loss=-0.11610, sqweights=0.59335]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 56.50it/s, loss=-0.11689, sqweights=0.59403]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 56.50it/s, loss=-0.11881, sqweights=0.59629]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 56.50it/s, loss=-0.11881, sqweights=0.59629, train_loss=-0.14933, train_sqweights=0.50852, val_loss=-0.11311, val_sqweights=0.49755]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 20.06it/s, loss=-0.11881, sqweights=0.59629, train_loss=-0.14933, train_sqweights=0.50852, val_loss=-0.11311, val_sqweights=0.49755]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 57.37it/s, loss=-0.11819, sqweights=0.62091]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 58.69it/s, loss=-0.11685, sqweights=0.61017]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 59.02it/s, loss=-0.11657, sqweights=0.60759]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 56.46it/s, loss=-0.11282, sqweights=0.60919]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 57.05it/s, loss=-0.11271, sqweights=0.60680]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 57.43it/s, loss=-0.11271, sqweights=0.60680]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 57.43it/s, loss=-0.11247, sqweights=0.60469]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 57.43it/s, loss=-0.11223, sqweights=0.60438]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 57.43it/s, loss=-0.10866, sqweights=0.60435]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 57.43it/s, loss=-0.10889, sqweights=0.60750]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 57.43it/s, loss=-0.10795, sqweights=0.60601]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 57.43it/s, loss=-0.10998, sqweights=0.60638]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 58.04it/s, loss=-0.10998, sqweights=0.60638]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 58.04it/s, loss=-0.10985, sqweights=0.60822]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 58.04it/s, loss=-0.11076, sqweights=0.60865]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 58.04it/s, loss=-0.11278, sqweights=0.60892]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 58.04it/s, loss=-0.11439, sqweights=0.61172]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 58.04it/s, loss=-0.11427, sqweights=0.61179]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 58.04it/s, loss=-0.11279, sqweights=0.61183]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 57.77it/s, loss=-0.11279, sqweights=0.61183]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 57.77it/s, loss=-0.11356, sqweights=0.61274]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 57.77it/s, loss=-0.11317, sqweights=0.61399]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 57.77it/s, loss=-0.11348, sqweights=0.61366]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 57.77it/s, loss=-0.11348, sqweights=0.61366, train_loss=-0.15279, train_sqweights=0.52925, val_loss=-0.11512, val_sqweights=0.51775]
Epoch 22: 100%|##########| 20/20 [00:01<00:00, 19.87it/s, loss=-0.11348, sqweights=0.61366, train_loss=-0.15279, train_sqweights=0.52925, val_loss=-0.11512, val_sqweights=0.51775]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:00, 58.37it/s, loss=-0.10328, sqweights=0.58780]
Epoch 23:  10%|#         | 2/20 [00:00<00:00, 59.90it/s, loss=-0.11837, sqweights=0.61379]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 60.35it/s, loss=-0.11778, sqweights=0.61833]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 60.54it/s, loss=-0.11633, sqweights=0.62404]
Epoch 23:  25%|##5       | 5/20 [00:00<00:00, 60.97it/s, loss=-0.11715, sqweights=0.62317]
Epoch 23:  30%|###       | 6/20 [00:00<00:00, 61.19it/s, loss=-0.11896, sqweights=0.61998]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 61.15it/s, loss=-0.11896, sqweights=0.61998]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 61.15it/s, loss=-0.12129, sqweights=0.62356]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 61.15it/s, loss=-0.12246, sqweights=0.62570]
Epoch 23:  45%|####5     | 9/20 [00:00<00:00, 61.15it/s, loss=-0.12241, sqweights=0.62627]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 61.15it/s, loss=-0.12197, sqweights=0.62700]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 61.15it/s, loss=-0.12284, sqweights=0.62791]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 61.15it/s, loss=-0.12024, sqweights=0.62701]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 61.15it/s, loss=-0.12024, sqweights=0.62768]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 61.15it/s, loss=-0.12024, sqweights=0.62768]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 61.15it/s, loss=-0.11985, sqweights=0.62744]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 61.15it/s, loss=-0.11937, sqweights=0.62734]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 61.15it/s, loss=-0.11948, sqweights=0.62828]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 61.15it/s, loss=-0.11983, sqweights=0.62933]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 61.15it/s, loss=-0.12026, sqweights=0.63077]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 61.15it/s, loss=-0.11909, sqweights=0.63073]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 61.15it/s, loss=-0.11762, sqweights=0.63031]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 61.15it/s, loss=-0.11762, sqweights=0.63031, train_loss=-0.15616, train_sqweights=0.54901, val_loss=-0.11769, val_sqweights=0.53709]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 21.08it/s, loss=-0.11762, sqweights=0.63031, train_loss=-0.15616, train_sqweights=0.54901, val_loss=-0.11769, val_sqweights=0.53709]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 51.16it/s, loss=-0.12183, sqweights=0.61319]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 53.11it/s, loss=-0.12922, sqweights=0.63030]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 54.31it/s, loss=-0.12268, sqweights=0.63588]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 55.38it/s, loss=-0.11881, sqweights=0.63312]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 56.03it/s, loss=-0.12354, sqweights=0.63490]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 56.36it/s, loss=-0.12354, sqweights=0.63490]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 56.36it/s, loss=-0.12266, sqweights=0.63566]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 56.36it/s, loss=-0.12245, sqweights=0.63763]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 56.36it/s, loss=-0.12059, sqweights=0.63902]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 56.36it/s, loss=-0.11768, sqweights=0.63769]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 56.36it/s, loss=-0.11639, sqweights=0.63681]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 56.36it/s, loss=-0.11652, sqweights=0.63711]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 56.68it/s, loss=-0.11652, sqweights=0.63711]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 56.68it/s, loss=-0.11577, sqweights=0.63724]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 56.68it/s, loss=-0.11540, sqweights=0.63983]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 56.68it/s, loss=-0.11347, sqweights=0.64104]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 56.68it/s, loss=-0.11651, sqweights=0.64342]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 56.68it/s, loss=-0.11594, sqweights=0.64363]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 56.68it/s, loss=-0.11574, sqweights=0.64382]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 57.05it/s, loss=-0.11574, sqweights=0.64382]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 57.05it/s, loss=-0.11621, sqweights=0.64553]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 57.05it/s, loss=-0.11611, sqweights=0.64618]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 57.05it/s, loss=-0.11685, sqweights=0.64804]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 57.05it/s, loss=-0.11685, sqweights=0.64804, train_loss=-0.15927, train_sqweights=0.56680, val_loss=-0.11986, val_sqweights=0.55481]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 20.71it/s, loss=-0.11685, sqweights=0.64804, train_loss=-0.15927, train_sqweights=0.56680, val_loss=-0.11986, val_sqweights=0.55481]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:00, 58.26it/s, loss=-0.10613, sqweights=0.67284]
Epoch 25:  10%|#         | 2/20 [00:00<00:00, 59.10it/s, loss=-0.10180, sqweights=0.65723]
Epoch 25:  15%|#5        | 3/20 [00:00<00:00, 58.85it/s, loss=-0.10904, sqweights=0.66255]
Epoch 25:  20%|##        | 4/20 [00:00<00:00, 59.32it/s, loss=-0.11245, sqweights=0.66224]
Epoch 25:  25%|##5       | 5/20 [00:00<00:00, 59.61it/s, loss=-0.11759, sqweights=0.65921]
Epoch 25:  30%|###       | 6/20 [00:00<00:00, 59.62it/s, loss=-0.11759, sqweights=0.65921]
Epoch 25:  30%|###       | 6/20 [00:00<00:00, 59.62it/s, loss=-0.11719, sqweights=0.66136]
Epoch 25:  35%|###5      | 7/20 [00:00<00:00, 59.62it/s, loss=-0.11313, sqweights=0.65962]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 59.62it/s, loss=-0.11763, sqweights=0.66009]
Epoch 25:  45%|####5     | 9/20 [00:00<00:00, 59.62it/s, loss=-0.11827, sqweights=0.66156]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 59.62it/s, loss=-0.12048, sqweights=0.66299]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 59.62it/s, loss=-0.12048, sqweights=0.66411]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 59.62it/s, loss=-0.12068, sqweights=0.66324]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 59.92it/s, loss=-0.12068, sqweights=0.66324]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 59.92it/s, loss=-0.12131, sqweights=0.66540]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 59.92it/s, loss=-0.12294, sqweights=0.66449]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 59.92it/s, loss=-0.12089, sqweights=0.66617]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 59.92it/s, loss=-0.11983, sqweights=0.66612]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 59.92it/s, loss=-0.11805, sqweights=0.66423]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 59.92it/s, loss=-0.11781, sqweights=0.66530]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 59.22it/s, loss=-0.11781, sqweights=0.66530]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 59.22it/s, loss=-0.11897, sqweights=0.66649]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 59.22it/s, loss=-0.11899, sqweights=0.66476]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 59.22it/s, loss=-0.11899, sqweights=0.66476, train_loss=-0.16168, train_sqweights=0.58507, val_loss=-0.12139, val_sqweights=0.57264]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 20.36it/s, loss=-0.11899, sqweights=0.66476, train_loss=-0.16168, train_sqweights=0.58507, val_loss=-0.12139, val_sqweights=0.57264]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 59.74it/s, loss=-0.10973, sqweights=0.65363]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 59.84it/s, loss=-0.11749, sqweights=0.65013]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 60.64it/s, loss=-0.11647, sqweights=0.65304]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 60.68it/s, loss=-0.11616, sqweights=0.65969]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 60.85it/s, loss=-0.11542, sqweights=0.65823]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 61.07it/s, loss=-0.11381, sqweights=0.65822]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 58.83it/s, loss=-0.11381, sqweights=0.65822]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 58.83it/s, loss=-0.11539, sqweights=0.65969]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 58.83it/s, loss=-0.11646, sqweights=0.66367]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 58.83it/s, loss=-0.11732, sqweights=0.66565]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 58.83it/s, loss=-0.11362, sqweights=0.66516]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 58.83it/s, loss=-0.11297, sqweights=0.66646]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 58.83it/s, loss=-0.11536, sqweights=0.66749]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 58.45it/s, loss=-0.11536, sqweights=0.66749]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 58.45it/s, loss=-0.11542, sqweights=0.66723]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 58.45it/s, loss=-0.11639, sqweights=0.66866]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 58.45it/s, loss=-0.11795, sqweights=0.66836]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 58.45it/s, loss=-0.11793, sqweights=0.66730]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 58.45it/s, loss=-0.11886, sqweights=0.66833]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 58.45it/s, loss=-0.11894, sqweights=0.66946]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 58.43it/s, loss=-0.11894, sqweights=0.66946]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 58.43it/s, loss=-0.11960, sqweights=0.67080]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 58.43it/s, loss=-0.12109, sqweights=0.67216]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 58.43it/s, loss=-0.12109, sqweights=0.67216, train_loss=-0.16399, train_sqweights=0.60200, val_loss=-0.12221, val_sqweights=0.59061]
Epoch 26: 100%|##########| 20/20 [00:01<00:00, 19.90it/s, loss=-0.12109, sqweights=0.67216, train_loss=-0.16399, train_sqweights=0.60200, val_loss=-0.12221, val_sqweights=0.59061]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 56.13it/s, loss=-0.14195, sqweights=0.68984]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 57.77it/s, loss=-0.12961, sqweights=0.69696]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 58.57it/s, loss=-0.12896, sqweights=0.69448]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 57.72it/s, loss=-0.12538, sqweights=0.69048]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 57.64it/s, loss=-0.13041, sqweights=0.68644]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 57.57it/s, loss=-0.13041, sqweights=0.68644]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 57.57it/s, loss=-0.12661, sqweights=0.68844]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 57.57it/s, loss=-0.12603, sqweights=0.68799]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 57.57it/s, loss=-0.12777, sqweights=0.68878]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 57.57it/s, loss=-0.12707, sqweights=0.69065]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 57.57it/s, loss=-0.12704, sqweights=0.68946]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 57.57it/s, loss=-0.12625, sqweights=0.68845]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 58.11it/s, loss=-0.12625, sqweights=0.68845]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 58.11it/s, loss=-0.12772, sqweights=0.69006]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 58.11it/s, loss=-0.12616, sqweights=0.69041]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 58.11it/s, loss=-0.12717, sqweights=0.68959]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 58.11it/s, loss=-0.12705, sqweights=0.68919]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 58.11it/s, loss=-0.12717, sqweights=0.69073]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 58.11it/s, loss=-0.12825, sqweights=0.69153]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 57.73it/s, loss=-0.12825, sqweights=0.69153]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 57.73it/s, loss=-0.12683, sqweights=0.69143]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 57.73it/s, loss=-0.12728, sqweights=0.69239]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 57.73it/s, loss=-0.12673, sqweights=0.69272]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 57.73it/s, loss=-0.12673, sqweights=0.69272, train_loss=-0.16650, train_sqweights=0.62133, val_loss=-0.12383, val_sqweights=0.61012]
Epoch 27: 100%|##########| 20/20 [00:01<00:00, 19.57it/s, loss=-0.12673, sqweights=0.69272, train_loss=-0.16650, train_sqweights=0.62133, val_loss=-0.12383, val_sqweights=0.61012]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 56.39it/s, loss=-0.16086, sqweights=0.68903]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 57.01it/s, loss=-0.14354, sqweights=0.68824]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 57.86it/s, loss=-0.14058, sqweights=0.69759]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 57.96it/s, loss=-0.13877, sqweights=0.69707]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 58.17it/s, loss=-0.14195, sqweights=0.69754]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 58.35it/s, loss=-0.14195, sqweights=0.69754]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 58.35it/s, loss=-0.13790, sqweights=0.69807]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 58.35it/s, loss=-0.13749, sqweights=0.69834]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 58.35it/s, loss=-0.13620, sqweights=0.69851]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 58.35it/s, loss=-0.13300, sqweights=0.69807]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 58.35it/s, loss=-0.12997, sqweights=0.69745]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 58.35it/s, loss=-0.13014, sqweights=0.69846]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 58.70it/s, loss=-0.13014, sqweights=0.69846]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 58.70it/s, loss=-0.12866, sqweights=0.70327]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 58.70it/s, loss=-0.12652, sqweights=0.70486]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 58.70it/s, loss=-0.12496, sqweights=0.70570]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 58.70it/s, loss=-0.12552, sqweights=0.70701]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 58.70it/s, loss=-0.12628, sqweights=0.70713]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 58.70it/s, loss=-0.12749, sqweights=0.70787]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 57.21it/s, loss=-0.12749, sqweights=0.70787]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 57.21it/s, loss=-0.12713, sqweights=0.70902]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 57.21it/s, loss=-0.12738, sqweights=0.71075]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 57.21it/s, loss=-0.12768, sqweights=0.71278]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 57.21it/s, loss=-0.12768, sqweights=0.71278, train_loss=-0.16876, train_sqweights=0.64257, val_loss=-0.12485, val_sqweights=0.63175]
Epoch 28: 100%|##########| 20/20 [00:01<00:00, 18.84it/s, loss=-0.12768, sqweights=0.71278, train_loss=-0.16876, train_sqweights=0.64257, val_loss=-0.12485, val_sqweights=0.63175]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 56.49it/s, loss=-0.11191, sqweights=0.72959]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 57.76it/s, loss=-0.12216, sqweights=0.72880]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 58.51it/s, loss=-0.11646, sqweights=0.72496]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 59.12it/s, loss=-0.12036, sqweights=0.71889]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 58.42it/s, loss=-0.12335, sqweights=0.72368]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 58.24it/s, loss=-0.12335, sqweights=0.72368]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 58.24it/s, loss=-0.12365, sqweights=0.72421]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 58.24it/s, loss=-0.12247, sqweights=0.72326]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 58.24it/s, loss=-0.12334, sqweights=0.72146]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 58.24it/s, loss=-0.12175, sqweights=0.72154]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 58.24it/s, loss=-0.12249, sqweights=0.72159]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 58.24it/s, loss=-0.12203, sqweights=0.72244]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 56.69it/s, loss=-0.12203, sqweights=0.72244]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 56.69it/s, loss=-0.12165, sqweights=0.72395]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 56.69it/s, loss=-0.12109, sqweights=0.72324]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 56.69it/s, loss=-0.12107, sqweights=0.72521]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 56.69it/s, loss=-0.11921, sqweights=0.72659]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 56.69it/s, loss=-0.11856, sqweights=0.72609]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 56.69it/s, loss=-0.12046, sqweights=0.72633]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 56.12it/s, loss=-0.12046, sqweights=0.72633]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 56.12it/s, loss=-0.12190, sqweights=0.72707]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 56.12it/s, loss=-0.12178, sqweights=0.72739]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 56.12it/s, loss=-0.12104, sqweights=0.72749]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 56.12it/s, loss=-0.12104, sqweights=0.72749, train_loss=-0.17131, train_sqweights=0.65909, val_loss=-0.12672, val_sqweights=0.64824]
Epoch 29: 100%|##########| 20/20 [00:01<00:00, 19.81it/s, loss=-0.12104, sqweights=0.72749, train_loss=-0.17131, train_sqweights=0.65909, val_loss=-0.12672, val_sqweights=0.64824]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 57.08it/s, loss=-0.08917, sqweights=0.73558]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 57.08it/s, loss=-0.09352, sqweights=0.73165]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 58.16it/s, loss=-0.10466, sqweights=0.73851]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 58.83it/s, loss=-0.11336, sqweights=0.74023]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 59.32it/s, loss=-0.11536, sqweights=0.73763]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 59.21it/s, loss=-0.11536, sqweights=0.73763]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 59.21it/s, loss=-0.11318, sqweights=0.73655]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 59.21it/s, loss=-0.11785, sqweights=0.73797]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 59.21it/s, loss=-0.12073, sqweights=0.73837]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 59.21it/s, loss=-0.12430, sqweights=0.74035]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 59.21it/s, loss=-0.12408, sqweights=0.74083]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 59.21it/s, loss=-0.12318, sqweights=0.73995]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 58.43it/s, loss=-0.12318, sqweights=0.73995]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 58.43it/s, loss=-0.12211, sqweights=0.73977]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 58.43it/s, loss=-0.11936, sqweights=0.74007]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 58.43it/s, loss=-0.12205, sqweights=0.73928]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 58.43it/s, loss=-0.12157, sqweights=0.73909]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 58.43it/s, loss=-0.12241, sqweights=0.73980]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 58.43it/s, loss=-0.12154, sqweights=0.73937]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 58.43it/s, loss=-0.12141, sqweights=0.73955]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 59.44it/s, loss=-0.12141, sqweights=0.73955]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 59.44it/s, loss=-0.12030, sqweights=0.74056]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 59.44it/s, loss=-0.12012, sqweights=0.73901]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 59.44it/s, loss=-0.12012, sqweights=0.73901, train_loss=-0.17311, train_sqweights=0.67373, val_loss=-0.12776, val_sqweights=0.66318]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 20.91it/s, loss=-0.12012, sqweights=0.73901, train_loss=-0.17311, train_sqweights=0.67373, val_loss=-0.12776, val_sqweights=0.66318]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 51.14it/s, loss=-0.12577, sqweights=0.72519]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 54.75it/s, loss=-0.14035, sqweights=0.73279]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 55.94it/s, loss=-0.12700, sqweights=0.74017]
Epoch 31:  20%|##        | 4/20 [00:00<00:00, 56.80it/s, loss=-0.12376, sqweights=0.74514]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 56.96it/s, loss=-0.12239, sqweights=0.74205]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 57.61it/s, loss=-0.12239, sqweights=0.74205]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 57.61it/s, loss=-0.11690, sqweights=0.74338]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 57.61it/s, loss=-0.12020, sqweights=0.74134]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 57.61it/s, loss=-0.12008, sqweights=0.74257]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 57.61it/s, loss=-0.12403, sqweights=0.74282]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 57.61it/s, loss=-0.12634, sqweights=0.74239]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 57.61it/s, loss=-0.12635, sqweights=0.74322]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 56.24it/s, loss=-0.12635, sqweights=0.74322]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 56.24it/s, loss=-0.12731, sqweights=0.74333]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 56.24it/s, loss=-0.12406, sqweights=0.74305]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 56.24it/s, loss=-0.12265, sqweights=0.74261]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 56.24it/s, loss=-0.12149, sqweights=0.74300]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 56.24it/s, loss=-0.12263, sqweights=0.74358]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 56.24it/s, loss=-0.12374, sqweights=0.74493]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 57.02it/s, loss=-0.12374, sqweights=0.74493]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 57.02it/s, loss=-0.12392, sqweights=0.74556]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 57.02it/s, loss=-0.12332, sqweights=0.74515]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 57.02it/s, loss=-0.12120, sqweights=0.74500]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 57.02it/s, loss=-0.12120, sqweights=0.74500, train_loss=-0.17489, train_sqweights=0.68688, val_loss=-0.12891, val_sqweights=0.67670]
Epoch 31: 100%|##########| 20/20 [00:01<00:00, 19.80it/s, loss=-0.12120, sqweights=0.74500, train_loss=-0.17489, train_sqweights=0.68688, val_loss=-0.12891, val_sqweights=0.67670]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 58.20it/s, loss=-0.11175, sqweights=0.76127]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 58.29it/s, loss=-0.13472, sqweights=0.75309]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 58.29it/s, loss=-0.11698, sqweights=0.75926]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 58.55it/s, loss=-0.12128, sqweights=0.76182]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 58.68it/s, loss=-0.12501, sqweights=0.76068]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 59.13it/s, loss=-0.12501, sqweights=0.76068]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 59.13it/s, loss=-0.12401, sqweights=0.75850]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 59.13it/s, loss=-0.12622, sqweights=0.75786]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 59.13it/s, loss=-0.12265, sqweights=0.75809]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 59.13it/s, loss=-0.12282, sqweights=0.75608]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 59.13it/s, loss=-0.12574, sqweights=0.75716]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 59.13it/s, loss=-0.12783, sqweights=0.75763]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 57.09it/s, loss=-0.12783, sqweights=0.75763]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 57.09it/s, loss=-0.12772, sqweights=0.75782]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 57.09it/s, loss=-0.12588, sqweights=0.75905]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 57.09it/s, loss=-0.12332, sqweights=0.75940]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 57.09it/s, loss=-0.12437, sqweights=0.76052]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 57.09it/s, loss=-0.12506, sqweights=0.76082]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 57.09it/s, loss=-0.12627, sqweights=0.76096]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 57.56it/s, loss=-0.12627, sqweights=0.76096]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 57.56it/s, loss=-0.12641, sqweights=0.76177]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 57.56it/s, loss=-0.12744, sqweights=0.76314]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 57.56it/s, loss=-0.12575, sqweights=0.76359]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 57.56it/s, loss=-0.12575, sqweights=0.76359, train_loss=-0.17586, train_sqweights=0.69895, val_loss=-0.12929, val_sqweights=0.68945]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 20.08it/s, loss=-0.12575, sqweights=0.76359, train_loss=-0.17586, train_sqweights=0.69895, val_loss=-0.12929, val_sqweights=0.68945]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 56.56it/s, loss=-0.08678, sqweights=0.73586]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 58.06it/s, loss=-0.09746, sqweights=0.76380]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 57.57it/s, loss=-0.11377, sqweights=0.77034]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 57.87it/s, loss=-0.11908, sqweights=0.76838]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 57.37it/s, loss=-0.11772, sqweights=0.76786]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 57.80it/s, loss=-0.11772, sqweights=0.76786]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 57.80it/s, loss=-0.11912, sqweights=0.76820]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 57.80it/s, loss=-0.12088, sqweights=0.76765]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 57.80it/s, loss=-0.12259, sqweights=0.76846]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 57.80it/s, loss=-0.12101, sqweights=0.76863]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 57.80it/s, loss=-0.12101, sqweights=0.76930]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 57.80it/s, loss=-0.11977, sqweights=0.77059]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 57.80it/s, loss=-0.11936, sqweights=0.76876]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 58.88it/s, loss=-0.11936, sqweights=0.76876]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 58.88it/s, loss=-0.12011, sqweights=0.76999]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 58.88it/s, loss=-0.11977, sqweights=0.76927]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 58.88it/s, loss=-0.12237, sqweights=0.76937]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 58.88it/s, loss=-0.12139, sqweights=0.76981]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 58.88it/s, loss=-0.12097, sqweights=0.76892]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 58.88it/s, loss=-0.11858, sqweights=0.77011]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 58.88it/s, loss=-0.12059, sqweights=0.77075]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 60.87it/s, loss=-0.12059, sqweights=0.77075]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 60.87it/s, loss=-0.12202, sqweights=0.77039]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 60.87it/s, loss=-0.12202, sqweights=0.77039, train_loss=-0.17729, train_sqweights=0.71010, val_loss=-0.12949, val_sqweights=0.70052]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 20.73it/s, loss=-0.12202, sqweights=0.77039, train_loss=-0.17729, train_sqweights=0.71010, val_loss=-0.12949, val_sqweights=0.70052]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 58.67it/s, loss=-0.11689, sqweights=0.79127]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 56.04it/s, loss=-0.12369, sqweights=0.78787]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 57.96it/s, loss=-0.12129, sqweights=0.77588]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 58.20it/s, loss=-0.12449, sqweights=0.77593]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 58.75it/s, loss=-0.12671, sqweights=0.77968]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 58.82it/s, loss=-0.12671, sqweights=0.77968]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 58.82it/s, loss=-0.13369, sqweights=0.77981]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 58.82it/s, loss=-0.13392, sqweights=0.77919]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 58.82it/s, loss=-0.12830, sqweights=0.78035]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 58.82it/s, loss=-0.13244, sqweights=0.78058]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 58.82it/s, loss=-0.13195, sqweights=0.78105]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 58.82it/s, loss=-0.13489, sqweights=0.78293]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 59.15it/s, loss=-0.13489, sqweights=0.78293]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 59.15it/s, loss=-0.13556, sqweights=0.78071]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 59.15it/s, loss=-0.13460, sqweights=0.78187]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 59.15it/s, loss=-0.13327, sqweights=0.78413]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 59.15it/s, loss=-0.13432, sqweights=0.78390]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 59.15it/s, loss=-0.13485, sqweights=0.78371]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 59.15it/s, loss=-0.13395, sqweights=0.78392]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 59.15it/s, loss=-0.13256, sqweights=0.78435]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 59.91it/s, loss=-0.13256, sqweights=0.78435]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 59.91it/s, loss=-0.13180, sqweights=0.78411]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 59.91it/s, loss=-0.13032, sqweights=0.78411]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 59.91it/s, loss=-0.13032, sqweights=0.78411, train_loss=-0.17825, train_sqweights=0.72441, val_loss=-0.13013, val_sqweights=0.71443]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 20.41it/s, loss=-0.13032, sqweights=0.78411, train_loss=-0.17825, train_sqweights=0.72441, val_loss=-0.13013, val_sqweights=0.71443]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 54.35it/s, loss=-0.12243, sqweights=0.78298]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 54.32it/s, loss=-0.12497, sqweights=0.78242]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 55.97it/s, loss=-0.12258, sqweights=0.78342]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 49.90it/s, loss=-0.11973, sqweights=0.78741]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 51.60it/s, loss=-0.12066, sqweights=0.78542]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 52.89it/s, loss=-0.12066, sqweights=0.78542]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 52.89it/s, loss=-0.12126, sqweights=0.78587]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 52.89it/s, loss=-0.12206, sqweights=0.78827]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 52.89it/s, loss=-0.12437, sqweights=0.79021]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 52.89it/s, loss=-0.12523, sqweights=0.78622]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 52.89it/s, loss=-0.12757, sqweights=0.78912]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 52.89it/s, loss=-0.12703, sqweights=0.79067]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 54.74it/s, loss=-0.12703, sqweights=0.79067]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 54.74it/s, loss=-0.12528, sqweights=0.79027]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 54.74it/s, loss=-0.12828, sqweights=0.78958]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 54.74it/s, loss=-0.12742, sqweights=0.78987]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 54.74it/s, loss=-0.12864, sqweights=0.79030]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 54.74it/s, loss=-0.12940, sqweights=0.78965]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 54.74it/s, loss=-0.12818, sqweights=0.79015]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 54.74it/s, loss=-0.12905, sqweights=0.79056]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 56.29it/s, loss=-0.12905, sqweights=0.79056]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 56.29it/s, loss=-0.12807, sqweights=0.79028]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 56.29it/s, loss=-0.12809, sqweights=0.79120]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 56.29it/s, loss=-0.12809, sqweights=0.79120, train_loss=-0.17925, train_sqweights=0.73658, val_loss=-0.13128, val_sqweights=0.72832]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 20.72it/s, loss=-0.12809, sqweights=0.79120, train_loss=-0.17925, train_sqweights=0.73658, val_loss=-0.13128, val_sqweights=0.72832]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 59.16it/s, loss=-0.10112, sqweights=0.81976]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 60.51it/s, loss=-0.12642, sqweights=0.81335]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 60.01it/s, loss=-0.12344, sqweights=0.80913]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 60.02it/s, loss=-0.12260, sqweights=0.80161]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 60.36it/s, loss=-0.12354, sqweights=0.80492]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 60.66it/s, loss=-0.12974, sqweights=0.80379]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 60.89it/s, loss=-0.12974, sqweights=0.80379]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 60.89it/s, loss=-0.13229, sqweights=0.79986]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 60.89it/s, loss=-0.13352, sqweights=0.79990]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 60.89it/s, loss=-0.12999, sqweights=0.80092]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 60.89it/s, loss=-0.13197, sqweights=0.80102]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 60.89it/s, loss=-0.13430, sqweights=0.80010]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 60.89it/s, loss=-0.13412, sqweights=0.79949]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 60.89it/s, loss=-0.13200, sqweights=0.80062]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 61.04it/s, loss=-0.13200, sqweights=0.80062]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 61.04it/s, loss=-0.13054, sqweights=0.79926]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 61.04it/s, loss=-0.13131, sqweights=0.79897]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 61.04it/s, loss=-0.13182, sqweights=0.79918]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 61.04it/s, loss=-0.13083, sqweights=0.79934]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 61.04it/s, loss=-0.13119, sqweights=0.80076]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 61.04it/s, loss=-0.12991, sqweights=0.80207]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 61.04it/s, loss=-0.12940, sqweights=0.80209]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 61.04it/s, loss=-0.12940, sqweights=0.80209, train_loss=-0.18108, train_sqweights=0.75063, val_loss=-0.13236, val_sqweights=0.74129]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 21.46it/s, loss=-0.12940, sqweights=0.80209, train_loss=-0.18108, train_sqweights=0.75063, val_loss=-0.13236, val_sqweights=0.74129]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 59.47it/s, loss=-0.14582, sqweights=0.78309]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 58.57it/s, loss=-0.14148, sqweights=0.80037]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 54.32it/s, loss=-0.13076, sqweights=0.80641]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 55.43it/s, loss=-0.13102, sqweights=0.80813]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 56.12it/s, loss=-0.12937, sqweights=0.81058]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 56.12it/s, loss=-0.12937, sqweights=0.81058]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 56.12it/s, loss=-0.13131, sqweights=0.80895]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 56.12it/s, loss=-0.13178, sqweights=0.80756]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 56.12it/s, loss=-0.13089, sqweights=0.80841]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 56.12it/s, loss=-0.13096, sqweights=0.80944]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 56.12it/s, loss=-0.12934, sqweights=0.81019]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 56.12it/s, loss=-0.13083, sqweights=0.80994]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 56.83it/s, loss=-0.13083, sqweights=0.80994]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 56.83it/s, loss=-0.13275, sqweights=0.80939]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 56.83it/s, loss=-0.13150, sqweights=0.80996]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 56.83it/s, loss=-0.12958, sqweights=0.81190]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 56.83it/s, loss=-0.13082, sqweights=0.81291]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 56.83it/s, loss=-0.13012, sqweights=0.81353]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 56.83it/s, loss=-0.12815, sqweights=0.81328]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 57.29it/s, loss=-0.12815, sqweights=0.81328]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 57.29it/s, loss=-0.12729, sqweights=0.81537]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 57.29it/s, loss=-0.12646, sqweights=0.81474]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 57.29it/s, loss=-0.12610, sqweights=0.81552]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 57.29it/s, loss=-0.12610, sqweights=0.81552, train_loss=-0.18246, train_sqweights=0.76487, val_loss=-0.13340, val_sqweights=0.75547]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 20.24it/s, loss=-0.12610, sqweights=0.81552, train_loss=-0.18246, train_sqweights=0.76487, val_loss=-0.13340, val_sqweights=0.75547]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:00, 53.08it/s, loss=-0.12882, sqweights=0.82764]
Epoch 38:  10%|#         | 2/20 [00:00<00:00, 52.85it/s, loss=-0.13366, sqweights=0.82136]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 52.61it/s, loss=-0.12910, sqweights=0.82193]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 53.99it/s, loss=-0.13581, sqweights=0.81650]
Epoch 38:  25%|##5       | 5/20 [00:00<00:00, 54.37it/s, loss=-0.14124, sqweights=0.81845]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 49.96it/s, loss=-0.14124, sqweights=0.81845]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 49.96it/s, loss=-0.13943, sqweights=0.81722]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 49.96it/s, loss=-0.13919, sqweights=0.81657]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 49.96it/s, loss=-0.13546, sqweights=0.81672]
Epoch 38:  45%|####5     | 9/20 [00:00<00:00, 49.96it/s, loss=-0.13642, sqweights=0.81909]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 49.96it/s, loss=-0.13519, sqweights=0.82030]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 49.96it/s, loss=-0.13487, sqweights=0.82057]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 52.57it/s, loss=-0.13487, sqweights=0.82057]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 52.57it/s, loss=-0.13205, sqweights=0.81914]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 52.57it/s, loss=-0.13128, sqweights=0.82004]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 52.57it/s, loss=-0.13213, sqweights=0.82165]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 52.57it/s, loss=-0.13362, sqweights=0.82102]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 52.57it/s, loss=-0.13358, sqweights=0.82132]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 52.57it/s, loss=-0.13473, sqweights=0.82169]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 52.57it/s, loss=-0.13533, sqweights=0.82215]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 55.16it/s, loss=-0.13533, sqweights=0.82215]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 55.16it/s, loss=-0.13430, sqweights=0.82241]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 55.16it/s, loss=-0.13442, sqweights=0.82249]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 55.16it/s, loss=-0.13442, sqweights=0.82249, train_loss=-0.18311, train_sqweights=0.77367, val_loss=-0.13428, val_sqweights=0.76580]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 20.54it/s, loss=-0.13442, sqweights=0.82249, train_loss=-0.18311, train_sqweights=0.77367, val_loss=-0.13428, val_sqweights=0.76580]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 54.51it/s, loss=-0.11640, sqweights=0.81651]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 54.22it/s, loss=-0.10035, sqweights=0.81292]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 55.90it/s, loss=-0.11175, sqweights=0.81827]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 56.65it/s, loss=-0.12006, sqweights=0.81895]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 57.17it/s, loss=-0.11431, sqweights=0.81896]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 52.10it/s, loss=-0.11431, sqweights=0.81896]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 52.10it/s, loss=-0.11171, sqweights=0.82011]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 52.10it/s, loss=-0.11092, sqweights=0.82456]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 52.10it/s, loss=-0.10952, sqweights=0.82361]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 52.10it/s, loss=-0.11174, sqweights=0.82645]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 52.10it/s, loss=-0.11159, sqweights=0.82565]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 52.10it/s, loss=-0.10984, sqweights=0.82447]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 53.72it/s, loss=-0.10984, sqweights=0.82447]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 53.72it/s, loss=-0.11210, sqweights=0.82587]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 53.72it/s, loss=-0.11251, sqweights=0.82478]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 53.72it/s, loss=-0.11270, sqweights=0.82433]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 53.72it/s, loss=-0.11503, sqweights=0.82547]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 53.72it/s, loss=-0.11511, sqweights=0.82528]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 53.72it/s, loss=-0.11874, sqweights=0.82528]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 53.72it/s, loss=-0.11966, sqweights=0.82636]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 55.50it/s, loss=-0.11966, sqweights=0.82636]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 55.50it/s, loss=-0.12194, sqweights=0.82721]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 55.50it/s, loss=-0.12003, sqweights=0.82844]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 55.50it/s, loss=-0.12003, sqweights=0.82844, train_loss=-0.18411, train_sqweights=0.77734, val_loss=-0.13446, val_sqweights=0.77034]
Epoch 39: 100%|##########| 20/20 [00:01<00:00, 19.93it/s, loss=-0.12003, sqweights=0.82844, train_loss=-0.18411, train_sqweights=0.77734, val_loss=-0.13446, val_sqweights=0.77034]

<matplotlib.legend.Legend object at 0x7f0c5842d9e8>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('../examples/var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 0 minutes 48.810 seconds)

Gallery generated by Sphinx-Gallery