Vector autoregression

This example demonstrates how one can validate deepdow on synthetic data. We choose to model our returns with the vector autoregression model (VAR). This model links future returns to lagged returns with a linear model. See [Lütkepohl2005] for more details. We use a stable VAR process with 12 lags and 8 assets, that is

\[r_t = A_1 r_{t-1} + ... + A_{12} r_{t-12}\]

For this specific task, we use the LinearNet network. It is very similar to VAR since it tries to find a linear model of all lagged variables. However, it also has purely deep learning components like dropout, batch normalization and softmax allocator.

To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks

  • equally weighted portfolio

  • inverse volatility

  • random allocation

References

Lütkepohl2005

Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.

Warning

Note that we are using the statsmodels package to simulate the VAR process.

Validation loss

Out:

model       metric     epoch  dataloader
1overN      loss       -1     train         0.001
                              val          -0.001
            sqweights  -1     train         0.125
                              val           0.125
InverseVol  loss       -1     train         0.001
                              val          -0.002
            sqweights  -1     train         0.144
                              val           0.145
Random      loss       -1     train         0.000
                              val           0.000
            sqweights  -1     train         0.166
                              val           0.166
VAR         loss       -1     train        -0.173
                              val          -0.174
            sqweights  -1     train         1.000
                              val           1.000

Epoch 0:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 0:   5%|5         | 1/20 [00:00<00:00, 48.26it/s, loss=-0.00073, sqweights=0.16562]
Epoch 0:  10%|#         | 2/20 [00:00<00:00, 54.73it/s, loss=-0.00287, sqweights=0.16567]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 24.43it/s, loss=-0.00287, sqweights=0.16567]
Epoch 0:  15%|#5        | 3/20 [00:00<00:00, 24.43it/s, loss=-0.00210, sqweights=0.16546]
Epoch 0:  20%|##        | 4/20 [00:00<00:00, 24.43it/s, loss=0.00233, sqweights=0.16570]
Epoch 0:  25%|##5       | 5/20 [00:00<00:00, 24.43it/s, loss=0.00064, sqweights=0.16639]
Epoch 0:  30%|###       | 6/20 [00:00<00:00, 24.43it/s, loss=-0.00070, sqweights=0.16656]
Epoch 0:  35%|###5      | 7/20 [00:00<00:00, 24.43it/s, loss=-0.00123, sqweights=0.16696]
Epoch 0:  40%|####      | 8/20 [00:00<00:00, 24.43it/s, loss=-0.00011, sqweights=0.16724]
Epoch 0:  45%|####5     | 9/20 [00:00<00:00, 24.43it/s, loss=0.00071, sqweights=0.16691]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 47.36it/s, loss=0.00071, sqweights=0.16691]
Epoch 0:  50%|#####     | 10/20 [00:00<00:00, 47.36it/s, loss=-0.00059, sqweights=0.16709]
Epoch 0:  55%|#####5    | 11/20 [00:00<00:00, 47.36it/s, loss=0.00039, sqweights=0.16712]
Epoch 0:  60%|######    | 12/20 [00:00<00:00, 47.36it/s, loss=0.00098, sqweights=0.16723]
Epoch 0:  65%|######5   | 13/20 [00:00<00:00, 47.36it/s, loss=0.00154, sqweights=0.16747]
Epoch 0:  70%|#######   | 14/20 [00:00<00:00, 47.36it/s, loss=0.00035, sqweights=0.16746]
Epoch 0:  75%|#######5  | 15/20 [00:00<00:00, 47.36it/s, loss=0.00071, sqweights=0.16745]
Epoch 0:  80%|########  | 16/20 [00:00<00:00, 47.36it/s, loss=0.00016, sqweights=0.16770]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 55.10it/s, loss=0.00016, sqweights=0.16770]
Epoch 0:  85%|########5 | 17/20 [00:00<00:00, 55.10it/s, loss=-0.00012, sqweights=0.16771]
Epoch 0:  90%|######### | 18/20 [00:00<00:00, 55.10it/s, loss=-0.00129, sqweights=0.16774]
Epoch 0:  95%|#########5| 19/20 [00:00<00:00, 55.10it/s, loss=-0.00113, sqweights=0.16795]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 21.34it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]

Epoch 1:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 1:   5%|5         | 1/20 [00:00<00:00, 48.28it/s, loss=0.00714, sqweights=0.17062]
Epoch 1:  10%|#         | 2/20 [00:00<00:00, 55.89it/s, loss=-0.00702, sqweights=0.16930]
Epoch 1:  15%|#5        | 3/20 [00:00<00:00, 59.07it/s, loss=-0.00641, sqweights=0.16876]
Epoch 1:  20%|##        | 4/20 [00:00<00:00, 60.81it/s, loss=-0.00352, sqweights=0.16840]
Epoch 1:  25%|##5       | 5/20 [00:00<00:00, 61.75it/s, loss=-0.00707, sqweights=0.16876]
Epoch 1:  30%|###       | 6/20 [00:00<00:00, 62.38it/s, loss=-0.00688, sqweights=0.16814]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 62.99it/s, loss=-0.00688, sqweights=0.16814]
Epoch 1:  35%|###5      | 7/20 [00:00<00:00, 62.99it/s, loss=-0.00745, sqweights=0.16818]
Epoch 1:  40%|####      | 8/20 [00:00<00:00, 62.99it/s, loss=-0.00909, sqweights=0.16820]
Epoch 1:  45%|####5     | 9/20 [00:00<00:00, 62.99it/s, loss=-0.00936, sqweights=0.16836]
Epoch 1:  50%|#####     | 10/20 [00:00<00:00, 62.99it/s, loss=-0.00923, sqweights=0.16866]
Epoch 1:  55%|#####5    | 11/20 [00:00<00:00, 62.99it/s, loss=-0.01065, sqweights=0.16898]
Epoch 1:  60%|######    | 12/20 [00:00<00:00, 62.99it/s, loss=-0.01114, sqweights=0.16935]
Epoch 1:  65%|######5   | 13/20 [00:00<00:00, 62.99it/s, loss=-0.01099, sqweights=0.16967]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 65.04it/s, loss=-0.01099, sqweights=0.16967]
Epoch 1:  70%|#######   | 14/20 [00:00<00:00, 65.04it/s, loss=-0.01002, sqweights=0.16997]
Epoch 1:  75%|#######5  | 15/20 [00:00<00:00, 65.04it/s, loss=-0.01012, sqweights=0.16993]
Epoch 1:  80%|########  | 16/20 [00:00<00:00, 65.04it/s, loss=-0.01024, sqweights=0.16999]
Epoch 1:  85%|########5 | 17/20 [00:00<00:00, 65.04it/s, loss=-0.00971, sqweights=0.17009]
Epoch 1:  90%|######### | 18/20 [00:00<00:00, 65.04it/s, loss=-0.00941, sqweights=0.17010]
Epoch 1:  95%|#########5| 19/20 [00:00<00:00, 65.04it/s, loss=-0.00891, sqweights=0.17018]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]

Epoch 2:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 2:   5%|5         | 1/20 [00:00<00:00, 47.44it/s, loss=-0.01247, sqweights=0.17019]
Epoch 2:  10%|#         | 2/20 [00:00<00:00, 55.36it/s, loss=-0.00259, sqweights=0.17309]
Epoch 2:  15%|#5        | 3/20 [00:00<00:00, 58.66it/s, loss=-0.00639, sqweights=0.17346]
Epoch 2:  20%|##        | 4/20 [00:00<00:00, 60.48it/s, loss=-0.01129, sqweights=0.17446]
Epoch 2:  25%|##5       | 5/20 [00:00<00:00, 61.03it/s, loss=-0.01252, sqweights=0.17383]
Epoch 2:  30%|###       | 6/20 [00:00<00:00, 61.91it/s, loss=-0.01182, sqweights=0.17386]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 62.48it/s, loss=-0.01182, sqweights=0.17386]
Epoch 2:  35%|###5      | 7/20 [00:00<00:00, 62.48it/s, loss=-0.01307, sqweights=0.17373]
Epoch 2:  40%|####      | 8/20 [00:00<00:00, 62.48it/s, loss=-0.01297, sqweights=0.17419]
Epoch 2:  45%|####5     | 9/20 [00:00<00:00, 62.48it/s, loss=-0.01251, sqweights=0.17465]
Epoch 2:  50%|#####     | 10/20 [00:00<00:00, 62.48it/s, loss=-0.01242, sqweights=0.17460]
Epoch 2:  55%|#####5    | 11/20 [00:00<00:00, 62.48it/s, loss=-0.01039, sqweights=0.17455]
Epoch 2:  60%|######    | 12/20 [00:00<00:00, 62.48it/s, loss=-0.00968, sqweights=0.17424]
Epoch 2:  65%|######5   | 13/20 [00:00<00:00, 62.48it/s, loss=-0.00901, sqweights=0.17421]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.00901, sqweights=0.17421]
Epoch 2:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.00973, sqweights=0.17462]
Epoch 2:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.00999, sqweights=0.17486]
Epoch 2:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.01013, sqweights=0.17568]
Epoch 2:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.01168, sqweights=0.17588]
Epoch 2:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.01186, sqweights=0.17596]
Epoch 2:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.01284, sqweights=0.17611]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 23.26it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]

Epoch 3:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 3:   5%|5         | 1/20 [00:00<00:00, 47.41it/s, loss=-0.00594, sqweights=0.17965]
Epoch 3:  10%|#         | 2/20 [00:00<00:00, 55.51it/s, loss=-0.00667, sqweights=0.18086]
Epoch 3:  15%|#5        | 3/20 [00:00<00:00, 58.76it/s, loss=-0.01056, sqweights=0.18173]
Epoch 3:  20%|##        | 4/20 [00:00<00:00, 60.33it/s, loss=-0.01214, sqweights=0.18164]
Epoch 3:  25%|##5       | 5/20 [00:00<00:00, 61.50it/s, loss=-0.01210, sqweights=0.18157]
Epoch 3:  30%|###       | 6/20 [00:00<00:00, 62.35it/s, loss=-0.01416, sqweights=0.18190]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 62.97it/s, loss=-0.01416, sqweights=0.18190]
Epoch 3:  35%|###5      | 7/20 [00:00<00:00, 62.97it/s, loss=-0.01541, sqweights=0.18159]
Epoch 3:  40%|####      | 8/20 [00:00<00:00, 62.97it/s, loss=-0.01638, sqweights=0.18150]
Epoch 3:  45%|####5     | 9/20 [00:00<00:00, 62.97it/s, loss=-0.01681, sqweights=0.18144]
Epoch 3:  50%|#####     | 10/20 [00:00<00:00, 62.97it/s, loss=-0.01782, sqweights=0.18171]
Epoch 3:  55%|#####5    | 11/20 [00:00<00:00, 62.97it/s, loss=-0.02016, sqweights=0.18200]
Epoch 3:  60%|######    | 12/20 [00:00<00:00, 62.97it/s, loss=-0.02175, sqweights=0.18246]
Epoch 3:  65%|######5   | 13/20 [00:00<00:00, 62.97it/s, loss=-0.02144, sqweights=0.18269]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 46.64it/s, loss=-0.02144, sqweights=0.18269]
Epoch 3:  70%|#######   | 14/20 [00:00<00:00, 46.64it/s, loss=-0.02170, sqweights=0.18283]
Epoch 3:  75%|#######5  | 15/20 [00:00<00:00, 46.64it/s, loss=-0.02162, sqweights=0.18297]
Epoch 3:  80%|########  | 16/20 [00:00<00:00, 46.64it/s, loss=-0.02147, sqweights=0.18374]
Epoch 3:  85%|########5 | 17/20 [00:00<00:00, 46.64it/s, loss=-0.02168, sqweights=0.18399]
Epoch 3:  90%|######### | 18/20 [00:00<00:00, 46.64it/s, loss=-0.02224, sqweights=0.18398]
Epoch 3:  95%|#########5| 19/20 [00:00<00:00, 46.64it/s, loss=-0.02130, sqweights=0.18428]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 21.45it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]

Epoch 4:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 4:   5%|5         | 1/20 [00:00<00:00, 48.47it/s, loss=-0.03263, sqweights=0.18674]
Epoch 4:  10%|#         | 2/20 [00:00<00:00, 55.88it/s, loss=-0.03378, sqweights=0.18761]
Epoch 4:  15%|#5        | 3/20 [00:00<00:00, 58.58it/s, loss=-0.02888, sqweights=0.18915]
Epoch 4:  20%|##        | 4/20 [00:00<00:00, 60.16it/s, loss=-0.03058, sqweights=0.18974]
Epoch 4:  25%|##5       | 5/20 [00:00<00:00, 61.37it/s, loss=-0.02790, sqweights=0.19008]
Epoch 4:  30%|###       | 6/20 [00:00<00:00, 62.18it/s, loss=-0.02984, sqweights=0.19055]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 62.78it/s, loss=-0.02984, sqweights=0.19055]
Epoch 4:  35%|###5      | 7/20 [00:00<00:00, 62.78it/s, loss=-0.02973, sqweights=0.19170]
Epoch 4:  40%|####      | 8/20 [00:00<00:00, 62.78it/s, loss=-0.02761, sqweights=0.19189]
Epoch 4:  45%|####5     | 9/20 [00:00<00:00, 62.78it/s, loss=-0.02741, sqweights=0.19285]
Epoch 4:  50%|#####     | 10/20 [00:00<00:00, 62.78it/s, loss=-0.02864, sqweights=0.19266]
Epoch 4:  55%|#####5    | 11/20 [00:00<00:00, 62.78it/s, loss=-0.02779, sqweights=0.19319]
Epoch 4:  60%|######    | 12/20 [00:00<00:00, 62.78it/s, loss=-0.02879, sqweights=0.19370]
Epoch 4:  65%|######5   | 13/20 [00:00<00:00, 62.78it/s, loss=-0.02852, sqweights=0.19445]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 64.67it/s, loss=-0.02852, sqweights=0.19445]
Epoch 4:  70%|#######   | 14/20 [00:00<00:00, 64.67it/s, loss=-0.02968, sqweights=0.19489]
Epoch 4:  75%|#######5  | 15/20 [00:00<00:00, 64.67it/s, loss=-0.02975, sqweights=0.19525]
Epoch 4:  80%|########  | 16/20 [00:00<00:00, 64.67it/s, loss=-0.02986, sqweights=0.19559]
Epoch 4:  85%|########5 | 17/20 [00:00<00:00, 64.67it/s, loss=-0.02901, sqweights=0.19565]
Epoch 4:  90%|######### | 18/20 [00:00<00:00, 64.67it/s, loss=-0.03082, sqweights=0.19607]
Epoch 4:  95%|#########5| 19/20 [00:00<00:00, 64.67it/s, loss=-0.03118, sqweights=0.19706]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 23.24it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]

Epoch 5:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 5:   5%|5         | 1/20 [00:00<00:00, 48.87it/s, loss=-0.03289, sqweights=0.19800]
Epoch 5:  10%|#         | 2/20 [00:00<00:00, 56.31it/s, loss=-0.02880, sqweights=0.19834]
Epoch 5:  15%|#5        | 3/20 [00:00<00:00, 58.66it/s, loss=-0.03424, sqweights=0.20072]
Epoch 5:  20%|##        | 4/20 [00:00<00:00, 60.47it/s, loss=-0.02995, sqweights=0.20181]
Epoch 5:  25%|##5       | 5/20 [00:00<00:00, 61.64it/s, loss=-0.02977, sqweights=0.20331]
Epoch 5:  30%|###       | 6/20 [00:00<00:00, 62.42it/s, loss=-0.03077, sqweights=0.20445]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 62.93it/s, loss=-0.03077, sqweights=0.20445]
Epoch 5:  35%|###5      | 7/20 [00:00<00:00, 62.93it/s, loss=-0.02941, sqweights=0.20508]
Epoch 5:  40%|####      | 8/20 [00:00<00:00, 62.93it/s, loss=-0.02796, sqweights=0.20516]
Epoch 5:  45%|####5     | 9/20 [00:00<00:00, 62.93it/s, loss=-0.02927, sqweights=0.20511]
Epoch 5:  50%|#####     | 10/20 [00:00<00:00, 62.93it/s, loss=-0.02927, sqweights=0.20546]
Epoch 5:  55%|#####5    | 11/20 [00:00<00:00, 62.93it/s, loss=-0.03093, sqweights=0.20604]
Epoch 5:  60%|######    | 12/20 [00:00<00:00, 62.93it/s, loss=-0.03245, sqweights=0.20607]
Epoch 5:  65%|######5   | 13/20 [00:00<00:00, 62.93it/s, loss=-0.03233, sqweights=0.20628]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.03233, sqweights=0.20628]
Epoch 5:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.03398, sqweights=0.20662]
Epoch 5:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.03454, sqweights=0.20745]
Epoch 5:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.03502, sqweights=0.20787]
Epoch 5:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.03548, sqweights=0.20841]
Epoch 5:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.03521, sqweights=0.20887]
Epoch 5:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.03570, sqweights=0.20929]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 23.28it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]

Epoch 6:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 6:   5%|5         | 1/20 [00:00<00:00, 48.57it/s, loss=-0.03736, sqweights=0.22457]
Epoch 6:  10%|#         | 2/20 [00:00<00:00, 55.93it/s, loss=-0.04413, sqweights=0.22210]
Epoch 6:  15%|#5        | 3/20 [00:00<00:00, 58.82it/s, loss=-0.04409, sqweights=0.22338]
Epoch 6:  20%|##        | 4/20 [00:00<00:00, 60.57it/s, loss=-0.04156, sqweights=0.22513]
Epoch 6:  25%|##5       | 5/20 [00:00<00:00, 61.71it/s, loss=-0.04160, sqweights=0.22603]
Epoch 6:  30%|###       | 6/20 [00:00<00:00, 62.46it/s, loss=-0.04079, sqweights=0.22613]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 63.00it/s, loss=-0.04079, sqweights=0.22613]
Epoch 6:  35%|###5      | 7/20 [00:00<00:00, 63.00it/s, loss=-0.04284, sqweights=0.22647]
Epoch 6:  40%|####      | 8/20 [00:00<00:00, 63.00it/s, loss=-0.04251, sqweights=0.22566]
Epoch 6:  45%|####5     | 9/20 [00:00<00:00, 63.00it/s, loss=-0.04233, sqweights=0.22548]
Epoch 6:  50%|#####     | 10/20 [00:00<00:00, 63.00it/s, loss=-0.04204, sqweights=0.22502]
Epoch 6:  55%|#####5    | 11/20 [00:00<00:00, 63.00it/s, loss=-0.04084, sqweights=0.22452]
Epoch 6:  60%|######    | 12/20 [00:00<00:00, 63.00it/s, loss=-0.04225, sqweights=0.22520]
Epoch 6:  65%|######5   | 13/20 [00:00<00:00, 63.00it/s, loss=-0.04236, sqweights=0.22566]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 64.40it/s, loss=-0.04236, sqweights=0.22566]
Epoch 6:  70%|#######   | 14/20 [00:00<00:00, 64.40it/s, loss=-0.04280, sqweights=0.22569]
Epoch 6:  75%|#######5  | 15/20 [00:00<00:00, 64.40it/s, loss=-0.04166, sqweights=0.22653]
Epoch 6:  80%|########  | 16/20 [00:00<00:00, 64.40it/s, loss=-0.04165, sqweights=0.22711]
Epoch 6:  85%|########5 | 17/20 [00:00<00:00, 64.40it/s, loss=-0.04157, sqweights=0.22790]
Epoch 6:  90%|######### | 18/20 [00:00<00:00, 64.40it/s, loss=-0.04233, sqweights=0.22851]
Epoch 6:  95%|#########5| 19/20 [00:00<00:00, 64.40it/s, loss=-0.04258, sqweights=0.22854]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 21.38it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]

Epoch 7:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 7:   5%|5         | 1/20 [00:00<00:00, 48.28it/s, loss=-0.05094, sqweights=0.24110]
Epoch 7:  10%|#         | 2/20 [00:00<00:00, 55.22it/s, loss=-0.04334, sqweights=0.24224]
Epoch 7:  15%|#5        | 3/20 [00:00<00:00, 58.33it/s, loss=-0.03868, sqweights=0.24382]
Epoch 7:  20%|##        | 4/20 [00:00<00:00, 60.13it/s, loss=-0.03967, sqweights=0.24180]
Epoch 7:  25%|##5       | 5/20 [00:00<00:00, 60.91it/s, loss=-0.04211, sqweights=0.24393]
Epoch 7:  30%|###       | 6/20 [00:00<00:00, 61.62it/s, loss=-0.04368, sqweights=0.24600]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 62.14it/s, loss=-0.04368, sqweights=0.24600]
Epoch 7:  35%|###5      | 7/20 [00:00<00:00, 62.14it/s, loss=-0.04502, sqweights=0.24593]
Epoch 7:  40%|####      | 8/20 [00:00<00:00, 62.14it/s, loss=-0.04501, sqweights=0.24651]
Epoch 7:  45%|####5     | 9/20 [00:00<00:00, 62.14it/s, loss=-0.04461, sqweights=0.24615]
Epoch 7:  50%|#####     | 10/20 [00:00<00:00, 62.14it/s, loss=-0.04638, sqweights=0.24703]
Epoch 7:  55%|#####5    | 11/20 [00:00<00:00, 62.14it/s, loss=-0.04806, sqweights=0.24667]
Epoch 7:  60%|######    | 12/20 [00:00<00:00, 62.14it/s, loss=-0.04972, sqweights=0.24696]
Epoch 7:  65%|######5   | 13/20 [00:00<00:00, 62.14it/s, loss=-0.04989, sqweights=0.24748]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 64.82it/s, loss=-0.04989, sqweights=0.24748]
Epoch 7:  70%|#######   | 14/20 [00:00<00:00, 64.82it/s, loss=-0.05053, sqweights=0.24776]
Epoch 7:  75%|#######5  | 15/20 [00:00<00:00, 64.82it/s, loss=-0.04909, sqweights=0.24859]
Epoch 7:  80%|########  | 16/20 [00:00<00:00, 64.82it/s, loss=-0.04959, sqweights=0.24856]
Epoch 7:  85%|########5 | 17/20 [00:00<00:00, 64.82it/s, loss=-0.05020, sqweights=0.24935]
Epoch 7:  90%|######### | 18/20 [00:00<00:00, 64.82it/s, loss=-0.05091, sqweights=0.24993]
Epoch 7:  95%|#########5| 19/20 [00:00<00:00, 64.82it/s, loss=-0.05083, sqweights=0.25018]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 23.18it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]

Epoch 8:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 8:   5%|5         | 1/20 [00:00<00:00, 48.41it/s, loss=-0.06106, sqweights=0.25068]
Epoch 8:  10%|#         | 2/20 [00:00<00:00, 56.12it/s, loss=-0.05734, sqweights=0.25583]
Epoch 8:  15%|#5        | 3/20 [00:00<00:00, 59.32it/s, loss=-0.05527, sqweights=0.25912]
Epoch 8:  20%|##        | 4/20 [00:00<00:00, 61.02it/s, loss=-0.05278, sqweights=0.25860]
Epoch 8:  25%|##5       | 5/20 [00:00<00:00, 62.12it/s, loss=-0.05690, sqweights=0.26283]
Epoch 8:  30%|###       | 6/20 [00:00<00:00, 62.90it/s, loss=-0.05721, sqweights=0.26475]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 63.49it/s, loss=-0.05721, sqweights=0.26475]
Epoch 8:  35%|###5      | 7/20 [00:00<00:00, 63.49it/s, loss=-0.05784, sqweights=0.26417]
Epoch 8:  40%|####      | 8/20 [00:00<00:00, 63.49it/s, loss=-0.05832, sqweights=0.26479]
Epoch 8:  45%|####5     | 9/20 [00:00<00:00, 63.49it/s, loss=-0.06019, sqweights=0.26632]
Epoch 8:  50%|#####     | 10/20 [00:00<00:00, 63.49it/s, loss=-0.06087, sqweights=0.26709]
Epoch 8:  55%|#####5    | 11/20 [00:00<00:00, 63.49it/s, loss=-0.06042, sqweights=0.26791]
Epoch 8:  60%|######    | 12/20 [00:00<00:00, 63.49it/s, loss=-0.06031, sqweights=0.26906]
Epoch 8:  65%|######5   | 13/20 [00:00<00:00, 63.49it/s, loss=-0.05926, sqweights=0.26963]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 65.30it/s, loss=-0.05926, sqweights=0.26963]
Epoch 8:  70%|#######   | 14/20 [00:00<00:00, 65.30it/s, loss=-0.05858, sqweights=0.27065]
Epoch 8:  75%|#######5  | 15/20 [00:00<00:00, 65.30it/s, loss=-0.05916, sqweights=0.27099]
Epoch 8:  80%|########  | 16/20 [00:00<00:00, 65.30it/s, loss=-0.05817, sqweights=0.27153]
Epoch 8:  85%|########5 | 17/20 [00:00<00:00, 65.30it/s, loss=-0.05768, sqweights=0.27176]
Epoch 8:  90%|######### | 18/20 [00:00<00:00, 65.30it/s, loss=-0.05766, sqweights=0.27158]
Epoch 8:  95%|#########5| 19/20 [00:00<00:00, 65.30it/s, loss=-0.05858, sqweights=0.27189]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 23.28it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]

Epoch 9:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 9:   5%|5         | 1/20 [00:00<00:00, 48.94it/s, loss=-0.06033, sqweights=0.28802]
Epoch 9:  10%|#         | 2/20 [00:00<00:00, 56.28it/s, loss=-0.05749, sqweights=0.28628]
Epoch 9:  15%|#5        | 3/20 [00:00<00:00, 59.52it/s, loss=-0.05615, sqweights=0.28486]
Epoch 9:  20%|##        | 4/20 [00:00<00:00, 60.83it/s, loss=-0.05787, sqweights=0.28554]
Epoch 9:  25%|##5       | 5/20 [00:00<00:00, 61.68it/s, loss=-0.06315, sqweights=0.28808]
Epoch 9:  30%|###       | 6/20 [00:00<00:00, 62.47it/s, loss=-0.06044, sqweights=0.28879]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 62.79it/s, loss=-0.06044, sqweights=0.28879]
Epoch 9:  35%|###5      | 7/20 [00:00<00:00, 62.79it/s, loss=-0.06317, sqweights=0.29010]
Epoch 9:  40%|####      | 8/20 [00:00<00:00, 62.79it/s, loss=-0.06203, sqweights=0.29283]
Epoch 9:  45%|####5     | 9/20 [00:00<00:00, 62.79it/s, loss=-0.06338, sqweights=0.29290]
Epoch 9:  50%|#####     | 10/20 [00:00<00:00, 62.79it/s, loss=-0.06290, sqweights=0.29373]
Epoch 9:  55%|#####5    | 11/20 [00:00<00:00, 62.79it/s, loss=-0.06464, sqweights=0.29454]
Epoch 9:  60%|######    | 12/20 [00:00<00:00, 62.79it/s, loss=-0.06420, sqweights=0.29603]
Epoch 9:  65%|######5   | 13/20 [00:00<00:00, 62.79it/s, loss=-0.06390, sqweights=0.29653]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 65.08it/s, loss=-0.06390, sqweights=0.29653]
Epoch 9:  70%|#######   | 14/20 [00:00<00:00, 65.08it/s, loss=-0.06496, sqweights=0.29755]
Epoch 9:  75%|#######5  | 15/20 [00:00<00:00, 65.08it/s, loss=-0.06599, sqweights=0.29979]
Epoch 9:  80%|########  | 16/20 [00:00<00:00, 65.08it/s, loss=-0.06666, sqweights=0.30110]
Epoch 9:  85%|########5 | 17/20 [00:00<00:00, 65.08it/s, loss=-0.06661, sqweights=0.30176]
Epoch 9:  90%|######### | 18/20 [00:00<00:00, 65.08it/s, loss=-0.06700, sqweights=0.30194]
Epoch 9:  95%|#########5| 19/20 [00:00<00:00, 65.08it/s, loss=-0.06643, sqweights=0.30218]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 21.37it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]

Epoch 10:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 10:   5%|5         | 1/20 [00:00<00:00, 47.57it/s, loss=-0.06753, sqweights=0.31177]
Epoch 10:  10%|#         | 2/20 [00:00<00:00, 55.41it/s, loss=-0.06837, sqweights=0.30830]
Epoch 10:  15%|#5        | 3/20 [00:00<00:00, 58.61it/s, loss=-0.05926, sqweights=0.31371]
Epoch 10:  20%|##        | 4/20 [00:00<00:00, 60.45it/s, loss=-0.06154, sqweights=0.31725]
Epoch 10:  25%|##5       | 5/20 [00:00<00:00, 61.65it/s, loss=-0.06618, sqweights=0.31853]
Epoch 10:  30%|###       | 6/20 [00:00<00:00, 62.45it/s, loss=-0.06801, sqweights=0.31965]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 63.05it/s, loss=-0.06801, sqweights=0.31965]
Epoch 10:  35%|###5      | 7/20 [00:00<00:00, 63.05it/s, loss=-0.06813, sqweights=0.32024]
Epoch 10:  40%|####      | 8/20 [00:00<00:00, 63.05it/s, loss=-0.06594, sqweights=0.31786]
Epoch 10:  45%|####5     | 9/20 [00:00<00:00, 63.05it/s, loss=-0.06848, sqweights=0.31912]
Epoch 10:  50%|#####     | 10/20 [00:00<00:00, 63.05it/s, loss=-0.07142, sqweights=0.31989]
Epoch 10:  55%|#####5    | 11/20 [00:00<00:00, 63.05it/s, loss=-0.07231, sqweights=0.32073]
Epoch 10:  60%|######    | 12/20 [00:00<00:00, 63.05it/s, loss=-0.07296, sqweights=0.32141]
Epoch 10:  65%|######5   | 13/20 [00:00<00:00, 63.05it/s, loss=-0.07326, sqweights=0.32151]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 65.13it/s, loss=-0.07326, sqweights=0.32151]
Epoch 10:  70%|#######   | 14/20 [00:00<00:00, 65.13it/s, loss=-0.07495, sqweights=0.32216]
Epoch 10:  75%|#######5  | 15/20 [00:00<00:00, 65.13it/s, loss=-0.07602, sqweights=0.32325]
Epoch 10:  80%|########  | 16/20 [00:00<00:00, 65.13it/s, loss=-0.07510, sqweights=0.32441]
Epoch 10:  85%|########5 | 17/20 [00:00<00:00, 65.13it/s, loss=-0.07478, sqweights=0.32520]
Epoch 10:  90%|######### | 18/20 [00:00<00:00, 65.13it/s, loss=-0.07478, sqweights=0.32687]
Epoch 10:  95%|#########5| 19/20 [00:00<00:00, 65.13it/s, loss=-0.07349, sqweights=0.32719]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 23.27it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]

Epoch 11:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 11:   5%|5         | 1/20 [00:00<00:00, 48.43it/s, loss=-0.08275, sqweights=0.34105]
Epoch 11:  10%|#         | 2/20 [00:00<00:00, 55.99it/s, loss=-0.07369, sqweights=0.33976]
Epoch 11:  15%|#5        | 3/20 [00:00<00:00, 59.22it/s, loss=-0.07831, sqweights=0.34105]
Epoch 11:  20%|##        | 4/20 [00:00<00:00, 60.90it/s, loss=-0.08237, sqweights=0.34122]
Epoch 11:  25%|##5       | 5/20 [00:00<00:00, 61.90it/s, loss=-0.07932, sqweights=0.34337]
Epoch 11:  30%|###       | 6/20 [00:00<00:00, 62.56it/s, loss=-0.07868, sqweights=0.34455]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 63.08it/s, loss=-0.07868, sqweights=0.34455]
Epoch 11:  35%|###5      | 7/20 [00:00<00:00, 63.08it/s, loss=-0.07493, sqweights=0.34321]
Epoch 11:  40%|####      | 8/20 [00:00<00:00, 63.08it/s, loss=-0.07448, sqweights=0.34361]
Epoch 11:  45%|####5     | 9/20 [00:00<00:00, 63.08it/s, loss=-0.07475, sqweights=0.34491]
Epoch 11:  50%|#####     | 10/20 [00:00<00:00, 63.08it/s, loss=-0.07513, sqweights=0.34512]
Epoch 11:  55%|#####5    | 11/20 [00:00<00:00, 63.08it/s, loss=-0.07662, sqweights=0.34650]
Epoch 11:  60%|######    | 12/20 [00:00<00:00, 63.08it/s, loss=-0.07747, sqweights=0.34814]
Epoch 11:  65%|######5   | 13/20 [00:00<00:00, 63.08it/s, loss=-0.07848, sqweights=0.34929]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 64.78it/s, loss=-0.07848, sqweights=0.34929]
Epoch 11:  70%|#######   | 14/20 [00:00<00:00, 64.78it/s, loss=-0.07961, sqweights=0.35002]
Epoch 11:  75%|#######5  | 15/20 [00:00<00:00, 64.78it/s, loss=-0.07878, sqweights=0.35113]
Epoch 11:  80%|########  | 16/20 [00:00<00:00, 64.78it/s, loss=-0.07838, sqweights=0.35242]
Epoch 11:  85%|########5 | 17/20 [00:00<00:00, 64.78it/s, loss=-0.07842, sqweights=0.35351]
Epoch 11:  90%|######### | 18/20 [00:00<00:00, 64.78it/s, loss=-0.07788, sqweights=0.35405]
Epoch 11:  95%|#########5| 19/20 [00:00<00:00, 64.78it/s, loss=-0.07792, sqweights=0.35477]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]

Epoch 12:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 12:   5%|5         | 1/20 [00:00<00:00, 48.79it/s, loss=-0.08471, sqweights=0.36337]
Epoch 12:  10%|#         | 2/20 [00:00<00:00, 56.39it/s, loss=-0.08585, sqweights=0.36754]
Epoch 12:  15%|#5        | 3/20 [00:00<00:00, 58.76it/s, loss=-0.08223, sqweights=0.37128]
Epoch 12:  20%|##        | 4/20 [00:00<00:00, 60.58it/s, loss=-0.08584, sqweights=0.37608]
Epoch 12:  25%|##5       | 5/20 [00:00<00:00, 61.70it/s, loss=-0.08819, sqweights=0.37633]
Epoch 12:  30%|###       | 6/20 [00:00<00:00, 62.45it/s, loss=-0.08225, sqweights=0.37474]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 63.04it/s, loss=-0.08225, sqweights=0.37474]
Epoch 12:  35%|###5      | 7/20 [00:00<00:00, 63.04it/s, loss=-0.08323, sqweights=0.37490]
Epoch 12:  40%|####      | 8/20 [00:00<00:00, 63.04it/s, loss=-0.08275, sqweights=0.37459]
Epoch 12:  45%|####5     | 9/20 [00:00<00:00, 63.04it/s, loss=-0.08151, sqweights=0.37539]
Epoch 12:  50%|#####     | 10/20 [00:00<00:00, 63.04it/s, loss=-0.08217, sqweights=0.37608]
Epoch 12:  55%|#####5    | 11/20 [00:00<00:00, 63.04it/s, loss=-0.08177, sqweights=0.37551]
Epoch 12:  60%|######    | 12/20 [00:00<00:00, 63.04it/s, loss=-0.08443, sqweights=0.37657]
Epoch 12:  65%|######5   | 13/20 [00:00<00:00, 63.04it/s, loss=-0.08378, sqweights=0.37572]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 64.92it/s, loss=-0.08378, sqweights=0.37572]
Epoch 12:  70%|#######   | 14/20 [00:00<00:00, 64.92it/s, loss=-0.08411, sqweights=0.37670]
Epoch 12:  75%|#######5  | 15/20 [00:00<00:00, 64.92it/s, loss=-0.08491, sqweights=0.37815]
Epoch 12:  80%|########  | 16/20 [00:00<00:00, 64.92it/s, loss=-0.08561, sqweights=0.37964]
Epoch 12:  85%|########5 | 17/20 [00:00<00:00, 64.92it/s, loss=-0.08717, sqweights=0.37979]
Epoch 12:  90%|######### | 18/20 [00:00<00:00, 64.92it/s, loss=-0.08669, sqweights=0.38047]
Epoch 12:  95%|#########5| 19/20 [00:00<00:00, 64.92it/s, loss=-0.08588, sqweights=0.38009]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 21.38it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]

Epoch 13:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 13:   5%|5         | 1/20 [00:00<00:00, 48.05it/s, loss=-0.08998, sqweights=0.42066]
Epoch 13:  10%|#         | 2/20 [00:00<00:00, 55.49it/s, loss=-0.09197, sqweights=0.40202]
Epoch 13:  15%|#5        | 3/20 [00:00<00:00, 58.76it/s, loss=-0.09240, sqweights=0.40136]
Epoch 13:  20%|##        | 4/20 [00:00<00:00, 60.58it/s, loss=-0.08685, sqweights=0.39863]
Epoch 13:  25%|##5       | 5/20 [00:00<00:00, 61.69it/s, loss=-0.08815, sqweights=0.39772]
Epoch 13:  30%|###       | 6/20 [00:00<00:00, 62.28it/s, loss=-0.08772, sqweights=0.39826]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 62.92it/s, loss=-0.08772, sqweights=0.39826]
Epoch 13:  35%|###5      | 7/20 [00:00<00:00, 62.92it/s, loss=-0.09007, sqweights=0.39858]
Epoch 13:  40%|####      | 8/20 [00:00<00:00, 62.92it/s, loss=-0.09228, sqweights=0.40014]
Epoch 13:  45%|####5     | 9/20 [00:00<00:00, 62.92it/s, loss=-0.09217, sqweights=0.39954]
Epoch 13:  50%|#####     | 10/20 [00:00<00:00, 62.92it/s, loss=-0.09081, sqweights=0.40007]
Epoch 13:  55%|#####5    | 11/20 [00:00<00:00, 62.92it/s, loss=-0.08810, sqweights=0.40162]
Epoch 13:  60%|######    | 12/20 [00:00<00:00, 62.92it/s, loss=-0.08876, sqweights=0.40201]
Epoch 13:  65%|######5   | 13/20 [00:00<00:00, 62.92it/s, loss=-0.08826, sqweights=0.40320]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 64.84it/s, loss=-0.08826, sqweights=0.40320]
Epoch 13:  70%|#######   | 14/20 [00:00<00:00, 64.84it/s, loss=-0.08858, sqweights=0.40282]
Epoch 13:  75%|#######5  | 15/20 [00:00<00:00, 64.84it/s, loss=-0.08905, sqweights=0.40405]
Epoch 13:  80%|########  | 16/20 [00:00<00:00, 64.84it/s, loss=-0.08836, sqweights=0.40530]
Epoch 13:  85%|########5 | 17/20 [00:00<00:00, 64.84it/s, loss=-0.08875, sqweights=0.40705]
Epoch 13:  90%|######### | 18/20 [00:00<00:00, 64.84it/s, loss=-0.08897, sqweights=0.40731]
Epoch 13:  95%|#########5| 19/20 [00:00<00:00, 64.84it/s, loss=-0.08892, sqweights=0.40877]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 23.21it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]

Epoch 14:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 14:   5%|5         | 1/20 [00:00<00:00, 48.67it/s, loss=-0.09347, sqweights=0.43040]
Epoch 14:  10%|#         | 2/20 [00:00<00:00, 56.05it/s, loss=-0.09072, sqweights=0.42517]
Epoch 14:  15%|#5        | 3/20 [00:00<00:00, 59.15it/s, loss=-0.09520, sqweights=0.42528]
Epoch 14:  20%|##        | 4/20 [00:00<00:00, 60.83it/s, loss=-0.09392, sqweights=0.42286]
Epoch 14:  25%|##5       | 5/20 [00:00<00:00, 61.33it/s, loss=-0.10022, sqweights=0.42318]
Epoch 14:  30%|###       | 6/20 [00:00<00:00, 62.07it/s, loss=-0.09754, sqweights=0.42714]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 62.51it/s, loss=-0.09754, sqweights=0.42714]
Epoch 14:  35%|###5      | 7/20 [00:00<00:00, 62.51it/s, loss=-0.09489, sqweights=0.42541]
Epoch 14:  40%|####      | 8/20 [00:00<00:00, 62.51it/s, loss=-0.09498, sqweights=0.42650]
Epoch 14:  45%|####5     | 9/20 [00:00<00:00, 62.51it/s, loss=-0.09590, sqweights=0.43020]
Epoch 14:  50%|#####     | 10/20 [00:00<00:00, 62.51it/s, loss=-0.09343, sqweights=0.43143]
Epoch 14:  55%|#####5    | 11/20 [00:00<00:00, 62.51it/s, loss=-0.09191, sqweights=0.43187]
Epoch 14:  60%|######    | 12/20 [00:00<00:00, 62.51it/s, loss=-0.09403, sqweights=0.43410]
Epoch 14:  65%|######5   | 13/20 [00:00<00:00, 62.51it/s, loss=-0.09397, sqweights=0.43501]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09397, sqweights=0.43501]
Epoch 14:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09208, sqweights=0.43505]
Epoch 14:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.09298, sqweights=0.43851]
Epoch 14:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.09483, sqweights=0.43947]
Epoch 14:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.09508, sqweights=0.43884]
Epoch 14:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.09639, sqweights=0.43967]
Epoch 14:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.09618, sqweights=0.44175]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 23.20it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]

Epoch 15:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 15:   5%|5         | 1/20 [00:00<00:00, 47.52it/s, loss=-0.07776, sqweights=0.44499]
Epoch 15:  10%|#         | 2/20 [00:00<00:00, 54.87it/s, loss=-0.10117, sqweights=0.45217]
Epoch 15:  15%|#5        | 3/20 [00:00<00:00, 58.29it/s, loss=-0.10106, sqweights=0.44935]
Epoch 15:  20%|##        | 4/20 [00:00<00:00, 60.31it/s, loss=-0.09864, sqweights=0.45530]
Epoch 15:  25%|##5       | 5/20 [00:00<00:00, 61.51it/s, loss=-0.10088, sqweights=0.45591]
Epoch 15:  30%|###       | 6/20 [00:00<00:00, 62.29it/s, loss=-0.10025, sqweights=0.45670]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 62.87it/s, loss=-0.10025, sqweights=0.45670]
Epoch 15:  35%|###5      | 7/20 [00:00<00:00, 62.87it/s, loss=-0.09813, sqweights=0.45867]
Epoch 15:  40%|####      | 8/20 [00:00<00:00, 62.87it/s, loss=-0.09467, sqweights=0.45524]
Epoch 15:  45%|####5     | 9/20 [00:00<00:00, 62.87it/s, loss=-0.09798, sqweights=0.45479]
Epoch 15:  50%|#####     | 10/20 [00:00<00:00, 62.87it/s, loss=-0.09848, sqweights=0.45526]
Epoch 15:  55%|#####5    | 11/20 [00:00<00:00, 62.87it/s, loss=-0.09785, sqweights=0.45649]
Epoch 15:  60%|######    | 12/20 [00:00<00:00, 62.87it/s, loss=-0.09831, sqweights=0.45588]
Epoch 15:  65%|######5   | 13/20 [00:00<00:00, 62.87it/s, loss=-0.09704, sqweights=0.45700]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09704, sqweights=0.45700]
Epoch 15:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09717, sqweights=0.45790]
Epoch 15:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.09846, sqweights=0.46001]
Epoch 15:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.09781, sqweights=0.46047]
Epoch 15:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.09654, sqweights=0.46121]
Epoch 15:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.09635, sqweights=0.46112]
Epoch 15:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.09673, sqweights=0.46178]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 21.30it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]

Epoch 16:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 16:   5%|5         | 1/20 [00:00<00:00, 48.63it/s, loss=-0.09268, sqweights=0.46160]
Epoch 16:  10%|#         | 2/20 [00:00<00:00, 56.01it/s, loss=-0.08807, sqweights=0.47416]
Epoch 16:  15%|#5        | 3/20 [00:00<00:00, 59.09it/s, loss=-0.09411, sqweights=0.47416]
Epoch 16:  20%|##        | 4/20 [00:00<00:00, 60.34it/s, loss=-0.10074, sqweights=0.47628]
Epoch 16:  25%|##5       | 5/20 [00:00<00:00, 61.44it/s, loss=-0.10077, sqweights=0.48216]
Epoch 16:  30%|###       | 6/20 [00:00<00:00, 62.28it/s, loss=-0.10006, sqweights=0.48039]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 62.65it/s, loss=-0.10006, sqweights=0.48039]
Epoch 16:  35%|###5      | 7/20 [00:00<00:00, 62.65it/s, loss=-0.10164, sqweights=0.48168]
Epoch 16:  40%|####      | 8/20 [00:00<00:00, 62.65it/s, loss=-0.09959, sqweights=0.48064]
Epoch 16:  45%|####5     | 9/20 [00:00<00:00, 62.65it/s, loss=-0.10108, sqweights=0.48233]
Epoch 16:  50%|#####     | 10/20 [00:00<00:00, 62.65it/s, loss=-0.10088, sqweights=0.48523]
Epoch 16:  55%|#####5    | 11/20 [00:00<00:00, 62.65it/s, loss=-0.09957, sqweights=0.48803]
Epoch 16:  60%|######    | 12/20 [00:00<00:00, 62.65it/s, loss=-0.09863, sqweights=0.48878]
Epoch 16:  65%|######5   | 13/20 [00:00<00:00, 62.65it/s, loss=-0.09849, sqweights=0.48949]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09849, sqweights=0.48949]
Epoch 16:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09799, sqweights=0.48896]
Epoch 16:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.10049, sqweights=0.48989]
Epoch 16:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.10252, sqweights=0.49118]
Epoch 16:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.10268, sqweights=0.49209]
Epoch 16:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.10146, sqweights=0.49103]
Epoch 16:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.10214, sqweights=0.49107]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 23.23it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]

Epoch 17:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 17:   5%|5         | 1/20 [00:00<00:00, 48.15it/s, loss=-0.07263, sqweights=0.47088]
Epoch 17:  10%|#         | 2/20 [00:00<00:00, 54.65it/s, loss=-0.07035, sqweights=0.48952]
Epoch 17:  15%|#5        | 3/20 [00:00<00:00, 57.84it/s, loss=-0.08838, sqweights=0.49690]
Epoch 17:  20%|##        | 4/20 [00:00<00:00, 59.79it/s, loss=-0.08892, sqweights=0.49608]
Epoch 17:  25%|##5       | 5/20 [00:00<00:00, 60.91it/s, loss=-0.09458, sqweights=0.50005]
Epoch 17:  30%|###       | 6/20 [00:00<00:00, 61.83it/s, loss=-0.09608, sqweights=0.50258]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 62.50it/s, loss=-0.09608, sqweights=0.50258]
Epoch 17:  35%|###5      | 7/20 [00:00<00:00, 62.50it/s, loss=-0.09627, sqweights=0.50353]
Epoch 17:  40%|####      | 8/20 [00:00<00:00, 62.50it/s, loss=-0.09905, sqweights=0.50506]
Epoch 17:  45%|####5     | 9/20 [00:00<00:00, 62.50it/s, loss=-0.10124, sqweights=0.50349]
Epoch 17:  50%|#####     | 10/20 [00:00<00:00, 62.50it/s, loss=-0.10062, sqweights=0.50493]
Epoch 17:  55%|#####5    | 11/20 [00:00<00:00, 62.50it/s, loss=-0.10145, sqweights=0.50558]
Epoch 17:  60%|######    | 12/20 [00:00<00:00, 62.50it/s, loss=-0.10070, sqweights=0.50723]
Epoch 17:  65%|######5   | 13/20 [00:00<00:00, 62.50it/s, loss=-0.10075, sqweights=0.50763]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 64.68it/s, loss=-0.10075, sqweights=0.50763]
Epoch 17:  70%|#######   | 14/20 [00:00<00:00, 64.68it/s, loss=-0.09942, sqweights=0.50832]
Epoch 17:  75%|#######5  | 15/20 [00:00<00:00, 64.68it/s, loss=-0.09991, sqweights=0.50942]
Epoch 17:  80%|########  | 16/20 [00:00<00:00, 64.68it/s, loss=-0.09898, sqweights=0.50909]
Epoch 17:  85%|########5 | 17/20 [00:00<00:00, 64.68it/s, loss=-0.09862, sqweights=0.50915]
Epoch 17:  90%|######### | 18/20 [00:00<00:00, 64.68it/s, loss=-0.10007, sqweights=0.50960]
Epoch 17:  95%|#########5| 19/20 [00:00<00:00, 64.68it/s, loss=-0.10178, sqweights=0.51104]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 23.19it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]

Epoch 18:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 18:   5%|5         | 1/20 [00:00<00:00, 48.39it/s, loss=-0.11609, sqweights=0.52120]
Epoch 18:  10%|#         | 2/20 [00:00<00:00, 55.85it/s, loss=-0.12475, sqweights=0.52755]
Epoch 18:  15%|#5        | 3/20 [00:00<00:00, 58.99it/s, loss=-0.11660, sqweights=0.52537]
Epoch 18:  20%|##        | 4/20 [00:00<00:00, 60.73it/s, loss=-0.11349, sqweights=0.52834]
Epoch 18:  25%|##5       | 5/20 [00:00<00:00, 61.86it/s, loss=-0.11398, sqweights=0.53109]
Epoch 18:  30%|###       | 6/20 [00:00<00:00, 62.64it/s, loss=-0.11380, sqweights=0.53258]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 63.11it/s, loss=-0.11380, sqweights=0.53258]
Epoch 18:  35%|###5      | 7/20 [00:00<00:00, 63.11it/s, loss=-0.10940, sqweights=0.53307]
Epoch 18:  40%|####      | 8/20 [00:00<00:00, 63.11it/s, loss=-0.10953, sqweights=0.53211]
Epoch 18:  45%|####5     | 9/20 [00:00<00:00, 63.11it/s, loss=-0.10822, sqweights=0.53274]
Epoch 18:  50%|#####     | 10/20 [00:00<00:00, 63.11it/s, loss=-0.10701, sqweights=0.53329]
Epoch 18:  55%|#####5    | 11/20 [00:00<00:00, 63.11it/s, loss=-0.10468, sqweights=0.53457]
Epoch 18:  60%|######    | 12/20 [00:00<00:00, 63.11it/s, loss=-0.10439, sqweights=0.53363]
Epoch 18:  65%|######5   | 13/20 [00:00<00:00, 63.11it/s, loss=-0.10419, sqweights=0.53460]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 64.45it/s, loss=-0.10419, sqweights=0.53460]
Epoch 18:  70%|#######   | 14/20 [00:00<00:00, 64.45it/s, loss=-0.10519, sqweights=0.53536]
Epoch 18:  75%|#######5  | 15/20 [00:00<00:00, 64.45it/s, loss=-0.10660, sqweights=0.53565]
Epoch 18:  80%|########  | 16/20 [00:00<00:00, 64.45it/s, loss=-0.10920, sqweights=0.53575]
Epoch 18:  85%|########5 | 17/20 [00:00<00:00, 64.45it/s, loss=-0.10902, sqweights=0.53885]
Epoch 18:  90%|######### | 18/20 [00:00<00:00, 64.45it/s, loss=-0.10874, sqweights=0.53856]
Epoch 18:  95%|#########5| 19/20 [00:00<00:00, 64.45it/s, loss=-0.10819, sqweights=0.53842]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 21.26it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]

Epoch 19:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 19:   5%|5         | 1/20 [00:00<00:00, 48.44it/s, loss=-0.10525, sqweights=0.53310]
Epoch 19:  10%|#         | 2/20 [00:00<00:00, 56.03it/s, loss=-0.10515, sqweights=0.53979]
Epoch 19:  15%|#5        | 3/20 [00:00<00:00, 58.46it/s, loss=-0.10749, sqweights=0.54270]
Epoch 19:  20%|##        | 4/20 [00:00<00:00, 59.96it/s, loss=-0.10776, sqweights=0.54263]
Epoch 19:  25%|##5       | 5/20 [00:00<00:00, 60.80it/s, loss=-0.10895, sqweights=0.54191]
Epoch 19:  30%|###       | 6/20 [00:00<00:00, 61.15it/s, loss=-0.11153, sqweights=0.54491]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 61.90it/s, loss=-0.11153, sqweights=0.54491]
Epoch 19:  35%|###5      | 7/20 [00:00<00:00, 61.90it/s, loss=-0.11229, sqweights=0.54411]
Epoch 19:  40%|####      | 8/20 [00:00<00:00, 61.90it/s, loss=-0.11116, sqweights=0.54464]
Epoch 19:  45%|####5     | 9/20 [00:00<00:00, 61.90it/s, loss=-0.11167, sqweights=0.54863]
Epoch 19:  50%|#####     | 10/20 [00:00<00:00, 61.90it/s, loss=-0.11161, sqweights=0.54890]
Epoch 19:  55%|#####5    | 11/20 [00:00<00:00, 61.90it/s, loss=-0.11127, sqweights=0.54939]
Epoch 19:  60%|######    | 12/20 [00:00<00:00, 61.90it/s, loss=-0.11049, sqweights=0.55163]
Epoch 19:  65%|######5   | 13/20 [00:00<00:00, 61.90it/s, loss=-0.10825, sqweights=0.55279]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 64.39it/s, loss=-0.10825, sqweights=0.55279]
Epoch 19:  70%|#######   | 14/20 [00:00<00:00, 64.39it/s, loss=-0.10745, sqweights=0.55160]
Epoch 19:  75%|#######5  | 15/20 [00:00<00:00, 64.39it/s, loss=-0.10612, sqweights=0.55343]
Epoch 19:  80%|########  | 16/20 [00:00<00:00, 64.39it/s, loss=-0.10608, sqweights=0.55331]
Epoch 19:  85%|########5 | 17/20 [00:00<00:00, 64.39it/s, loss=-0.10650, sqweights=0.55306]
Epoch 19:  90%|######### | 18/20 [00:00<00:00, 64.39it/s, loss=-0.10669, sqweights=0.55310]
Epoch 19:  95%|#########5| 19/20 [00:00<00:00, 64.39it/s, loss=-0.10846, sqweights=0.55464]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 23.12it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]

Epoch 20:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 20:   5%|5         | 1/20 [00:00<00:00, 47.63it/s, loss=-0.08769, sqweights=0.54733]
Epoch 20:  10%|#         | 2/20 [00:00<00:00, 55.14it/s, loss=-0.10949, sqweights=0.55174]
Epoch 20:  15%|#5        | 3/20 [00:00<00:00, 58.42it/s, loss=-0.10915, sqweights=0.55808]
Epoch 20:  20%|##        | 4/20 [00:00<00:00, 60.27it/s, loss=-0.11420, sqweights=0.56447]
Epoch 20:  25%|##5       | 5/20 [00:00<00:00, 61.46it/s, loss=-0.11304, sqweights=0.56131]
Epoch 20:  30%|###       | 6/20 [00:00<00:00, 62.11it/s, loss=-0.11304, sqweights=0.56218]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.11304, sqweights=0.56218]
Epoch 20:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.11231, sqweights=0.56449]
Epoch 20:  40%|####      | 8/20 [00:00<00:00, 62.47it/s, loss=-0.11506, sqweights=0.56712]
Epoch 20:  45%|####5     | 9/20 [00:00<00:00, 62.47it/s, loss=-0.11374, sqweights=0.56820]
Epoch 20:  50%|#####     | 10/20 [00:00<00:00, 62.47it/s, loss=-0.11277, sqweights=0.56822]
Epoch 20:  55%|#####5    | 11/20 [00:00<00:00, 62.47it/s, loss=-0.11133, sqweights=0.56869]
Epoch 20:  60%|######    | 12/20 [00:00<00:00, 62.47it/s, loss=-0.11155, sqweights=0.56952]
Epoch 20:  65%|######5   | 13/20 [00:00<00:00, 62.47it/s, loss=-0.11161, sqweights=0.56927]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 64.69it/s, loss=-0.11161, sqweights=0.56927]
Epoch 20:  70%|#######   | 14/20 [00:00<00:00, 64.69it/s, loss=-0.11155, sqweights=0.57108]
Epoch 20:  75%|#######5  | 15/20 [00:00<00:00, 64.69it/s, loss=-0.11229, sqweights=0.57162]
Epoch 20:  80%|########  | 16/20 [00:00<00:00, 64.69it/s, loss=-0.11132, sqweights=0.57115]
Epoch 20:  85%|########5 | 17/20 [00:00<00:00, 64.69it/s, loss=-0.11103, sqweights=0.57277]
Epoch 20:  90%|######### | 18/20 [00:00<00:00, 64.69it/s, loss=-0.11058, sqweights=0.57316]
Epoch 20:  95%|#########5| 19/20 [00:00<00:00, 64.69it/s, loss=-0.11009, sqweights=0.57285]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 23.09it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]

Epoch 21:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 21:   5%|5         | 1/20 [00:00<00:00, 48.90it/s, loss=-0.12126, sqweights=0.60815]
Epoch 21:  10%|#         | 2/20 [00:00<00:00, 56.25it/s, loss=-0.11400, sqweights=0.59867]
Epoch 21:  15%|#5        | 3/20 [00:00<00:00, 59.40it/s, loss=-0.11547, sqweights=0.59577]
Epoch 21:  20%|##        | 4/20 [00:00<00:00, 61.08it/s, loss=-0.12237, sqweights=0.59427]
Epoch 21:  25%|##5       | 5/20 [00:00<00:00, 61.67it/s, loss=-0.11782, sqweights=0.59439]
Epoch 21:  30%|###       | 6/20 [00:00<00:00, 62.47it/s, loss=-0.11923, sqweights=0.59286]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 63.04it/s, loss=-0.11923, sqweights=0.59286]
Epoch 21:  35%|###5      | 7/20 [00:00<00:00, 63.04it/s, loss=-0.12054, sqweights=0.59288]
Epoch 21:  40%|####      | 8/20 [00:00<00:00, 63.04it/s, loss=-0.12151, sqweights=0.59308]
Epoch 21:  45%|####5     | 9/20 [00:00<00:00, 63.04it/s, loss=-0.12133, sqweights=0.59469]
Epoch 21:  50%|#####     | 10/20 [00:00<00:00, 63.04it/s, loss=-0.11658, sqweights=0.59325]
Epoch 21:  55%|#####5    | 11/20 [00:00<00:00, 63.04it/s, loss=-0.11510, sqweights=0.59322]
Epoch 21:  60%|######    | 12/20 [00:00<00:00, 63.04it/s, loss=-0.11688, sqweights=0.59532]
Epoch 21:  65%|######5   | 13/20 [00:00<00:00, 63.04it/s, loss=-0.11525, sqweights=0.59465]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 65.09it/s, loss=-0.11525, sqweights=0.59465]
Epoch 21:  70%|#######   | 14/20 [00:00<00:00, 65.09it/s, loss=-0.11524, sqweights=0.59489]
Epoch 21:  75%|#######5  | 15/20 [00:00<00:00, 65.09it/s, loss=-0.11512, sqweights=0.59458]
Epoch 21:  80%|########  | 16/20 [00:00<00:00, 65.09it/s, loss=-0.11539, sqweights=0.59398]
Epoch 21:  85%|########5 | 17/20 [00:00<00:00, 65.09it/s, loss=-0.11563, sqweights=0.59534]
Epoch 21:  90%|######### | 18/20 [00:00<00:00, 65.09it/s, loss=-0.11527, sqweights=0.59618]
Epoch 21:  95%|#########5| 19/20 [00:00<00:00, 65.09it/s, loss=-0.11463, sqweights=0.59684]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 21.23it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]

Epoch 22:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 22:   5%|5         | 1/20 [00:00<00:00, 48.10it/s, loss=-0.13117, sqweights=0.59707]
Epoch 22:  10%|#         | 2/20 [00:00<00:00, 55.70it/s, loss=-0.13134, sqweights=0.58748]
Epoch 22:  15%|#5        | 3/20 [00:00<00:00, 58.91it/s, loss=-0.12128, sqweights=0.59059]
Epoch 22:  20%|##        | 4/20 [00:00<00:00, 60.63it/s, loss=-0.12522, sqweights=0.59912]
Epoch 22:  25%|##5       | 5/20 [00:00<00:00, 61.72it/s, loss=-0.12619, sqweights=0.60234]
Epoch 22:  30%|###       | 6/20 [00:00<00:00, 62.52it/s, loss=-0.12257, sqweights=0.60437]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 63.01it/s, loss=-0.12257, sqweights=0.60437]
Epoch 22:  35%|###5      | 7/20 [00:00<00:00, 63.01it/s, loss=-0.12130, sqweights=0.60778]
Epoch 22:  40%|####      | 8/20 [00:00<00:00, 63.01it/s, loss=-0.11961, sqweights=0.60704]
Epoch 22:  45%|####5     | 9/20 [00:00<00:00, 63.01it/s, loss=-0.12047, sqweights=0.60727]
Epoch 22:  50%|#####     | 10/20 [00:00<00:00, 63.01it/s, loss=-0.12367, sqweights=0.60889]
Epoch 22:  55%|#####5    | 11/20 [00:00<00:00, 63.01it/s, loss=-0.12254, sqweights=0.61065]
Epoch 22:  60%|######    | 12/20 [00:00<00:00, 63.01it/s, loss=-0.11959, sqweights=0.60985]
Epoch 22:  65%|######5   | 13/20 [00:00<00:00, 63.01it/s, loss=-0.11779, sqweights=0.61102]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 64.75it/s, loss=-0.11779, sqweights=0.61102]
Epoch 22:  70%|#######   | 14/20 [00:00<00:00, 64.75it/s, loss=-0.11568, sqweights=0.61147]
Epoch 22:  75%|#######5  | 15/20 [00:00<00:00, 64.75it/s, loss=-0.11837, sqweights=0.61256]
Epoch 22:  80%|########  | 16/20 [00:00<00:00, 64.75it/s, loss=-0.11787, sqweights=0.61153]
Epoch 22:  85%|########5 | 17/20 [00:00<00:00, 64.75it/s, loss=-0.11755, sqweights=0.61269]
Epoch 22:  90%|######### | 18/20 [00:00<00:00, 64.75it/s, loss=-0.11595, sqweights=0.61204]
Epoch 22:  95%|#########5| 19/20 [00:00<00:00, 64.75it/s, loss=-0.11706, sqweights=0.61272]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 23.20it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]

Epoch 23:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 23:   5%|5         | 1/20 [00:00<00:00, 48.67it/s, loss=-0.13710, sqweights=0.64026]
Epoch 23:  10%|#         | 2/20 [00:00<00:00, 55.97it/s, loss=-0.13911, sqweights=0.63102]
Epoch 23:  15%|#5        | 3/20 [00:00<00:00, 59.04it/s, loss=-0.13081, sqweights=0.62900]
Epoch 23:  20%|##        | 4/20 [00:00<00:00, 60.78it/s, loss=-0.12694, sqweights=0.62430]
Epoch 23:  25%|##5       | 5/20 [00:00<00:00, 61.84it/s, loss=-0.12368, sqweights=0.62540]
Epoch 23:  30%|###       | 6/20 [00:00<00:00, 62.29it/s, loss=-0.12137, sqweights=0.62690]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 62.92it/s, loss=-0.12137, sqweights=0.62690]
Epoch 23:  35%|###5      | 7/20 [00:00<00:00, 62.92it/s, loss=-0.12089, sqweights=0.62839]
Epoch 23:  40%|####      | 8/20 [00:00<00:00, 62.92it/s, loss=-0.12032, sqweights=0.63043]
Epoch 23:  45%|####5     | 9/20 [00:00<00:00, 62.92it/s, loss=-0.11969, sqweights=0.62983]
Epoch 23:  50%|#####     | 10/20 [00:00<00:00, 62.92it/s, loss=-0.12087, sqweights=0.62905]
Epoch 23:  55%|#####5    | 11/20 [00:00<00:00, 62.92it/s, loss=-0.12236, sqweights=0.63026]
Epoch 23:  60%|######    | 12/20 [00:00<00:00, 62.92it/s, loss=-0.12181, sqweights=0.63080]
Epoch 23:  65%|######5   | 13/20 [00:00<00:00, 62.92it/s, loss=-0.12125, sqweights=0.63185]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 65.11it/s, loss=-0.12125, sqweights=0.63185]
Epoch 23:  70%|#######   | 14/20 [00:00<00:00, 65.11it/s, loss=-0.11966, sqweights=0.63187]
Epoch 23:  75%|#######5  | 15/20 [00:00<00:00, 65.11it/s, loss=-0.11937, sqweights=0.63101]
Epoch 23:  80%|########  | 16/20 [00:00<00:00, 65.11it/s, loss=-0.11905, sqweights=0.63181]
Epoch 23:  85%|########5 | 17/20 [00:00<00:00, 65.11it/s, loss=-0.11888, sqweights=0.63153]
Epoch 23:  90%|######### | 18/20 [00:00<00:00, 65.11it/s, loss=-0.11830, sqweights=0.63213]
Epoch 23:  95%|#########5| 19/20 [00:00<00:00, 65.11it/s, loss=-0.11642, sqweights=0.63196]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]

Epoch 24:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 24:   5%|5         | 1/20 [00:00<00:00, 47.44it/s, loss=-0.11339, sqweights=0.64813]
Epoch 24:  10%|#         | 2/20 [00:00<00:00, 55.15it/s, loss=-0.11591, sqweights=0.63407]
Epoch 24:  15%|#5        | 3/20 [00:00<00:00, 58.11it/s, loss=-0.11721, sqweights=0.64390]
Epoch 24:  20%|##        | 4/20 [00:00<00:00, 59.68it/s, loss=-0.12384, sqweights=0.64470]
Epoch 24:  25%|##5       | 5/20 [00:00<00:00, 60.75it/s, loss=-0.12393, sqweights=0.64383]
Epoch 24:  30%|###       | 6/20 [00:00<00:00, 61.59it/s, loss=-0.12190, sqweights=0.63774]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 62.29it/s, loss=-0.12190, sqweights=0.63774]
Epoch 24:  35%|###5      | 7/20 [00:00<00:00, 62.29it/s, loss=-0.12355, sqweights=0.63842]
Epoch 24:  40%|####      | 8/20 [00:00<00:00, 62.29it/s, loss=-0.12555, sqweights=0.64039]
Epoch 24:  45%|####5     | 9/20 [00:00<00:00, 62.29it/s, loss=-0.12478, sqweights=0.64138]
Epoch 24:  50%|#####     | 10/20 [00:00<00:00, 62.29it/s, loss=-0.12270, sqweights=0.63830]
Epoch 24:  55%|#####5    | 11/20 [00:00<00:00, 62.29it/s, loss=-0.12330, sqweights=0.64155]
Epoch 24:  60%|######    | 12/20 [00:00<00:00, 62.29it/s, loss=-0.12258, sqweights=0.64255]
Epoch 24:  65%|######5   | 13/20 [00:00<00:00, 62.29it/s, loss=-0.12429, sqweights=0.64295]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 64.43it/s, loss=-0.12429, sqweights=0.64295]
Epoch 24:  70%|#######   | 14/20 [00:00<00:00, 64.43it/s, loss=-0.12393, sqweights=0.64410]
Epoch 24:  75%|#######5  | 15/20 [00:00<00:00, 64.43it/s, loss=-0.12207, sqweights=0.64360]
Epoch 24:  80%|########  | 16/20 [00:00<00:00, 64.43it/s, loss=-0.11986, sqweights=0.64241]
Epoch 24:  85%|########5 | 17/20 [00:00<00:00, 64.43it/s, loss=-0.12027, sqweights=0.64287]
Epoch 24:  90%|######### | 18/20 [00:00<00:00, 64.43it/s, loss=-0.11946, sqweights=0.64427]
Epoch 24:  95%|#########5| 19/20 [00:00<00:00, 64.43it/s, loss=-0.11969, sqweights=0.64521]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 23.16it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]

Epoch 25:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:01,  9.92it/s]
Epoch 25:   5%|5         | 1/20 [00:00<00:01,  9.92it/s, loss=-0.09722, sqweights=0.64730]
Epoch 25:  10%|#         | 2/20 [00:00<00:01,  9.92it/s, loss=-0.10720, sqweights=0.65466]
Epoch 25:  15%|#5        | 3/20 [00:00<00:01,  9.92it/s, loss=-0.11394, sqweights=0.65431]
Epoch 25:  20%|##        | 4/20 [00:00<00:01,  9.92it/s, loss=-0.11621, sqweights=0.65410]
Epoch 25:  25%|##5       | 5/20 [00:00<00:01,  9.92it/s, loss=-0.11728, sqweights=0.65711]
Epoch 25:  30%|###       | 6/20 [00:00<00:01,  9.92it/s, loss=-0.11581, sqweights=0.65812]
Epoch 25:  35%|###5      | 7/20 [00:00<00:01,  9.92it/s, loss=-0.11504, sqweights=0.66089]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 43.45it/s, loss=-0.11504, sqweights=0.66089]
Epoch 25:  40%|####      | 8/20 [00:00<00:00, 43.45it/s, loss=-0.11514, sqweights=0.66104]
Epoch 25:  45%|####5     | 9/20 [00:00<00:00, 43.45it/s, loss=-0.11718, sqweights=0.66307]
Epoch 25:  50%|#####     | 10/20 [00:00<00:00, 43.45it/s, loss=-0.11618, sqweights=0.66512]
Epoch 25:  55%|#####5    | 11/20 [00:00<00:00, 43.45it/s, loss=-0.11578, sqweights=0.66592]
Epoch 25:  60%|######    | 12/20 [00:00<00:00, 43.45it/s, loss=-0.11711, sqweights=0.66532]
Epoch 25:  65%|######5   | 13/20 [00:00<00:00, 43.45it/s, loss=-0.11704, sqweights=0.66706]
Epoch 25:  70%|#######   | 14/20 [00:00<00:00, 43.45it/s, loss=-0.11755, sqweights=0.66609]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 54.01it/s, loss=-0.11755, sqweights=0.66609]
Epoch 25:  75%|#######5  | 15/20 [00:00<00:00, 54.01it/s, loss=-0.11864, sqweights=0.66568]
Epoch 25:  80%|########  | 16/20 [00:00<00:00, 54.01it/s, loss=-0.11966, sqweights=0.66639]
Epoch 25:  85%|########5 | 17/20 [00:00<00:00, 54.01it/s, loss=-0.11941, sqweights=0.66633]
Epoch 25:  90%|######### | 18/20 [00:00<00:00, 54.01it/s, loss=-0.11978, sqweights=0.66714]
Epoch 25:  95%|#########5| 19/20 [00:00<00:00, 54.01it/s, loss=-0.11938, sqweights=0.66749]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 21.17it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]

Epoch 26:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 26:   5%|5         | 1/20 [00:00<00:00, 48.07it/s, loss=-0.08433, sqweights=0.64913]
Epoch 26:  10%|#         | 2/20 [00:00<00:00, 55.56it/s, loss=-0.09888, sqweights=0.66231]
Epoch 26:  15%|#5        | 3/20 [00:00<00:00, 57.57it/s, loss=-0.10227, sqweights=0.66639]
Epoch 26:  20%|##        | 4/20 [00:00<00:00, 59.56it/s, loss=-0.11543, sqweights=0.66631]
Epoch 26:  25%|##5       | 5/20 [00:00<00:00, 60.80it/s, loss=-0.12312, sqweights=0.66897]
Epoch 26:  30%|###       | 6/20 [00:00<00:00, 61.46it/s, loss=-0.12591, sqweights=0.67268]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 62.22it/s, loss=-0.12591, sqweights=0.67268]
Epoch 26:  35%|###5      | 7/20 [00:00<00:00, 62.22it/s, loss=-0.12591, sqweights=0.67621]
Epoch 26:  40%|####      | 8/20 [00:00<00:00, 62.22it/s, loss=-0.12559, sqweights=0.67540]
Epoch 26:  45%|####5     | 9/20 [00:00<00:00, 62.22it/s, loss=-0.12478, sqweights=0.67549]
Epoch 26:  50%|#####     | 10/20 [00:00<00:00, 62.22it/s, loss=-0.12217, sqweights=0.67539]
Epoch 26:  55%|#####5    | 11/20 [00:00<00:00, 62.22it/s, loss=-0.12047, sqweights=0.67515]
Epoch 26:  60%|######    | 12/20 [00:00<00:00, 62.22it/s, loss=-0.12003, sqweights=0.67590]
Epoch 26:  65%|######5   | 13/20 [00:00<00:00, 62.22it/s, loss=-0.12017, sqweights=0.67498]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 64.35it/s, loss=-0.12017, sqweights=0.67498]
Epoch 26:  70%|#######   | 14/20 [00:00<00:00, 64.35it/s, loss=-0.12006, sqweights=0.67425]
Epoch 26:  75%|#######5  | 15/20 [00:00<00:00, 64.35it/s, loss=-0.12067, sqweights=0.67477]
Epoch 26:  80%|########  | 16/20 [00:00<00:00, 64.35it/s, loss=-0.12006, sqweights=0.67509]
Epoch 26:  85%|########5 | 17/20 [00:00<00:00, 64.35it/s, loss=-0.11943, sqweights=0.67448]
Epoch 26:  90%|######### | 18/20 [00:00<00:00, 64.35it/s, loss=-0.12071, sqweights=0.67550]
Epoch 26:  95%|#########5| 19/20 [00:00<00:00, 64.35it/s, loss=-0.11972, sqweights=0.67616]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 23.11it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]

Epoch 27:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 27:   5%|5         | 1/20 [00:00<00:00, 48.00it/s, loss=-0.10956, sqweights=0.68584]
Epoch 27:  10%|#         | 2/20 [00:00<00:00, 55.57it/s, loss=-0.11183, sqweights=0.69164]
Epoch 27:  15%|#5        | 3/20 [00:00<00:00, 58.40it/s, loss=-0.10998, sqweights=0.69819]
Epoch 27:  20%|##        | 4/20 [00:00<00:00, 60.02it/s, loss=-0.10446, sqweights=0.69861]
Epoch 27:  25%|##5       | 5/20 [00:00<00:00, 61.23it/s, loss=-0.10985, sqweights=0.68821]
Epoch 27:  30%|###       | 6/20 [00:00<00:00, 62.04it/s, loss=-0.11288, sqweights=0.68643]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 62.60it/s, loss=-0.11288, sqweights=0.68643]
Epoch 27:  35%|###5      | 7/20 [00:00<00:00, 62.60it/s, loss=-0.11756, sqweights=0.68755]
Epoch 27:  40%|####      | 8/20 [00:00<00:00, 62.60it/s, loss=-0.11457, sqweights=0.68463]
Epoch 27:  45%|####5     | 9/20 [00:00<00:00, 62.60it/s, loss=-0.11186, sqweights=0.68500]
Epoch 27:  50%|#####     | 10/20 [00:00<00:00, 62.60it/s, loss=-0.10952, sqweights=0.68651]
Epoch 27:  55%|#####5    | 11/20 [00:00<00:00, 62.60it/s, loss=-0.10973, sqweights=0.68796]
Epoch 27:  60%|######    | 12/20 [00:00<00:00, 62.60it/s, loss=-0.10793, sqweights=0.68942]
Epoch 27:  65%|######5   | 13/20 [00:00<00:00, 62.60it/s, loss=-0.10816, sqweights=0.68988]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 64.76it/s, loss=-0.10816, sqweights=0.68988]
Epoch 27:  70%|#######   | 14/20 [00:00<00:00, 64.76it/s, loss=-0.11100, sqweights=0.69027]
Epoch 27:  75%|#######5  | 15/20 [00:00<00:00, 64.76it/s, loss=-0.11253, sqweights=0.69062]
Epoch 27:  80%|########  | 16/20 [00:00<00:00, 64.76it/s, loss=-0.11294, sqweights=0.68929]
Epoch 27:  85%|########5 | 17/20 [00:00<00:00, 64.76it/s, loss=-0.11321, sqweights=0.69015]
Epoch 27:  90%|######### | 18/20 [00:00<00:00, 64.76it/s, loss=-0.11289, sqweights=0.68904]
Epoch 27:  95%|#########5| 19/20 [00:00<00:00, 64.76it/s, loss=-0.11316, sqweights=0.68988]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]

Epoch 28:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 28:   5%|5         | 1/20 [00:00<00:00, 48.46it/s, loss=-0.11375, sqweights=0.67527]
Epoch 28:  10%|#         | 2/20 [00:00<00:00, 55.86it/s, loss=-0.11291, sqweights=0.67880]
Epoch 28:  15%|#5        | 3/20 [00:00<00:00, 58.90it/s, loss=-0.11675, sqweights=0.68919]
Epoch 28:  20%|##        | 4/20 [00:00<00:00, 60.64it/s, loss=-0.12165, sqweights=0.69292]
Epoch 28:  25%|##5       | 5/20 [00:00<00:00, 61.16it/s, loss=-0.12326, sqweights=0.69402]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 33.40it/s, loss=-0.12326, sqweights=0.69402]
Epoch 28:  30%|###       | 6/20 [00:00<00:00, 33.40it/s, loss=-0.12503, sqweights=0.69764]
Epoch 28:  35%|###5      | 7/20 [00:00<00:00, 33.40it/s, loss=-0.12784, sqweights=0.69900]
Epoch 28:  40%|####      | 8/20 [00:00<00:00, 33.40it/s, loss=-0.12642, sqweights=0.69312]
Epoch 28:  45%|####5     | 9/20 [00:00<00:00, 33.40it/s, loss=-0.12365, sqweights=0.69161]
Epoch 28:  50%|#####     | 10/20 [00:00<00:00, 33.40it/s, loss=-0.12369, sqweights=0.69037]
Epoch 28:  55%|#####5    | 11/20 [00:00<00:00, 33.40it/s, loss=-0.12331, sqweights=0.68953]
Epoch 28:  60%|######    | 12/20 [00:00<00:00, 33.40it/s, loss=-0.12429, sqweights=0.69089]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 48.28it/s, loss=-0.12429, sqweights=0.69089]
Epoch 28:  65%|######5   | 13/20 [00:00<00:00, 48.28it/s, loss=-0.12511, sqweights=0.69272]
Epoch 28:  70%|#######   | 14/20 [00:00<00:00, 48.28it/s, loss=-0.12385, sqweights=0.69310]
Epoch 28:  75%|#######5  | 15/20 [00:00<00:00, 48.28it/s, loss=-0.12407, sqweights=0.69447]
Epoch 28:  80%|########  | 16/20 [00:00<00:00, 48.28it/s, loss=-0.12391, sqweights=0.69567]
Epoch 28:  85%|########5 | 17/20 [00:00<00:00, 48.28it/s, loss=-0.12378, sqweights=0.69594]
Epoch 28:  90%|######### | 18/20 [00:00<00:00, 48.28it/s, loss=-0.12297, sqweights=0.69583]
Epoch 28:  95%|#########5| 19/20 [00:00<00:00, 48.28it/s, loss=-0.12266, sqweights=0.69601]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12266, sqweights=0.69601]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 21.11it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]

Epoch 29:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 29:   5%|5         | 1/20 [00:00<00:00, 46.00it/s, loss=-0.09095, sqweights=0.71518]
Epoch 29:  10%|#         | 2/20 [00:00<00:00, 54.27it/s, loss=-0.12836, sqweights=0.70632]
Epoch 29:  15%|#5        | 3/20 [00:00<00:00, 57.64it/s, loss=-0.13019, sqweights=0.70184]
Epoch 29:  20%|##        | 4/20 [00:00<00:00, 59.69it/s, loss=-0.12817, sqweights=0.70611]
Epoch 29:  25%|##5       | 5/20 [00:00<00:00, 60.99it/s, loss=-0.12552, sqweights=0.70390]
Epoch 29:  30%|###       | 6/20 [00:00<00:00, 61.96it/s, loss=-0.12555, sqweights=0.70787]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 62.62it/s, loss=-0.12555, sqweights=0.70787]
Epoch 29:  35%|###5      | 7/20 [00:00<00:00, 62.62it/s, loss=-0.12883, sqweights=0.70805]
Epoch 29:  40%|####      | 8/20 [00:00<00:00, 62.62it/s, loss=-0.12963, sqweights=0.70937]
Epoch 29:  45%|####5     | 9/20 [00:00<00:00, 62.62it/s, loss=-0.12931, sqweights=0.70821]
Epoch 29:  50%|#####     | 10/20 [00:00<00:00, 62.62it/s, loss=-0.12930, sqweights=0.70846]
Epoch 29:  55%|#####5    | 11/20 [00:00<00:00, 62.62it/s, loss=-0.12690, sqweights=0.71154]
Epoch 29:  60%|######    | 12/20 [00:00<00:00, 62.62it/s, loss=-0.12253, sqweights=0.71247]
Epoch 29:  65%|######5   | 13/20 [00:00<00:00, 62.62it/s, loss=-0.12340, sqweights=0.71475]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 64.79it/s, loss=-0.12340, sqweights=0.71475]
Epoch 29:  70%|#######   | 14/20 [00:00<00:00, 64.79it/s, loss=-0.12301, sqweights=0.71411]
Epoch 29:  75%|#######5  | 15/20 [00:00<00:00, 64.79it/s, loss=-0.12328, sqweights=0.71284]
Epoch 29:  80%|########  | 16/20 [00:00<00:00, 64.79it/s, loss=-0.12240, sqweights=0.71429]
Epoch 29:  85%|########5 | 17/20 [00:00<00:00, 64.79it/s, loss=-0.12037, sqweights=0.71422]
Epoch 29:  90%|######### | 18/20 [00:00<00:00, 64.79it/s, loss=-0.12169, sqweights=0.71494]
Epoch 29:  95%|#########5| 19/20 [00:00<00:00, 64.79it/s, loss=-0.12256, sqweights=0.71632]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 23.16it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]

Epoch 30:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 30:   5%|5         | 1/20 [00:00<00:00, 48.34it/s, loss=-0.11191, sqweights=0.71765]
Epoch 30:  10%|#         | 2/20 [00:00<00:00, 55.38it/s, loss=-0.13057, sqweights=0.72857]
Epoch 30:  15%|#5        | 3/20 [00:00<00:00, 58.64it/s, loss=-0.12068, sqweights=0.73183]
Epoch 30:  20%|##        | 4/20 [00:00<00:00, 60.46it/s, loss=-0.12589, sqweights=0.72622]
Epoch 30:  25%|##5       | 5/20 [00:00<00:00, 61.59it/s, loss=-0.12757, sqweights=0.72263]
Epoch 30:  30%|###       | 6/20 [00:00<00:00, 61.90it/s, loss=-0.12593, sqweights=0.71821]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.12593, sqweights=0.71821]
Epoch 30:  35%|###5      | 7/20 [00:00<00:00, 62.47it/s, loss=-0.12480, sqweights=0.72193]
Epoch 30:  40%|####      | 8/20 [00:00<00:00, 62.47it/s, loss=-0.12018, sqweights=0.72129]
Epoch 30:  45%|####5     | 9/20 [00:00<00:00, 62.47it/s, loss=-0.12739, sqweights=0.72299]
Epoch 30:  50%|#####     | 10/20 [00:00<00:00, 62.47it/s, loss=-0.12500, sqweights=0.72325]
Epoch 30:  55%|#####5    | 11/20 [00:00<00:00, 62.47it/s, loss=-0.12493, sqweights=0.72350]
Epoch 30:  60%|######    | 12/20 [00:00<00:00, 62.47it/s, loss=-0.12285, sqweights=0.72373]
Epoch 30:  65%|######5   | 13/20 [00:00<00:00, 62.47it/s, loss=-0.12416, sqweights=0.72340]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 64.52it/s, loss=-0.12416, sqweights=0.72340]
Epoch 30:  70%|#######   | 14/20 [00:00<00:00, 64.52it/s, loss=-0.12436, sqweights=0.72315]
Epoch 30:  75%|#######5  | 15/20 [00:00<00:00, 64.52it/s, loss=-0.12302, sqweights=0.72262]
Epoch 30:  80%|########  | 16/20 [00:00<00:00, 64.52it/s, loss=-0.12362, sqweights=0.72430]
Epoch 30:  85%|########5 | 17/20 [00:00<00:00, 64.52it/s, loss=-0.12189, sqweights=0.72353]
Epoch 30:  90%|######### | 18/20 [00:00<00:00, 64.52it/s, loss=-0.12234, sqweights=0.72307]
Epoch 30:  95%|#########5| 19/20 [00:00<00:00, 64.52it/s, loss=-0.12315, sqweights=0.72351]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 23.13it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]

Epoch 31:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 31:   5%|5         | 1/20 [00:00<00:00, 48.38it/s, loss=-0.09259, sqweights=0.72724]
Epoch 31:  10%|#         | 2/20 [00:00<00:00, 55.89it/s, loss=-0.10033, sqweights=0.73332]
Epoch 31:  15%|#5        | 3/20 [00:00<00:00, 57.96it/s, loss=-0.11162, sqweights=0.74179]
Epoch 31:  20%|##        | 4/20 [00:00<00:00, 59.87it/s, loss=-0.11613, sqweights=0.74979]
Epoch 31:  25%|##5       | 5/20 [00:00<00:00, 61.03it/s, loss=-0.12089, sqweights=0.75120]
Epoch 31:  30%|###       | 6/20 [00:00<00:00, 61.85it/s, loss=-0.11869, sqweights=0.74953]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 62.53it/s, loss=-0.11869, sqweights=0.74953]
Epoch 31:  35%|###5      | 7/20 [00:00<00:00, 62.53it/s, loss=-0.12161, sqweights=0.74850]
Epoch 31:  40%|####      | 8/20 [00:00<00:00, 62.53it/s, loss=-0.12072, sqweights=0.74884]
Epoch 31:  45%|####5     | 9/20 [00:00<00:00, 62.53it/s, loss=-0.11640, sqweights=0.74645]
Epoch 31:  50%|#####     | 10/20 [00:00<00:00, 62.53it/s, loss=-0.11726, sqweights=0.74481]
Epoch 31:  55%|#####5    | 11/20 [00:00<00:00, 62.53it/s, loss=-0.11729, sqweights=0.74667]
Epoch 31:  60%|######    | 12/20 [00:00<00:00, 62.53it/s, loss=-0.11647, sqweights=0.74462]
Epoch 31:  65%|######5   | 13/20 [00:00<00:00, 62.53it/s, loss=-0.11742, sqweights=0.74466]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 64.72it/s, loss=-0.11742, sqweights=0.74466]
Epoch 31:  70%|#######   | 14/20 [00:00<00:00, 64.72it/s, loss=-0.11848, sqweights=0.74543]
Epoch 31:  75%|#######5  | 15/20 [00:00<00:00, 64.72it/s, loss=-0.11978, sqweights=0.74593]
Epoch 31:  80%|########  | 16/20 [00:00<00:00, 64.72it/s, loss=-0.12041, sqweights=0.74576]
Epoch 31:  85%|########5 | 17/20 [00:00<00:00, 64.72it/s, loss=-0.12080, sqweights=0.74493]
Epoch 31:  90%|######### | 18/20 [00:00<00:00, 64.72it/s, loss=-0.12018, sqweights=0.74519]
Epoch 31:  95%|#########5| 19/20 [00:00<00:00, 64.72it/s, loss=-0.12111, sqweights=0.74666]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 21.21it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]

Epoch 32:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 32:   5%|5         | 1/20 [00:00<00:00, 45.71it/s, loss=-0.13522, sqweights=0.74692]
Epoch 32:  10%|#         | 2/20 [00:00<00:00, 52.80it/s, loss=-0.14122, sqweights=0.74113]
Epoch 32:  15%|#5        | 3/20 [00:00<00:00, 56.73it/s, loss=-0.12747, sqweights=0.74165]
Epoch 32:  20%|##        | 4/20 [00:00<00:00, 58.95it/s, loss=-0.12905, sqweights=0.74697]
Epoch 32:  25%|##5       | 5/20 [00:00<00:00, 60.17it/s, loss=-0.13142, sqweights=0.74708]
Epoch 32:  30%|###       | 6/20 [00:00<00:00, 61.02it/s, loss=-0.13001, sqweights=0.74666]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 61.75it/s, loss=-0.13001, sqweights=0.74666]
Epoch 32:  35%|###5      | 7/20 [00:00<00:00, 61.75it/s, loss=-0.12976, sqweights=0.74630]
Epoch 32:  40%|####      | 8/20 [00:00<00:00, 61.75it/s, loss=-0.12813, sqweights=0.74604]
Epoch 32:  45%|####5     | 9/20 [00:00<00:00, 61.75it/s, loss=-0.12759, sqweights=0.74600]
Epoch 32:  50%|#####     | 10/20 [00:00<00:00, 61.75it/s, loss=-0.12956, sqweights=0.74829]
Epoch 32:  55%|#####5    | 11/20 [00:00<00:00, 61.75it/s, loss=-0.13121, sqweights=0.75011]
Epoch 32:  60%|######    | 12/20 [00:00<00:00, 61.75it/s, loss=-0.12879, sqweights=0.75109]
Epoch 32:  65%|######5   | 13/20 [00:00<00:00, 61.75it/s, loss=-0.12745, sqweights=0.74950]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 64.14it/s, loss=-0.12745, sqweights=0.74950]
Epoch 32:  70%|#######   | 14/20 [00:00<00:00, 64.14it/s, loss=-0.12787, sqweights=0.74926]
Epoch 32:  75%|#######5  | 15/20 [00:00<00:00, 64.14it/s, loss=-0.12877, sqweights=0.75067]
Epoch 32:  80%|########  | 16/20 [00:00<00:00, 64.14it/s, loss=-0.12654, sqweights=0.75050]
Epoch 32:  85%|########5 | 17/20 [00:00<00:00, 64.14it/s, loss=-0.12588, sqweights=0.74972]
Epoch 32:  90%|######### | 18/20 [00:00<00:00, 64.14it/s, loss=-0.12632, sqweights=0.74991]
Epoch 32:  95%|#########5| 19/20 [00:00<00:00, 64.14it/s, loss=-0.12755, sqweights=0.75047]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 23.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]

Epoch 33:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 33:   5%|5         | 1/20 [00:00<00:00, 48.62it/s, loss=-0.12935, sqweights=0.74396]
Epoch 33:  10%|#         | 2/20 [00:00<00:00, 56.17it/s, loss=-0.13060, sqweights=0.75081]
Epoch 33:  15%|#5        | 3/20 [00:00<00:00, 58.44it/s, loss=-0.13392, sqweights=0.75941]
Epoch 33:  20%|##        | 4/20 [00:00<00:00, 60.19it/s, loss=-0.13369, sqweights=0.76078]
Epoch 33:  25%|##5       | 5/20 [00:00<00:00, 60.65it/s, loss=-0.13166, sqweights=0.75946]
Epoch 33:  30%|###       | 6/20 [00:00<00:00, 61.57it/s, loss=-0.12623, sqweights=0.76146]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 62.23it/s, loss=-0.12623, sqweights=0.76146]
Epoch 33:  35%|###5      | 7/20 [00:00<00:00, 62.23it/s, loss=-0.12264, sqweights=0.76469]
Epoch 33:  40%|####      | 8/20 [00:00<00:00, 62.23it/s, loss=-0.12068, sqweights=0.76479]
Epoch 33:  45%|####5     | 9/20 [00:00<00:00, 62.23it/s, loss=-0.11919, sqweights=0.76266]
Epoch 33:  50%|#####     | 10/20 [00:00<00:00, 62.23it/s, loss=-0.12032, sqweights=0.76194]
Epoch 33:  55%|#####5    | 11/20 [00:00<00:00, 62.23it/s, loss=-0.12107, sqweights=0.76327]
Epoch 33:  60%|######    | 12/20 [00:00<00:00, 62.23it/s, loss=-0.12092, sqweights=0.76261]
Epoch 33:  65%|######5   | 13/20 [00:00<00:00, 62.23it/s, loss=-0.12264, sqweights=0.76134]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 64.21it/s, loss=-0.12264, sqweights=0.76134]
Epoch 33:  70%|#######   | 14/20 [00:00<00:00, 64.21it/s, loss=-0.12231, sqweights=0.76107]
Epoch 33:  75%|#######5  | 15/20 [00:00<00:00, 64.21it/s, loss=-0.12110, sqweights=0.76196]
Epoch 33:  80%|########  | 16/20 [00:00<00:00, 64.21it/s, loss=-0.12291, sqweights=0.76391]
Epoch 33:  85%|########5 | 17/20 [00:00<00:00, 64.21it/s, loss=-0.12259, sqweights=0.76407]
Epoch 33:  90%|######### | 18/20 [00:00<00:00, 64.21it/s, loss=-0.12318, sqweights=0.76442]
Epoch 33:  95%|#########5| 19/20 [00:00<00:00, 64.21it/s, loss=-0.12400, sqweights=0.76425]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]

Epoch 34:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 34:   5%|5         | 1/20 [00:00<00:00, 47.65it/s, loss=-0.13294, sqweights=0.76685]
Epoch 34:  10%|#         | 2/20 [00:00<00:00, 55.13it/s, loss=-0.12388, sqweights=0.76946]
Epoch 34:  15%|#5        | 3/20 [00:00<00:00, 58.41it/s, loss=-0.11935, sqweights=0.77068]
Epoch 34:  20%|##        | 4/20 [00:00<00:00, 60.26it/s, loss=-0.12583, sqweights=0.77345]
Epoch 34:  25%|##5       | 5/20 [00:00<00:00, 61.50it/s, loss=-0.12629, sqweights=0.77351]
Epoch 34:  30%|###       | 6/20 [00:00<00:00, 62.27it/s, loss=-0.12823, sqweights=0.77563]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12823, sqweights=0.77563]
Epoch 34:  35%|###5      | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12418, sqweights=0.78051]
Epoch 34:  40%|####      | 8/20 [00:00<00:00, 62.70it/s, loss=-0.12477, sqweights=0.78011]
Epoch 34:  45%|####5     | 9/20 [00:00<00:00, 62.70it/s, loss=-0.12312, sqweights=0.77895]
Epoch 34:  50%|#####     | 10/20 [00:00<00:00, 62.70it/s, loss=-0.12440, sqweights=0.77896]
Epoch 34:  55%|#####5    | 11/20 [00:00<00:00, 62.70it/s, loss=-0.12242, sqweights=0.77810]
Epoch 34:  60%|######    | 12/20 [00:00<00:00, 62.70it/s, loss=-0.12372, sqweights=0.77726]
Epoch 34:  65%|######5   | 13/20 [00:00<00:00, 62.70it/s, loss=-0.12335, sqweights=0.77693]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 64.49it/s, loss=-0.12335, sqweights=0.77693]
Epoch 34:  70%|#######   | 14/20 [00:00<00:00, 64.49it/s, loss=-0.12301, sqweights=0.77730]
Epoch 34:  75%|#######5  | 15/20 [00:00<00:00, 64.49it/s, loss=-0.12277, sqweights=0.77655]
Epoch 34:  80%|########  | 16/20 [00:00<00:00, 64.49it/s, loss=-0.12395, sqweights=0.77742]
Epoch 34:  85%|########5 | 17/20 [00:00<00:00, 64.49it/s, loss=-0.12553, sqweights=0.77770]
Epoch 34:  90%|######### | 18/20 [00:00<00:00, 64.49it/s, loss=-0.12614, sqweights=0.77760]
Epoch 34:  95%|#########5| 19/20 [00:00<00:00, 64.49it/s, loss=-0.12586, sqweights=0.77807]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 21.12it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]

Epoch 35:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 35:   5%|5         | 1/20 [00:00<00:00, 48.15it/s, loss=-0.12864, sqweights=0.78205]
Epoch 35:  10%|#         | 2/20 [00:00<00:00, 55.58it/s, loss=-0.14016, sqweights=0.78308]
Epoch 35:  15%|#5        | 3/20 [00:00<00:00, 58.32it/s, loss=-0.13643, sqweights=0.78184]
Epoch 35:  20%|##        | 4/20 [00:00<00:00, 60.14it/s, loss=-0.13149, sqweights=0.77668]
Epoch 35:  25%|##5       | 5/20 [00:00<00:00, 61.00it/s, loss=-0.12778, sqweights=0.77462]
Epoch 35:  30%|###       | 6/20 [00:00<00:00, 61.88it/s, loss=-0.12557, sqweights=0.77656]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 62.27it/s, loss=-0.12557, sqweights=0.77656]
Epoch 35:  35%|###5      | 7/20 [00:00<00:00, 62.27it/s, loss=-0.12502, sqweights=0.77585]
Epoch 35:  40%|####      | 8/20 [00:00<00:00, 62.27it/s, loss=-0.12160, sqweights=0.77731]
Epoch 35:  45%|####5     | 9/20 [00:00<00:00, 62.27it/s, loss=-0.12744, sqweights=0.77763]
Epoch 35:  50%|#####     | 10/20 [00:00<00:00, 62.27it/s, loss=-0.12637, sqweights=0.77667]
Epoch 35:  55%|#####5    | 11/20 [00:00<00:00, 62.27it/s, loss=-0.12754, sqweights=0.77724]
Epoch 35:  60%|######    | 12/20 [00:00<00:00, 62.27it/s, loss=-0.12826, sqweights=0.77748]
Epoch 35:  65%|######5   | 13/20 [00:00<00:00, 62.27it/s, loss=-0.12708, sqweights=0.77690]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 64.45it/s, loss=-0.12708, sqweights=0.77690]
Epoch 35:  70%|#######   | 14/20 [00:00<00:00, 64.45it/s, loss=-0.12727, sqweights=0.77724]
Epoch 35:  75%|#######5  | 15/20 [00:00<00:00, 64.45it/s, loss=-0.12587, sqweights=0.77780]
Epoch 35:  80%|########  | 16/20 [00:00<00:00, 64.45it/s, loss=-0.12542, sqweights=0.77894]
Epoch 35:  85%|########5 | 17/20 [00:00<00:00, 64.45it/s, loss=-0.12523, sqweights=0.77943]
Epoch 35:  90%|######### | 18/20 [00:00<00:00, 64.45it/s, loss=-0.12420, sqweights=0.77938]
Epoch 35:  95%|#########5| 19/20 [00:00<00:00, 64.45it/s, loss=-0.12445, sqweights=0.77954]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]

Epoch 36:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 36:   5%|5         | 1/20 [00:00<00:00, 48.24it/s, loss=-0.11660, sqweights=0.78016]
Epoch 36:  10%|#         | 2/20 [00:00<00:00, 54.94it/s, loss=-0.12011, sqweights=0.78780]
Epoch 36:  15%|#5        | 3/20 [00:00<00:00, 58.39it/s, loss=-0.13007, sqweights=0.79828]
Epoch 36:  20%|##        | 4/20 [00:00<00:00, 60.32it/s, loss=-0.13182, sqweights=0.79655]
Epoch 36:  25%|##5       | 5/20 [00:00<00:00, 61.51it/s, loss=-0.13046, sqweights=0.79330]
Epoch 36:  30%|###       | 6/20 [00:00<00:00, 62.34it/s, loss=-0.12952, sqweights=0.79457]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 62.96it/s, loss=-0.12952, sqweights=0.79457]
Epoch 36:  35%|###5      | 7/20 [00:00<00:00, 62.96it/s, loss=-0.11986, sqweights=0.79544]
Epoch 36:  40%|####      | 8/20 [00:00<00:00, 62.96it/s, loss=-0.12021, sqweights=0.79225]
Epoch 36:  45%|####5     | 9/20 [00:00<00:00, 62.96it/s, loss=-0.12150, sqweights=0.79106]
Epoch 36:  50%|#####     | 10/20 [00:00<00:00, 62.96it/s, loss=-0.12218, sqweights=0.79176]
Epoch 36:  55%|#####5    | 11/20 [00:00<00:00, 62.96it/s, loss=-0.12355, sqweights=0.79211]
Epoch 36:  60%|######    | 12/20 [00:00<00:00, 62.96it/s, loss=-0.12257, sqweights=0.78987]
Epoch 36:  65%|######5   | 13/20 [00:00<00:00, 62.96it/s, loss=-0.12206, sqweights=0.79165]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.12206, sqweights=0.79165]
Epoch 36:  70%|#######   | 14/20 [00:00<00:00, 64.95it/s, loss=-0.12369, sqweights=0.79161]
Epoch 36:  75%|#######5  | 15/20 [00:00<00:00, 64.95it/s, loss=-0.12333, sqweights=0.79395]
Epoch 36:  80%|########  | 16/20 [00:00<00:00, 64.95it/s, loss=-0.12314, sqweights=0.79348]
Epoch 36:  85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.12199, sqweights=0.79278]
Epoch 36:  90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.12109, sqweights=0.79434]
Epoch 36:  95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.12139, sqweights=0.79414]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 23.21it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]

Epoch 37:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 37:   5%|5         | 1/20 [00:00<00:00, 48.39it/s, loss=-0.12368, sqweights=0.79008]
Epoch 37:  10%|#         | 2/20 [00:00<00:00, 55.55it/s, loss=-0.13102, sqweights=0.78963]
Epoch 37:  15%|#5        | 3/20 [00:00<00:00, 58.78it/s, loss=-0.13559, sqweights=0.79313]
Epoch 37:  20%|##        | 4/20 [00:00<00:00, 60.54it/s, loss=-0.12363, sqweights=0.79869]
Epoch 37:  25%|##5       | 5/20 [00:00<00:00, 61.67it/s, loss=-0.12969, sqweights=0.79935]
Epoch 37:  30%|###       | 6/20 [00:00<00:00, 62.45it/s, loss=-0.12704, sqweights=0.80012]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 63.03it/s, loss=-0.12704, sqweights=0.80012]
Epoch 37:  35%|###5      | 7/20 [00:00<00:00, 63.03it/s, loss=-0.13051, sqweights=0.80045]
Epoch 37:  40%|####      | 8/20 [00:00<00:00, 63.03it/s, loss=-0.13055, sqweights=0.80521]
Epoch 37:  45%|####5     | 9/20 [00:00<00:00, 63.03it/s, loss=-0.12914, sqweights=0.80261]
Epoch 37:  50%|#####     | 10/20 [00:00<00:00, 63.03it/s, loss=-0.12759, sqweights=0.80055]
Epoch 37:  55%|#####5    | 11/20 [00:00<00:00, 63.03it/s, loss=-0.12790, sqweights=0.80067]
Epoch 37:  60%|######    | 12/20 [00:00<00:00, 63.03it/s, loss=-0.12941, sqweights=0.80171]
Epoch 37:  65%|######5   | 13/20 [00:00<00:00, 63.03it/s, loss=-0.12899, sqweights=0.80194]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12899, sqweights=0.80194]
Epoch 37:  70%|#######   | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12863, sqweights=0.80253]
Epoch 37:  75%|#######5  | 15/20 [00:00<00:00, 64.83it/s, loss=-0.12638, sqweights=0.80435]
Epoch 37:  80%|########  | 16/20 [00:00<00:00, 64.83it/s, loss=-0.12891, sqweights=0.80355]
Epoch 37:  85%|########5 | 17/20 [00:00<00:00, 64.83it/s, loss=-0.12834, sqweights=0.80343]
Epoch 37:  90%|######### | 18/20 [00:00<00:00, 64.83it/s, loss=-0.12633, sqweights=0.80364]
Epoch 37:  95%|#########5| 19/20 [00:00<00:00, 64.83it/s, loss=-0.12501, sqweights=0.80556]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 21.09it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]

Epoch 38:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 38:   5%|5         | 1/20 [00:00<00:00, 47.57it/s, loss=-0.13163, sqweights=0.80267]
Epoch 38:  10%|#         | 2/20 [00:00<00:00, 55.20it/s, loss=-0.14169, sqweights=0.79233]
Epoch 38:  15%|#5        | 3/20 [00:00<00:00, 57.74it/s, loss=-0.13660, sqweights=0.79548]
Epoch 38:  20%|##        | 4/20 [00:00<00:00, 59.71it/s, loss=-0.13344, sqweights=0.79909]
Epoch 38:  25%|##5       | 5/20 [00:00<00:00, 61.00it/s, loss=-0.13769, sqweights=0.80390]
Epoch 38:  30%|###       | 6/20 [00:00<00:00, 61.89it/s, loss=-0.13191, sqweights=0.80354]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 62.54it/s, loss=-0.13191, sqweights=0.80354]
Epoch 38:  35%|###5      | 7/20 [00:00<00:00, 62.54it/s, loss=-0.13259, sqweights=0.80453]
Epoch 38:  40%|####      | 8/20 [00:00<00:00, 62.54it/s, loss=-0.12911, sqweights=0.80572]
Epoch 38:  45%|####5     | 9/20 [00:00<00:00, 62.54it/s, loss=-0.12804, sqweights=0.80816]
Epoch 38:  50%|#####     | 10/20 [00:00<00:00, 62.54it/s, loss=-0.12398, sqweights=0.80895]
Epoch 38:  55%|#####5    | 11/20 [00:00<00:00, 62.54it/s, loss=-0.12643, sqweights=0.81014]
Epoch 38:  60%|######    | 12/20 [00:00<00:00, 62.54it/s, loss=-0.12501, sqweights=0.81052]
Epoch 38:  65%|######5   | 13/20 [00:00<00:00, 62.54it/s, loss=-0.12400, sqweights=0.81107]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 64.92it/s, loss=-0.12400, sqweights=0.81107]
Epoch 38:  70%|#######   | 14/20 [00:00<00:00, 64.92it/s, loss=-0.12407, sqweights=0.81273]
Epoch 38:  75%|#######5  | 15/20 [00:00<00:00, 64.92it/s, loss=-0.12694, sqweights=0.81269]
Epoch 38:  80%|########  | 16/20 [00:00<00:00, 64.92it/s, loss=-0.12635, sqweights=0.81319]
Epoch 38:  85%|########5 | 17/20 [00:00<00:00, 64.92it/s, loss=-0.12671, sqweights=0.81173]
Epoch 38:  90%|######### | 18/20 [00:00<00:00, 64.92it/s, loss=-0.12762, sqweights=0.81191]
Epoch 38:  95%|#########5| 19/20 [00:00<00:00, 64.92it/s, loss=-0.12918, sqweights=0.81233]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 23.17it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]

Epoch 39:   0%|          | 0/20 [00:00<?, ?it/s]
Epoch 39:   5%|5         | 1/20 [00:00<00:00, 48.24it/s, loss=-0.11211, sqweights=0.83095]
Epoch 39:  10%|#         | 2/20 [00:00<00:00, 55.23it/s, loss=-0.11914, sqweights=0.82066]
Epoch 39:  15%|#5        | 3/20 [00:00<00:00, 58.57it/s, loss=-0.11884, sqweights=0.82027]
Epoch 39:  20%|##        | 4/20 [00:00<00:00, 60.40it/s, loss=-0.11786, sqweights=0.82377]
Epoch 39:  25%|##5       | 5/20 [00:00<00:00, 61.48it/s, loss=-0.11628, sqweights=0.82335]
Epoch 39:  30%|###       | 6/20 [00:00<00:00, 62.29it/s, loss=-0.11809, sqweights=0.82164]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 62.70it/s, loss=-0.11809, sqweights=0.82164]
Epoch 39:  35%|###5      | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12640, sqweights=0.82007]
Epoch 39:  40%|####      | 8/20 [00:00<00:00, 62.70it/s, loss=-0.13055, sqweights=0.82095]
Epoch 39:  45%|####5     | 9/20 [00:00<00:00, 62.70it/s, loss=-0.13016, sqweights=0.81815]
Epoch 39:  50%|#####     | 10/20 [00:00<00:00, 62.70it/s, loss=-0.12811, sqweights=0.81687]
Epoch 39:  55%|#####5    | 11/20 [00:00<00:00, 62.70it/s, loss=-0.12815, sqweights=0.81824]
Epoch 39:  60%|######    | 12/20 [00:00<00:00, 62.70it/s, loss=-0.12692, sqweights=0.81863]
Epoch 39:  65%|######5   | 13/20 [00:00<00:00, 62.70it/s, loss=-0.12631, sqweights=0.81975]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12631, sqweights=0.81975]
Epoch 39:  70%|#######   | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12671, sqweights=0.82048]
Epoch 39:  75%|#######5  | 15/20 [00:00<00:00, 64.83it/s, loss=-0.12583, sqweights=0.82132]
Epoch 39:  80%|########  | 16/20 [00:00<00:00, 64.83it/s, loss=-0.12654, sqweights=0.82272]
Epoch 39:  85%|########5 | 17/20 [00:00<00:00, 64.83it/s, loss=-0.12666, sqweights=0.82375]
Epoch 39:  90%|######### | 18/20 [00:00<00:00, 64.83it/s, loss=-0.12838, sqweights=0.82432]
Epoch 39:  95%|#########5| 19/20 [00:00<00:00, 64.83it/s, loss=-0.12921, sqweights=0.82490]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]

<matplotlib.legend.Legend object at 0x7f61fcaf9050>

import numpy as np
import torch

import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast

from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run


class VARTrue(Benchmark):
    """Benchmark representing the ground truth return process.

    Parameters
    ----------
    process : statsmodels.tsa.vector_ar.var_model.VARProcess
        The ground truth VAR process that generates the returns.

    """

    def __init__(self, process):
        self.process = process

    def __call__(self, x):
        """Invest all money into the asset with the highest return over the horizon."""
        n_samples, n_channels, lookback, n_assets = x.shape

        assert n_channels == 1

        x_np = x.detach().numpy()  # (n_samples, n_channels, lookback, n_assets)
        weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]

        result = torch.zeros(n_samples, n_assets).to(x.dtype)

        for i, w_ix in enumerate(weights_list):
            result[i, w_ix] = 1

        return result


coefs = np.load('var_coefs.npy')  # (lookback, n_assets, n_assets) = (12, 8, 8)

# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256

# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)

# Create features and targets
X_list, y_list = [], []

for i in range(lookback, n_timesteps - horizon - gap + 1):
    X_list.append(data[i - lookback: i, :])
    y_list.append(data[i + gap: i + gap + horizon, :])

X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]

# Setup deepdow framework
dataset = InRAMDataset(X, y)

network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
                             indices=list(range(5000)),
                             batch_size=batch_size,
                             lookback=lookback)
val_dataloaders = {'train': dataloader,
                   'val': RigidDataLoader(dataset,
                                          indices=list(range(5020, 9800)),
                                          batch_size=batch_size,
                                          lookback=lookback)}

run = Run(network,
          100 * MeanReturns(),
          dataloader,
          val_dataloaders=val_dataloaders,
          metrics={'sqweights': SquaredWeights()},
          benchmarks={'1overN': OneOverN(),
                      'VAR': VARTrue(process),
                      'Random': Random(),
                      'InverseVol': InverseVolatility()},
          optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
          callbacks=[EarlyStoppingCallback('val', 'loss')]
          )

history = run.launch(40)

fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')

per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')

ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')

plt.legend()

Total running time of the script: ( 0 minutes 42.128 seconds)

Gallery generated by Sphinx-Gallery