Note
Click here to download the full example code
Vector autoregression¶
This example demonstrates how one can validate deepdow
on synthetic data.
We choose to model our returns with the vector autoregression model (VAR).
This model links future returns to lagged returns with a linear
model. See [Lütkepohl2005] for more details. We use a stable VAR
process with 12 lags and 8 assets, that is
For this specific task, we use the LinearNet
network. It is very similar to VAR since it tries to find a linear
model of all lagged variables. However, it also has purely deep learning components like dropout, batch
normalization and softmax allocator.
To put the performance of our network into context, we create a benchmark VARTrue that has access to the true parameters of the VAR process. We create a simple investment rule of investing all resources into the asset with the highest future returns. Additionally, we also consider other benchmarks
equally weighted portfolio
inverse volatility
random allocation
References¶
- Lütkepohl2005
Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science & Business Media, 2005.
Warning
Note that we are using the statsmodels
package to simulate the VAR process.
Out:
model metric epoch dataloader
1overN loss -1 train 0.001
val -0.001
sqweights -1 train 0.125
val 0.125
InverseVol loss -1 train 0.001
val -0.002
sqweights -1 train 0.144
val 0.145
Random loss -1 train 0.000
val 0.000
sqweights -1 train 0.166
val 0.166
VAR loss -1 train -0.173
val -0.174
sqweights -1 train 1.000
val 1.000
Epoch 0: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 0: 5%|5 | 1/20 [00:00<00:00, 48.26it/s, loss=-0.00073, sqweights=0.16562]
Epoch 0: 10%|# | 2/20 [00:00<00:00, 54.73it/s, loss=-0.00287, sqweights=0.16567]
Epoch 0: 15%|#5 | 3/20 [00:00<00:00, 24.43it/s, loss=-0.00287, sqweights=0.16567]
Epoch 0: 15%|#5 | 3/20 [00:00<00:00, 24.43it/s, loss=-0.00210, sqweights=0.16546]
Epoch 0: 20%|## | 4/20 [00:00<00:00, 24.43it/s, loss=0.00233, sqweights=0.16570]
Epoch 0: 25%|##5 | 5/20 [00:00<00:00, 24.43it/s, loss=0.00064, sqweights=0.16639]
Epoch 0: 30%|### | 6/20 [00:00<00:00, 24.43it/s, loss=-0.00070, sqweights=0.16656]
Epoch 0: 35%|###5 | 7/20 [00:00<00:00, 24.43it/s, loss=-0.00123, sqweights=0.16696]
Epoch 0: 40%|#### | 8/20 [00:00<00:00, 24.43it/s, loss=-0.00011, sqweights=0.16724]
Epoch 0: 45%|####5 | 9/20 [00:00<00:00, 24.43it/s, loss=0.00071, sqweights=0.16691]
Epoch 0: 50%|##### | 10/20 [00:00<00:00, 47.36it/s, loss=0.00071, sqweights=0.16691]
Epoch 0: 50%|##### | 10/20 [00:00<00:00, 47.36it/s, loss=-0.00059, sqweights=0.16709]
Epoch 0: 55%|#####5 | 11/20 [00:00<00:00, 47.36it/s, loss=0.00039, sqweights=0.16712]
Epoch 0: 60%|###### | 12/20 [00:00<00:00, 47.36it/s, loss=0.00098, sqweights=0.16723]
Epoch 0: 65%|######5 | 13/20 [00:00<00:00, 47.36it/s, loss=0.00154, sqweights=0.16747]
Epoch 0: 70%|####### | 14/20 [00:00<00:00, 47.36it/s, loss=0.00035, sqweights=0.16746]
Epoch 0: 75%|#######5 | 15/20 [00:00<00:00, 47.36it/s, loss=0.00071, sqweights=0.16745]
Epoch 0: 80%|######## | 16/20 [00:00<00:00, 47.36it/s, loss=0.00016, sqweights=0.16770]
Epoch 0: 85%|########5 | 17/20 [00:00<00:00, 55.10it/s, loss=0.00016, sqweights=0.16770]
Epoch 0: 85%|########5 | 17/20 [00:00<00:00, 55.10it/s, loss=-0.00012, sqweights=0.16771]
Epoch 0: 90%|######### | 18/20 [00:00<00:00, 55.10it/s, loss=-0.00129, sqweights=0.16774]
Epoch 0: 95%|#########5| 19/20 [00:00<00:00, 55.10it/s, loss=-0.00113, sqweights=0.16795]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 55.10it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]
Epoch 0: 100%|##########| 20/20 [00:00<00:00, 21.34it/s, loss=-0.00089, sqweights=0.16791, train_loss=0.00094, train_sqweights=0.12549, val_loss=-0.00057, val_sqweights=0.12549]
Epoch 1: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 1: 5%|5 | 1/20 [00:00<00:00, 48.28it/s, loss=0.00714, sqweights=0.17062]
Epoch 1: 10%|# | 2/20 [00:00<00:00, 55.89it/s, loss=-0.00702, sqweights=0.16930]
Epoch 1: 15%|#5 | 3/20 [00:00<00:00, 59.07it/s, loss=-0.00641, sqweights=0.16876]
Epoch 1: 20%|## | 4/20 [00:00<00:00, 60.81it/s, loss=-0.00352, sqweights=0.16840]
Epoch 1: 25%|##5 | 5/20 [00:00<00:00, 61.75it/s, loss=-0.00707, sqweights=0.16876]
Epoch 1: 30%|### | 6/20 [00:00<00:00, 62.38it/s, loss=-0.00688, sqweights=0.16814]
Epoch 1: 35%|###5 | 7/20 [00:00<00:00, 62.99it/s, loss=-0.00688, sqweights=0.16814]
Epoch 1: 35%|###5 | 7/20 [00:00<00:00, 62.99it/s, loss=-0.00745, sqweights=0.16818]
Epoch 1: 40%|#### | 8/20 [00:00<00:00, 62.99it/s, loss=-0.00909, sqweights=0.16820]
Epoch 1: 45%|####5 | 9/20 [00:00<00:00, 62.99it/s, loss=-0.00936, sqweights=0.16836]
Epoch 1: 50%|##### | 10/20 [00:00<00:00, 62.99it/s, loss=-0.00923, sqweights=0.16866]
Epoch 1: 55%|#####5 | 11/20 [00:00<00:00, 62.99it/s, loss=-0.01065, sqweights=0.16898]
Epoch 1: 60%|###### | 12/20 [00:00<00:00, 62.99it/s, loss=-0.01114, sqweights=0.16935]
Epoch 1: 65%|######5 | 13/20 [00:00<00:00, 62.99it/s, loss=-0.01099, sqweights=0.16967]
Epoch 1: 70%|####### | 14/20 [00:00<00:00, 65.04it/s, loss=-0.01099, sqweights=0.16967]
Epoch 1: 70%|####### | 14/20 [00:00<00:00, 65.04it/s, loss=-0.01002, sqweights=0.16997]
Epoch 1: 75%|#######5 | 15/20 [00:00<00:00, 65.04it/s, loss=-0.01012, sqweights=0.16993]
Epoch 1: 80%|######## | 16/20 [00:00<00:00, 65.04it/s, loss=-0.01024, sqweights=0.16999]
Epoch 1: 85%|########5 | 17/20 [00:00<00:00, 65.04it/s, loss=-0.00971, sqweights=0.17009]
Epoch 1: 90%|######### | 18/20 [00:00<00:00, 65.04it/s, loss=-0.00941, sqweights=0.17010]
Epoch 1: 95%|#########5| 19/20 [00:00<00:00, 65.04it/s, loss=-0.00891, sqweights=0.17018]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 65.04it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]
Epoch 1: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.00846, sqweights=0.17010, train_loss=0.00050, train_sqweights=0.12562, val_loss=-0.00090, val_sqweights=0.12563]
Epoch 2: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 2: 5%|5 | 1/20 [00:00<00:00, 47.44it/s, loss=-0.01247, sqweights=0.17019]
Epoch 2: 10%|# | 2/20 [00:00<00:00, 55.36it/s, loss=-0.00259, sqweights=0.17309]
Epoch 2: 15%|#5 | 3/20 [00:00<00:00, 58.66it/s, loss=-0.00639, sqweights=0.17346]
Epoch 2: 20%|## | 4/20 [00:00<00:00, 60.48it/s, loss=-0.01129, sqweights=0.17446]
Epoch 2: 25%|##5 | 5/20 [00:00<00:00, 61.03it/s, loss=-0.01252, sqweights=0.17383]
Epoch 2: 30%|### | 6/20 [00:00<00:00, 61.91it/s, loss=-0.01182, sqweights=0.17386]
Epoch 2: 35%|###5 | 7/20 [00:00<00:00, 62.48it/s, loss=-0.01182, sqweights=0.17386]
Epoch 2: 35%|###5 | 7/20 [00:00<00:00, 62.48it/s, loss=-0.01307, sqweights=0.17373]
Epoch 2: 40%|#### | 8/20 [00:00<00:00, 62.48it/s, loss=-0.01297, sqweights=0.17419]
Epoch 2: 45%|####5 | 9/20 [00:00<00:00, 62.48it/s, loss=-0.01251, sqweights=0.17465]
Epoch 2: 50%|##### | 10/20 [00:00<00:00, 62.48it/s, loss=-0.01242, sqweights=0.17460]
Epoch 2: 55%|#####5 | 11/20 [00:00<00:00, 62.48it/s, loss=-0.01039, sqweights=0.17455]
Epoch 2: 60%|###### | 12/20 [00:00<00:00, 62.48it/s, loss=-0.00968, sqweights=0.17424]
Epoch 2: 65%|######5 | 13/20 [00:00<00:00, 62.48it/s, loss=-0.00901, sqweights=0.17421]
Epoch 2: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.00901, sqweights=0.17421]
Epoch 2: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.00973, sqweights=0.17462]
Epoch 2: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.00999, sqweights=0.17486]
Epoch 2: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.01013, sqweights=0.17568]
Epoch 2: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.01168, sqweights=0.17588]
Epoch 2: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.01186, sqweights=0.17596]
Epoch 2: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.01284, sqweights=0.17611]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]
Epoch 2: 100%|##########| 20/20 [00:00<00:00, 23.26it/s, loss=-0.01353, sqweights=0.17665, train_loss=-0.00124, train_sqweights=0.12610, val_loss=-0.00224, val_sqweights=0.12610]
Epoch 3: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 3: 5%|5 | 1/20 [00:00<00:00, 47.41it/s, loss=-0.00594, sqweights=0.17965]
Epoch 3: 10%|# | 2/20 [00:00<00:00, 55.51it/s, loss=-0.00667, sqweights=0.18086]
Epoch 3: 15%|#5 | 3/20 [00:00<00:00, 58.76it/s, loss=-0.01056, sqweights=0.18173]
Epoch 3: 20%|## | 4/20 [00:00<00:00, 60.33it/s, loss=-0.01214, sqweights=0.18164]
Epoch 3: 25%|##5 | 5/20 [00:00<00:00, 61.50it/s, loss=-0.01210, sqweights=0.18157]
Epoch 3: 30%|### | 6/20 [00:00<00:00, 62.35it/s, loss=-0.01416, sqweights=0.18190]
Epoch 3: 35%|###5 | 7/20 [00:00<00:00, 62.97it/s, loss=-0.01416, sqweights=0.18190]
Epoch 3: 35%|###5 | 7/20 [00:00<00:00, 62.97it/s, loss=-0.01541, sqweights=0.18159]
Epoch 3: 40%|#### | 8/20 [00:00<00:00, 62.97it/s, loss=-0.01638, sqweights=0.18150]
Epoch 3: 45%|####5 | 9/20 [00:00<00:00, 62.97it/s, loss=-0.01681, sqweights=0.18144]
Epoch 3: 50%|##### | 10/20 [00:00<00:00, 62.97it/s, loss=-0.01782, sqweights=0.18171]
Epoch 3: 55%|#####5 | 11/20 [00:00<00:00, 62.97it/s, loss=-0.02016, sqweights=0.18200]
Epoch 3: 60%|###### | 12/20 [00:00<00:00, 62.97it/s, loss=-0.02175, sqweights=0.18246]
Epoch 3: 65%|######5 | 13/20 [00:00<00:00, 62.97it/s, loss=-0.02144, sqweights=0.18269]
Epoch 3: 70%|####### | 14/20 [00:00<00:00, 46.64it/s, loss=-0.02144, sqweights=0.18269]
Epoch 3: 70%|####### | 14/20 [00:00<00:00, 46.64it/s, loss=-0.02170, sqweights=0.18283]
Epoch 3: 75%|#######5 | 15/20 [00:00<00:00, 46.64it/s, loss=-0.02162, sqweights=0.18297]
Epoch 3: 80%|######## | 16/20 [00:00<00:00, 46.64it/s, loss=-0.02147, sqweights=0.18374]
Epoch 3: 85%|########5 | 17/20 [00:00<00:00, 46.64it/s, loss=-0.02168, sqweights=0.18399]
Epoch 3: 90%|######### | 18/20 [00:00<00:00, 46.64it/s, loss=-0.02224, sqweights=0.18398]
Epoch 3: 95%|#########5| 19/20 [00:00<00:00, 46.64it/s, loss=-0.02130, sqweights=0.18428]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 46.64it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]
Epoch 3: 100%|##########| 20/20 [00:00<00:00, 21.45it/s, loss=-0.02061, sqweights=0.18466, train_loss=-0.00750, train_sqweights=0.12905, val_loss=-0.00721, val_sqweights=0.12900]
Epoch 4: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 4: 5%|5 | 1/20 [00:00<00:00, 48.47it/s, loss=-0.03263, sqweights=0.18674]
Epoch 4: 10%|# | 2/20 [00:00<00:00, 55.88it/s, loss=-0.03378, sqweights=0.18761]
Epoch 4: 15%|#5 | 3/20 [00:00<00:00, 58.58it/s, loss=-0.02888, sqweights=0.18915]
Epoch 4: 20%|## | 4/20 [00:00<00:00, 60.16it/s, loss=-0.03058, sqweights=0.18974]
Epoch 4: 25%|##5 | 5/20 [00:00<00:00, 61.37it/s, loss=-0.02790, sqweights=0.19008]
Epoch 4: 30%|### | 6/20 [00:00<00:00, 62.18it/s, loss=-0.02984, sqweights=0.19055]
Epoch 4: 35%|###5 | 7/20 [00:00<00:00, 62.78it/s, loss=-0.02984, sqweights=0.19055]
Epoch 4: 35%|###5 | 7/20 [00:00<00:00, 62.78it/s, loss=-0.02973, sqweights=0.19170]
Epoch 4: 40%|#### | 8/20 [00:00<00:00, 62.78it/s, loss=-0.02761, sqweights=0.19189]
Epoch 4: 45%|####5 | 9/20 [00:00<00:00, 62.78it/s, loss=-0.02741, sqweights=0.19285]
Epoch 4: 50%|##### | 10/20 [00:00<00:00, 62.78it/s, loss=-0.02864, sqweights=0.19266]
Epoch 4: 55%|#####5 | 11/20 [00:00<00:00, 62.78it/s, loss=-0.02779, sqweights=0.19319]
Epoch 4: 60%|###### | 12/20 [00:00<00:00, 62.78it/s, loss=-0.02879, sqweights=0.19370]
Epoch 4: 65%|######5 | 13/20 [00:00<00:00, 62.78it/s, loss=-0.02852, sqweights=0.19445]
Epoch 4: 70%|####### | 14/20 [00:00<00:00, 64.67it/s, loss=-0.02852, sqweights=0.19445]
Epoch 4: 70%|####### | 14/20 [00:00<00:00, 64.67it/s, loss=-0.02968, sqweights=0.19489]
Epoch 4: 75%|#######5 | 15/20 [00:00<00:00, 64.67it/s, loss=-0.02975, sqweights=0.19525]
Epoch 4: 80%|######## | 16/20 [00:00<00:00, 64.67it/s, loss=-0.02986, sqweights=0.19559]
Epoch 4: 85%|########5 | 17/20 [00:00<00:00, 64.67it/s, loss=-0.02901, sqweights=0.19565]
Epoch 4: 90%|######### | 18/20 [00:00<00:00, 64.67it/s, loss=-0.03082, sqweights=0.19607]
Epoch 4: 95%|#########5| 19/20 [00:00<00:00, 64.67it/s, loss=-0.03118, sqweights=0.19706]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 64.67it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]
Epoch 4: 100%|##########| 20/20 [00:00<00:00, 23.24it/s, loss=-0.03172, sqweights=0.19712, train_loss=-0.02353, train_sqweights=0.14478, val_loss=-0.02007, val_sqweights=0.14437]
Epoch 5: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 5: 5%|5 | 1/20 [00:00<00:00, 48.87it/s, loss=-0.03289, sqweights=0.19800]
Epoch 5: 10%|# | 2/20 [00:00<00:00, 56.31it/s, loss=-0.02880, sqweights=0.19834]
Epoch 5: 15%|#5 | 3/20 [00:00<00:00, 58.66it/s, loss=-0.03424, sqweights=0.20072]
Epoch 5: 20%|## | 4/20 [00:00<00:00, 60.47it/s, loss=-0.02995, sqweights=0.20181]
Epoch 5: 25%|##5 | 5/20 [00:00<00:00, 61.64it/s, loss=-0.02977, sqweights=0.20331]
Epoch 5: 30%|### | 6/20 [00:00<00:00, 62.42it/s, loss=-0.03077, sqweights=0.20445]
Epoch 5: 35%|###5 | 7/20 [00:00<00:00, 62.93it/s, loss=-0.03077, sqweights=0.20445]
Epoch 5: 35%|###5 | 7/20 [00:00<00:00, 62.93it/s, loss=-0.02941, sqweights=0.20508]
Epoch 5: 40%|#### | 8/20 [00:00<00:00, 62.93it/s, loss=-0.02796, sqweights=0.20516]
Epoch 5: 45%|####5 | 9/20 [00:00<00:00, 62.93it/s, loss=-0.02927, sqweights=0.20511]
Epoch 5: 50%|##### | 10/20 [00:00<00:00, 62.93it/s, loss=-0.02927, sqweights=0.20546]
Epoch 5: 55%|#####5 | 11/20 [00:00<00:00, 62.93it/s, loss=-0.03093, sqweights=0.20604]
Epoch 5: 60%|###### | 12/20 [00:00<00:00, 62.93it/s, loss=-0.03245, sqweights=0.20607]
Epoch 5: 65%|######5 | 13/20 [00:00<00:00, 62.93it/s, loss=-0.03233, sqweights=0.20628]
Epoch 5: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.03233, sqweights=0.20628]
Epoch 5: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.03398, sqweights=0.20662]
Epoch 5: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.03454, sqweights=0.20745]
Epoch 5: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.03502, sqweights=0.20787]
Epoch 5: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.03548, sqweights=0.20841]
Epoch 5: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.03521, sqweights=0.20887]
Epoch 5: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.03570, sqweights=0.20929]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]
Epoch 5: 100%|##########| 20/20 [00:00<00:00, 23.28it/s, loss=-0.03510, sqweights=0.20913, train_loss=-0.04063, train_sqweights=0.17116, val_loss=-0.03392, val_sqweights=0.17005]
Epoch 6: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 6: 5%|5 | 1/20 [00:00<00:00, 48.57it/s, loss=-0.03736, sqweights=0.22457]
Epoch 6: 10%|# | 2/20 [00:00<00:00, 55.93it/s, loss=-0.04413, sqweights=0.22210]
Epoch 6: 15%|#5 | 3/20 [00:00<00:00, 58.82it/s, loss=-0.04409, sqweights=0.22338]
Epoch 6: 20%|## | 4/20 [00:00<00:00, 60.57it/s, loss=-0.04156, sqweights=0.22513]
Epoch 6: 25%|##5 | 5/20 [00:00<00:00, 61.71it/s, loss=-0.04160, sqweights=0.22603]
Epoch 6: 30%|### | 6/20 [00:00<00:00, 62.46it/s, loss=-0.04079, sqweights=0.22613]
Epoch 6: 35%|###5 | 7/20 [00:00<00:00, 63.00it/s, loss=-0.04079, sqweights=0.22613]
Epoch 6: 35%|###5 | 7/20 [00:00<00:00, 63.00it/s, loss=-0.04284, sqweights=0.22647]
Epoch 6: 40%|#### | 8/20 [00:00<00:00, 63.00it/s, loss=-0.04251, sqweights=0.22566]
Epoch 6: 45%|####5 | 9/20 [00:00<00:00, 63.00it/s, loss=-0.04233, sqweights=0.22548]
Epoch 6: 50%|##### | 10/20 [00:00<00:00, 63.00it/s, loss=-0.04204, sqweights=0.22502]
Epoch 6: 55%|#####5 | 11/20 [00:00<00:00, 63.00it/s, loss=-0.04084, sqweights=0.22452]
Epoch 6: 60%|###### | 12/20 [00:00<00:00, 63.00it/s, loss=-0.04225, sqweights=0.22520]
Epoch 6: 65%|######5 | 13/20 [00:00<00:00, 63.00it/s, loss=-0.04236, sqweights=0.22566]
Epoch 6: 70%|####### | 14/20 [00:00<00:00, 64.40it/s, loss=-0.04236, sqweights=0.22566]
Epoch 6: 70%|####### | 14/20 [00:00<00:00, 64.40it/s, loss=-0.04280, sqweights=0.22569]
Epoch 6: 75%|#######5 | 15/20 [00:00<00:00, 64.40it/s, loss=-0.04166, sqweights=0.22653]
Epoch 6: 80%|######## | 16/20 [00:00<00:00, 64.40it/s, loss=-0.04165, sqweights=0.22711]
Epoch 6: 85%|########5 | 17/20 [00:00<00:00, 64.40it/s, loss=-0.04157, sqweights=0.22790]
Epoch 6: 90%|######### | 18/20 [00:00<00:00, 64.40it/s, loss=-0.04233, sqweights=0.22851]
Epoch 6: 95%|#########5| 19/20 [00:00<00:00, 64.40it/s, loss=-0.04258, sqweights=0.22854]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 64.40it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]
Epoch 6: 100%|##########| 20/20 [00:00<00:00, 21.38it/s, loss=-0.04151, sqweights=0.22852, train_loss=-0.05169, train_sqweights=0.18852, val_loss=-0.04297, val_sqweights=0.18683]
Epoch 7: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 7: 5%|5 | 1/20 [00:00<00:00, 48.28it/s, loss=-0.05094, sqweights=0.24110]
Epoch 7: 10%|# | 2/20 [00:00<00:00, 55.22it/s, loss=-0.04334, sqweights=0.24224]
Epoch 7: 15%|#5 | 3/20 [00:00<00:00, 58.33it/s, loss=-0.03868, sqweights=0.24382]
Epoch 7: 20%|## | 4/20 [00:00<00:00, 60.13it/s, loss=-0.03967, sqweights=0.24180]
Epoch 7: 25%|##5 | 5/20 [00:00<00:00, 60.91it/s, loss=-0.04211, sqweights=0.24393]
Epoch 7: 30%|### | 6/20 [00:00<00:00, 61.62it/s, loss=-0.04368, sqweights=0.24600]
Epoch 7: 35%|###5 | 7/20 [00:00<00:00, 62.14it/s, loss=-0.04368, sqweights=0.24600]
Epoch 7: 35%|###5 | 7/20 [00:00<00:00, 62.14it/s, loss=-0.04502, sqweights=0.24593]
Epoch 7: 40%|#### | 8/20 [00:00<00:00, 62.14it/s, loss=-0.04501, sqweights=0.24651]
Epoch 7: 45%|####5 | 9/20 [00:00<00:00, 62.14it/s, loss=-0.04461, sqweights=0.24615]
Epoch 7: 50%|##### | 10/20 [00:00<00:00, 62.14it/s, loss=-0.04638, sqweights=0.24703]
Epoch 7: 55%|#####5 | 11/20 [00:00<00:00, 62.14it/s, loss=-0.04806, sqweights=0.24667]
Epoch 7: 60%|###### | 12/20 [00:00<00:00, 62.14it/s, loss=-0.04972, sqweights=0.24696]
Epoch 7: 65%|######5 | 13/20 [00:00<00:00, 62.14it/s, loss=-0.04989, sqweights=0.24748]
Epoch 7: 70%|####### | 14/20 [00:00<00:00, 64.82it/s, loss=-0.04989, sqweights=0.24748]
Epoch 7: 70%|####### | 14/20 [00:00<00:00, 64.82it/s, loss=-0.05053, sqweights=0.24776]
Epoch 7: 75%|#######5 | 15/20 [00:00<00:00, 64.82it/s, loss=-0.04909, sqweights=0.24859]
Epoch 7: 80%|######## | 16/20 [00:00<00:00, 64.82it/s, loss=-0.04959, sqweights=0.24856]
Epoch 7: 85%|########5 | 17/20 [00:00<00:00, 64.82it/s, loss=-0.05020, sqweights=0.24935]
Epoch 7: 90%|######### | 18/20 [00:00<00:00, 64.82it/s, loss=-0.05091, sqweights=0.24993]
Epoch 7: 95%|#########5| 19/20 [00:00<00:00, 64.82it/s, loss=-0.05083, sqweights=0.25018]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 64.82it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]
Epoch 7: 100%|##########| 20/20 [00:00<00:00, 23.18it/s, loss=-0.05086, sqweights=0.25049, train_loss=-0.06090, train_sqweights=0.20398, val_loss=-0.05062, val_sqweights=0.20173]
Epoch 8: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 8: 5%|5 | 1/20 [00:00<00:00, 48.41it/s, loss=-0.06106, sqweights=0.25068]
Epoch 8: 10%|# | 2/20 [00:00<00:00, 56.12it/s, loss=-0.05734, sqweights=0.25583]
Epoch 8: 15%|#5 | 3/20 [00:00<00:00, 59.32it/s, loss=-0.05527, sqweights=0.25912]
Epoch 8: 20%|## | 4/20 [00:00<00:00, 61.02it/s, loss=-0.05278, sqweights=0.25860]
Epoch 8: 25%|##5 | 5/20 [00:00<00:00, 62.12it/s, loss=-0.05690, sqweights=0.26283]
Epoch 8: 30%|### | 6/20 [00:00<00:00, 62.90it/s, loss=-0.05721, sqweights=0.26475]
Epoch 8: 35%|###5 | 7/20 [00:00<00:00, 63.49it/s, loss=-0.05721, sqweights=0.26475]
Epoch 8: 35%|###5 | 7/20 [00:00<00:00, 63.49it/s, loss=-0.05784, sqweights=0.26417]
Epoch 8: 40%|#### | 8/20 [00:00<00:00, 63.49it/s, loss=-0.05832, sqweights=0.26479]
Epoch 8: 45%|####5 | 9/20 [00:00<00:00, 63.49it/s, loss=-0.06019, sqweights=0.26632]
Epoch 8: 50%|##### | 10/20 [00:00<00:00, 63.49it/s, loss=-0.06087, sqweights=0.26709]
Epoch 8: 55%|#####5 | 11/20 [00:00<00:00, 63.49it/s, loss=-0.06042, sqweights=0.26791]
Epoch 8: 60%|###### | 12/20 [00:00<00:00, 63.49it/s, loss=-0.06031, sqweights=0.26906]
Epoch 8: 65%|######5 | 13/20 [00:00<00:00, 63.49it/s, loss=-0.05926, sqweights=0.26963]
Epoch 8: 70%|####### | 14/20 [00:00<00:00, 65.30it/s, loss=-0.05926, sqweights=0.26963]
Epoch 8: 70%|####### | 14/20 [00:00<00:00, 65.30it/s, loss=-0.05858, sqweights=0.27065]
Epoch 8: 75%|#######5 | 15/20 [00:00<00:00, 65.30it/s, loss=-0.05916, sqweights=0.27099]
Epoch 8: 80%|######## | 16/20 [00:00<00:00, 65.30it/s, loss=-0.05817, sqweights=0.27153]
Epoch 8: 85%|########5 | 17/20 [00:00<00:00, 65.30it/s, loss=-0.05768, sqweights=0.27176]
Epoch 8: 90%|######### | 18/20 [00:00<00:00, 65.30it/s, loss=-0.05766, sqweights=0.27158]
Epoch 8: 95%|#########5| 19/20 [00:00<00:00, 65.30it/s, loss=-0.05858, sqweights=0.27189]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 65.30it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]
Epoch 8: 100%|##########| 20/20 [00:00<00:00, 23.28it/s, loss=-0.05826, sqweights=0.27189, train_loss=-0.07016, train_sqweights=0.22189, val_loss=-0.05829, val_sqweights=0.21920]
Epoch 9: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 9: 5%|5 | 1/20 [00:00<00:00, 48.94it/s, loss=-0.06033, sqweights=0.28802]
Epoch 9: 10%|# | 2/20 [00:00<00:00, 56.28it/s, loss=-0.05749, sqweights=0.28628]
Epoch 9: 15%|#5 | 3/20 [00:00<00:00, 59.52it/s, loss=-0.05615, sqweights=0.28486]
Epoch 9: 20%|## | 4/20 [00:00<00:00, 60.83it/s, loss=-0.05787, sqweights=0.28554]
Epoch 9: 25%|##5 | 5/20 [00:00<00:00, 61.68it/s, loss=-0.06315, sqweights=0.28808]
Epoch 9: 30%|### | 6/20 [00:00<00:00, 62.47it/s, loss=-0.06044, sqweights=0.28879]
Epoch 9: 35%|###5 | 7/20 [00:00<00:00, 62.79it/s, loss=-0.06044, sqweights=0.28879]
Epoch 9: 35%|###5 | 7/20 [00:00<00:00, 62.79it/s, loss=-0.06317, sqweights=0.29010]
Epoch 9: 40%|#### | 8/20 [00:00<00:00, 62.79it/s, loss=-0.06203, sqweights=0.29283]
Epoch 9: 45%|####5 | 9/20 [00:00<00:00, 62.79it/s, loss=-0.06338, sqweights=0.29290]
Epoch 9: 50%|##### | 10/20 [00:00<00:00, 62.79it/s, loss=-0.06290, sqweights=0.29373]
Epoch 9: 55%|#####5 | 11/20 [00:00<00:00, 62.79it/s, loss=-0.06464, sqweights=0.29454]
Epoch 9: 60%|###### | 12/20 [00:00<00:00, 62.79it/s, loss=-0.06420, sqweights=0.29603]
Epoch 9: 65%|######5 | 13/20 [00:00<00:00, 62.79it/s, loss=-0.06390, sqweights=0.29653]
Epoch 9: 70%|####### | 14/20 [00:00<00:00, 65.08it/s, loss=-0.06390, sqweights=0.29653]
Epoch 9: 70%|####### | 14/20 [00:00<00:00, 65.08it/s, loss=-0.06496, sqweights=0.29755]
Epoch 9: 75%|#######5 | 15/20 [00:00<00:00, 65.08it/s, loss=-0.06599, sqweights=0.29979]
Epoch 9: 80%|######## | 16/20 [00:00<00:00, 65.08it/s, loss=-0.06666, sqweights=0.30110]
Epoch 9: 85%|########5 | 17/20 [00:00<00:00, 65.08it/s, loss=-0.06661, sqweights=0.30176]
Epoch 9: 90%|######### | 18/20 [00:00<00:00, 65.08it/s, loss=-0.06700, sqweights=0.30194]
Epoch 9: 95%|#########5| 19/20 [00:00<00:00, 65.08it/s, loss=-0.06643, sqweights=0.30218]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 65.08it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]
Epoch 9: 100%|##########| 20/20 [00:00<00:00, 21.37it/s, loss=-0.06579, sqweights=0.30222, train_loss=-0.07917, train_sqweights=0.24050, val_loss=-0.06552, val_sqweights=0.23722]
Epoch 10: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 10: 5%|5 | 1/20 [00:00<00:00, 47.57it/s, loss=-0.06753, sqweights=0.31177]
Epoch 10: 10%|# | 2/20 [00:00<00:00, 55.41it/s, loss=-0.06837, sqweights=0.30830]
Epoch 10: 15%|#5 | 3/20 [00:00<00:00, 58.61it/s, loss=-0.05926, sqweights=0.31371]
Epoch 10: 20%|## | 4/20 [00:00<00:00, 60.45it/s, loss=-0.06154, sqweights=0.31725]
Epoch 10: 25%|##5 | 5/20 [00:00<00:00, 61.65it/s, loss=-0.06618, sqweights=0.31853]
Epoch 10: 30%|### | 6/20 [00:00<00:00, 62.45it/s, loss=-0.06801, sqweights=0.31965]
Epoch 10: 35%|###5 | 7/20 [00:00<00:00, 63.05it/s, loss=-0.06801, sqweights=0.31965]
Epoch 10: 35%|###5 | 7/20 [00:00<00:00, 63.05it/s, loss=-0.06813, sqweights=0.32024]
Epoch 10: 40%|#### | 8/20 [00:00<00:00, 63.05it/s, loss=-0.06594, sqweights=0.31786]
Epoch 10: 45%|####5 | 9/20 [00:00<00:00, 63.05it/s, loss=-0.06848, sqweights=0.31912]
Epoch 10: 50%|##### | 10/20 [00:00<00:00, 63.05it/s, loss=-0.07142, sqweights=0.31989]
Epoch 10: 55%|#####5 | 11/20 [00:00<00:00, 63.05it/s, loss=-0.07231, sqweights=0.32073]
Epoch 10: 60%|###### | 12/20 [00:00<00:00, 63.05it/s, loss=-0.07296, sqweights=0.32141]
Epoch 10: 65%|######5 | 13/20 [00:00<00:00, 63.05it/s, loss=-0.07326, sqweights=0.32151]
Epoch 10: 70%|####### | 14/20 [00:00<00:00, 65.13it/s, loss=-0.07326, sqweights=0.32151]
Epoch 10: 70%|####### | 14/20 [00:00<00:00, 65.13it/s, loss=-0.07495, sqweights=0.32216]
Epoch 10: 75%|#######5 | 15/20 [00:00<00:00, 65.13it/s, loss=-0.07602, sqweights=0.32325]
Epoch 10: 80%|######## | 16/20 [00:00<00:00, 65.13it/s, loss=-0.07510, sqweights=0.32441]
Epoch 10: 85%|########5 | 17/20 [00:00<00:00, 65.13it/s, loss=-0.07478, sqweights=0.32520]
Epoch 10: 90%|######### | 18/20 [00:00<00:00, 65.13it/s, loss=-0.07478, sqweights=0.32687]
Epoch 10: 95%|#########5| 19/20 [00:00<00:00, 65.13it/s, loss=-0.07349, sqweights=0.32719]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 65.13it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]
Epoch 10: 100%|##########| 20/20 [00:00<00:00, 23.27it/s, loss=-0.07370, sqweights=0.32880, train_loss=-0.08768, train_sqweights=0.26154, val_loss=-0.07237, val_sqweights=0.25764]
Epoch 11: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 11: 5%|5 | 1/20 [00:00<00:00, 48.43it/s, loss=-0.08275, sqweights=0.34105]
Epoch 11: 10%|# | 2/20 [00:00<00:00, 55.99it/s, loss=-0.07369, sqweights=0.33976]
Epoch 11: 15%|#5 | 3/20 [00:00<00:00, 59.22it/s, loss=-0.07831, sqweights=0.34105]
Epoch 11: 20%|## | 4/20 [00:00<00:00, 60.90it/s, loss=-0.08237, sqweights=0.34122]
Epoch 11: 25%|##5 | 5/20 [00:00<00:00, 61.90it/s, loss=-0.07932, sqweights=0.34337]
Epoch 11: 30%|### | 6/20 [00:00<00:00, 62.56it/s, loss=-0.07868, sqweights=0.34455]
Epoch 11: 35%|###5 | 7/20 [00:00<00:00, 63.08it/s, loss=-0.07868, sqweights=0.34455]
Epoch 11: 35%|###5 | 7/20 [00:00<00:00, 63.08it/s, loss=-0.07493, sqweights=0.34321]
Epoch 11: 40%|#### | 8/20 [00:00<00:00, 63.08it/s, loss=-0.07448, sqweights=0.34361]
Epoch 11: 45%|####5 | 9/20 [00:00<00:00, 63.08it/s, loss=-0.07475, sqweights=0.34491]
Epoch 11: 50%|##### | 10/20 [00:00<00:00, 63.08it/s, loss=-0.07513, sqweights=0.34512]
Epoch 11: 55%|#####5 | 11/20 [00:00<00:00, 63.08it/s, loss=-0.07662, sqweights=0.34650]
Epoch 11: 60%|###### | 12/20 [00:00<00:00, 63.08it/s, loss=-0.07747, sqweights=0.34814]
Epoch 11: 65%|######5 | 13/20 [00:00<00:00, 63.08it/s, loss=-0.07848, sqweights=0.34929]
Epoch 11: 70%|####### | 14/20 [00:00<00:00, 64.78it/s, loss=-0.07848, sqweights=0.34929]
Epoch 11: 70%|####### | 14/20 [00:00<00:00, 64.78it/s, loss=-0.07961, sqweights=0.35002]
Epoch 11: 75%|#######5 | 15/20 [00:00<00:00, 64.78it/s, loss=-0.07878, sqweights=0.35113]
Epoch 11: 80%|######## | 16/20 [00:00<00:00, 64.78it/s, loss=-0.07838, sqweights=0.35242]
Epoch 11: 85%|########5 | 17/20 [00:00<00:00, 64.78it/s, loss=-0.07842, sqweights=0.35351]
Epoch 11: 90%|######### | 18/20 [00:00<00:00, 64.78it/s, loss=-0.07788, sqweights=0.35405]
Epoch 11: 95%|#########5| 19/20 [00:00<00:00, 64.78it/s, loss=-0.07792, sqweights=0.35477]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 64.78it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]
Epoch 11: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.07729, sqweights=0.35499, train_loss=-0.09582, train_sqweights=0.28274, val_loss=-0.07876, val_sqweights=0.27797]
Epoch 12: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 12: 5%|5 | 1/20 [00:00<00:00, 48.79it/s, loss=-0.08471, sqweights=0.36337]
Epoch 12: 10%|# | 2/20 [00:00<00:00, 56.39it/s, loss=-0.08585, sqweights=0.36754]
Epoch 12: 15%|#5 | 3/20 [00:00<00:00, 58.76it/s, loss=-0.08223, sqweights=0.37128]
Epoch 12: 20%|## | 4/20 [00:00<00:00, 60.58it/s, loss=-0.08584, sqweights=0.37608]
Epoch 12: 25%|##5 | 5/20 [00:00<00:00, 61.70it/s, loss=-0.08819, sqweights=0.37633]
Epoch 12: 30%|### | 6/20 [00:00<00:00, 62.45it/s, loss=-0.08225, sqweights=0.37474]
Epoch 12: 35%|###5 | 7/20 [00:00<00:00, 63.04it/s, loss=-0.08225, sqweights=0.37474]
Epoch 12: 35%|###5 | 7/20 [00:00<00:00, 63.04it/s, loss=-0.08323, sqweights=0.37490]
Epoch 12: 40%|#### | 8/20 [00:00<00:00, 63.04it/s, loss=-0.08275, sqweights=0.37459]
Epoch 12: 45%|####5 | 9/20 [00:00<00:00, 63.04it/s, loss=-0.08151, sqweights=0.37539]
Epoch 12: 50%|##### | 10/20 [00:00<00:00, 63.04it/s, loss=-0.08217, sqweights=0.37608]
Epoch 12: 55%|#####5 | 11/20 [00:00<00:00, 63.04it/s, loss=-0.08177, sqweights=0.37551]
Epoch 12: 60%|###### | 12/20 [00:00<00:00, 63.04it/s, loss=-0.08443, sqweights=0.37657]
Epoch 12: 65%|######5 | 13/20 [00:00<00:00, 63.04it/s, loss=-0.08378, sqweights=0.37572]
Epoch 12: 70%|####### | 14/20 [00:00<00:00, 64.92it/s, loss=-0.08378, sqweights=0.37572]
Epoch 12: 70%|####### | 14/20 [00:00<00:00, 64.92it/s, loss=-0.08411, sqweights=0.37670]
Epoch 12: 75%|#######5 | 15/20 [00:00<00:00, 64.92it/s, loss=-0.08491, sqweights=0.37815]
Epoch 12: 80%|######## | 16/20 [00:00<00:00, 64.92it/s, loss=-0.08561, sqweights=0.37964]
Epoch 12: 85%|########5 | 17/20 [00:00<00:00, 64.92it/s, loss=-0.08717, sqweights=0.37979]
Epoch 12: 90%|######### | 18/20 [00:00<00:00, 64.92it/s, loss=-0.08669, sqweights=0.38047]
Epoch 12: 95%|#########5| 19/20 [00:00<00:00, 64.92it/s, loss=-0.08588, sqweights=0.38009]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]
Epoch 12: 100%|##########| 20/20 [00:00<00:00, 21.38it/s, loss=-0.08574, sqweights=0.38137, train_loss=-0.10355, train_sqweights=0.30575, val_loss=-0.08490, val_sqweights=0.30023]
Epoch 13: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 13: 5%|5 | 1/20 [00:00<00:00, 48.05it/s, loss=-0.08998, sqweights=0.42066]
Epoch 13: 10%|# | 2/20 [00:00<00:00, 55.49it/s, loss=-0.09197, sqweights=0.40202]
Epoch 13: 15%|#5 | 3/20 [00:00<00:00, 58.76it/s, loss=-0.09240, sqweights=0.40136]
Epoch 13: 20%|## | 4/20 [00:00<00:00, 60.58it/s, loss=-0.08685, sqweights=0.39863]
Epoch 13: 25%|##5 | 5/20 [00:00<00:00, 61.69it/s, loss=-0.08815, sqweights=0.39772]
Epoch 13: 30%|### | 6/20 [00:00<00:00, 62.28it/s, loss=-0.08772, sqweights=0.39826]
Epoch 13: 35%|###5 | 7/20 [00:00<00:00, 62.92it/s, loss=-0.08772, sqweights=0.39826]
Epoch 13: 35%|###5 | 7/20 [00:00<00:00, 62.92it/s, loss=-0.09007, sqweights=0.39858]
Epoch 13: 40%|#### | 8/20 [00:00<00:00, 62.92it/s, loss=-0.09228, sqweights=0.40014]
Epoch 13: 45%|####5 | 9/20 [00:00<00:00, 62.92it/s, loss=-0.09217, sqweights=0.39954]
Epoch 13: 50%|##### | 10/20 [00:00<00:00, 62.92it/s, loss=-0.09081, sqweights=0.40007]
Epoch 13: 55%|#####5 | 11/20 [00:00<00:00, 62.92it/s, loss=-0.08810, sqweights=0.40162]
Epoch 13: 60%|###### | 12/20 [00:00<00:00, 62.92it/s, loss=-0.08876, sqweights=0.40201]
Epoch 13: 65%|######5 | 13/20 [00:00<00:00, 62.92it/s, loss=-0.08826, sqweights=0.40320]
Epoch 13: 70%|####### | 14/20 [00:00<00:00, 64.84it/s, loss=-0.08826, sqweights=0.40320]
Epoch 13: 70%|####### | 14/20 [00:00<00:00, 64.84it/s, loss=-0.08858, sqweights=0.40282]
Epoch 13: 75%|#######5 | 15/20 [00:00<00:00, 64.84it/s, loss=-0.08905, sqweights=0.40405]
Epoch 13: 80%|######## | 16/20 [00:00<00:00, 64.84it/s, loss=-0.08836, sqweights=0.40530]
Epoch 13: 85%|########5 | 17/20 [00:00<00:00, 64.84it/s, loss=-0.08875, sqweights=0.40705]
Epoch 13: 90%|######### | 18/20 [00:00<00:00, 64.84it/s, loss=-0.08897, sqweights=0.40731]
Epoch 13: 95%|#########5| 19/20 [00:00<00:00, 64.84it/s, loss=-0.08892, sqweights=0.40877]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 64.84it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]
Epoch 13: 100%|##########| 20/20 [00:00<00:00, 23.21it/s, loss=-0.08891, sqweights=0.40909, train_loss=-0.11080, train_sqweights=0.32952, val_loss=-0.09047, val_sqweights=0.32356]
Epoch 14: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 14: 5%|5 | 1/20 [00:00<00:00, 48.67it/s, loss=-0.09347, sqweights=0.43040]
Epoch 14: 10%|# | 2/20 [00:00<00:00, 56.05it/s, loss=-0.09072, sqweights=0.42517]
Epoch 14: 15%|#5 | 3/20 [00:00<00:00, 59.15it/s, loss=-0.09520, sqweights=0.42528]
Epoch 14: 20%|## | 4/20 [00:00<00:00, 60.83it/s, loss=-0.09392, sqweights=0.42286]
Epoch 14: 25%|##5 | 5/20 [00:00<00:00, 61.33it/s, loss=-0.10022, sqweights=0.42318]
Epoch 14: 30%|### | 6/20 [00:00<00:00, 62.07it/s, loss=-0.09754, sqweights=0.42714]
Epoch 14: 35%|###5 | 7/20 [00:00<00:00, 62.51it/s, loss=-0.09754, sqweights=0.42714]
Epoch 14: 35%|###5 | 7/20 [00:00<00:00, 62.51it/s, loss=-0.09489, sqweights=0.42541]
Epoch 14: 40%|#### | 8/20 [00:00<00:00, 62.51it/s, loss=-0.09498, sqweights=0.42650]
Epoch 14: 45%|####5 | 9/20 [00:00<00:00, 62.51it/s, loss=-0.09590, sqweights=0.43020]
Epoch 14: 50%|##### | 10/20 [00:00<00:00, 62.51it/s, loss=-0.09343, sqweights=0.43143]
Epoch 14: 55%|#####5 | 11/20 [00:00<00:00, 62.51it/s, loss=-0.09191, sqweights=0.43187]
Epoch 14: 60%|###### | 12/20 [00:00<00:00, 62.51it/s, loss=-0.09403, sqweights=0.43410]
Epoch 14: 65%|######5 | 13/20 [00:00<00:00, 62.51it/s, loss=-0.09397, sqweights=0.43501]
Epoch 14: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09397, sqweights=0.43501]
Epoch 14: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09208, sqweights=0.43505]
Epoch 14: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.09298, sqweights=0.43851]
Epoch 14: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.09483, sqweights=0.43947]
Epoch 14: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.09508, sqweights=0.43884]
Epoch 14: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.09639, sqweights=0.43967]
Epoch 14: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.09618, sqweights=0.44175]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]
Epoch 14: 100%|##########| 20/20 [00:00<00:00, 23.20it/s, loss=-0.09701, sqweights=0.44113, train_loss=-0.11767, train_sqweights=0.35347, val_loss=-0.09568, val_sqweights=0.34675]
Epoch 15: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 15: 5%|5 | 1/20 [00:00<00:00, 47.52it/s, loss=-0.07776, sqweights=0.44499]
Epoch 15: 10%|# | 2/20 [00:00<00:00, 54.87it/s, loss=-0.10117, sqweights=0.45217]
Epoch 15: 15%|#5 | 3/20 [00:00<00:00, 58.29it/s, loss=-0.10106, sqweights=0.44935]
Epoch 15: 20%|## | 4/20 [00:00<00:00, 60.31it/s, loss=-0.09864, sqweights=0.45530]
Epoch 15: 25%|##5 | 5/20 [00:00<00:00, 61.51it/s, loss=-0.10088, sqweights=0.45591]
Epoch 15: 30%|### | 6/20 [00:00<00:00, 62.29it/s, loss=-0.10025, sqweights=0.45670]
Epoch 15: 35%|###5 | 7/20 [00:00<00:00, 62.87it/s, loss=-0.10025, sqweights=0.45670]
Epoch 15: 35%|###5 | 7/20 [00:00<00:00, 62.87it/s, loss=-0.09813, sqweights=0.45867]
Epoch 15: 40%|#### | 8/20 [00:00<00:00, 62.87it/s, loss=-0.09467, sqweights=0.45524]
Epoch 15: 45%|####5 | 9/20 [00:00<00:00, 62.87it/s, loss=-0.09798, sqweights=0.45479]
Epoch 15: 50%|##### | 10/20 [00:00<00:00, 62.87it/s, loss=-0.09848, sqweights=0.45526]
Epoch 15: 55%|#####5 | 11/20 [00:00<00:00, 62.87it/s, loss=-0.09785, sqweights=0.45649]
Epoch 15: 60%|###### | 12/20 [00:00<00:00, 62.87it/s, loss=-0.09831, sqweights=0.45588]
Epoch 15: 65%|######5 | 13/20 [00:00<00:00, 62.87it/s, loss=-0.09704, sqweights=0.45700]
Epoch 15: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09704, sqweights=0.45700]
Epoch 15: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09717, sqweights=0.45790]
Epoch 15: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.09846, sqweights=0.46001]
Epoch 15: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.09781, sqweights=0.46047]
Epoch 15: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.09654, sqweights=0.46121]
Epoch 15: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.09635, sqweights=0.46112]
Epoch 15: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.09673, sqweights=0.46178]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]
Epoch 15: 100%|##########| 20/20 [00:00<00:00, 21.30it/s, loss=-0.09626, sqweights=0.46267, train_loss=-0.12389, train_sqweights=0.37840, val_loss=-0.10036, val_sqweights=0.37072]
Epoch 16: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 16: 5%|5 | 1/20 [00:00<00:00, 48.63it/s, loss=-0.09268, sqweights=0.46160]
Epoch 16: 10%|# | 2/20 [00:00<00:00, 56.01it/s, loss=-0.08807, sqweights=0.47416]
Epoch 16: 15%|#5 | 3/20 [00:00<00:00, 59.09it/s, loss=-0.09411, sqweights=0.47416]
Epoch 16: 20%|## | 4/20 [00:00<00:00, 60.34it/s, loss=-0.10074, sqweights=0.47628]
Epoch 16: 25%|##5 | 5/20 [00:00<00:00, 61.44it/s, loss=-0.10077, sqweights=0.48216]
Epoch 16: 30%|### | 6/20 [00:00<00:00, 62.28it/s, loss=-0.10006, sqweights=0.48039]
Epoch 16: 35%|###5 | 7/20 [00:00<00:00, 62.65it/s, loss=-0.10006, sqweights=0.48039]
Epoch 16: 35%|###5 | 7/20 [00:00<00:00, 62.65it/s, loss=-0.10164, sqweights=0.48168]
Epoch 16: 40%|#### | 8/20 [00:00<00:00, 62.65it/s, loss=-0.09959, sqweights=0.48064]
Epoch 16: 45%|####5 | 9/20 [00:00<00:00, 62.65it/s, loss=-0.10108, sqweights=0.48233]
Epoch 16: 50%|##### | 10/20 [00:00<00:00, 62.65it/s, loss=-0.10088, sqweights=0.48523]
Epoch 16: 55%|#####5 | 11/20 [00:00<00:00, 62.65it/s, loss=-0.09957, sqweights=0.48803]
Epoch 16: 60%|###### | 12/20 [00:00<00:00, 62.65it/s, loss=-0.09863, sqweights=0.48878]
Epoch 16: 65%|######5 | 13/20 [00:00<00:00, 62.65it/s, loss=-0.09849, sqweights=0.48949]
Epoch 16: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09849, sqweights=0.48949]
Epoch 16: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.09799, sqweights=0.48896]
Epoch 16: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.10049, sqweights=0.48989]
Epoch 16: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.10252, sqweights=0.49118]
Epoch 16: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.10268, sqweights=0.49209]
Epoch 16: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.10146, sqweights=0.49103]
Epoch 16: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.10214, sqweights=0.49107]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]
Epoch 16: 100%|##########| 20/20 [00:00<00:00, 23.23it/s, loss=-0.10438, sqweights=0.49232, train_loss=-0.12939, train_sqweights=0.40008, val_loss=-0.10448, val_sqweights=0.39169]
Epoch 17: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 17: 5%|5 | 1/20 [00:00<00:00, 48.15it/s, loss=-0.07263, sqweights=0.47088]
Epoch 17: 10%|# | 2/20 [00:00<00:00, 54.65it/s, loss=-0.07035, sqweights=0.48952]
Epoch 17: 15%|#5 | 3/20 [00:00<00:00, 57.84it/s, loss=-0.08838, sqweights=0.49690]
Epoch 17: 20%|## | 4/20 [00:00<00:00, 59.79it/s, loss=-0.08892, sqweights=0.49608]
Epoch 17: 25%|##5 | 5/20 [00:00<00:00, 60.91it/s, loss=-0.09458, sqweights=0.50005]
Epoch 17: 30%|### | 6/20 [00:00<00:00, 61.83it/s, loss=-0.09608, sqweights=0.50258]
Epoch 17: 35%|###5 | 7/20 [00:00<00:00, 62.50it/s, loss=-0.09608, sqweights=0.50258]
Epoch 17: 35%|###5 | 7/20 [00:00<00:00, 62.50it/s, loss=-0.09627, sqweights=0.50353]
Epoch 17: 40%|#### | 8/20 [00:00<00:00, 62.50it/s, loss=-0.09905, sqweights=0.50506]
Epoch 17: 45%|####5 | 9/20 [00:00<00:00, 62.50it/s, loss=-0.10124, sqweights=0.50349]
Epoch 17: 50%|##### | 10/20 [00:00<00:00, 62.50it/s, loss=-0.10062, sqweights=0.50493]
Epoch 17: 55%|#####5 | 11/20 [00:00<00:00, 62.50it/s, loss=-0.10145, sqweights=0.50558]
Epoch 17: 60%|###### | 12/20 [00:00<00:00, 62.50it/s, loss=-0.10070, sqweights=0.50723]
Epoch 17: 65%|######5 | 13/20 [00:00<00:00, 62.50it/s, loss=-0.10075, sqweights=0.50763]
Epoch 17: 70%|####### | 14/20 [00:00<00:00, 64.68it/s, loss=-0.10075, sqweights=0.50763]
Epoch 17: 70%|####### | 14/20 [00:00<00:00, 64.68it/s, loss=-0.09942, sqweights=0.50832]
Epoch 17: 75%|#######5 | 15/20 [00:00<00:00, 64.68it/s, loss=-0.09991, sqweights=0.50942]
Epoch 17: 80%|######## | 16/20 [00:00<00:00, 64.68it/s, loss=-0.09898, sqweights=0.50909]
Epoch 17: 85%|########5 | 17/20 [00:00<00:00, 64.68it/s, loss=-0.09862, sqweights=0.50915]
Epoch 17: 90%|######### | 18/20 [00:00<00:00, 64.68it/s, loss=-0.10007, sqweights=0.50960]
Epoch 17: 95%|#########5| 19/20 [00:00<00:00, 64.68it/s, loss=-0.10178, sqweights=0.51104]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 64.68it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]
Epoch 17: 100%|##########| 20/20 [00:00<00:00, 23.19it/s, loss=-0.10369, sqweights=0.51329, train_loss=-0.13436, train_sqweights=0.42182, val_loss=-0.10850, val_sqweights=0.41218]
Epoch 18: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 18: 5%|5 | 1/20 [00:00<00:00, 48.39it/s, loss=-0.11609, sqweights=0.52120]
Epoch 18: 10%|# | 2/20 [00:00<00:00, 55.85it/s, loss=-0.12475, sqweights=0.52755]
Epoch 18: 15%|#5 | 3/20 [00:00<00:00, 58.99it/s, loss=-0.11660, sqweights=0.52537]
Epoch 18: 20%|## | 4/20 [00:00<00:00, 60.73it/s, loss=-0.11349, sqweights=0.52834]
Epoch 18: 25%|##5 | 5/20 [00:00<00:00, 61.86it/s, loss=-0.11398, sqweights=0.53109]
Epoch 18: 30%|### | 6/20 [00:00<00:00, 62.64it/s, loss=-0.11380, sqweights=0.53258]
Epoch 18: 35%|###5 | 7/20 [00:00<00:00, 63.11it/s, loss=-0.11380, sqweights=0.53258]
Epoch 18: 35%|###5 | 7/20 [00:00<00:00, 63.11it/s, loss=-0.10940, sqweights=0.53307]
Epoch 18: 40%|#### | 8/20 [00:00<00:00, 63.11it/s, loss=-0.10953, sqweights=0.53211]
Epoch 18: 45%|####5 | 9/20 [00:00<00:00, 63.11it/s, loss=-0.10822, sqweights=0.53274]
Epoch 18: 50%|##### | 10/20 [00:00<00:00, 63.11it/s, loss=-0.10701, sqweights=0.53329]
Epoch 18: 55%|#####5 | 11/20 [00:00<00:00, 63.11it/s, loss=-0.10468, sqweights=0.53457]
Epoch 18: 60%|###### | 12/20 [00:00<00:00, 63.11it/s, loss=-0.10439, sqweights=0.53363]
Epoch 18: 65%|######5 | 13/20 [00:00<00:00, 63.11it/s, loss=-0.10419, sqweights=0.53460]
Epoch 18: 70%|####### | 14/20 [00:00<00:00, 64.45it/s, loss=-0.10419, sqweights=0.53460]
Epoch 18: 70%|####### | 14/20 [00:00<00:00, 64.45it/s, loss=-0.10519, sqweights=0.53536]
Epoch 18: 75%|#######5 | 15/20 [00:00<00:00, 64.45it/s, loss=-0.10660, sqweights=0.53565]
Epoch 18: 80%|######## | 16/20 [00:00<00:00, 64.45it/s, loss=-0.10920, sqweights=0.53575]
Epoch 18: 85%|########5 | 17/20 [00:00<00:00, 64.45it/s, loss=-0.10902, sqweights=0.53885]
Epoch 18: 90%|######### | 18/20 [00:00<00:00, 64.45it/s, loss=-0.10874, sqweights=0.53856]
Epoch 18: 95%|#########5| 19/20 [00:00<00:00, 64.45it/s, loss=-0.10819, sqweights=0.53842]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]
Epoch 18: 100%|##########| 20/20 [00:00<00:00, 21.26it/s, loss=-0.10705, sqweights=0.54016, train_loss=-0.13901, train_sqweights=0.44371, val_loss=-0.11223, val_sqweights=0.43362]
Epoch 19: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 19: 5%|5 | 1/20 [00:00<00:00, 48.44it/s, loss=-0.10525, sqweights=0.53310]
Epoch 19: 10%|# | 2/20 [00:00<00:00, 56.03it/s, loss=-0.10515, sqweights=0.53979]
Epoch 19: 15%|#5 | 3/20 [00:00<00:00, 58.46it/s, loss=-0.10749, sqweights=0.54270]
Epoch 19: 20%|## | 4/20 [00:00<00:00, 59.96it/s, loss=-0.10776, sqweights=0.54263]
Epoch 19: 25%|##5 | 5/20 [00:00<00:00, 60.80it/s, loss=-0.10895, sqweights=0.54191]
Epoch 19: 30%|### | 6/20 [00:00<00:00, 61.15it/s, loss=-0.11153, sqweights=0.54491]
Epoch 19: 35%|###5 | 7/20 [00:00<00:00, 61.90it/s, loss=-0.11153, sqweights=0.54491]
Epoch 19: 35%|###5 | 7/20 [00:00<00:00, 61.90it/s, loss=-0.11229, sqweights=0.54411]
Epoch 19: 40%|#### | 8/20 [00:00<00:00, 61.90it/s, loss=-0.11116, sqweights=0.54464]
Epoch 19: 45%|####5 | 9/20 [00:00<00:00, 61.90it/s, loss=-0.11167, sqweights=0.54863]
Epoch 19: 50%|##### | 10/20 [00:00<00:00, 61.90it/s, loss=-0.11161, sqweights=0.54890]
Epoch 19: 55%|#####5 | 11/20 [00:00<00:00, 61.90it/s, loss=-0.11127, sqweights=0.54939]
Epoch 19: 60%|###### | 12/20 [00:00<00:00, 61.90it/s, loss=-0.11049, sqweights=0.55163]
Epoch 19: 65%|######5 | 13/20 [00:00<00:00, 61.90it/s, loss=-0.10825, sqweights=0.55279]
Epoch 19: 70%|####### | 14/20 [00:00<00:00, 64.39it/s, loss=-0.10825, sqweights=0.55279]
Epoch 19: 70%|####### | 14/20 [00:00<00:00, 64.39it/s, loss=-0.10745, sqweights=0.55160]
Epoch 19: 75%|#######5 | 15/20 [00:00<00:00, 64.39it/s, loss=-0.10612, sqweights=0.55343]
Epoch 19: 80%|######## | 16/20 [00:00<00:00, 64.39it/s, loss=-0.10608, sqweights=0.55331]
Epoch 19: 85%|########5 | 17/20 [00:00<00:00, 64.39it/s, loss=-0.10650, sqweights=0.55306]
Epoch 19: 90%|######### | 18/20 [00:00<00:00, 64.39it/s, loss=-0.10669, sqweights=0.55310]
Epoch 19: 95%|#########5| 19/20 [00:00<00:00, 64.39it/s, loss=-0.10846, sqweights=0.55464]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 64.39it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]
Epoch 19: 100%|##########| 20/20 [00:00<00:00, 23.12it/s, loss=-0.10757, sqweights=0.55479, train_loss=-0.14304, train_sqweights=0.46321, val_loss=-0.11539, val_sqweights=0.45265]
Epoch 20: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 20: 5%|5 | 1/20 [00:00<00:00, 47.63it/s, loss=-0.08769, sqweights=0.54733]
Epoch 20: 10%|# | 2/20 [00:00<00:00, 55.14it/s, loss=-0.10949, sqweights=0.55174]
Epoch 20: 15%|#5 | 3/20 [00:00<00:00, 58.42it/s, loss=-0.10915, sqweights=0.55808]
Epoch 20: 20%|## | 4/20 [00:00<00:00, 60.27it/s, loss=-0.11420, sqweights=0.56447]
Epoch 20: 25%|##5 | 5/20 [00:00<00:00, 61.46it/s, loss=-0.11304, sqweights=0.56131]
Epoch 20: 30%|### | 6/20 [00:00<00:00, 62.11it/s, loss=-0.11304, sqweights=0.56218]
Epoch 20: 35%|###5 | 7/20 [00:00<00:00, 62.47it/s, loss=-0.11304, sqweights=0.56218]
Epoch 20: 35%|###5 | 7/20 [00:00<00:00, 62.47it/s, loss=-0.11231, sqweights=0.56449]
Epoch 20: 40%|#### | 8/20 [00:00<00:00, 62.47it/s, loss=-0.11506, sqweights=0.56712]
Epoch 20: 45%|####5 | 9/20 [00:00<00:00, 62.47it/s, loss=-0.11374, sqweights=0.56820]
Epoch 20: 50%|##### | 10/20 [00:00<00:00, 62.47it/s, loss=-0.11277, sqweights=0.56822]
Epoch 20: 55%|#####5 | 11/20 [00:00<00:00, 62.47it/s, loss=-0.11133, sqweights=0.56869]
Epoch 20: 60%|###### | 12/20 [00:00<00:00, 62.47it/s, loss=-0.11155, sqweights=0.56952]
Epoch 20: 65%|######5 | 13/20 [00:00<00:00, 62.47it/s, loss=-0.11161, sqweights=0.56927]
Epoch 20: 70%|####### | 14/20 [00:00<00:00, 64.69it/s, loss=-0.11161, sqweights=0.56927]
Epoch 20: 70%|####### | 14/20 [00:00<00:00, 64.69it/s, loss=-0.11155, sqweights=0.57108]
Epoch 20: 75%|#######5 | 15/20 [00:00<00:00, 64.69it/s, loss=-0.11229, sqweights=0.57162]
Epoch 20: 80%|######## | 16/20 [00:00<00:00, 64.69it/s, loss=-0.11132, sqweights=0.57115]
Epoch 20: 85%|########5 | 17/20 [00:00<00:00, 64.69it/s, loss=-0.11103, sqweights=0.57277]
Epoch 20: 90%|######### | 18/20 [00:00<00:00, 64.69it/s, loss=-0.11058, sqweights=0.57316]
Epoch 20: 95%|#########5| 19/20 [00:00<00:00, 64.69it/s, loss=-0.11009, sqweights=0.57285]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 64.69it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]
Epoch 20: 100%|##########| 20/20 [00:00<00:00, 23.09it/s, loss=-0.10973, sqweights=0.57489, train_loss=-0.14681, train_sqweights=0.48355, val_loss=-0.11828, val_sqweights=0.47239]
Epoch 21: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 21: 5%|5 | 1/20 [00:00<00:00, 48.90it/s, loss=-0.12126, sqweights=0.60815]
Epoch 21: 10%|# | 2/20 [00:00<00:00, 56.25it/s, loss=-0.11400, sqweights=0.59867]
Epoch 21: 15%|#5 | 3/20 [00:00<00:00, 59.40it/s, loss=-0.11547, sqweights=0.59577]
Epoch 21: 20%|## | 4/20 [00:00<00:00, 61.08it/s, loss=-0.12237, sqweights=0.59427]
Epoch 21: 25%|##5 | 5/20 [00:00<00:00, 61.67it/s, loss=-0.11782, sqweights=0.59439]
Epoch 21: 30%|### | 6/20 [00:00<00:00, 62.47it/s, loss=-0.11923, sqweights=0.59286]
Epoch 21: 35%|###5 | 7/20 [00:00<00:00, 63.04it/s, loss=-0.11923, sqweights=0.59286]
Epoch 21: 35%|###5 | 7/20 [00:00<00:00, 63.04it/s, loss=-0.12054, sqweights=0.59288]
Epoch 21: 40%|#### | 8/20 [00:00<00:00, 63.04it/s, loss=-0.12151, sqweights=0.59308]
Epoch 21: 45%|####5 | 9/20 [00:00<00:00, 63.04it/s, loss=-0.12133, sqweights=0.59469]
Epoch 21: 50%|##### | 10/20 [00:00<00:00, 63.04it/s, loss=-0.11658, sqweights=0.59325]
Epoch 21: 55%|#####5 | 11/20 [00:00<00:00, 63.04it/s, loss=-0.11510, sqweights=0.59322]
Epoch 21: 60%|###### | 12/20 [00:00<00:00, 63.04it/s, loss=-0.11688, sqweights=0.59532]
Epoch 21: 65%|######5 | 13/20 [00:00<00:00, 63.04it/s, loss=-0.11525, sqweights=0.59465]
Epoch 21: 70%|####### | 14/20 [00:00<00:00, 65.09it/s, loss=-0.11525, sqweights=0.59465]
Epoch 21: 70%|####### | 14/20 [00:00<00:00, 65.09it/s, loss=-0.11524, sqweights=0.59489]
Epoch 21: 75%|#######5 | 15/20 [00:00<00:00, 65.09it/s, loss=-0.11512, sqweights=0.59458]
Epoch 21: 80%|######## | 16/20 [00:00<00:00, 65.09it/s, loss=-0.11539, sqweights=0.59398]
Epoch 21: 85%|########5 | 17/20 [00:00<00:00, 65.09it/s, loss=-0.11563, sqweights=0.59534]
Epoch 21: 90%|######### | 18/20 [00:00<00:00, 65.09it/s, loss=-0.11527, sqweights=0.59618]
Epoch 21: 95%|#########5| 19/20 [00:00<00:00, 65.09it/s, loss=-0.11463, sqweights=0.59684]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 65.09it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]
Epoch 21: 100%|##########| 20/20 [00:00<00:00, 21.23it/s, loss=-0.11392, sqweights=0.59738, train_loss=-0.15044, train_sqweights=0.50227, val_loss=-0.12109, val_sqweights=0.49056]
Epoch 22: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 22: 5%|5 | 1/20 [00:00<00:00, 48.10it/s, loss=-0.13117, sqweights=0.59707]
Epoch 22: 10%|# | 2/20 [00:00<00:00, 55.70it/s, loss=-0.13134, sqweights=0.58748]
Epoch 22: 15%|#5 | 3/20 [00:00<00:00, 58.91it/s, loss=-0.12128, sqweights=0.59059]
Epoch 22: 20%|## | 4/20 [00:00<00:00, 60.63it/s, loss=-0.12522, sqweights=0.59912]
Epoch 22: 25%|##5 | 5/20 [00:00<00:00, 61.72it/s, loss=-0.12619, sqweights=0.60234]
Epoch 22: 30%|### | 6/20 [00:00<00:00, 62.52it/s, loss=-0.12257, sqweights=0.60437]
Epoch 22: 35%|###5 | 7/20 [00:00<00:00, 63.01it/s, loss=-0.12257, sqweights=0.60437]
Epoch 22: 35%|###5 | 7/20 [00:00<00:00, 63.01it/s, loss=-0.12130, sqweights=0.60778]
Epoch 22: 40%|#### | 8/20 [00:00<00:00, 63.01it/s, loss=-0.11961, sqweights=0.60704]
Epoch 22: 45%|####5 | 9/20 [00:00<00:00, 63.01it/s, loss=-0.12047, sqweights=0.60727]
Epoch 22: 50%|##### | 10/20 [00:00<00:00, 63.01it/s, loss=-0.12367, sqweights=0.60889]
Epoch 22: 55%|#####5 | 11/20 [00:00<00:00, 63.01it/s, loss=-0.12254, sqweights=0.61065]
Epoch 22: 60%|###### | 12/20 [00:00<00:00, 63.01it/s, loss=-0.11959, sqweights=0.60985]
Epoch 22: 65%|######5 | 13/20 [00:00<00:00, 63.01it/s, loss=-0.11779, sqweights=0.61102]
Epoch 22: 70%|####### | 14/20 [00:00<00:00, 64.75it/s, loss=-0.11779, sqweights=0.61102]
Epoch 22: 70%|####### | 14/20 [00:00<00:00, 64.75it/s, loss=-0.11568, sqweights=0.61147]
Epoch 22: 75%|#######5 | 15/20 [00:00<00:00, 64.75it/s, loss=-0.11837, sqweights=0.61256]
Epoch 22: 80%|######## | 16/20 [00:00<00:00, 64.75it/s, loss=-0.11787, sqweights=0.61153]
Epoch 22: 85%|########5 | 17/20 [00:00<00:00, 64.75it/s, loss=-0.11755, sqweights=0.61269]
Epoch 22: 90%|######### | 18/20 [00:00<00:00, 64.75it/s, loss=-0.11595, sqweights=0.61204]
Epoch 22: 95%|#########5| 19/20 [00:00<00:00, 64.75it/s, loss=-0.11706, sqweights=0.61272]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 64.75it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]
Epoch 22: 100%|##########| 20/20 [00:00<00:00, 23.20it/s, loss=-0.11673, sqweights=0.61193, train_loss=-0.15389, train_sqweights=0.52299, val_loss=-0.12370, val_sqweights=0.51164]
Epoch 23: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 23: 5%|5 | 1/20 [00:00<00:00, 48.67it/s, loss=-0.13710, sqweights=0.64026]
Epoch 23: 10%|# | 2/20 [00:00<00:00, 55.97it/s, loss=-0.13911, sqweights=0.63102]
Epoch 23: 15%|#5 | 3/20 [00:00<00:00, 59.04it/s, loss=-0.13081, sqweights=0.62900]
Epoch 23: 20%|## | 4/20 [00:00<00:00, 60.78it/s, loss=-0.12694, sqweights=0.62430]
Epoch 23: 25%|##5 | 5/20 [00:00<00:00, 61.84it/s, loss=-0.12368, sqweights=0.62540]
Epoch 23: 30%|### | 6/20 [00:00<00:00, 62.29it/s, loss=-0.12137, sqweights=0.62690]
Epoch 23: 35%|###5 | 7/20 [00:00<00:00, 62.92it/s, loss=-0.12137, sqweights=0.62690]
Epoch 23: 35%|###5 | 7/20 [00:00<00:00, 62.92it/s, loss=-0.12089, sqweights=0.62839]
Epoch 23: 40%|#### | 8/20 [00:00<00:00, 62.92it/s, loss=-0.12032, sqweights=0.63043]
Epoch 23: 45%|####5 | 9/20 [00:00<00:00, 62.92it/s, loss=-0.11969, sqweights=0.62983]
Epoch 23: 50%|##### | 10/20 [00:00<00:00, 62.92it/s, loss=-0.12087, sqweights=0.62905]
Epoch 23: 55%|#####5 | 11/20 [00:00<00:00, 62.92it/s, loss=-0.12236, sqweights=0.63026]
Epoch 23: 60%|###### | 12/20 [00:00<00:00, 62.92it/s, loss=-0.12181, sqweights=0.63080]
Epoch 23: 65%|######5 | 13/20 [00:00<00:00, 62.92it/s, loss=-0.12125, sqweights=0.63185]
Epoch 23: 70%|####### | 14/20 [00:00<00:00, 65.11it/s, loss=-0.12125, sqweights=0.63185]
Epoch 23: 70%|####### | 14/20 [00:00<00:00, 65.11it/s, loss=-0.11966, sqweights=0.63187]
Epoch 23: 75%|#######5 | 15/20 [00:00<00:00, 65.11it/s, loss=-0.11937, sqweights=0.63101]
Epoch 23: 80%|######## | 16/20 [00:00<00:00, 65.11it/s, loss=-0.11905, sqweights=0.63181]
Epoch 23: 85%|########5 | 17/20 [00:00<00:00, 65.11it/s, loss=-0.11888, sqweights=0.63153]
Epoch 23: 90%|######### | 18/20 [00:00<00:00, 65.11it/s, loss=-0.11830, sqweights=0.63213]
Epoch 23: 95%|#########5| 19/20 [00:00<00:00, 65.11it/s, loss=-0.11642, sqweights=0.63196]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 65.11it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]
Epoch 23: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.11721, sqweights=0.63158, train_loss=-0.15740, train_sqweights=0.54319, val_loss=-0.12589, val_sqweights=0.53165]
Epoch 24: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 24: 5%|5 | 1/20 [00:00<00:00, 47.44it/s, loss=-0.11339, sqweights=0.64813]
Epoch 24: 10%|# | 2/20 [00:00<00:00, 55.15it/s, loss=-0.11591, sqweights=0.63407]
Epoch 24: 15%|#5 | 3/20 [00:00<00:00, 58.11it/s, loss=-0.11721, sqweights=0.64390]
Epoch 24: 20%|## | 4/20 [00:00<00:00, 59.68it/s, loss=-0.12384, sqweights=0.64470]
Epoch 24: 25%|##5 | 5/20 [00:00<00:00, 60.75it/s, loss=-0.12393, sqweights=0.64383]
Epoch 24: 30%|### | 6/20 [00:00<00:00, 61.59it/s, loss=-0.12190, sqweights=0.63774]
Epoch 24: 35%|###5 | 7/20 [00:00<00:00, 62.29it/s, loss=-0.12190, sqweights=0.63774]
Epoch 24: 35%|###5 | 7/20 [00:00<00:00, 62.29it/s, loss=-0.12355, sqweights=0.63842]
Epoch 24: 40%|#### | 8/20 [00:00<00:00, 62.29it/s, loss=-0.12555, sqweights=0.64039]
Epoch 24: 45%|####5 | 9/20 [00:00<00:00, 62.29it/s, loss=-0.12478, sqweights=0.64138]
Epoch 24: 50%|##### | 10/20 [00:00<00:00, 62.29it/s, loss=-0.12270, sqweights=0.63830]
Epoch 24: 55%|#####5 | 11/20 [00:00<00:00, 62.29it/s, loss=-0.12330, sqweights=0.64155]
Epoch 24: 60%|###### | 12/20 [00:00<00:00, 62.29it/s, loss=-0.12258, sqweights=0.64255]
Epoch 24: 65%|######5 | 13/20 [00:00<00:00, 62.29it/s, loss=-0.12429, sqweights=0.64295]
Epoch 24: 70%|####### | 14/20 [00:00<00:00, 64.43it/s, loss=-0.12429, sqweights=0.64295]
Epoch 24: 70%|####### | 14/20 [00:00<00:00, 64.43it/s, loss=-0.12393, sqweights=0.64410]
Epoch 24: 75%|#######5 | 15/20 [00:00<00:00, 64.43it/s, loss=-0.12207, sqweights=0.64360]
Epoch 24: 80%|######## | 16/20 [00:00<00:00, 64.43it/s, loss=-0.11986, sqweights=0.64241]
Epoch 24: 85%|########5 | 17/20 [00:00<00:00, 64.43it/s, loss=-0.12027, sqweights=0.64287]
Epoch 24: 90%|######### | 18/20 [00:00<00:00, 64.43it/s, loss=-0.11946, sqweights=0.64427]
Epoch 24: 95%|#########5| 19/20 [00:00<00:00, 64.43it/s, loss=-0.11969, sqweights=0.64521]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 64.43it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]
Epoch 24: 100%|##########| 20/20 [00:00<00:00, 23.16it/s, loss=-0.12046, sqweights=0.64696, train_loss=-0.16006, train_sqweights=0.56194, val_loss=-0.12807, val_sqweights=0.55104]
Epoch 25: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 25: 5%|5 | 1/20 [00:00<00:01, 9.92it/s]
Epoch 25: 5%|5 | 1/20 [00:00<00:01, 9.92it/s, loss=-0.09722, sqweights=0.64730]
Epoch 25: 10%|# | 2/20 [00:00<00:01, 9.92it/s, loss=-0.10720, sqweights=0.65466]
Epoch 25: 15%|#5 | 3/20 [00:00<00:01, 9.92it/s, loss=-0.11394, sqweights=0.65431]
Epoch 25: 20%|## | 4/20 [00:00<00:01, 9.92it/s, loss=-0.11621, sqweights=0.65410]
Epoch 25: 25%|##5 | 5/20 [00:00<00:01, 9.92it/s, loss=-0.11728, sqweights=0.65711]
Epoch 25: 30%|### | 6/20 [00:00<00:01, 9.92it/s, loss=-0.11581, sqweights=0.65812]
Epoch 25: 35%|###5 | 7/20 [00:00<00:01, 9.92it/s, loss=-0.11504, sqweights=0.66089]
Epoch 25: 40%|#### | 8/20 [00:00<00:00, 43.45it/s, loss=-0.11504, sqweights=0.66089]
Epoch 25: 40%|#### | 8/20 [00:00<00:00, 43.45it/s, loss=-0.11514, sqweights=0.66104]
Epoch 25: 45%|####5 | 9/20 [00:00<00:00, 43.45it/s, loss=-0.11718, sqweights=0.66307]
Epoch 25: 50%|##### | 10/20 [00:00<00:00, 43.45it/s, loss=-0.11618, sqweights=0.66512]
Epoch 25: 55%|#####5 | 11/20 [00:00<00:00, 43.45it/s, loss=-0.11578, sqweights=0.66592]
Epoch 25: 60%|###### | 12/20 [00:00<00:00, 43.45it/s, loss=-0.11711, sqweights=0.66532]
Epoch 25: 65%|######5 | 13/20 [00:00<00:00, 43.45it/s, loss=-0.11704, sqweights=0.66706]
Epoch 25: 70%|####### | 14/20 [00:00<00:00, 43.45it/s, loss=-0.11755, sqweights=0.66609]
Epoch 25: 75%|#######5 | 15/20 [00:00<00:00, 54.01it/s, loss=-0.11755, sqweights=0.66609]
Epoch 25: 75%|#######5 | 15/20 [00:00<00:00, 54.01it/s, loss=-0.11864, sqweights=0.66568]
Epoch 25: 80%|######## | 16/20 [00:00<00:00, 54.01it/s, loss=-0.11966, sqweights=0.66639]
Epoch 25: 85%|########5 | 17/20 [00:00<00:00, 54.01it/s, loss=-0.11941, sqweights=0.66633]
Epoch 25: 90%|######### | 18/20 [00:00<00:00, 54.01it/s, loss=-0.11978, sqweights=0.66714]
Epoch 25: 95%|#########5| 19/20 [00:00<00:00, 54.01it/s, loss=-0.11938, sqweights=0.66749]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 54.01it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]
Epoch 25: 100%|##########| 20/20 [00:00<00:00, 21.17it/s, loss=-0.11801, sqweights=0.66685, train_loss=-0.16262, train_sqweights=0.57806, val_loss=-0.13014, val_sqweights=0.56826]
Epoch 26: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 26: 5%|5 | 1/20 [00:00<00:00, 48.07it/s, loss=-0.08433, sqweights=0.64913]
Epoch 26: 10%|# | 2/20 [00:00<00:00, 55.56it/s, loss=-0.09888, sqweights=0.66231]
Epoch 26: 15%|#5 | 3/20 [00:00<00:00, 57.57it/s, loss=-0.10227, sqweights=0.66639]
Epoch 26: 20%|## | 4/20 [00:00<00:00, 59.56it/s, loss=-0.11543, sqweights=0.66631]
Epoch 26: 25%|##5 | 5/20 [00:00<00:00, 60.80it/s, loss=-0.12312, sqweights=0.66897]
Epoch 26: 30%|### | 6/20 [00:00<00:00, 61.46it/s, loss=-0.12591, sqweights=0.67268]
Epoch 26: 35%|###5 | 7/20 [00:00<00:00, 62.22it/s, loss=-0.12591, sqweights=0.67268]
Epoch 26: 35%|###5 | 7/20 [00:00<00:00, 62.22it/s, loss=-0.12591, sqweights=0.67621]
Epoch 26: 40%|#### | 8/20 [00:00<00:00, 62.22it/s, loss=-0.12559, sqweights=0.67540]
Epoch 26: 45%|####5 | 9/20 [00:00<00:00, 62.22it/s, loss=-0.12478, sqweights=0.67549]
Epoch 26: 50%|##### | 10/20 [00:00<00:00, 62.22it/s, loss=-0.12217, sqweights=0.67539]
Epoch 26: 55%|#####5 | 11/20 [00:00<00:00, 62.22it/s, loss=-0.12047, sqweights=0.67515]
Epoch 26: 60%|###### | 12/20 [00:00<00:00, 62.22it/s, loss=-0.12003, sqweights=0.67590]
Epoch 26: 65%|######5 | 13/20 [00:00<00:00, 62.22it/s, loss=-0.12017, sqweights=0.67498]
Epoch 26: 70%|####### | 14/20 [00:00<00:00, 64.35it/s, loss=-0.12017, sqweights=0.67498]
Epoch 26: 70%|####### | 14/20 [00:00<00:00, 64.35it/s, loss=-0.12006, sqweights=0.67425]
Epoch 26: 75%|#######5 | 15/20 [00:00<00:00, 64.35it/s, loss=-0.12067, sqweights=0.67477]
Epoch 26: 80%|######## | 16/20 [00:00<00:00, 64.35it/s, loss=-0.12006, sqweights=0.67509]
Epoch 26: 85%|########5 | 17/20 [00:00<00:00, 64.35it/s, loss=-0.11943, sqweights=0.67448]
Epoch 26: 90%|######### | 18/20 [00:00<00:00, 64.35it/s, loss=-0.12071, sqweights=0.67550]
Epoch 26: 95%|#########5| 19/20 [00:00<00:00, 64.35it/s, loss=-0.11972, sqweights=0.67616]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 64.35it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]
Epoch 26: 100%|##########| 20/20 [00:00<00:00, 23.11it/s, loss=-0.11896, sqweights=0.67645, train_loss=-0.16478, train_sqweights=0.59564, val_loss=-0.13194, val_sqweights=0.58620]
Epoch 27: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 27: 5%|5 | 1/20 [00:00<00:00, 48.00it/s, loss=-0.10956, sqweights=0.68584]
Epoch 27: 10%|# | 2/20 [00:00<00:00, 55.57it/s, loss=-0.11183, sqweights=0.69164]
Epoch 27: 15%|#5 | 3/20 [00:00<00:00, 58.40it/s, loss=-0.10998, sqweights=0.69819]
Epoch 27: 20%|## | 4/20 [00:00<00:00, 60.02it/s, loss=-0.10446, sqweights=0.69861]
Epoch 27: 25%|##5 | 5/20 [00:00<00:00, 61.23it/s, loss=-0.10985, sqweights=0.68821]
Epoch 27: 30%|### | 6/20 [00:00<00:00, 62.04it/s, loss=-0.11288, sqweights=0.68643]
Epoch 27: 35%|###5 | 7/20 [00:00<00:00, 62.60it/s, loss=-0.11288, sqweights=0.68643]
Epoch 27: 35%|###5 | 7/20 [00:00<00:00, 62.60it/s, loss=-0.11756, sqweights=0.68755]
Epoch 27: 40%|#### | 8/20 [00:00<00:00, 62.60it/s, loss=-0.11457, sqweights=0.68463]
Epoch 27: 45%|####5 | 9/20 [00:00<00:00, 62.60it/s, loss=-0.11186, sqweights=0.68500]
Epoch 27: 50%|##### | 10/20 [00:00<00:00, 62.60it/s, loss=-0.10952, sqweights=0.68651]
Epoch 27: 55%|#####5 | 11/20 [00:00<00:00, 62.60it/s, loss=-0.10973, sqweights=0.68796]
Epoch 27: 60%|###### | 12/20 [00:00<00:00, 62.60it/s, loss=-0.10793, sqweights=0.68942]
Epoch 27: 65%|######5 | 13/20 [00:00<00:00, 62.60it/s, loss=-0.10816, sqweights=0.68988]
Epoch 27: 70%|####### | 14/20 [00:00<00:00, 64.76it/s, loss=-0.10816, sqweights=0.68988]
Epoch 27: 70%|####### | 14/20 [00:00<00:00, 64.76it/s, loss=-0.11100, sqweights=0.69027]
Epoch 27: 75%|#######5 | 15/20 [00:00<00:00, 64.76it/s, loss=-0.11253, sqweights=0.69062]
Epoch 27: 80%|######## | 16/20 [00:00<00:00, 64.76it/s, loss=-0.11294, sqweights=0.68929]
Epoch 27: 85%|########5 | 17/20 [00:00<00:00, 64.76it/s, loss=-0.11321, sqweights=0.69015]
Epoch 27: 90%|######### | 18/20 [00:00<00:00, 64.76it/s, loss=-0.11289, sqweights=0.68904]
Epoch 27: 95%|#########5| 19/20 [00:00<00:00, 64.76it/s, loss=-0.11316, sqweights=0.68988]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 64.76it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]
Epoch 27: 100%|##########| 20/20 [00:00<00:00, 23.22it/s, loss=-0.11398, sqweights=0.69185, train_loss=-0.16673, train_sqweights=0.60894, val_loss=-0.13328, val_sqweights=0.59953]
Epoch 28: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 28: 5%|5 | 1/20 [00:00<00:00, 48.46it/s, loss=-0.11375, sqweights=0.67527]
Epoch 28: 10%|# | 2/20 [00:00<00:00, 55.86it/s, loss=-0.11291, sqweights=0.67880]
Epoch 28: 15%|#5 | 3/20 [00:00<00:00, 58.90it/s, loss=-0.11675, sqweights=0.68919]
Epoch 28: 20%|## | 4/20 [00:00<00:00, 60.64it/s, loss=-0.12165, sqweights=0.69292]
Epoch 28: 25%|##5 | 5/20 [00:00<00:00, 61.16it/s, loss=-0.12326, sqweights=0.69402]
Epoch 28: 30%|### | 6/20 [00:00<00:00, 33.40it/s, loss=-0.12326, sqweights=0.69402]
Epoch 28: 30%|### | 6/20 [00:00<00:00, 33.40it/s, loss=-0.12503, sqweights=0.69764]
Epoch 28: 35%|###5 | 7/20 [00:00<00:00, 33.40it/s, loss=-0.12784, sqweights=0.69900]
Epoch 28: 40%|#### | 8/20 [00:00<00:00, 33.40it/s, loss=-0.12642, sqweights=0.69312]
Epoch 28: 45%|####5 | 9/20 [00:00<00:00, 33.40it/s, loss=-0.12365, sqweights=0.69161]
Epoch 28: 50%|##### | 10/20 [00:00<00:00, 33.40it/s, loss=-0.12369, sqweights=0.69037]
Epoch 28: 55%|#####5 | 11/20 [00:00<00:00, 33.40it/s, loss=-0.12331, sqweights=0.68953]
Epoch 28: 60%|###### | 12/20 [00:00<00:00, 33.40it/s, loss=-0.12429, sqweights=0.69089]
Epoch 28: 65%|######5 | 13/20 [00:00<00:00, 48.28it/s, loss=-0.12429, sqweights=0.69089]
Epoch 28: 65%|######5 | 13/20 [00:00<00:00, 48.28it/s, loss=-0.12511, sqweights=0.69272]
Epoch 28: 70%|####### | 14/20 [00:00<00:00, 48.28it/s, loss=-0.12385, sqweights=0.69310]
Epoch 28: 75%|#######5 | 15/20 [00:00<00:00, 48.28it/s, loss=-0.12407, sqweights=0.69447]
Epoch 28: 80%|######## | 16/20 [00:00<00:00, 48.28it/s, loss=-0.12391, sqweights=0.69567]
Epoch 28: 85%|########5 | 17/20 [00:00<00:00, 48.28it/s, loss=-0.12378, sqweights=0.69594]
Epoch 28: 90%|######### | 18/20 [00:00<00:00, 48.28it/s, loss=-0.12297, sqweights=0.69583]
Epoch 28: 95%|#########5| 19/20 [00:00<00:00, 48.28it/s, loss=-0.12266, sqweights=0.69601]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12266, sqweights=0.69601]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 56.30it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]
Epoch 28: 100%|##########| 20/20 [00:00<00:00, 21.11it/s, loss=-0.12371, sqweights=0.69599, train_loss=-0.16859, train_sqweights=0.62391, val_loss=-0.13456, val_sqweights=0.61365]
Epoch 29: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 29: 5%|5 | 1/20 [00:00<00:00, 46.00it/s, loss=-0.09095, sqweights=0.71518]
Epoch 29: 10%|# | 2/20 [00:00<00:00, 54.27it/s, loss=-0.12836, sqweights=0.70632]
Epoch 29: 15%|#5 | 3/20 [00:00<00:00, 57.64it/s, loss=-0.13019, sqweights=0.70184]
Epoch 29: 20%|## | 4/20 [00:00<00:00, 59.69it/s, loss=-0.12817, sqweights=0.70611]
Epoch 29: 25%|##5 | 5/20 [00:00<00:00, 60.99it/s, loss=-0.12552, sqweights=0.70390]
Epoch 29: 30%|### | 6/20 [00:00<00:00, 61.96it/s, loss=-0.12555, sqweights=0.70787]
Epoch 29: 35%|###5 | 7/20 [00:00<00:00, 62.62it/s, loss=-0.12555, sqweights=0.70787]
Epoch 29: 35%|###5 | 7/20 [00:00<00:00, 62.62it/s, loss=-0.12883, sqweights=0.70805]
Epoch 29: 40%|#### | 8/20 [00:00<00:00, 62.62it/s, loss=-0.12963, sqweights=0.70937]
Epoch 29: 45%|####5 | 9/20 [00:00<00:00, 62.62it/s, loss=-0.12931, sqweights=0.70821]
Epoch 29: 50%|##### | 10/20 [00:00<00:00, 62.62it/s, loss=-0.12930, sqweights=0.70846]
Epoch 29: 55%|#####5 | 11/20 [00:00<00:00, 62.62it/s, loss=-0.12690, sqweights=0.71154]
Epoch 29: 60%|###### | 12/20 [00:00<00:00, 62.62it/s, loss=-0.12253, sqweights=0.71247]
Epoch 29: 65%|######5 | 13/20 [00:00<00:00, 62.62it/s, loss=-0.12340, sqweights=0.71475]
Epoch 29: 70%|####### | 14/20 [00:00<00:00, 64.79it/s, loss=-0.12340, sqweights=0.71475]
Epoch 29: 70%|####### | 14/20 [00:00<00:00, 64.79it/s, loss=-0.12301, sqweights=0.71411]
Epoch 29: 75%|#######5 | 15/20 [00:00<00:00, 64.79it/s, loss=-0.12328, sqweights=0.71284]
Epoch 29: 80%|######## | 16/20 [00:00<00:00, 64.79it/s, loss=-0.12240, sqweights=0.71429]
Epoch 29: 85%|########5 | 17/20 [00:00<00:00, 64.79it/s, loss=-0.12037, sqweights=0.71422]
Epoch 29: 90%|######### | 18/20 [00:00<00:00, 64.79it/s, loss=-0.12169, sqweights=0.71494]
Epoch 29: 95%|#########5| 19/20 [00:00<00:00, 64.79it/s, loss=-0.12256, sqweights=0.71632]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 64.79it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]
Epoch 29: 100%|##########| 20/20 [00:00<00:00, 23.16it/s, loss=-0.12297, sqweights=0.71641, train_loss=-0.17046, train_sqweights=0.64219, val_loss=-0.13577, val_sqweights=0.63182]
Epoch 30: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 30: 5%|5 | 1/20 [00:00<00:00, 48.34it/s, loss=-0.11191, sqweights=0.71765]
Epoch 30: 10%|# | 2/20 [00:00<00:00, 55.38it/s, loss=-0.13057, sqweights=0.72857]
Epoch 30: 15%|#5 | 3/20 [00:00<00:00, 58.64it/s, loss=-0.12068, sqweights=0.73183]
Epoch 30: 20%|## | 4/20 [00:00<00:00, 60.46it/s, loss=-0.12589, sqweights=0.72622]
Epoch 30: 25%|##5 | 5/20 [00:00<00:00, 61.59it/s, loss=-0.12757, sqweights=0.72263]
Epoch 30: 30%|### | 6/20 [00:00<00:00, 61.90it/s, loss=-0.12593, sqweights=0.71821]
Epoch 30: 35%|###5 | 7/20 [00:00<00:00, 62.47it/s, loss=-0.12593, sqweights=0.71821]
Epoch 30: 35%|###5 | 7/20 [00:00<00:00, 62.47it/s, loss=-0.12480, sqweights=0.72193]
Epoch 30: 40%|#### | 8/20 [00:00<00:00, 62.47it/s, loss=-0.12018, sqweights=0.72129]
Epoch 30: 45%|####5 | 9/20 [00:00<00:00, 62.47it/s, loss=-0.12739, sqweights=0.72299]
Epoch 30: 50%|##### | 10/20 [00:00<00:00, 62.47it/s, loss=-0.12500, sqweights=0.72325]
Epoch 30: 55%|#####5 | 11/20 [00:00<00:00, 62.47it/s, loss=-0.12493, sqweights=0.72350]
Epoch 30: 60%|###### | 12/20 [00:00<00:00, 62.47it/s, loss=-0.12285, sqweights=0.72373]
Epoch 30: 65%|######5 | 13/20 [00:00<00:00, 62.47it/s, loss=-0.12416, sqweights=0.72340]
Epoch 30: 70%|####### | 14/20 [00:00<00:00, 64.52it/s, loss=-0.12416, sqweights=0.72340]
Epoch 30: 70%|####### | 14/20 [00:00<00:00, 64.52it/s, loss=-0.12436, sqweights=0.72315]
Epoch 30: 75%|#######5 | 15/20 [00:00<00:00, 64.52it/s, loss=-0.12302, sqweights=0.72262]
Epoch 30: 80%|######## | 16/20 [00:00<00:00, 64.52it/s, loss=-0.12362, sqweights=0.72430]
Epoch 30: 85%|########5 | 17/20 [00:00<00:00, 64.52it/s, loss=-0.12189, sqweights=0.72353]
Epoch 30: 90%|######### | 18/20 [00:00<00:00, 64.52it/s, loss=-0.12234, sqweights=0.72307]
Epoch 30: 95%|#########5| 19/20 [00:00<00:00, 64.52it/s, loss=-0.12315, sqweights=0.72351]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 64.52it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]
Epoch 30: 100%|##########| 20/20 [00:00<00:00, 23.13it/s, loss=-0.12390, sqweights=0.72404, train_loss=-0.17213, train_sqweights=0.65782, val_loss=-0.13702, val_sqweights=0.64725]
Epoch 31: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 31: 5%|5 | 1/20 [00:00<00:00, 48.38it/s, loss=-0.09259, sqweights=0.72724]
Epoch 31: 10%|# | 2/20 [00:00<00:00, 55.89it/s, loss=-0.10033, sqweights=0.73332]
Epoch 31: 15%|#5 | 3/20 [00:00<00:00, 57.96it/s, loss=-0.11162, sqweights=0.74179]
Epoch 31: 20%|## | 4/20 [00:00<00:00, 59.87it/s, loss=-0.11613, sqweights=0.74979]
Epoch 31: 25%|##5 | 5/20 [00:00<00:00, 61.03it/s, loss=-0.12089, sqweights=0.75120]
Epoch 31: 30%|### | 6/20 [00:00<00:00, 61.85it/s, loss=-0.11869, sqweights=0.74953]
Epoch 31: 35%|###5 | 7/20 [00:00<00:00, 62.53it/s, loss=-0.11869, sqweights=0.74953]
Epoch 31: 35%|###5 | 7/20 [00:00<00:00, 62.53it/s, loss=-0.12161, sqweights=0.74850]
Epoch 31: 40%|#### | 8/20 [00:00<00:00, 62.53it/s, loss=-0.12072, sqweights=0.74884]
Epoch 31: 45%|####5 | 9/20 [00:00<00:00, 62.53it/s, loss=-0.11640, sqweights=0.74645]
Epoch 31: 50%|##### | 10/20 [00:00<00:00, 62.53it/s, loss=-0.11726, sqweights=0.74481]
Epoch 31: 55%|#####5 | 11/20 [00:00<00:00, 62.53it/s, loss=-0.11729, sqweights=0.74667]
Epoch 31: 60%|###### | 12/20 [00:00<00:00, 62.53it/s, loss=-0.11647, sqweights=0.74462]
Epoch 31: 65%|######5 | 13/20 [00:00<00:00, 62.53it/s, loss=-0.11742, sqweights=0.74466]
Epoch 31: 70%|####### | 14/20 [00:00<00:00, 64.72it/s, loss=-0.11742, sqweights=0.74466]
Epoch 31: 70%|####### | 14/20 [00:00<00:00, 64.72it/s, loss=-0.11848, sqweights=0.74543]
Epoch 31: 75%|#######5 | 15/20 [00:00<00:00, 64.72it/s, loss=-0.11978, sqweights=0.74593]
Epoch 31: 80%|######## | 16/20 [00:00<00:00, 64.72it/s, loss=-0.12041, sqweights=0.74576]
Epoch 31: 85%|########5 | 17/20 [00:00<00:00, 64.72it/s, loss=-0.12080, sqweights=0.74493]
Epoch 31: 90%|######### | 18/20 [00:00<00:00, 64.72it/s, loss=-0.12018, sqweights=0.74519]
Epoch 31: 95%|#########5| 19/20 [00:00<00:00, 64.72it/s, loss=-0.12111, sqweights=0.74666]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 64.72it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]
Epoch 31: 100%|##########| 20/20 [00:00<00:00, 21.21it/s, loss=-0.12054, sqweights=0.74705, train_loss=-0.17365, train_sqweights=0.67195, val_loss=-0.13798, val_sqweights=0.66128]
Epoch 32: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 32: 5%|5 | 1/20 [00:00<00:00, 45.71it/s, loss=-0.13522, sqweights=0.74692]
Epoch 32: 10%|# | 2/20 [00:00<00:00, 52.80it/s, loss=-0.14122, sqweights=0.74113]
Epoch 32: 15%|#5 | 3/20 [00:00<00:00, 56.73it/s, loss=-0.12747, sqweights=0.74165]
Epoch 32: 20%|## | 4/20 [00:00<00:00, 58.95it/s, loss=-0.12905, sqweights=0.74697]
Epoch 32: 25%|##5 | 5/20 [00:00<00:00, 60.17it/s, loss=-0.13142, sqweights=0.74708]
Epoch 32: 30%|### | 6/20 [00:00<00:00, 61.02it/s, loss=-0.13001, sqweights=0.74666]
Epoch 32: 35%|###5 | 7/20 [00:00<00:00, 61.75it/s, loss=-0.13001, sqweights=0.74666]
Epoch 32: 35%|###5 | 7/20 [00:00<00:00, 61.75it/s, loss=-0.12976, sqweights=0.74630]
Epoch 32: 40%|#### | 8/20 [00:00<00:00, 61.75it/s, loss=-0.12813, sqweights=0.74604]
Epoch 32: 45%|####5 | 9/20 [00:00<00:00, 61.75it/s, loss=-0.12759, sqweights=0.74600]
Epoch 32: 50%|##### | 10/20 [00:00<00:00, 61.75it/s, loss=-0.12956, sqweights=0.74829]
Epoch 32: 55%|#####5 | 11/20 [00:00<00:00, 61.75it/s, loss=-0.13121, sqweights=0.75011]
Epoch 32: 60%|###### | 12/20 [00:00<00:00, 61.75it/s, loss=-0.12879, sqweights=0.75109]
Epoch 32: 65%|######5 | 13/20 [00:00<00:00, 61.75it/s, loss=-0.12745, sqweights=0.74950]
Epoch 32: 70%|####### | 14/20 [00:00<00:00, 64.14it/s, loss=-0.12745, sqweights=0.74950]
Epoch 32: 70%|####### | 14/20 [00:00<00:00, 64.14it/s, loss=-0.12787, sqweights=0.74926]
Epoch 32: 75%|#######5 | 15/20 [00:00<00:00, 64.14it/s, loss=-0.12877, sqweights=0.75067]
Epoch 32: 80%|######## | 16/20 [00:00<00:00, 64.14it/s, loss=-0.12654, sqweights=0.75050]
Epoch 32: 85%|########5 | 17/20 [00:00<00:00, 64.14it/s, loss=-0.12588, sqweights=0.74972]
Epoch 32: 90%|######### | 18/20 [00:00<00:00, 64.14it/s, loss=-0.12632, sqweights=0.74991]
Epoch 32: 95%|#########5| 19/20 [00:00<00:00, 64.14it/s, loss=-0.12755, sqweights=0.75047]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 64.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]
Epoch 32: 100%|##########| 20/20 [00:00<00:00, 23.14it/s, loss=-0.12623, sqweights=0.75160, train_loss=-0.17534, train_sqweights=0.68555, val_loss=-0.13862, val_sqweights=0.67477]
Epoch 33: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 33: 5%|5 | 1/20 [00:00<00:00, 48.62it/s, loss=-0.12935, sqweights=0.74396]
Epoch 33: 10%|# | 2/20 [00:00<00:00, 56.17it/s, loss=-0.13060, sqweights=0.75081]
Epoch 33: 15%|#5 | 3/20 [00:00<00:00, 58.44it/s, loss=-0.13392, sqweights=0.75941]
Epoch 33: 20%|## | 4/20 [00:00<00:00, 60.19it/s, loss=-0.13369, sqweights=0.76078]
Epoch 33: 25%|##5 | 5/20 [00:00<00:00, 60.65it/s, loss=-0.13166, sqweights=0.75946]
Epoch 33: 30%|### | 6/20 [00:00<00:00, 61.57it/s, loss=-0.12623, sqweights=0.76146]
Epoch 33: 35%|###5 | 7/20 [00:00<00:00, 62.23it/s, loss=-0.12623, sqweights=0.76146]
Epoch 33: 35%|###5 | 7/20 [00:00<00:00, 62.23it/s, loss=-0.12264, sqweights=0.76469]
Epoch 33: 40%|#### | 8/20 [00:00<00:00, 62.23it/s, loss=-0.12068, sqweights=0.76479]
Epoch 33: 45%|####5 | 9/20 [00:00<00:00, 62.23it/s, loss=-0.11919, sqweights=0.76266]
Epoch 33: 50%|##### | 10/20 [00:00<00:00, 62.23it/s, loss=-0.12032, sqweights=0.76194]
Epoch 33: 55%|#####5 | 11/20 [00:00<00:00, 62.23it/s, loss=-0.12107, sqweights=0.76327]
Epoch 33: 60%|###### | 12/20 [00:00<00:00, 62.23it/s, loss=-0.12092, sqweights=0.76261]
Epoch 33: 65%|######5 | 13/20 [00:00<00:00, 62.23it/s, loss=-0.12264, sqweights=0.76134]
Epoch 33: 70%|####### | 14/20 [00:00<00:00, 64.21it/s, loss=-0.12264, sqweights=0.76134]
Epoch 33: 70%|####### | 14/20 [00:00<00:00, 64.21it/s, loss=-0.12231, sqweights=0.76107]
Epoch 33: 75%|#######5 | 15/20 [00:00<00:00, 64.21it/s, loss=-0.12110, sqweights=0.76196]
Epoch 33: 80%|######## | 16/20 [00:00<00:00, 64.21it/s, loss=-0.12291, sqweights=0.76391]
Epoch 33: 85%|########5 | 17/20 [00:00<00:00, 64.21it/s, loss=-0.12259, sqweights=0.76407]
Epoch 33: 90%|######### | 18/20 [00:00<00:00, 64.21it/s, loss=-0.12318, sqweights=0.76442]
Epoch 33: 95%|#########5| 19/20 [00:00<00:00, 64.21it/s, loss=-0.12400, sqweights=0.76425]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 64.21it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]
Epoch 33: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12517, sqweights=0.76387, train_loss=-0.17698, train_sqweights=0.69919, val_loss=-0.13990, val_sqweights=0.68942]
Epoch 34: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 34: 5%|5 | 1/20 [00:00<00:00, 47.65it/s, loss=-0.13294, sqweights=0.76685]
Epoch 34: 10%|# | 2/20 [00:00<00:00, 55.13it/s, loss=-0.12388, sqweights=0.76946]
Epoch 34: 15%|#5 | 3/20 [00:00<00:00, 58.41it/s, loss=-0.11935, sqweights=0.77068]
Epoch 34: 20%|## | 4/20 [00:00<00:00, 60.26it/s, loss=-0.12583, sqweights=0.77345]
Epoch 34: 25%|##5 | 5/20 [00:00<00:00, 61.50it/s, loss=-0.12629, sqweights=0.77351]
Epoch 34: 30%|### | 6/20 [00:00<00:00, 62.27it/s, loss=-0.12823, sqweights=0.77563]
Epoch 34: 35%|###5 | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12823, sqweights=0.77563]
Epoch 34: 35%|###5 | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12418, sqweights=0.78051]
Epoch 34: 40%|#### | 8/20 [00:00<00:00, 62.70it/s, loss=-0.12477, sqweights=0.78011]
Epoch 34: 45%|####5 | 9/20 [00:00<00:00, 62.70it/s, loss=-0.12312, sqweights=0.77895]
Epoch 34: 50%|##### | 10/20 [00:00<00:00, 62.70it/s, loss=-0.12440, sqweights=0.77896]
Epoch 34: 55%|#####5 | 11/20 [00:00<00:00, 62.70it/s, loss=-0.12242, sqweights=0.77810]
Epoch 34: 60%|###### | 12/20 [00:00<00:00, 62.70it/s, loss=-0.12372, sqweights=0.77726]
Epoch 34: 65%|######5 | 13/20 [00:00<00:00, 62.70it/s, loss=-0.12335, sqweights=0.77693]
Epoch 34: 70%|####### | 14/20 [00:00<00:00, 64.49it/s, loss=-0.12335, sqweights=0.77693]
Epoch 34: 70%|####### | 14/20 [00:00<00:00, 64.49it/s, loss=-0.12301, sqweights=0.77730]
Epoch 34: 75%|#######5 | 15/20 [00:00<00:00, 64.49it/s, loss=-0.12277, sqweights=0.77655]
Epoch 34: 80%|######## | 16/20 [00:00<00:00, 64.49it/s, loss=-0.12395, sqweights=0.77742]
Epoch 34: 85%|########5 | 17/20 [00:00<00:00, 64.49it/s, loss=-0.12553, sqweights=0.77770]
Epoch 34: 90%|######### | 18/20 [00:00<00:00, 64.49it/s, loss=-0.12614, sqweights=0.77760]
Epoch 34: 95%|#########5| 19/20 [00:00<00:00, 64.49it/s, loss=-0.12586, sqweights=0.77807]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 64.49it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]
Epoch 34: 100%|##########| 20/20 [00:00<00:00, 21.12it/s, loss=-0.12656, sqweights=0.77775, train_loss=-0.17830, train_sqweights=0.71368, val_loss=-0.14049, val_sqweights=0.70585]
Epoch 35: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 35: 5%|5 | 1/20 [00:00<00:00, 48.15it/s, loss=-0.12864, sqweights=0.78205]
Epoch 35: 10%|# | 2/20 [00:00<00:00, 55.58it/s, loss=-0.14016, sqweights=0.78308]
Epoch 35: 15%|#5 | 3/20 [00:00<00:00, 58.32it/s, loss=-0.13643, sqweights=0.78184]
Epoch 35: 20%|## | 4/20 [00:00<00:00, 60.14it/s, loss=-0.13149, sqweights=0.77668]
Epoch 35: 25%|##5 | 5/20 [00:00<00:00, 61.00it/s, loss=-0.12778, sqweights=0.77462]
Epoch 35: 30%|### | 6/20 [00:00<00:00, 61.88it/s, loss=-0.12557, sqweights=0.77656]
Epoch 35: 35%|###5 | 7/20 [00:00<00:00, 62.27it/s, loss=-0.12557, sqweights=0.77656]
Epoch 35: 35%|###5 | 7/20 [00:00<00:00, 62.27it/s, loss=-0.12502, sqweights=0.77585]
Epoch 35: 40%|#### | 8/20 [00:00<00:00, 62.27it/s, loss=-0.12160, sqweights=0.77731]
Epoch 35: 45%|####5 | 9/20 [00:00<00:00, 62.27it/s, loss=-0.12744, sqweights=0.77763]
Epoch 35: 50%|##### | 10/20 [00:00<00:00, 62.27it/s, loss=-0.12637, sqweights=0.77667]
Epoch 35: 55%|#####5 | 11/20 [00:00<00:00, 62.27it/s, loss=-0.12754, sqweights=0.77724]
Epoch 35: 60%|###### | 12/20 [00:00<00:00, 62.27it/s, loss=-0.12826, sqweights=0.77748]
Epoch 35: 65%|######5 | 13/20 [00:00<00:00, 62.27it/s, loss=-0.12708, sqweights=0.77690]
Epoch 35: 70%|####### | 14/20 [00:00<00:00, 64.45it/s, loss=-0.12708, sqweights=0.77690]
Epoch 35: 70%|####### | 14/20 [00:00<00:00, 64.45it/s, loss=-0.12727, sqweights=0.77724]
Epoch 35: 75%|#######5 | 15/20 [00:00<00:00, 64.45it/s, loss=-0.12587, sqweights=0.77780]
Epoch 35: 80%|######## | 16/20 [00:00<00:00, 64.45it/s, loss=-0.12542, sqweights=0.77894]
Epoch 35: 85%|########5 | 17/20 [00:00<00:00, 64.45it/s, loss=-0.12523, sqweights=0.77943]
Epoch 35: 90%|######### | 18/20 [00:00<00:00, 64.45it/s, loss=-0.12420, sqweights=0.77938]
Epoch 35: 95%|#########5| 19/20 [00:00<00:00, 64.45it/s, loss=-0.12445, sqweights=0.77954]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 64.45it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]
Epoch 35: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12175, sqweights=0.78087, train_loss=-0.17931, train_sqweights=0.72477, val_loss=-0.14091, val_sqweights=0.71813]
Epoch 36: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 36: 5%|5 | 1/20 [00:00<00:00, 48.24it/s, loss=-0.11660, sqweights=0.78016]
Epoch 36: 10%|# | 2/20 [00:00<00:00, 54.94it/s, loss=-0.12011, sqweights=0.78780]
Epoch 36: 15%|#5 | 3/20 [00:00<00:00, 58.39it/s, loss=-0.13007, sqweights=0.79828]
Epoch 36: 20%|## | 4/20 [00:00<00:00, 60.32it/s, loss=-0.13182, sqweights=0.79655]
Epoch 36: 25%|##5 | 5/20 [00:00<00:00, 61.51it/s, loss=-0.13046, sqweights=0.79330]
Epoch 36: 30%|### | 6/20 [00:00<00:00, 62.34it/s, loss=-0.12952, sqweights=0.79457]
Epoch 36: 35%|###5 | 7/20 [00:00<00:00, 62.96it/s, loss=-0.12952, sqweights=0.79457]
Epoch 36: 35%|###5 | 7/20 [00:00<00:00, 62.96it/s, loss=-0.11986, sqweights=0.79544]
Epoch 36: 40%|#### | 8/20 [00:00<00:00, 62.96it/s, loss=-0.12021, sqweights=0.79225]
Epoch 36: 45%|####5 | 9/20 [00:00<00:00, 62.96it/s, loss=-0.12150, sqweights=0.79106]
Epoch 36: 50%|##### | 10/20 [00:00<00:00, 62.96it/s, loss=-0.12218, sqweights=0.79176]
Epoch 36: 55%|#####5 | 11/20 [00:00<00:00, 62.96it/s, loss=-0.12355, sqweights=0.79211]
Epoch 36: 60%|###### | 12/20 [00:00<00:00, 62.96it/s, loss=-0.12257, sqweights=0.78987]
Epoch 36: 65%|######5 | 13/20 [00:00<00:00, 62.96it/s, loss=-0.12206, sqweights=0.79165]
Epoch 36: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.12206, sqweights=0.79165]
Epoch 36: 70%|####### | 14/20 [00:00<00:00, 64.95it/s, loss=-0.12369, sqweights=0.79161]
Epoch 36: 75%|#######5 | 15/20 [00:00<00:00, 64.95it/s, loss=-0.12333, sqweights=0.79395]
Epoch 36: 80%|######## | 16/20 [00:00<00:00, 64.95it/s, loss=-0.12314, sqweights=0.79348]
Epoch 36: 85%|########5 | 17/20 [00:00<00:00, 64.95it/s, loss=-0.12199, sqweights=0.79278]
Epoch 36: 90%|######### | 18/20 [00:00<00:00, 64.95it/s, loss=-0.12109, sqweights=0.79434]
Epoch 36: 95%|#########5| 19/20 [00:00<00:00, 64.95it/s, loss=-0.12139, sqweights=0.79414]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 64.95it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]
Epoch 36: 100%|##########| 20/20 [00:00<00:00, 23.21it/s, loss=-0.12216, sqweights=0.79527, train_loss=-0.18013, train_sqweights=0.73527, val_loss=-0.14157, val_sqweights=0.72981]
Epoch 37: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 37: 5%|5 | 1/20 [00:00<00:00, 48.39it/s, loss=-0.12368, sqweights=0.79008]
Epoch 37: 10%|# | 2/20 [00:00<00:00, 55.55it/s, loss=-0.13102, sqweights=0.78963]
Epoch 37: 15%|#5 | 3/20 [00:00<00:00, 58.78it/s, loss=-0.13559, sqweights=0.79313]
Epoch 37: 20%|## | 4/20 [00:00<00:00, 60.54it/s, loss=-0.12363, sqweights=0.79869]
Epoch 37: 25%|##5 | 5/20 [00:00<00:00, 61.67it/s, loss=-0.12969, sqweights=0.79935]
Epoch 37: 30%|### | 6/20 [00:00<00:00, 62.45it/s, loss=-0.12704, sqweights=0.80012]
Epoch 37: 35%|###5 | 7/20 [00:00<00:00, 63.03it/s, loss=-0.12704, sqweights=0.80012]
Epoch 37: 35%|###5 | 7/20 [00:00<00:00, 63.03it/s, loss=-0.13051, sqweights=0.80045]
Epoch 37: 40%|#### | 8/20 [00:00<00:00, 63.03it/s, loss=-0.13055, sqweights=0.80521]
Epoch 37: 45%|####5 | 9/20 [00:00<00:00, 63.03it/s, loss=-0.12914, sqweights=0.80261]
Epoch 37: 50%|##### | 10/20 [00:00<00:00, 63.03it/s, loss=-0.12759, sqweights=0.80055]
Epoch 37: 55%|#####5 | 11/20 [00:00<00:00, 63.03it/s, loss=-0.12790, sqweights=0.80067]
Epoch 37: 60%|###### | 12/20 [00:00<00:00, 63.03it/s, loss=-0.12941, sqweights=0.80171]
Epoch 37: 65%|######5 | 13/20 [00:00<00:00, 63.03it/s, loss=-0.12899, sqweights=0.80194]
Epoch 37: 70%|####### | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12899, sqweights=0.80194]
Epoch 37: 70%|####### | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12863, sqweights=0.80253]
Epoch 37: 75%|#######5 | 15/20 [00:00<00:00, 64.83it/s, loss=-0.12638, sqweights=0.80435]
Epoch 37: 80%|######## | 16/20 [00:00<00:00, 64.83it/s, loss=-0.12891, sqweights=0.80355]
Epoch 37: 85%|########5 | 17/20 [00:00<00:00, 64.83it/s, loss=-0.12834, sqweights=0.80343]
Epoch 37: 90%|######### | 18/20 [00:00<00:00, 64.83it/s, loss=-0.12633, sqweights=0.80364]
Epoch 37: 95%|#########5| 19/20 [00:00<00:00, 64.83it/s, loss=-0.12501, sqweights=0.80556]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]
Epoch 37: 100%|##########| 20/20 [00:00<00:00, 21.09it/s, loss=-0.12675, sqweights=0.80672, train_loss=-0.18092, train_sqweights=0.74587, val_loss=-0.14174, val_sqweights=0.74047]
Epoch 38: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 38: 5%|5 | 1/20 [00:00<00:00, 47.57it/s, loss=-0.13163, sqweights=0.80267]
Epoch 38: 10%|# | 2/20 [00:00<00:00, 55.20it/s, loss=-0.14169, sqweights=0.79233]
Epoch 38: 15%|#5 | 3/20 [00:00<00:00, 57.74it/s, loss=-0.13660, sqweights=0.79548]
Epoch 38: 20%|## | 4/20 [00:00<00:00, 59.71it/s, loss=-0.13344, sqweights=0.79909]
Epoch 38: 25%|##5 | 5/20 [00:00<00:00, 61.00it/s, loss=-0.13769, sqweights=0.80390]
Epoch 38: 30%|### | 6/20 [00:00<00:00, 61.89it/s, loss=-0.13191, sqweights=0.80354]
Epoch 38: 35%|###5 | 7/20 [00:00<00:00, 62.54it/s, loss=-0.13191, sqweights=0.80354]
Epoch 38: 35%|###5 | 7/20 [00:00<00:00, 62.54it/s, loss=-0.13259, sqweights=0.80453]
Epoch 38: 40%|#### | 8/20 [00:00<00:00, 62.54it/s, loss=-0.12911, sqweights=0.80572]
Epoch 38: 45%|####5 | 9/20 [00:00<00:00, 62.54it/s, loss=-0.12804, sqweights=0.80816]
Epoch 38: 50%|##### | 10/20 [00:00<00:00, 62.54it/s, loss=-0.12398, sqweights=0.80895]
Epoch 38: 55%|#####5 | 11/20 [00:00<00:00, 62.54it/s, loss=-0.12643, sqweights=0.81014]
Epoch 38: 60%|###### | 12/20 [00:00<00:00, 62.54it/s, loss=-0.12501, sqweights=0.81052]
Epoch 38: 65%|######5 | 13/20 [00:00<00:00, 62.54it/s, loss=-0.12400, sqweights=0.81107]
Epoch 38: 70%|####### | 14/20 [00:00<00:00, 64.92it/s, loss=-0.12400, sqweights=0.81107]
Epoch 38: 70%|####### | 14/20 [00:00<00:00, 64.92it/s, loss=-0.12407, sqweights=0.81273]
Epoch 38: 75%|#######5 | 15/20 [00:00<00:00, 64.92it/s, loss=-0.12694, sqweights=0.81269]
Epoch 38: 80%|######## | 16/20 [00:00<00:00, 64.92it/s, loss=-0.12635, sqweights=0.81319]
Epoch 38: 85%|########5 | 17/20 [00:00<00:00, 64.92it/s, loss=-0.12671, sqweights=0.81173]
Epoch 38: 90%|######### | 18/20 [00:00<00:00, 64.92it/s, loss=-0.12762, sqweights=0.81191]
Epoch 38: 95%|#########5| 19/20 [00:00<00:00, 64.92it/s, loss=-0.12918, sqweights=0.81233]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 64.92it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]
Epoch 38: 100%|##########| 20/20 [00:00<00:00, 23.17it/s, loss=-0.12958, sqweights=0.81021, train_loss=-0.18177, train_sqweights=0.75785, val_loss=-0.14198, val_sqweights=0.75335]
Epoch 39: 0%| | 0/20 [00:00<?, ?it/s]
Epoch 39: 5%|5 | 1/20 [00:00<00:00, 48.24it/s, loss=-0.11211, sqweights=0.83095]
Epoch 39: 10%|# | 2/20 [00:00<00:00, 55.23it/s, loss=-0.11914, sqweights=0.82066]
Epoch 39: 15%|#5 | 3/20 [00:00<00:00, 58.57it/s, loss=-0.11884, sqweights=0.82027]
Epoch 39: 20%|## | 4/20 [00:00<00:00, 60.40it/s, loss=-0.11786, sqweights=0.82377]
Epoch 39: 25%|##5 | 5/20 [00:00<00:00, 61.48it/s, loss=-0.11628, sqweights=0.82335]
Epoch 39: 30%|### | 6/20 [00:00<00:00, 62.29it/s, loss=-0.11809, sqweights=0.82164]
Epoch 39: 35%|###5 | 7/20 [00:00<00:00, 62.70it/s, loss=-0.11809, sqweights=0.82164]
Epoch 39: 35%|###5 | 7/20 [00:00<00:00, 62.70it/s, loss=-0.12640, sqweights=0.82007]
Epoch 39: 40%|#### | 8/20 [00:00<00:00, 62.70it/s, loss=-0.13055, sqweights=0.82095]
Epoch 39: 45%|####5 | 9/20 [00:00<00:00, 62.70it/s, loss=-0.13016, sqweights=0.81815]
Epoch 39: 50%|##### | 10/20 [00:00<00:00, 62.70it/s, loss=-0.12811, sqweights=0.81687]
Epoch 39: 55%|#####5 | 11/20 [00:00<00:00, 62.70it/s, loss=-0.12815, sqweights=0.81824]
Epoch 39: 60%|###### | 12/20 [00:00<00:00, 62.70it/s, loss=-0.12692, sqweights=0.81863]
Epoch 39: 65%|######5 | 13/20 [00:00<00:00, 62.70it/s, loss=-0.12631, sqweights=0.81975]
Epoch 39: 70%|####### | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12631, sqweights=0.81975]
Epoch 39: 70%|####### | 14/20 [00:00<00:00, 64.83it/s, loss=-0.12671, sqweights=0.82048]
Epoch 39: 75%|#######5 | 15/20 [00:00<00:00, 64.83it/s, loss=-0.12583, sqweights=0.82132]
Epoch 39: 80%|######## | 16/20 [00:00<00:00, 64.83it/s, loss=-0.12654, sqweights=0.82272]
Epoch 39: 85%|########5 | 17/20 [00:00<00:00, 64.83it/s, loss=-0.12666, sqweights=0.82375]
Epoch 39: 90%|######### | 18/20 [00:00<00:00, 64.83it/s, loss=-0.12838, sqweights=0.82432]
Epoch 39: 95%|#########5| 19/20 [00:00<00:00, 64.83it/s, loss=-0.12921, sqweights=0.82490]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 64.83it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]
Epoch 39: 100%|##########| 20/20 [00:00<00:00, 23.15it/s, loss=-0.12951, sqweights=0.82506, train_loss=-0.18299, train_sqweights=0.77106, val_loss=-0.14267, val_sqweights=0.76645]
<matplotlib.legend.Legend object at 0x7f61fcaf9050>
import numpy as np
import torch
import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VARProcess, forecast
from deepdow.benchmarks import OneOverN, Benchmark, InverseVolatility, Random
from deepdow.callbacks import EarlyStoppingCallback
from deepdow.data import InRAMDataset, RigidDataLoader
from deepdow.losses import MeanReturns, SquaredWeights
from deepdow.nn import LinearNet
from deepdow.experiments import Run
class VARTrue(Benchmark):
"""Benchmark representing the ground truth return process.
Parameters
----------
process : statsmodels.tsa.vector_ar.var_model.VARProcess
The ground truth VAR process that generates the returns.
"""
def __init__(self, process):
self.process = process
def __call__(self, x):
"""Invest all money into the asset with the highest return over the horizon."""
n_samples, n_channels, lookback, n_assets = x.shape
assert n_channels == 1
x_np = x.detach().numpy() # (n_samples, n_channels, lookback, n_assets)
weights_list = [forecast(x_np[i, 0], self.process.coefs, None, 1).argmax() for i in range(n_samples)]
result = torch.zeros(n_samples, n_assets).to(x.dtype)
for i, w_ix in enumerate(weights_list):
result[i, w_ix] = 1
return result
coefs = np.load('var_coefs.npy') # (lookback, n_assets, n_assets) = (12, 8, 8)
# Parameters
lookback, _, n_assets = coefs.shape
gap, horizon = 0, 1
batch_size = 256
# Simulate returns
process = VARProcess(coefs, None, np.eye(n_assets) * 1e-5)
data = process.simulate_var(10000)
n_timesteps = len(data)
# Create features and targets
X_list, y_list = [], []
for i in range(lookback, n_timesteps - horizon - gap + 1):
X_list.append(data[i - lookback: i, :])
y_list.append(data[i + gap: i + gap + horizon, :])
X = np.stack(X_list, axis=0)[:, None, ...]
y = np.stack(y_list, axis=0)[:, None, ...]
# Setup deepdow framework
dataset = InRAMDataset(X, y)
network = LinearNet(1, lookback, n_assets, p=0.5)
dataloader = RigidDataLoader(dataset,
indices=list(range(5000)),
batch_size=batch_size,
lookback=lookback)
val_dataloaders = {'train': dataloader,
'val': RigidDataLoader(dataset,
indices=list(range(5020, 9800)),
batch_size=batch_size,
lookback=lookback)}
run = Run(network,
100 * MeanReturns(),
dataloader,
val_dataloaders=val_dataloaders,
metrics={'sqweights': SquaredWeights()},
benchmarks={'1overN': OneOverN(),
'VAR': VARTrue(process),
'Random': Random(),
'InverseVol': InverseVolatility()},
optimizer=torch.optim.Adam(network.parameters(), amsgrad=True),
callbacks=[EarlyStoppingCallback('val', 'loss')]
)
history = run.launch(40)
fig, ax = plt.subplots(1, 1)
ax.set_title('Validation loss')
per_epoch_results = history.metrics.groupby(['dataloader', 'metric', 'model', 'epoch'])['value'].mean()['val']['loss']
our = per_epoch_results['network']
our.plot(ax=ax, label='network')
ax.hlines(y=per_epoch_results['VAR'], xmin=0, xmax=len(our), color='red', label='VAR')
ax.hlines(y=per_epoch_results['1overN'], xmin=0, xmax=len(our), color='green', label='1overN')
ax.hlines(y=per_epoch_results['Random'], xmin=0, xmax=len(our), color='yellow', label='Random')
ax.hlines(y=per_epoch_results['InverseVol'], xmin=0, xmax=len(our), color='black', label='InverseVol')
plt.legend()
Total running time of the script: ( 0 minutes 42.128 seconds)