| end of split 1 / 54 | epoch 1 | time: 4081.25s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 2 / 54 | epoch 1 | time: 4086.63s | valid loss 1.21 | valid ppl 3.34 | learning rate 20.0000 | end of split 3 / 54 | epoch 1 | time: 4090.54s | valid loss 1.14 | valid ppl 3.11 | learning rate 20.0000 | end of split 4 / 54 | epoch 1 | time: 4091.78s | valid loss 1.10 | valid ppl 3.00 | learning rate 20.0000 | end of split 5 / 54 | epoch 1 | time: 4082.69s | valid loss 1.09 | valid ppl 2.96 | learning rate 20.0000 | end of split 6 / 54 | epoch 1 | time: 4082.49s | valid loss 1.06 | valid ppl 2.88 | learning rate 20.0000 | end of split 7 / 54 | epoch 1 | time: 4082.99s | valid loss 1.04 | valid ppl 2.83 | learning rate 20.0000 | end of split 8 / 54 | epoch 1 | time: 4083.30s | valid loss 1.04 | valid ppl 2.82 | learning rate 20.0000 | end of split 9 / 54 | epoch 1 | time: 4085.84s | valid loss 1.03 | valid ppl 2.79 | learning rate 20.0000 | end of split 10 / 54 | epoch 1 | time: 4090.64s | valid loss 1.02 | valid ppl 2.77 | learning rate 20.0000 | end of split 11 / 54 | epoch 1 | time: 4090.81s | valid loss 1.01 | valid ppl 2.74 | learning rate 20.0000 | end of split 1 / 54 | epoch 1 | time: 4116.95s | valid loss 1.01 | valid ppl 2.74 | learning rate 20.0000 | end of split 2 / 54 | epoch 1 | time: 4121.16s | valid loss 1.00 | valid ppl 2.72 | learning rate 20.0000 | end of split 3 / 54 | epoch 1 | time: 4127.33s | valid loss 1.00 | valid ppl 2.71 | learning rate 20.0000 | end of split 4 / 54 | epoch 1 | time: 4127.00s | valid loss 0.99 | valid ppl 2.70 | learning rate 20.0000 | end of split 5 / 54 | epoch 1 | time: 4118.07s | valid loss 1.00 | valid ppl 2.71 | learning rate 20.0000 | end of split 6 / 54 | epoch 1 | time: 4121.31s | valid loss 0.99 | valid ppl 2.68 | learning rate 20.0000 | end of split 7 / 54 | epoch 1 | time: 4122.89s | valid loss 0.98 | valid ppl 2.67 | learning rate 20.0000 | end of split 8 / 54 | epoch 1 | time: 4117.48s | valid loss 0.98 | valid ppl 2.67 | learning rate 20.0000 | end of split 9 / 54 | epoch 1 | time: 4111.90s | valid loss 0.98 | valid ppl 2.66 | learning rate 20.0000 | end of split 10 / 54 | epoch 1 | time: 4094.19s | valid loss 0.98 | valid ppl 2.65 | learning rate 20.0000 | end of split 11 / 54 | epoch 1 | time: 4101.30s | valid loss 0.97 | valid ppl 2.64 | learning rate 20.0000 | end of split 12 / 54 | epoch 1 | time: 4090.08s | valid loss 0.98 | valid ppl 2.66 | learning rate 20.0000 | end of split 13 / 54 | epoch 1 | time: 4088.73s | valid loss 0.97 | valid ppl 2.64 | learning rate 20.0000 | end of split 14 / 54 | epoch 1 | time: 4085.64s | valid loss 0.97 | valid ppl 2.63 | learning rate 20.0000 | end of split 15 / 54 | epoch 1 | time: 4081.74s | valid loss 0.97 | valid ppl 2.63 | learning rate 20.0000 | end of split 16 / 54 | epoch 1 | time: 4081.89s | valid loss 0.96 | valid ppl 2.62 | learning rate 20.0000 | end of split 17 / 54 | epoch 1 | time: 4083.98s | valid loss 0.96 | valid ppl 2.61 | learning rate 20.0000 | end of split 18 / 54 | epoch 1 | time: 4083.58s | valid loss 0.96 | valid ppl 2.61 | learning rate 20.0000 | end of split 19 / 54 | epoch 1 | time: 4082.55s | valid loss 0.96 | valid ppl 2.61 | learning rate 20.0000 | end of split 20 / 54 | epoch 1 | time: 4087.93s | valid loss 0.96 | valid ppl 2.60 | learning rate 20.0000 | end of split 1 / 34 | epoch 1 | time: 4075.41s | valid loss 0.95 | valid ppl 2.60 | learning rate 20.0000 | end of split 2 / 34 | epoch 1 | time: 394.20s | valid loss 1.08 | valid ppl 2.94 | learning rate 20.0000 | end of split 3 / 34 | epoch 1 | time: 4099.18s | valid loss 0.96 | valid ppl 2.60 | learning rate 20.0000 | end of split 4 / 34 | epoch 1 | time: 4099.37s | valid loss 0.95 | valid ppl 2.59 | learning rate 20.0000 | end of split 5 / 34 | epoch 1 | time: 4090.51s | valid loss 0.95 | valid ppl 2.59 | learning rate 20.0000 | end of split 6 / 34 | epoch 1 | time: 4095.56s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 7 / 34 | epoch 1 | time: 4088.98s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 8 / 34 | epoch 1 | time: 4077.35s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 9 / 34 | epoch 1 | time: 4080.62s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 1 / 34 | epoch 1 | time: 4052.51s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 2 / 34 | epoch 1 | time: 393.17s | valid loss 1.06 | valid ppl 2.89 | learning rate 20.0000 | end of split 3 / 34 | epoch 1 | time: 4087.88s | valid loss 0.95 | valid ppl 2.58 | learning rate 20.0000 | end of split 4 / 34 | epoch 1 | time: 4087.17s | valid loss 0.95 | valid ppl 2.57 | learning rate 20.0000 | end of split 5 / 34 | epoch 1 | time: 4086.04s | valid loss 0.95 | valid ppl 2.57 | learning rate 20.0000 | end of split 6 / 34 | epoch 1 | time: 4084.87s | valid loss 0.94 | valid ppl 2.57 | learning rate 20.0000 | end of split 7 / 34 | epoch 1 | time: 4088.46s | valid loss 0.94 | valid ppl 2.57 | learning rate 20.0000 | end of split 8 / 34 | epoch 1 | time: 4078.42s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 9 / 34 | epoch 1 | time: 4066.07s | valid loss 0.94 | valid ppl 2.57 | learning rate 20.0000 | end of split 10 / 34 | epoch 1 | time: 4089.51s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 11 / 34 | epoch 1 | time: 4083.77s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 12 / 34 | epoch 1 | time: 393.53s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 13 / 34 | epoch 1 | time: 4082.31s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 14 / 34 | epoch 1 | time: 4083.36s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 15 / 34 | epoch 1 | time: 4082.12s | valid loss 0.94 | valid ppl 2.56 | learning rate 20.0000 | end of split 16 / 34 | epoch 1 | time: 4083.76s | valid loss 0.94 | valid ppl 2.55 | learning rate 20.0000 | end of split 17 / 34 | epoch 1 | time: 4091.26s | valid loss 0.94 | valid ppl 2.55 | learning rate 20.0000 | end of split 18 / 34 | epoch 1 | time: 4086.95s | valid loss 0.94 | valid ppl 2.55 | learning rate 20.0000 | end of split 19 / 34 | epoch 1 | time: 4084.13s | valid loss 0.94 | valid ppl 2.55 | learning rate 20.0000 | end of split 20 / 34 | epoch 1 | time: 4084.99s | valid loss 0.94 | valid ppl 2.55 | learning rate 20.0000 | end of split 21 / 34 | epoch 1 | time: 4087.34s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 22 / 34 | epoch 1 | time: 4081.53s | valid loss 0.93 | valid ppl 2.55 | learning rate 20.0000 | end of split 23 / 34 | epoch 1 | time: 4082.18s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 24 / 34 | epoch 1 | time: 4090.12s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 25 / 34 | epoch 1 | time: 4084.61s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 26 / 34 | epoch 1 | time: 4085.58s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 27 / 34 | epoch 1 | time: 4085.31s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 28 / 34 | epoch 1 | time: 4084.61s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 29 / 34 | epoch 1 | time: 393.38s | valid loss 1.06 | valid ppl 2.89 | learning rate 20.0000 | end of split 30 / 34 | epoch 1 | time: 4080.38s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 31 / 34 | epoch 1 | time: 4078.00s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 32 / 34 | epoch 1 | time: 4061.67s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 33 / 34 | epoch 1 | time: 3523.66s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 34 / 34 | epoch 1 | time: 4797.62s | valid loss 0.93 | valid ppl 2.54 | learning rate 20.0000 | end of split 1 / 34 | epoch 2 | time: 4080.82s | valid loss 0.93 | valid ppl 2.52 | learning rate 20.0000 | end of split 2 / 34 | epoch 2 | time: 4077.49s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 3 / 34 | epoch 2 | time: 391.66s | valid loss 1.02 | valid ppl 2.76 | learning rate 20.0000 | end of split 4 / 34 | epoch 2 | time: 4064.35s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 5 / 34 | epoch 2 | time: 4081.22s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 6 / 34 | epoch 2 | time: 4081.34s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 7 / 34 | epoch 2 | time: 391.54s | valid loss 0.93 | valid ppl 2.53 | learning rate 20.0000 | end of split 8 / 34 | epoch 2 | time: 4065.03s | valid loss 0.93 | valid ppl 2.52 | learning rate 20.0000 | end of split 9 / 34 | epoch 2 | time: 4079.89s | valid loss 0.92 | valid ppl 2.52 | learning rate 20.0000