| end of split 1 / 4 | epoch 1 | time: 8911.93s | valid loss 1.62 | valid ppl 5.07 | learning rate 20.0000 | end of split 2 / 4 | epoch 1 | time: 8940.43s | valid loss 1.51 | valid ppl 4.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 1 | time: 8937.93s | valid loss 1.45 | valid ppl 4.28 | learning rate 20.0000 | end of split 4 / 4 | epoch 1 | time: 4290.56s | valid loss 1.43 | valid ppl 4.20 | learning rate 20.0000 | end of split 1 / 4 | epoch 2 | time: 8937.28s | valid loss 1.41 | valid ppl 4.09 | learning rate 20.0000 | end of split 2 / 4 | epoch 2 | time: 4299.22s | valid loss 1.40 | valid ppl 4.04 | learning rate 20.0000 | end of split 3 / 4 | epoch 2 | time: 8950.23s | valid loss 1.38 | valid ppl 3.97 | learning rate 20.0000 | end of split 4 / 4 | epoch 2 | time: 8955.98s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 1 / 4 | epoch 3 | time: 8949.26s | valid loss 1.35 | valid ppl 3.86 | learning rate 20.0000 | end of split 2 / 4 | epoch 3 | time: 8952.69s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 3 / 4 | epoch 3 | time: 4294.73s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 4 / 4 | epoch 3 | time: 8936.01s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 1 / 4 | epoch 4 | time: 4295.71s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 2 / 4 | epoch 4 | time: 8952.30s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 3 / 4 | epoch 4 | time: 8949.57s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 4 / 4 | epoch 4 | time: 8948.47s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 1 / 4 | epoch 5 | time: 8952.53s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 2 / 4 | epoch 5 | time: 8967.77s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 3 / 4 | epoch 5 | time: 4303.62s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 4 / 4 | epoch 5 | time: 8955.62s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 1 / 4 | epoch 6 | time: 4293.63s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 2 / 4 | epoch 6 | time: 8950.83s | valid loss 1.30 | valid ppl 3.66 | learning rate 20.0000 | end of split 1 / 4 | epoch 1 | time: 8636.50s | valid loss 1.30 | valid ppl 3.66 | learning rate 20.0000 | end of split 2 / 4 | epoch 1 | time: 8670.66s | valid loss 1.29 | valid ppl 3.65 | learning rate 20.0000 | end of split 3 / 4 | epoch 1 | time: 8657.50s | valid loss 1.30 | valid ppl 3.66 | learning rate 20.0000 | end of split 4 / 4 | epoch 1 | time: 4144.61s | valid loss 1.29 | valid ppl 3.64 | learning rate 20.0000 | end of split 1 / 4 | epoch 2 | time: 4139.96s | valid loss 1.29 | valid ppl 3.65 | learning rate 20.0000 | end of split 2 / 4 | epoch 2 | time: 8646.65s | valid loss 1.29 | valid ppl 3.63 | learning rate 20.0000 | end of split 3 / 4 | epoch 2 | time: 8632.55s | valid loss 1.29 | valid ppl 3.64 | learning rate 20.0000 | end of split 4 / 4 | epoch 2 | time: 8626.28s | valid loss 1.29 | valid ppl 3.61 | learning rate 20.0000 | end of split 1 / 4 | epoch 3 | time: 4129.14s | valid loss 1.29 | valid ppl 3.62 | learning rate 20.0000 | end of split 2 / 4 | epoch 3 | time: 8621.19s | valid loss 1.29 | valid ppl 3.64 | learning rate 20.0000 | end of split 3 / 4 | epoch 3 | time: 8608.42s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 4 / 4 | epoch 3 | time: 8622.47s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 1 / 4 | epoch 4 | time: 8597.66s | valid loss 1.28 | valid ppl 3.60 | learning rate 20.0000 | end of split 2 / 4 | epoch 4 | time: 4125.73s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 3 / 4 | epoch 4 | time: 8614.82s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 4 / 4 | epoch 4 | time: 8618.80s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 1 / 4 | epoch 5 | time: 4122.60s | valid loss 1.28 | valid ppl 3.58 | learning rate 20.0000 | end of split 2 / 4 | epoch 5 | time: 8603.78s | valid loss 1.28 | valid ppl 3.58 | learning rate 20.0000 | end of split 3 / 4 | epoch 5 | time: 8626.99s | valid loss 1.28 | valid ppl 3.60 | learning rate 20.0000 | end of split 4 / 4 | epoch 5 | time: 8616.73s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 1 / 4 | epoch 6 | time: 8618.37s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 2 / 4 | epoch 6 | time: 4126.58s | valid loss 1.28 | valid ppl 3.58 | learning rate 20.0000 | end of split 3 / 4 | epoch 6 | time: 8620.34s | valid loss 1.28 | valid ppl 3.60 | learning rate 20.0000 | end of split 4 / 4 | epoch 6 | time: 8602.34s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 1 / 4 | epoch 7 | time: 8594.35s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 2 / 4 | epoch 7 | time: 8614.04s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 3 / 4 | epoch 7 | time: 4126.32s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 4 / 4 | epoch 7 | time: 8620.86s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 1 / 4 | epoch 8 | time: 4127.76s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 2 / 4 | epoch 8 | time: 8609.71s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 3 / 4 | epoch 8 | time: 8633.81s | valid loss 1.27 | valid ppl 3.54 | learning rate 20.0000 | end of split 4 / 4 | epoch 8 | time: 8618.93s | valid loss 1.26 | valid ppl 3.54 | learning rate 20.0000 | end of split 1 / 4 | epoch 9 | time: 8654.09s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 2 / 4 | epoch 9 | time: 4142.02s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 3 / 4 | epoch 9 | time: 8652.13s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 4 / 4 | epoch 9 | time: 8639.77s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 1 / 4 | epoch 10 | time: 8636.90s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 2 / 4 | epoch 10 | time: 4136.07s | valid loss 1.27 | valid ppl 3.54 | learning rate 20.0000 | end of split 3 / 4 | epoch 10 | time: 8643.53s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 4 / 4 | epoch 10 | time: 8652.19s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 1 / 4 | epoch 11 | time: 8639.45s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 2 / 4 | epoch 11 | time: 8631.37s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 3 / 4 | epoch 11 | time: 8644.05s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 4 / 4 | epoch 11 | time: 4140.29s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 1 / 4 | epoch 12 | time: 4145.19s | valid loss 1.26 | valid ppl 3.54 | learning rate 20.0000 | end of split 2 / 4 | epoch 12 | time: 8640.26s | valid loss 1.27 | valid ppl 3.56 | learning rate 20.0000 | end of split 3 / 4 | epoch 12 | time: 8641.53s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 4 / 4 | epoch 12 | time: 8641.07s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 1 / 4 | epoch 13 | time: 8639.52s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 2 / 4 | epoch 13 | time: 8640.65s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 13 | time: 4124.35s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 4 / 4 | epoch 13 | time: 8626.82s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 1 / 4 | epoch 14 | time: 4124.90s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 2 / 4 | epoch 14 | time: 8610.67s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 14 | time: 8636.56s | valid loss 1.26 | valid ppl 3.54 | learning rate 20.0000 | end of split 4 / 4 | epoch 14 | time: 8637.73s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 1 / 4 | epoch 15 | time: 8628.97s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 2 / 4 | epoch 15 | time: 8613.07s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 3 / 4 | epoch 15 | time: 8623.37s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 4 / 4 | epoch 15 | time: 4117.84s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 1 / 4 | epoch 16 | time: 8635.45s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 2 / 4 | epoch 16 | time: 8622.71s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 3 / 4 | epoch 16 | time: 8633.11s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 4 / 4 | epoch 16 | time: 4144.93s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 1 / 4 | epoch 17 | time: 8631.78s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 2 / 4 | epoch 17 | time: 4128.16s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 17 | time: 8623.67s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 4 / 4 | epoch 17 | time: 8626.51s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 1 / 4 | epoch 18 | time: 8635.98s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000