| end of split 1 / 4 | epoch 1 | time: 8562.76s | valid loss 1.63 | valid ppl 5.12 | learning rate 20.0000 | end of split 2 / 4 | epoch 1 | time: 8573.03s | valid loss 1.52 | valid ppl 4.58 | learning rate 20.0000 | end of split 3 / 4 | epoch 1 | time: 8582.24s | valid loss 1.46 | valid ppl 4.30 | learning rate 20.0000 | end of split 4 / 4 | epoch 1 | time: 4118.48s | valid loss 1.44 | valid ppl 4.24 | learning rate 20.0000 | end of split 1 / 4 | epoch 2 | time: 8571.48s | valid loss 1.41 | valid ppl 4.10 | learning rate 20.0000 | end of split 2 / 4 | epoch 2 | time: 4121.49s | valid loss 1.40 | valid ppl 4.07 | learning rate 20.0000 | end of split 3 / 4 | epoch 2 | time: 8579.78s | valid loss 1.38 | valid ppl 3.98 | learning rate 20.0000 | end of split 4 / 4 | epoch 2 | time: 8583.21s | valid loss 1.37 | valid ppl 3.92 | learning rate 20.0000 | end of split 1 / 4 | epoch 3 | time: 8581.73s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 2 / 4 | epoch 3 | time: 4536.25s | valid loss 1.35 | valid ppl 3.86 | learning rate 20.0000 | end of split 3 / 4 | epoch 3 | time: 8578.51s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 4 / 4 | epoch 3 | time: 8658.13s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 1 / 4 | epoch 4 | time: 8580.88s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 2 / 4 | epoch 4 | time: 8593.00s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 3 / 4 | epoch 4 | time: 8644.32s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 4 / 4 | epoch 4 | time: 4934.06s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 1 / 4 | epoch 5 | time: 8578.49s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 2 / 4 | epoch 5 | time: 8582.93s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 3 / 4 | epoch 5 | time: 4123.67s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 4 / 4 | epoch 5 | time: 8592.02s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 1 / 4 | epoch 6 | time: 4123.22s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 2 / 4 | epoch 6 | time: 8590.96s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 3 / 4 | epoch 6 | time: 8586.73s | valid loss 1.30 | valid ppl 3.66 | learning rate 20.0000 | end of split 4 / 4 | epoch 6 | time: 10499.46s | valid loss 1.30 | valid ppl 3.65 | learning rate 20.0000 | end of split 1 / 4 | epoch 7 | time: 8579.06s | valid loss 1.29 | valid ppl 3.64 | learning rate 20.0000 | end of split 2 / 4 | epoch 7 | time: 4133.54s | valid loss 1.29 | valid ppl 3.64 | learning rate 20.0000 | end of split 3 / 4 | epoch 7 | time: 8591.98s | valid loss 1.29 | valid ppl 3.63 | learning rate 20.0000 | end of split 4 / 4 | epoch 7 | time: 8591.59s | valid loss 1.29 | valid ppl 3.62 | learning rate 20.0000 | end of split 1 / 4 | epoch 8 | time: 8598.10s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 2 / 4 | epoch 8 | time: 8592.66s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 3 / 4 | epoch 8 | time: 8597.33s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 4 / 4 | epoch 8 | time: 4124.64s | valid loss 1.28 | valid ppl 3.61 | learning rate 20.0000 | end of split 1 / 4 | epoch 9 | time: 8573.14s | valid loss 1.28 | valid ppl 3.60 | learning rate 20.0000 | end of split 2 / 4 | epoch 9 | time: 8569.17s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 3 / 4 | epoch 9 | time: 8564.82s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 4 / 4 | epoch 9 | time: 4113.20s | valid loss 1.28 | valid ppl 3.59 | learning rate 20.0000 | end of split 1 / 4 | epoch 10 | time: 8579.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 20.0000 | end of split 2 / 4 | epoch 10 | time: 8586.46s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 3 / 4 | epoch 10 | time: 4121.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 20.0000 | end of split 4 / 4 | epoch 10 | time: 8585.27s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 1 / 4 | epoch 11 | time: 8561.35s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 2 / 4 | epoch 11 | time: 4119.16s | valid loss 1.27 | valid ppl 3.57 | learning rate 20.0000 | end of split 3 / 4 | epoch 11 | time: 8567.49s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 4 / 4 | epoch 11 | time: 8604.60s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 1 / 4 | epoch 12 | time: 8570.95s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 2 / 4 | epoch 12 | time: 8592.50s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 3 / 4 | epoch 12 | time: 4129.76s | valid loss 1.27 | valid ppl 3.55 | learning rate 20.0000 | end of split 4 / 4 | epoch 12 | time: 8572.50s | valid loss 1.27 | valid ppl 3.54 | learning rate 20.0000 | end of split 1 / 4 | epoch 13 | time: 8585.99s | valid loss 1.26 | valid ppl 3.54 | learning rate 20.0000 | end of split 2 / 4 | epoch 13 | time: 8587.19s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 3 / 4 | epoch 13 | time: 8582.92s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 4 / 4 | epoch 13 | time: 4133.92s | valid loss 1.26 | valid ppl 3.54 | learning rate 20.0000 | end of split 1 / 4 | epoch 14 | time: 8601.24s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 2 / 4 | epoch 14 | time: 8639.11s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 3 / 4 | epoch 14 | time: 4121.70s | valid loss 1.26 | valid ppl 3.53 | learning rate 20.0000 | end of split 4 / 4 | epoch 14 | time: 8582.64s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 1 / 4 | epoch 15 | time: 8578.50s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 2 / 4 | epoch 15 | time: 8576.53s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 3 / 4 | epoch 15 | time: 4119.30s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 4 / 4 | epoch 15 | time: 8590.29s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 1 / 4 | epoch 16 | time: 4119.29s | valid loss 1.26 | valid ppl 3.52 | learning rate 20.0000 | end of split 2 / 4 | epoch 16 | time: 8588.29s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 16 | time: 8587.05s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 4 / 4 | epoch 16 | time: 8586.71s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 1 / 4 | epoch 17 | time: 8577.23s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 2 / 4 | epoch 17 | time: 4121.60s | valid loss 1.26 | valid ppl 3.51 | learning rate 20.0000 | end of split 3 / 4 | epoch 17 | time: 8579.34s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 4 / 4 | epoch 17 | time: 8609.62s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 1 / 4 | epoch 18 | time: 4114.69s | valid loss 1.25 | valid ppl 3.51 | learning rate 20.0000 | end of split 2 / 4 | epoch 18 | time: 8575.62s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 3 / 4 | epoch 18 | time: 8597.99s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 4 / 4 | epoch 18 | time: 8581.70s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 1 / 4 | epoch 19 | time: 8581.36s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 2 / 4 | epoch 19 | time: 8620.52s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 3 / 4 | epoch 19 | time: 4133.90s | valid loss 1.25 | valid ppl 3.50 | learning rate 20.0000 | end of split 4 / 4 | epoch 19 | time: 8592.75s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 1 / 4 | epoch 20 | time: 8593.67s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 2 / 4 | epoch 20 | time: 4563.06s | valid loss 1.25 | valid ppl 3.49 | learning rate 20.0000 | end of split 3 / 4 | epoch 20 | time: 9311.19s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 4 / 4 | epoch 20 | time: 10014.57s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 1 / 4 | epoch 21 | time: 8581.80s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 2 / 4 | epoch 21 | time: 4236.40s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 3 / 4 | epoch 21 | time: 8579.34s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 4 / 4 | epoch 21 | time: 8573.74s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 1 / 4 | epoch 22 | time: 8588.42s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 2 / 4 | epoch 22 | time: 4118.76s | valid loss 1.25 | valid ppl 3.48 | learning rate 20.0000 | end of split 3 / 4 | epoch 22 | time: 8565.39s | valid loss 1.25 | valid ppl 3.47 | learning rate 20.0000 | end of split 4 / 4 | epoch 22 | time: 8629.35s | valid loss 1.25 | valid ppl 3.47 | learning rate 20.0000 | end of split 1 / 4 | epoch 23 | time: 8569.01s | valid loss 1.25 | valid ppl 3.47 | learning rate 20.0000 | end of split 2 / 4 | epoch 23 | time: 8612.11s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000 | end of split 3 / 4 | epoch 23 | time: 9536.87s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000 | end of split 4 / 4 | epoch 23 | time: 4474.93s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000 | end of split 1 / 4 | epoch 24 | time: 8560.14s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000 | end of split 2 / 4 | epoch 24 | time: 8587.87s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000 | end of split 3 / 4 | epoch 24 | time: 8614.77s | valid loss 1.24 | valid ppl 3.47 | learning rate 20.0000