lucataco/train-text-to-image-lora 🔢📝❓ → 🖼️

▶️ 26 runs 📅 Jul 2024 ⚙️ Cog 0.9.13 🔗 GitHub ⚖️ License
huggingface image-lora-training stable-diffusion

About

Huggingface Diffusers: SDv1.4/1.5/2.0/2.1 finetuner

Example Output

Output

Example output

Performance Metrics

1057.56s Prediction Time
1132.38s Total Time
All Input Parameters
{
  "dataset": "lambdalabs/naruto-blip-captions",
  "base_model": "runwayml/stable-diffusion-v1-5",
  "resolution": 512,
  "hub_model_id": "naruto-lora",
  "lr_scheduler": "cosine",
  "learning_rate": 0.0001,
  "max_grad_norm": 1,
  "max_train_steps": 1000,
  "num_train_epochs": 100,
  "train_batch_size": 1,
  "validation_prompt": "A naruto with blue eyes.",
  "dataloader_num_workers": 8,
  "gradient_accumulation_steps": 4
}
Input Parameters
seed Type: integer
Seed for reproducibility
dataset Type: stringDefault: lambdalabs/naruto-blip-captions
Huggingface dataset to use
hf_token Type: string
Huggingface token
base_model Type: stringDefault: runwayml/stable-diffusion-v1-5
Base huggingface model to use
resolution Type: integerDefault: 512Range: 128 - 1024
Resolution for training
hub_model_id Type: stringDefault: naruto-lora
Huggingface model id to upload to
lr_scheduler Default: cosine
Learning rate scheduler
learning_rate Type: numberDefault: 0.0001Range: 0.0001 - 0.01
Learning rate for training
max_grad_norm Type: numberDefault: 1Range: 0.1 - 10
Maximum gradient norm
max_train_steps Type: integerDefault: 1000Range: 1 - 100000
Maximum number of training steps
num_train_epochs Type: integerDefault: 100Range: 1 - 10000
Number of training epochs
train_batch_size Type: integerDefault: 1Range: 1 - 4
Batch size for training
validation_prompt Type: stringDefault: A naruto with blue eyes.
Validation prompt
dataloader_num_workers Type: integerDefault: 8Range: 1 - 16
Number of workers for dataloader
gradient_accumulation_steps Type: integerDefault: 4Range: 1 - 8
Gradient accumulation steps
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 40529315
Running with params: ['accelerate', 'launch', 'train_text_to_image_lora.py', '--pretrained_model_name_or_path', 'runwayml/stable-diffusion-v1-5', '--dataset_name', 'lambdalabs/naruto-blip-captions', '--dataloader_num_workers', '8', '--resolution', '512', '--center_crop', '--random_flip', '--train_batch_size', '1', '--num_train_epochs', '100', '--gradient_accumulation_steps', '4', '--max_train_steps', '1000', '--checkpointing_steps', '1001', '--learning_rate', '0.0001', '--max_grad_norm', '1.0', '--lr_scheduler', 'cosine', '--lr_warmup_steps', '0', '--output_dir', '/tmp/train-t2i-lora', '--validation_prompt', 'A naruto with blue eyes.', '--seed', '40529315']
The following values were not passed to `accelerate launch` and had defaults used instead:
`--num_processes` was set to a value of `1`
`--num_machines` was set to a value of `1`
`--mixed_precision` was set to a value of `'no'`
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/accelerate/accelerator.py:406: UserWarning: `log_with=tensorboard` was passed but no supported trackers are currently installed.
warnings.warn(f"`log_with={log_with}` was passed but no supported trackers are currently installed.")
07/28/2024 20:35:39 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: no
{'thresholding', 'sample_max_value', 'prediction_type', 'rescale_betas_zero_snr', 'timestep_spacing', 'dynamic_thresholding_ratio', 'variance_type', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
{'encoder_hid_dim', 'conv_out_kernel', 'upcast_attention', 'num_attention_heads', 'mid_block_only_cross_attention', 'cross_attention_norm', 'conv_in_kernel', 'addition_embed_type', 'addition_time_embed_dim', 'num_class_embeds', 'mid_block_type', 'time_embedding_type', 'encoder_hid_dim_type', 'only_cross_attention', 'transformer_layers_per_block', 'dual_cross_attention', 'reverse_transformer_layers_per_block', 'attention_type', 'time_embedding_act_fn', 'class_embed_type', 'resnet_time_scale_shift', 'projection_class_embeddings_input_dim', 'time_cond_proj_dim', 'timestep_post_act', 'class_embeddings_concat', 'resnet_out_scale_factor', 'resnet_skip_time_act', 'use_linear_projection', 'addition_embed_type_num_heads', 'time_embedding_dim', 'dropout'} was not found in config. Values will be initialized to default values.
Downloading readme:   0%|          | 0.00/1.02k [00:00<?, ?B/s]
Downloading readme: 100%|██████████| 1.02k/1.02k [00:00<00:00, 10.5MB/s]
Repo card metadata block was not found. Setting CardData to empty.
07/28/2024 20:35:51 - WARNING - huggingface_hub.repocard - Repo card metadata block was not found. Setting CardData to empty.
Downloading metadata:   0%|          | 0.00/897 [00:00<?, ?B/s]
Downloading metadata: 100%|██████████| 897/897 [00:00<00:00, 10.0MB/s]
Downloading data:   0%|          | 0.00/344M [00:00<?, ?B/s]
Downloading data:   3%|▎         | 10.5M/344M [00:01<00:58, 5.74MB/s]
Downloading data:   6%|▌         | 21.0M/344M [00:02<00:28, 11.3MB/s]
Downloading data:  94%|█████████▍| 323M/344M [00:02<00:00, 260MB/s]  
Downloading data: 100%|█████████▉| 344M/344M [00:02<00:00, 153MB/s]
Downloading data:   0%|          | 0.00/357M [00:00<?, ?B/s]
Downloading data:   3%|▎         | 10.5M/357M [00:01<01:05, 5.29MB/s]
Downloading data:   6%|▌         | 21.4M/357M [00:02<00:30, 10.8MB/s]
Downloading data:  88%|████████▊ | 315M/357M [00:02<00:00, 239MB/s]  
Downloading data: 100%|█████████▉| 357M/357M [00:02<00:00, 148MB/s]
Generating train split:   0%|          | 0/1221 [00:00<?, ? examples/s]
Generating train split:  50%|█████     | 611/1221 [00:00<00:00, 933.52 examples/s]
Generating train split: 100%|██████████| 1221/1221 [00:01<00:00, 987.27 examples/s]
Generating train split: 100%|██████████| 1221/1221 [00:01<00:00, 949.74 examples/s]
07/28/2024 20:36:01 - INFO - __main__ - ***** Running training *****
07/28/2024 20:36:01 - INFO - __main__ -   Num examples = 1221
07/28/2024 20:36:01 - INFO - __main__ -   Num Epochs = 4
07/28/2024 20:36:01 - INFO - __main__ -   Instantaneous batch size per device = 1
07/28/2024 20:36:01 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 4
07/28/2024 20:36:01 - INFO - __main__ -   Gradient Accumulation steps = 4
07/28/2024 20:36:01 - INFO - __main__ -   Total optimization steps = 1000
Steps:   0%|          | 0/1000 [00:00<?, ?it/s]
Steps:   0%|          | 0/1000 [00:01<?, ?it/s, lr=0.0001, step_loss=0.0449]
Steps:   0%|          | 0/1000 [00:01<?, ?it/s, lr=0.0001, step_loss=0.0205]
Steps:   0%|          | 0/1000 [00:01<?, ?it/s, lr=0.0001, step_loss=0.11]  
Steps:   0%|          | 1/1000 [00:01<32:03,  1.93s/it, lr=0.0001, step_loss=0.11]
Steps:   0%|          | 1/1000 [00:01<32:03,  1.93s/it, lr=0.0001, step_loss=0.0236]
Steps:   0%|          | 1/1000 [00:02<32:03,  1.93s/it, lr=0.0001, step_loss=0.395] 
Steps:   0%|          | 1/1000 [00:02<32:03,  1.93s/it, lr=0.0001, step_loss=0.0303]
Steps:   0%|          | 1/1000 [00:02<32:03,  1.93s/it, lr=0.0001, step_loss=0.0615]
Steps:   0%|          | 2/1000 [00:02<22:15,  1.34s/it, lr=0.0001, step_loss=0.0615]
Steps:   0%|          | 2/1000 [00:02<22:15,  1.34s/it, lr=0.0001, step_loss=0.051] 
Steps:   0%|          | 2/1000 [00:03<22:15,  1.34s/it, lr=0.0001, step_loss=0.113]
Steps:   0%|          | 2/1000 [00:03<22:15,  1.34s/it, lr=0.0001, step_loss=0.0491]
Steps:   0%|          | 2/1000 [00:03<22:15,  1.34s/it, lr=0.0001, step_loss=0.114] 
Steps:   0%|          | 3/1000 [00:03<19:23,  1.17s/it, lr=0.0001, step_loss=0.114]
Steps:   0%|          | 3/1000 [00:03<19:23,  1.17s/it, lr=0.0001, step_loss=0.117]
Steps:   0%|          | 3/1000 [00:04<19:23,  1.17s/it, lr=0.0001, step_loss=0.0117]
Steps:   0%|          | 3/1000 [00:04<19:23,  1.17s/it, lr=0.0001, step_loss=0.0704]
Steps:   0%|          | 3/1000 [00:04<19:23,  1.17s/it, lr=0.0001, step_loss=0.253] 
Steps:   0%|          | 4/1000 [00:04<18:01,  1.09s/it, lr=0.0001, step_loss=0.253]
Steps:   0%|          | 4/1000 [00:04<18:01,  1.09s/it, lr=0.0001, step_loss=0.00302]
Steps:   0%|          | 4/1000 [00:05<18:01,  1.09s/it, lr=0.0001, step_loss=0.0191] 
Steps:   0%|          | 4/1000 [00:05<18:01,  1.09s/it, lr=0.0001, step_loss=0.00939]
Steps:   0%|          | 4/1000 [00:05<18:01,  1.09s/it, lr=0.0001, step_loss=0.00298]
Steps:   0%|          | 5/1000 [00:05<17:17,  1.04s/it, lr=0.0001, step_loss=0.00298]
Steps:   0%|          | 5/1000 [00:05<17:17,  1.04s/it, lr=0.0001, step_loss=0.016]  
Steps:   0%|          | 5/1000 [00:06<17:17,  1.04s/it, lr=0.0001, step_loss=0.0444]
Steps:   0%|          | 5/1000 [00:06<17:17,  1.04s/it, lr=0.0001, step_loss=0.0858]
Steps:   0%|          | 5/1000 [00:06<17:17,  1.04s/it, lr=0.0001, step_loss=0.0568]
Steps:   1%|          | 6/1000 [00:06<16:50,  1.02s/it, lr=0.0001, step_loss=0.0568]
Steps:   1%|          | 6/1000 [00:06<16:50,  1.02s/it, lr=0.0001, step_loss=0.156] 
Steps:   1%|          | 6/1000 [00:06<16:50,  1.02s/it, lr=0.0001, step_loss=0.00715]
Steps:   1%|          | 6/1000 [00:07<16:50,  1.02s/it, lr=0.0001, step_loss=0.0184] 
Steps:   1%|          | 6/1000 [00:07<16:50,  1.02s/it, lr=0.0001, step_loss=0.0411]
Steps:   1%|          | 7/1000 [00:07<16:33,  1.00s/it, lr=0.0001, step_loss=0.0411]
Steps:   1%|          | 7/1000 [00:07<16:33,  1.00s/it, lr=0.0001, step_loss=0.0938]
Steps:   1%|          | 7/1000 [00:07<16:33,  1.00s/it, lr=0.0001, step_loss=0.228] 
Steps:   1%|          | 7/1000 [00:08<16:33,  1.00s/it, lr=0.0001, step_loss=0.0383]
Steps:   1%|          | 7/1000 [00:08<16:33,  1.00s/it, lr=0.0001, step_loss=0.0454]
Steps:   1%|          | 8/1000 [00:08<16:21,  1.01it/s, lr=0.0001, step_loss=0.0454]
Steps:   1%|          | 8/1000 [00:08<16:21,  1.01it/s, lr=0.0001, step_loss=0.242] 
Steps:   1%|          | 8/1000 [00:08<16:21,  1.01it/s, lr=0.0001, step_loss=0.00357]
Steps:   1%|          | 8/1000 [00:09<16:21,  1.01it/s, lr=0.0001, step_loss=0.0178] 
Steps:   1%|          | 8/1000 [00:09<16:21,  1.01it/s, lr=0.0001, step_loss=0.0471]
Steps:   1%|          | 9/1000 [00:09<16:13,  1.02it/s, lr=0.0001, step_loss=0.0471]
Steps:   1%|          | 9/1000 [00:09<16:13,  1.02it/s, lr=0.0001, step_loss=0.145] 
Steps:   1%|          | 9/1000 [00:09<16:13,  1.02it/s, lr=0.0001, step_loss=0.257]
Steps:   1%|          | 9/1000 [00:10<16:13,  1.02it/s, lr=0.0001, step_loss=0.0176]
Steps:   1%|          | 9/1000 [00:10<16:13,  1.02it/s, lr=0.0001, step_loss=0.0468]
Steps:   1%|          | 10/1000 [00:10<16:07,  1.02it/s, lr=0.0001, step_loss=0.0468]
Steps:   1%|          | 10/1000 [00:10<16:07,  1.02it/s, lr=0.0001, step_loss=0.0277]
Steps:   1%|          | 10/1000 [00:10<16:07,  1.02it/s, lr=0.0001, step_loss=0.041] 
Steps:   1%|          | 10/1000 [00:11<16:07,  1.02it/s, lr=0.0001, step_loss=0.0492]
Steps:   1%|          | 10/1000 [00:11<16:07,  1.02it/s, lr=0.0001, step_loss=0.00988]
Steps:   1%|          | 11/1000 [00:11<16:03,  1.03it/s, lr=0.0001, step_loss=0.00988]
Steps:   1%|          | 11/1000 [00:11<16:03,  1.03it/s, lr=0.0001, step_loss=0.0173] 
Steps:   1%|          | 11/1000 [00:11<16:03,  1.03it/s, lr=0.0001, step_loss=0.0485]
Steps:   1%|          | 11/1000 [00:12<16:03,  1.03it/s, lr=0.0001, step_loss=0.183] 
Steps:   1%|          | 11/1000 [00:12<16:03,  1.03it/s, lr=0.0001, step_loss=0.091]
Steps:   1%|          | 12/1000 [00:12<16:01,  1.03it/s, lr=0.0001, step_loss=0.091]
Steps:   1%|          | 12/1000 [00:12<16:01,  1.03it/s, lr=0.0001, step_loss=0.3]  
Steps:   1%|          | 12/1000 [00:12<16:01,  1.03it/s, lr=0.0001, step_loss=0.16]
Steps:   1%|          | 12/1000 [00:13<16:01,  1.03it/s, lr=0.0001, step_loss=0.00339]
Steps:   1%|          | 12/1000 [00:13<16:01,  1.03it/s, lr=0.0001, step_loss=0.0459] 
Steps:   1%|▏         | 13/1000 [00:13<15:58,  1.03it/s, lr=0.0001, step_loss=0.0459]
Steps:   1%|▏         | 13/1000 [00:13<15:58,  1.03it/s, lr=0.0001, step_loss=0.0063]
Steps:   1%|▏         | 13/1000 [00:13<15:58,  1.03it/s, lr=0.0001, step_loss=0.104] 
Steps:   1%|▏         | 13/1000 [00:14<15:58,  1.03it/s, lr=0.0001, step_loss=0.0186]
Steps:   1%|▏         | 13/1000 [00:14<15:58,  1.03it/s, lr=0.0001, step_loss=0.0986]
Steps:   1%|▏         | 14/1000 [00:14<15:56,  1.03it/s, lr=0.0001, step_loss=0.0986]
Steps:   1%|▏         | 14/1000 [00:14<15:56,  1.03it/s, lr=0.0001, step_loss=0.216] 
Steps:   1%|▏         | 14/1000 [00:14<15:56,  1.03it/s, lr=0.0001, step_loss=0.163]
Steps:   1%|▏         | 14/1000 [00:14<15:56,  1.03it/s, lr=0.0001, step_loss=0.0431]
Steps:   1%|▏         | 14/1000 [00:15<15:56,  1.03it/s, lr=0.0001, step_loss=0.00439]
Steps:   2%|▏         | 15/1000 [00:15<15:55,  1.03it/s, lr=0.0001, step_loss=0.00439]
Steps:   2%|▏         | 15/1000 [00:15<15:55,  1.03it/s, lr=9.99e-5, step_loss=0.0725]
Steps:   2%|▏         | 15/1000 [00:15<15:55,  1.03it/s, lr=9.99e-5, step_loss=0.0353]
Steps:   2%|▏         | 15/1000 [00:15<15:55,  1.03it/s, lr=9.99e-5, step_loss=0.015] 
Steps:   2%|▏         | 15/1000 [00:16<15:55,  1.03it/s, lr=9.99e-5, step_loss=0.0626]
Steps:   2%|▏         | 16/1000 [00:16<15:53,  1.03it/s, lr=9.99e-5, step_loss=0.0626]
Steps:   2%|▏         | 16/1000 [00:16<15:53,  1.03it/s, lr=9.99e-5, step_loss=0.106] 
Steps:   2%|▏         | 16/1000 [00:16<15:53,  1.03it/s, lr=9.99e-5, step_loss=0.0484]
Steps:   2%|▏         | 16/1000 [00:16<15:53,  1.03it/s, lr=9.99e-5, step_loss=0.0877]
Steps:   2%|▏         | 16/1000 [00:17<15:53,  1.03it/s, lr=9.99e-5, step_loss=0.111] 
Steps:   2%|▏         | 17/1000 [00:17<15:52,  1.03it/s, lr=9.99e-5, step_loss=0.111]
Steps:   2%|▏         | 17/1000 [00:17<15:52,  1.03it/s, lr=9.99e-5, step_loss=0.00685]
Steps:   2%|▏         | 17/1000 [00:17<15:52,  1.03it/s, lr=9.99e-5, step_loss=0.234]  
Steps:   2%|▏         | 17/1000 [00:17<15:52,  1.03it/s, lr=9.99e-5, step_loss=0.0229]
Steps:   2%|▏         | 17/1000 [00:18<15:52,  1.03it/s, lr=9.99e-5, step_loss=0.0277]
Steps:   2%|▏         | 18/1000 [00:18<15:51,  1.03it/s, lr=9.99e-5, step_loss=0.0277]
Steps:   2%|▏         | 18/1000 [00:18<15:51,  1.03it/s, lr=9.99e-5, step_loss=0.195] 
Steps:   2%|▏         | 18/1000 [00:18<15:51,  1.03it/s, lr=9.99e-5, step_loss=0.339]
Steps:   2%|▏         | 18/1000 [00:18<15:51,  1.03it/s, lr=9.99e-5, step_loss=0.0522]
Steps:   2%|▏         | 18/1000 [00:19<15:51,  1.03it/s, lr=9.99e-5, step_loss=0.253] 
Steps:   2%|▏         | 19/1000 [00:19<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.253]
Steps:   2%|▏         | 19/1000 [00:19<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.0575]
Steps:   2%|▏         | 19/1000 [00:19<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.0183]
Steps:   2%|▏         | 19/1000 [00:19<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.126] 
Steps:   2%|▏         | 19/1000 [00:20<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.119]
Steps:   2%|▏         | 20/1000 [00:20<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.119]
Steps:   2%|▏         | 20/1000 [00:20<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.158]
Steps:   2%|▏         | 20/1000 [00:20<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.00463]
Steps:   2%|▏         | 20/1000 [00:20<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.0842] 
Steps:   2%|▏         | 20/1000 [00:21<15:49,  1.03it/s, lr=9.99e-5, step_loss=0.0123]
Steps:   2%|▏         | 21/1000 [00:21<15:47,  1.03it/s, lr=9.99e-5, step_loss=0.0123]
Steps:   2%|▏         | 21/1000 [00:21<15:47,  1.03it/s, lr=9.99e-5, step_loss=0.0165]
Steps:   2%|▏         | 21/1000 [00:21<15:47,  1.03it/s, lr=9.99e-5, step_loss=0.00404]
Steps:   2%|▏         | 21/1000 [00:21<15:47,  1.03it/s, lr=9.99e-5, step_loss=0.00853]
Steps:   2%|▏         | 21/1000 [00:21<15:47,  1.03it/s, lr=9.99e-5, step_loss=0.274]  
Steps:   2%|▏         | 22/1000 [00:22<15:46,  1.03it/s, lr=9.99e-5, step_loss=0.274]
Steps:   2%|▏         | 22/1000 [00:22<15:46,  1.03it/s, lr=9.99e-5, step_loss=0.27] 
Steps:   2%|▏         | 22/1000 [00:22<15:46,  1.03it/s, lr=9.99e-5, step_loss=0.13]
Steps:   2%|▏         | 22/1000 [00:22<15:46,  1.03it/s, lr=9.99e-5, step_loss=0.019]
Steps:   2%|▏         | 22/1000 [00:22<15:46,  1.03it/s, lr=9.99e-5, step_loss=0.127]
Steps:   2%|▏         | 23/1000 [00:23<15:45,  1.03it/s, lr=9.99e-5, step_loss=0.127]
Steps:   2%|▏         | 23/1000 [00:23<15:45,  1.03it/s, lr=9.99e-5, step_loss=0.0133]
Steps:   2%|▏         | 23/1000 [00:23<15:45,  1.03it/s, lr=9.99e-5, step_loss=0.06]  
Steps:   2%|▏         | 23/1000 [00:23<15:45,  1.03it/s, lr=9.99e-5, step_loss=0.623]
Steps:   2%|▏         | 23/1000 [00:23<15:45,  1.03it/s, lr=9.99e-5, step_loss=0.092]
Steps:   2%|▏         | 24/1000 [00:24<15:43,  1.03it/s, lr=9.99e-5, step_loss=0.092]
Steps:   2%|▏         | 24/1000 [00:24<15:43,  1.03it/s, lr=9.99e-5, step_loss=0.198]
Steps:   2%|▏         | 24/1000 [00:24<15:43,  1.03it/s, lr=9.99e-5, step_loss=0.216]
Steps:   2%|▏         | 24/1000 [00:24<15:43,  1.03it/s, lr=9.99e-5, step_loss=0.00696]
Steps:   2%|▏         | 24/1000 [00:24<15:43,  1.03it/s, lr=9.99e-5, step_loss=0.0231] 
Steps:   2%|▎         | 25/1000 [00:25<15:42,  1.03it/s, lr=9.99e-5, step_loss=0.0231]
Steps:   2%|▎         | 25/1000 [00:25<15:42,  1.03it/s, lr=9.98e-5, step_loss=0.042] 
Steps:   2%|▎         | 25/1000 [00:25<15:42,  1.03it/s, lr=9.98e-5, step_loss=0.124]
Steps:   2%|▎         | 25/1000 [00:25<15:42,  1.03it/s, lr=9.98e-5, step_loss=0.0205]
Steps:   2%|▎         | 25/1000 [00:25<15:42,  1.03it/s, lr=9.98e-5, step_loss=0.251] 
Steps:   3%|▎         | 26/1000 [00:26<15:41,  1.03it/s, lr=9.98e-5, step_loss=0.251]
Steps:   3%|▎         | 26/1000 [00:26<15:41,  1.03it/s, lr=9.98e-5, step_loss=0.156]
Steps:   3%|▎         | 26/1000 [00:26<15:41,  1.03it/s, lr=9.98e-5, step_loss=0.0111]
Steps:   3%|▎         | 26/1000 [00:26<15:41,  1.03it/s, lr=9.98e-5, step_loss=0.16]  
Steps:   3%|▎         | 26/1000 [00:26<15:41,  1.03it/s, lr=9.98e-5, step_loss=0.0587]
Steps:   3%|▎         | 27/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.0587]
Steps:   3%|▎         | 27/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.0143]
Steps:   3%|▎         | 27/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.0206]
Steps:   3%|▎         | 27/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.03]  
Steps:   3%|▎         | 27/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.165]
Steps:   3%|▎         | 28/1000 [00:27<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.165]
Steps:   3%|▎         | 28/1000 [00:28<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.238]
Steps:   3%|▎         | 28/1000 [00:28<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.00729]
Steps:   3%|▎         | 28/1000 [00:28<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.0354] 
Steps:   3%|▎         | 28/1000 [00:28<15:40,  1.03it/s, lr=9.98e-5, step_loss=0.00228]
Steps:   3%|▎         | 29/1000 [00:28<15:39,  1.03it/s, lr=9.98e-5, step_loss=0.00228]
Steps:   3%|▎         | 29/1000 [00:28<15:39,  1.03it/s, lr=9.98e-5, step_loss=0.0157] 
Steps:   3%|▎         | 29/1000 [00:29<15:39,  1.03it/s, lr=9.98e-5, step_loss=0.0366]
Steps:   3%|▎         | 29/1000 [00:29<15:39,  1.03it/s, lr=9.98e-5, step_loss=0.181] 
Steps:   3%|▎         | 29/1000 [00:29<15:39,  1.03it/s, lr=9.98e-5, step_loss=0.00866]
Steps:   3%|▎         | 30/1000 [00:29<15:38,  1.03it/s, lr=9.98e-5, step_loss=0.00866]
Steps:   3%|▎         | 30/1000 [00:29<15:38,  1.03it/s, lr=9.98e-5, step_loss=0.0573] 
Steps:   3%|▎         | 30/1000 [00:30<15:38,  1.03it/s, lr=9.98e-5, step_loss=0.0443]
Steps:   3%|▎         | 30/1000 [00:30<15:38,  1.03it/s, lr=9.98e-5, step_loss=0.0434]
Steps:   3%|▎         | 30/1000 [00:30<15:38,  1.03it/s, lr=9.98e-5, step_loss=0.0801]
Steps:   3%|▎         | 31/1000 [00:30<15:37,  1.03it/s, lr=9.98e-5, step_loss=0.0801]
Steps:   3%|▎         | 31/1000 [00:30<15:37,  1.03it/s, lr=9.98e-5, step_loss=0.0274]
Steps:   3%|▎         | 31/1000 [00:31<15:37,  1.03it/s, lr=9.98e-5, step_loss=0.154] 
Steps:   3%|▎         | 31/1000 [00:31<15:37,  1.03it/s, lr=9.98e-5, step_loss=0.0032]
Steps:   3%|▎         | 31/1000 [00:31<15:37,  1.03it/s, lr=9.98e-5, step_loss=0.0307]
Steps:   3%|▎         | 32/1000 [00:31<15:36,  1.03it/s, lr=9.98e-5, step_loss=0.0307]
Steps:   3%|▎         | 32/1000 [00:31<15:36,  1.03it/s, lr=9.97e-5, step_loss=0.0387]
Steps:   3%|▎         | 32/1000 [00:32<15:36,  1.03it/s, lr=9.97e-5, step_loss=0.00565]
Steps:   3%|▎         | 32/1000 [00:32<15:36,  1.03it/s, lr=9.97e-5, step_loss=0.00356]
Steps:   3%|▎         | 32/1000 [00:32<15:36,  1.03it/s, lr=9.97e-5, step_loss=0.2]    
Steps:   3%|▎         | 33/1000 [00:32<15:35,  1.03it/s, lr=9.97e-5, step_loss=0.2]
Steps:   3%|▎         | 33/1000 [00:32<15:35,  1.03it/s, lr=9.97e-5, step_loss=0.0307]
Steps:   3%|▎         | 33/1000 [00:33<15:35,  1.03it/s, lr=9.97e-5, step_loss=0.0291]
Steps:   3%|▎         | 33/1000 [00:33<15:35,  1.03it/s, lr=9.97e-5, step_loss=0.195] 
Steps:   3%|▎         | 33/1000 [00:33<15:35,  1.03it/s, lr=9.97e-5, step_loss=0.0531]
Steps:   3%|▎         | 34/1000 [00:33<16:03,  1.00it/s, lr=9.97e-5, step_loss=0.0531]
Steps:   3%|▎         | 34/1000 [00:33<16:03,  1.00it/s, lr=9.97e-5, step_loss=0.0912]
Steps:   3%|▎         | 34/1000 [00:34<16:03,  1.00it/s, lr=9.97e-5, step_loss=0.0307]
Steps:   3%|▎         | 34/1000 [00:34<16:03,  1.00it/s, lr=9.97e-5, step_loss=0.256] 
Steps:   3%|▎         | 34/1000 [00:34<16:03,  1.00it/s, lr=9.97e-5, step_loss=0.0299]
Steps:   4%|▎         | 35/1000 [00:34<15:53,  1.01it/s, lr=9.97e-5, step_loss=0.0299]
Steps:   4%|▎         | 35/1000 [00:34<15:53,  1.01it/s, lr=9.97e-5, step_loss=0.511] 
Steps:   4%|▎         | 35/1000 [00:35<15:53,  1.01it/s, lr=9.97e-5, step_loss=0.154]
Steps:   4%|▎         | 35/1000 [00:35<15:53,  1.01it/s, lr=9.97e-5, step_loss=0.412]
Steps:   4%|▎         | 35/1000 [00:35<15:53,  1.01it/s, lr=9.97e-5, step_loss=0.0619]
Steps:   4%|▎         | 36/1000 [00:35<15:46,  1.02it/s, lr=9.97e-5, step_loss=0.0619]
Steps:   4%|▎         | 36/1000 [00:35<15:46,  1.02it/s, lr=9.97e-5, step_loss=0.14]  
Steps:   4%|▎         | 36/1000 [00:36<15:46,  1.02it/s, lr=9.97e-5, step_loss=0.124]
Steps:   4%|▎         | 36/1000 [00:36<15:46,  1.02it/s, lr=9.97e-5, step_loss=0.014]
Steps:   4%|▎         | 36/1000 [00:36<15:46,  1.02it/s, lr=9.97e-5, step_loss=0.178]
Steps:   4%|▎         | 37/1000 [00:36<15:40,  1.02it/s, lr=9.97e-5, step_loss=0.178]
Steps:   4%|▎         | 37/1000 [00:36<15:40,  1.02it/s, lr=9.97e-5, step_loss=0.015]
Steps:   4%|▎         | 37/1000 [00:37<15:40,  1.02it/s, lr=9.97e-5, step_loss=0.0246]
Steps:   4%|▎         | 37/1000 [00:37<15:40,  1.02it/s, lr=9.97e-5, step_loss=0.00584]
Steps:   4%|▎         | 37/1000 [00:37<15:40,  1.02it/s, lr=9.97e-5, step_loss=0.0602] 
Steps:   4%|▍         | 38/1000 [00:37<15:36,  1.03it/s, lr=9.97e-5, step_loss=0.0602]
Steps:   4%|▍         | 38/1000 [00:37<15:36,  1.03it/s, lr=9.96e-5, step_loss=0.143] 
Steps:   4%|▍         | 38/1000 [00:38<15:36,  1.03it/s, lr=9.96e-5, step_loss=0.323]
Steps:   4%|▍         | 38/1000 [00:38<15:36,  1.03it/s, lr=9.96e-5, step_loss=0.0109]
Steps:   4%|▍         | 38/1000 [00:38<15:36,  1.03it/s, lr=9.96e-5, step_loss=0.00413]
Steps:   4%|▍         | 39/1000 [00:38<15:34,  1.03it/s, lr=9.96e-5, step_loss=0.00413]
Steps:   4%|▍         | 39/1000 [00:38<15:34,  1.03it/s, lr=9.96e-5, step_loss=0.0191] 
Steps:   4%|▍         | 39/1000 [00:39<15:34,  1.03it/s, lr=9.96e-5, step_loss=0.156] 
Steps:   4%|▍         | 39/1000 [00:39<15:34,  1.03it/s, lr=9.96e-5, step_loss=0.102]
Steps:   4%|▍         | 39/1000 [00:39<15:34,  1.03it/s, lr=9.96e-5, step_loss=0.0599]
Steps:   4%|▍         | 40/1000 [00:39<15:32,  1.03it/s, lr=9.96e-5, step_loss=0.0599]
Steps:   4%|▍         | 40/1000 [00:39<15:32,  1.03it/s, lr=9.96e-5, step_loss=0.00613]
Steps:   4%|▍         | 40/1000 [00:39<15:32,  1.03it/s, lr=9.96e-5, step_loss=0.0253] 
Steps:   4%|▍         | 40/1000 [00:40<15:32,  1.03it/s, lr=9.96e-5, step_loss=0.0132]
Steps:   4%|▍         | 40/1000 [00:40<15:32,  1.03it/s, lr=9.96e-5, step_loss=0.0484]
Steps:   4%|▍         | 41/1000 [00:40<15:30,  1.03it/s, lr=9.96e-5, step_loss=0.0484]
Steps:   4%|▍         | 41/1000 [00:40<15:30,  1.03it/s, lr=9.96e-5, step_loss=0.0514]
Steps:   4%|▍         | 41/1000 [00:40<15:30,  1.03it/s, lr=9.96e-5, step_loss=0.0317]
Steps:   4%|▍         | 41/1000 [00:41<15:30,  1.03it/s, lr=9.96e-5, step_loss=0.0658]
Steps:   4%|▍         | 41/1000 [00:41<15:30,  1.03it/s, lr=9.96e-5, step_loss=0.336] 
Steps:   4%|▍         | 42/1000 [00:41<15:29,  1.03it/s, lr=9.96e-5, step_loss=0.336]
Steps:   4%|▍         | 42/1000 [00:41<15:29,  1.03it/s, lr=9.96e-5, step_loss=0.0844]
Steps:   4%|▍         | 42/1000 [00:41<15:29,  1.03it/s, lr=9.96e-5, step_loss=0.0272]
Steps:   4%|▍         | 42/1000 [00:42<15:29,  1.03it/s, lr=9.96e-5, step_loss=0.00773]
Steps:   4%|▍         | 42/1000 [00:42<15:29,  1.03it/s, lr=9.96e-5, step_loss=0.19]   
Steps:   4%|▍         | 43/1000 [00:42<15:28,  1.03it/s, lr=9.96e-5, step_loss=0.19]
Steps:   4%|▍         | 43/1000 [00:42<15:28,  1.03it/s, lr=9.95e-5, step_loss=0.0828]
Steps:   4%|▍         | 43/1000 [00:42<15:28,  1.03it/s, lr=9.95e-5, step_loss=0.0107]
Steps:   4%|▍         | 43/1000 [00:43<15:28,  1.03it/s, lr=9.95e-5, step_loss=0.176] 
Steps:   4%|▍         | 43/1000 [00:43<15:28,  1.03it/s, lr=9.95e-5, step_loss=0.0368]
Steps:   4%|▍         | 44/1000 [00:43<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.0368]
Steps:   4%|▍         | 44/1000 [00:43<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.0271]
Steps:   4%|▍         | 44/1000 [00:43<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.0664]
Steps:   4%|▍         | 44/1000 [00:44<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.0108]
Steps:   4%|▍         | 44/1000 [00:44<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.00935]
Steps:   4%|▍         | 45/1000 [00:44<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.00935]
Steps:   4%|▍         | 45/1000 [00:44<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.0771] 
Steps:   4%|▍         | 45/1000 [00:44<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.148] 
Steps:   4%|▍         | 45/1000 [00:45<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.289]
Steps:   4%|▍         | 45/1000 [00:45<15:26,  1.03it/s, lr=9.95e-5, step_loss=0.00651]
Steps:   5%|▍         | 46/1000 [00:45<15:25,  1.03it/s, lr=9.95e-5, step_loss=0.00651]
Steps:   5%|▍         | 46/1000 [00:45<15:25,  1.03it/s, lr=9.95e-5, step_loss=0.0104] 
Steps:   5%|▍         | 46/1000 [00:45<15:25,  1.03it/s, lr=9.95e-5, step_loss=0.0593]
Steps:   5%|▍         | 46/1000 [00:46<15:25,  1.03it/s, lr=9.95e-5, step_loss=0.0988]
Steps:   5%|▍         | 46/1000 [00:46<15:25,  1.03it/s, lr=9.95e-5, step_loss=0.0072]
Steps:   5%|▍         | 47/1000 [00:46<15:24,  1.03it/s, lr=9.95e-5, step_loss=0.0072]
Steps:   5%|▍         | 47/1000 [00:46<15:24,  1.03it/s, lr=9.95e-5, step_loss=0.0529]
Steps:   5%|▍         | 47/1000 [00:46<15:24,  1.03it/s, lr=9.95e-5, step_loss=0.198] 
Steps:   5%|▍         | 47/1000 [00:47<15:24,  1.03it/s, lr=9.95e-5, step_loss=0.021]
Steps:   5%|▍         | 47/1000 [00:47<15:24,  1.03it/s, lr=9.95e-5, step_loss=0.107]
Steps:   5%|▍         | 48/1000 [00:47<15:23,  1.03it/s, lr=9.95e-5, step_loss=0.107]
Steps:   5%|▍         | 48/1000 [00:47<15:23,  1.03it/s, lr=9.94e-5, step_loss=0.111]
Steps:   5%|▍         | 48/1000 [00:47<15:23,  1.03it/s, lr=9.94e-5, step_loss=0.0352]
Steps:   5%|▍         | 48/1000 [00:47<15:23,  1.03it/s, lr=9.94e-5, step_loss=0.00644]
Steps:   5%|▍         | 48/1000 [00:48<15:23,  1.03it/s, lr=9.94e-5, step_loss=0.0492] 
Steps:   5%|▍         | 49/1000 [00:48<15:22,  1.03it/s, lr=9.94e-5, step_loss=0.0492]
Steps:   5%|▍         | 49/1000 [00:48<15:22,  1.03it/s, lr=9.94e-5, step_loss=0.0614]
Steps:   5%|▍         | 49/1000 [00:48<15:22,  1.03it/s, lr=9.94e-5, step_loss=0.0105]
Steps:   5%|▍         | 49/1000 [00:48<15:22,  1.03it/s, lr=9.94e-5, step_loss=0.0136]
Steps:   5%|▍         | 49/1000 [00:49<15:22,  1.03it/s, lr=9.94e-5, step_loss=0.0298]
Steps:   5%|▌         | 50/1000 [00:49<15:21,  1.03it/s, lr=9.94e-5, step_loss=0.0298]
Steps:   5%|▌         | 50/1000 [00:49<15:21,  1.03it/s, lr=9.94e-5, step_loss=0.00372]
Steps:   5%|▌         | 50/1000 [00:49<15:21,  1.03it/s, lr=9.94e-5, step_loss=0.0437] 
Steps:   5%|▌         | 50/1000 [00:49<15:21,  1.03it/s, lr=9.94e-5, step_loss=0.00401]
Steps:   5%|▌         | 50/1000 [00:50<15:21,  1.03it/s, lr=9.94e-5, step_loss=0.0182] 
Steps:   5%|▌         | 51/1000 [00:50<15:20,  1.03it/s, lr=9.94e-5, step_loss=0.0182]
Steps:   5%|▌         | 51/1000 [00:50<15:20,  1.03it/s, lr=9.94e-5, step_loss=0.388] 
Steps:   5%|▌         | 51/1000 [00:50<15:20,  1.03it/s, lr=9.94e-5, step_loss=0.0868]
Steps:   5%|▌         | 51/1000 [00:50<15:20,  1.03it/s, lr=9.94e-5, step_loss=0.035] 
Steps:   5%|▌         | 51/1000 [00:51<15:20,  1.03it/s, lr=9.94e-5, step_loss=0.0642]
Steps:   5%|▌         | 52/1000 [00:51<15:19,  1.03it/s, lr=9.94e-5, step_loss=0.0642]
Steps:   5%|▌         | 52/1000 [00:51<15:19,  1.03it/s, lr=9.93e-5, step_loss=0.0532]
Steps:   5%|▌         | 52/1000 [00:51<15:19,  1.03it/s, lr=9.93e-5, step_loss=0.281] 
Steps:   5%|▌         | 52/1000 [00:51<15:19,  1.03it/s, lr=9.93e-5, step_loss=0.0103]
Steps:   5%|▌         | 52/1000 [00:52<15:19,  1.03it/s, lr=9.93e-5, step_loss=0.159] 
Steps:   5%|▌         | 53/1000 [00:52<15:18,  1.03it/s, lr=9.93e-5, step_loss=0.159]
Steps:   5%|▌         | 53/1000 [00:52<15:18,  1.03it/s, lr=9.93e-5, step_loss=0.0828]
Steps:   5%|▌         | 53/1000 [00:52<15:18,  1.03it/s, lr=9.93e-5, step_loss=0.056] 
Steps:   5%|▌         | 53/1000 [00:52<15:18,  1.03it/s, lr=9.93e-5, step_loss=0.0709]
Steps:   5%|▌         | 53/1000 [00:53<15:18,  1.03it/s, lr=9.93e-5, step_loss=0.0402]
Steps:   5%|▌         | 54/1000 [00:53<15:17,  1.03it/s, lr=9.93e-5, step_loss=0.0402]
Steps:   5%|▌         | 54/1000 [00:53<15:17,  1.03it/s, lr=9.93e-5, step_loss=0.059] 
Steps:   5%|▌         | 54/1000 [00:53<15:17,  1.03it/s, lr=9.93e-5, step_loss=0.00804]
Steps:   5%|▌         | 54/1000 [00:53<15:17,  1.03it/s, lr=9.93e-5, step_loss=0.121]  
Steps:   5%|▌         | 54/1000 [00:54<15:17,  1.03it/s, lr=9.93e-5, step_loss=0.0587]
Steps:   6%|▌         | 55/1000 [00:54<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.0587]
Steps:   6%|▌         | 55/1000 [00:54<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.0733]
Steps:   6%|▌         | 55/1000 [00:54<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.0229]
Steps:   6%|▌         | 55/1000 [00:54<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.045] 
Steps:   6%|▌         | 55/1000 [00:55<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.00702]
Steps:   6%|▌         | 56/1000 [00:55<15:16,  1.03it/s, lr=9.93e-5, step_loss=0.00702]
Steps:   6%|▌         | 56/1000 [00:55<15:16,  1.03it/s, lr=9.92e-5, step_loss=0.169]  
Steps:   6%|▌         | 56/1000 [00:55<15:16,  1.03it/s, lr=9.92e-5, step_loss=0.0525]
Steps:   6%|▌         | 56/1000 [00:55<15:16,  1.03it/s, lr=9.92e-5, step_loss=0.221] 
Steps:   6%|▌         | 56/1000 [00:55<15:16,  1.03it/s, lr=9.92e-5, step_loss=0.16] 
Steps:   6%|▌         | 57/1000 [00:56<15:15,  1.03it/s, lr=9.92e-5, step_loss=0.16]
Steps:   6%|▌         | 57/1000 [00:56<15:15,  1.03it/s, lr=9.92e-5, step_loss=0.0553]
Steps:   6%|▌         | 57/1000 [00:56<15:15,  1.03it/s, lr=9.92e-5, step_loss=0.0911]
Steps:   6%|▌         | 57/1000 [00:56<15:15,  1.03it/s, lr=9.92e-5, step_loss=0.00762]
Steps:   6%|▌         | 57/1000 [00:56<15:15,  1.03it/s, lr=9.92e-5, step_loss=0.0412] 
Steps:   6%|▌         | 58/1000 [00:57<15:13,  1.03it/s, lr=9.92e-5, step_loss=0.0412]
Steps:   6%|▌         | 58/1000 [00:57<15:13,  1.03it/s, lr=9.92e-5, step_loss=0.023] 
Steps:   6%|▌         | 58/1000 [00:57<15:13,  1.03it/s, lr=9.92e-5, step_loss=0.0721]
Steps:   6%|▌         | 58/1000 [00:57<15:13,  1.03it/s, lr=9.92e-5, step_loss=0.153] 
Steps:   6%|▌         | 58/1000 [00:57<15:13,  1.03it/s, lr=9.92e-5, step_loss=0.0427]
Steps:   6%|▌         | 59/1000 [00:58<15:12,  1.03it/s, lr=9.92e-5, step_loss=0.0427]
Steps:   6%|▌         | 59/1000 [00:58<15:12,  1.03it/s, lr=9.91e-5, step_loss=0.302] 
Steps:   6%|▌         | 59/1000 [00:58<15:12,  1.03it/s, lr=9.91e-5, step_loss=0.00986]
Steps:   6%|▌         | 59/1000 [00:58<15:12,  1.03it/s, lr=9.91e-5, step_loss=0.166]  
Steps:   6%|▌         | 59/1000 [00:58<15:12,  1.03it/s, lr=9.91e-5, step_loss=0.00757]
Steps:   6%|▌         | 60/1000 [00:59<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.00757]
Steps:   6%|▌         | 60/1000 [00:59<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.0742] 
Steps:   6%|▌         | 60/1000 [00:59<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.0115]
Steps:   6%|▌         | 60/1000 [00:59<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.00248]
Steps:   6%|▌         | 60/1000 [00:59<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.0884] 
Steps:   6%|▌         | 61/1000 [01:00<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.0884]
Steps:   6%|▌         | 61/1000 [01:00<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.0141]
Steps:   6%|▌         | 61/1000 [01:00<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.019] 
Steps:   6%|▌         | 61/1000 [01:00<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.645]
Steps:   6%|▌         | 61/1000 [01:00<15:11,  1.03it/s, lr=9.91e-5, step_loss=0.142]
Steps:   6%|▌         | 62/1000 [01:01<15:10,  1.03it/s, lr=9.91e-5, step_loss=0.142]
Steps:   6%|▌         | 62/1000 [01:01<15:10,  1.03it/s, lr=9.91e-5, step_loss=0.0242]
Steps:   6%|▌         | 62/1000 [01:01<15:10,  1.03it/s, lr=9.91e-5, step_loss=0.0751]
Steps:   6%|▌         | 62/1000 [01:01<15:10,  1.03it/s, lr=9.91e-5, step_loss=0.0383]
Steps:   6%|▌         | 62/1000 [01:01<15:10,  1.03it/s, lr=9.91e-5, step_loss=0.0017]
Steps:   6%|▋         | 63/1000 [01:02<15:08,  1.03it/s, lr=9.91e-5, step_loss=0.0017]
Steps:   6%|▋         | 63/1000 [01:02<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0376] 
Steps:   6%|▋         | 63/1000 [01:02<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.105] 
Steps:   6%|▋         | 63/1000 [01:02<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0166]
Steps:   6%|▋         | 63/1000 [01:02<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0184]
Steps:   6%|▋         | 64/1000 [01:02<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0184]
Steps:   6%|▋         | 64/1000 [01:03<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0347]
Steps:   6%|▋         | 64/1000 [01:03<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.0151]
Steps:   6%|▋         | 64/1000 [01:03<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.117] 
Steps:   6%|▋         | 64/1000 [01:03<15:08,  1.03it/s, lr=9.9e-5, step_loss=0.251]
Steps:   6%|▋         | 65/1000 [01:03<15:07,  1.03it/s, lr=9.9e-5, step_loss=0.251]
Steps:   6%|▋         | 65/1000 [01:03<15:07,  1.03it/s, lr=9.9e-5, step_loss=0.0547]
Steps:   6%|▋         | 65/1000 [01:04<15:07,  1.03it/s, lr=9.9e-5, step_loss=0.167] 
Steps:   6%|▋         | 65/1000 [01:04<15:07,  1.03it/s, lr=9.9e-5, step_loss=0.141]
Steps:   6%|▋         | 65/1000 [01:04<15:07,  1.03it/s, lr=9.9e-5, step_loss=0.0489]
Steps:   7%|▋         | 66/1000 [01:04<15:06,  1.03it/s, lr=9.9e-5, step_loss=0.0489]
Steps:   7%|▋         | 66/1000 [01:04<15:06,  1.03it/s, lr=9.89e-5, step_loss=0.0613]
Steps:   7%|▋         | 66/1000 [01:05<15:06,  1.03it/s, lr=9.89e-5, step_loss=0.0574]
Steps:   7%|▋         | 66/1000 [01:05<15:06,  1.03it/s, lr=9.89e-5, step_loss=0.00614]
Steps:   7%|▋         | 66/1000 [01:05<15:06,  1.03it/s, lr=9.89e-5, step_loss=0.203]  
Steps:   7%|▋         | 67/1000 [01:05<15:05,  1.03it/s, lr=9.89e-5, step_loss=0.203]
Steps:   7%|▋         | 67/1000 [01:05<15:05,  1.03it/s, lr=9.89e-5, step_loss=0.0181]
Steps:   7%|▋         | 67/1000 [01:06<15:05,  1.03it/s, lr=9.89e-5, step_loss=0.0044]
Steps:   7%|▋         | 67/1000 [01:06<15:05,  1.03it/s, lr=9.89e-5, step_loss=0.0236]
Steps:   7%|▋         | 67/1000 [01:06<15:05,  1.03it/s, lr=9.89e-5, step_loss=0.291] 
Steps:   7%|▋         | 68/1000 [01:06<15:04,  1.03it/s, lr=9.89e-5, step_loss=0.291]
Steps:   7%|▋         | 68/1000 [01:06<15:04,  1.03it/s, lr=9.89e-5, step_loss=0.0149]
Steps:   7%|▋         | 68/1000 [01:07<15:04,  1.03it/s, lr=9.89e-5, step_loss=0.0495]
Steps:   7%|▋         | 68/1000 [01:07<15:04,  1.03it/s, lr=9.89e-5, step_loss=0.176] 
Steps:   7%|▋         | 68/1000 [01:07<15:04,  1.03it/s, lr=9.89e-5, step_loss=0.0825]
Steps:   7%|▋         | 69/1000 [01:07<15:03,  1.03it/s, lr=9.89e-5, step_loss=0.0825]
Steps:   7%|▋         | 69/1000 [01:07<15:03,  1.03it/s, lr=9.88e-5, step_loss=0.00837]
Steps:   7%|▋         | 69/1000 [01:08<15:03,  1.03it/s, lr=9.88e-5, step_loss=0.0164] 
Steps:   7%|▋         | 69/1000 [01:08<15:03,  1.03it/s, lr=9.88e-5, step_loss=0.0585]
Steps:   7%|▋         | 69/1000 [01:08<15:03,  1.03it/s, lr=9.88e-5, step_loss=0.0571]
Steps:   7%|▋         | 70/1000 [01:08<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.0571]
Steps:   7%|▋         | 70/1000 [01:08<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.00679]
Steps:   7%|▋         | 70/1000 [01:09<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.0267] 
Steps:   7%|▋         | 70/1000 [01:09<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.0579]
Steps:   7%|▋         | 70/1000 [01:09<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.00787]
Steps:   7%|▋         | 71/1000 [01:09<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.00787]
Steps:   7%|▋         | 71/1000 [01:09<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.0785] 
Steps:   7%|▋         | 71/1000 [01:10<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.0291]
Steps:   7%|▋         | 71/1000 [01:10<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.684] 
Steps:   7%|▋         | 71/1000 [01:10<15:04,  1.03it/s, lr=9.88e-5, step_loss=0.267]
Steps:   7%|▋         | 72/1000 [01:10<15:02,  1.03it/s, lr=9.88e-5, step_loss=0.267]
Steps:   7%|▋         | 72/1000 [01:10<15:02,  1.03it/s, lr=9.87e-5, step_loss=0.0618]
Steps:   7%|▋         | 72/1000 [01:11<15:02,  1.03it/s, lr=9.87e-5, step_loss=0.122] 
Steps:   7%|▋         | 72/1000 [01:11<15:02,  1.03it/s, lr=9.87e-5, step_loss=0.00409]
Steps:   7%|▋         | 72/1000 [01:11<15:02,  1.03it/s, lr=9.87e-5, step_loss=0.509]  
Steps:   7%|▋         | 73/1000 [01:11<15:01,  1.03it/s, lr=9.87e-5, step_loss=0.509]
Steps:   7%|▋         | 73/1000 [01:11<15:01,  1.03it/s, lr=9.87e-5, step_loss=0.0414]
Steps:   7%|▋         | 73/1000 [01:12<15:01,  1.03it/s, lr=9.87e-5, step_loss=0.0683]
Steps:   7%|▋         | 73/1000 [01:12<15:01,  1.03it/s, lr=9.87e-5, step_loss=0.0542]
Steps:   7%|▋         | 73/1000 [01:12<15:01,  1.03it/s, lr=9.87e-5, step_loss=0.0166]
Steps:   7%|▋         | 74/1000 [01:12<15:00,  1.03it/s, lr=9.87e-5, step_loss=0.0166]
Steps:   7%|▋         | 74/1000 [01:12<15:00,  1.03it/s, lr=9.87e-5, step_loss=0.105] 
Steps:   7%|▋         | 74/1000 [01:12<15:00,  1.03it/s, lr=9.87e-5, step_loss=0.00465]
Steps:   7%|▋         | 74/1000 [01:13<15:00,  1.03it/s, lr=9.87e-5, step_loss=0.328]  
Steps:   7%|▋         | 74/1000 [01:13<15:00,  1.03it/s, lr=9.87e-5, step_loss=0.166]
Steps:   8%|▊         | 75/1000 [01:13<14:58,  1.03it/s, lr=9.87e-5, step_loss=0.166]
Steps:   8%|▊         | 75/1000 [01:13<14:58,  1.03it/s, lr=9.86e-5, step_loss=0.0108]
Steps:   8%|▊         | 75/1000 [01:13<14:58,  1.03it/s, lr=9.86e-5, step_loss=0.0607]
Steps:   8%|▊         | 75/1000 [01:14<14:58,  1.03it/s, lr=9.86e-5, step_loss=0.0192]
Steps:   8%|▊         | 75/1000 [01:14<14:58,  1.03it/s, lr=9.86e-5, step_loss=0.0102]
Steps:   8%|▊         | 76/1000 [01:14<14:57,  1.03it/s, lr=9.86e-5, step_loss=0.0102]
Steps:   8%|▊         | 76/1000 [01:14<14:57,  1.03it/s, lr=9.86e-5, step_loss=0.0161]
Steps:   8%|▊         | 76/1000 [01:14<14:57,  1.03it/s, lr=9.86e-5, step_loss=0.108] 
Steps:   8%|▊         | 76/1000 [01:15<14:57,  1.03it/s, lr=9.86e-5, step_loss=0.116]
Steps:   8%|▊         | 76/1000 [01:15<14:57,  1.03it/s, lr=9.86e-5, step_loss=0.126]
Steps:   8%|▊         | 77/1000 [01:15<14:56,  1.03it/s, lr=9.86e-5, step_loss=0.126]
Steps:   8%|▊         | 77/1000 [01:15<14:56,  1.03it/s, lr=9.85e-5, step_loss=0.0753]
Steps:   8%|▊         | 77/1000 [01:15<14:56,  1.03it/s, lr=9.85e-5, step_loss=0.00646]
Steps:   8%|▊         | 77/1000 [01:16<14:56,  1.03it/s, lr=9.85e-5, step_loss=0.076]  
Steps:   8%|▊         | 77/1000 [01:16<14:56,  1.03it/s, lr=9.85e-5, step_loss=0.138]
Steps:   8%|▊         | 78/1000 [01:16<14:55,  1.03it/s, lr=9.85e-5, step_loss=0.138]
Steps:   8%|▊         | 78/1000 [01:16<14:55,  1.03it/s, lr=9.85e-5, step_loss=0.172]
Steps:   8%|▊         | 78/1000 [01:16<14:55,  1.03it/s, lr=9.85e-5, step_loss=0.0508]
Steps:   8%|▊         | 78/1000 [01:17<14:55,  1.03it/s, lr=9.85e-5, step_loss=0.144] 
Steps:   8%|▊         | 78/1000 [01:17<14:55,  1.03it/s, lr=9.85e-5, step_loss=0.0417]
Steps:   8%|▊         | 79/1000 [01:17<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.0417]
Steps:   8%|▊         | 79/1000 [01:17<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.0786]
Steps:   8%|▊         | 79/1000 [01:17<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.00585]
Steps:   8%|▊         | 79/1000 [01:18<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.178]  
Steps:   8%|▊         | 79/1000 [01:18<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.0687]
Steps:   8%|▊         | 80/1000 [01:18<14:53,  1.03it/s, lr=9.85e-5, step_loss=0.0687]
Steps:   8%|▊         | 80/1000 [01:18<14:53,  1.03it/s, lr=9.84e-5, step_loss=0.0614]
Steps:   8%|▊         | 80/1000 [01:18<14:53,  1.03it/s, lr=9.84e-5, step_loss=0.0166]
Steps:   8%|▊         | 80/1000 [01:19<14:53,  1.03it/s, lr=9.84e-5, step_loss=0.0718]
Steps:   8%|▊         | 80/1000 [01:19<14:53,  1.03it/s, lr=9.84e-5, step_loss=0.0732]
Steps:   8%|▊         | 81/1000 [01:19<14:52,  1.03it/s, lr=9.84e-5, step_loss=0.0732]
Steps:   8%|▊         | 81/1000 [01:19<14:52,  1.03it/s, lr=9.84e-5, step_loss=0.0289]
Steps:   8%|▊         | 81/1000 [01:19<14:52,  1.03it/s, lr=9.84e-5, step_loss=0.211] 
Steps:   8%|▊         | 81/1000 [01:20<14:52,  1.03it/s, lr=9.84e-5, step_loss=0.0049]
Steps:   8%|▊         | 81/1000 [01:20<14:52,  1.03it/s, lr=9.84e-5, step_loss=0.158] 
Steps:   8%|▊         | 82/1000 [01:20<14:51,  1.03it/s, lr=9.84e-5, step_loss=0.158]
Steps:   8%|▊         | 82/1000 [01:20<14:51,  1.03it/s, lr=9.84e-5, step_loss=0.00891]
Steps:   8%|▊         | 82/1000 [01:20<14:51,  1.03it/s, lr=9.84e-5, step_loss=0.0623] 
Steps:   8%|▊         | 82/1000 [01:20<14:51,  1.03it/s, lr=9.84e-5, step_loss=0.00695]
Steps:   8%|▊         | 82/1000 [01:21<14:51,  1.03it/s, lr=9.84e-5, step_loss=0.0403] 
Steps:   8%|▊         | 83/1000 [01:21<14:50,  1.03it/s, lr=9.84e-5, step_loss=0.0403]
Steps:   8%|▊         | 83/1000 [01:21<14:50,  1.03it/s, lr=9.83e-5, step_loss=0.0642]
Steps:   8%|▊         | 83/1000 [01:21<14:50,  1.03it/s, lr=9.83e-5, step_loss=0.0378]
Steps:   8%|▊         | 83/1000 [01:21<14:50,  1.03it/s, lr=9.83e-5, step_loss=0.00513]
Steps:   8%|▊         | 83/1000 [01:22<14:50,  1.03it/s, lr=9.83e-5, step_loss=0.041]  
Steps:   8%|▊         | 84/1000 [01:22<14:49,  1.03it/s, lr=9.83e-5, step_loss=0.041]
Steps:   8%|▊         | 84/1000 [01:22<14:49,  1.03it/s, lr=9.83e-5, step_loss=0.0821]
Steps:   8%|▊         | 84/1000 [01:22<14:49,  1.03it/s, lr=9.83e-5, step_loss=0.12]  
Steps:   8%|▊         | 84/1000 [01:22<14:49,  1.03it/s, lr=9.83e-5, step_loss=0.179]
Steps:   8%|▊         | 84/1000 [01:23<14:49,  1.03it/s, lr=9.83e-5, step_loss=0.0133]
Steps:   8%|▊         | 85/1000 [01:23<14:48,  1.03it/s, lr=9.83e-5, step_loss=0.0133]
Steps:   8%|▊         | 85/1000 [01:23<14:48,  1.03it/s, lr=9.82e-5, step_loss=0.00522]
Steps:   8%|▊         | 85/1000 [01:23<14:48,  1.03it/s, lr=9.82e-5, step_loss=0.0965] 
Steps:   8%|▊         | 85/1000 [01:23<14:48,  1.03it/s, lr=9.82e-5, step_loss=0.133] 
Steps:   8%|▊         | 85/1000 [01:24<14:48,  1.03it/s, lr=9.82e-5, step_loss=0.00513]
Steps:   9%|▊         | 86/1000 [01:24<14:47,  1.03it/s, lr=9.82e-5, step_loss=0.00513]
Steps:   9%|▊         | 86/1000 [01:24<14:47,  1.03it/s, lr=9.82e-5, step_loss=0.0455] 
Steps:   9%|▊         | 86/1000 [01:24<14:47,  1.03it/s, lr=9.82e-5, step_loss=0.0419]
Steps:   9%|▊         | 86/1000 [01:24<14:47,  1.03it/s, lr=9.82e-5, step_loss=0.198] 
Steps:   9%|▊         | 86/1000 [01:25<14:47,  1.03it/s, lr=9.82e-5, step_loss=0.0281]
Steps:   9%|▊         | 87/1000 [01:25<14:46,  1.03it/s, lr=9.82e-5, step_loss=0.0281]
Steps:   9%|▊         | 87/1000 [01:25<14:46,  1.03it/s, lr=9.81e-5, step_loss=0.186] 
Steps:   9%|▊         | 87/1000 [01:25<14:46,  1.03it/s, lr=9.81e-5, step_loss=0.0347]
Steps:   9%|▊         | 87/1000 [01:25<14:46,  1.03it/s, lr=9.81e-5, step_loss=0.00373]
Steps:   9%|▊         | 87/1000 [01:26<14:46,  1.03it/s, lr=9.81e-5, step_loss=0.044]  
Steps:   9%|▉         | 88/1000 [01:26<14:45,  1.03it/s, lr=9.81e-5, step_loss=0.044]
Steps:   9%|▉         | 88/1000 [01:26<14:45,  1.03it/s, lr=9.81e-5, step_loss=0.0597]
Steps:   9%|▉         | 88/1000 [01:26<14:45,  1.03it/s, lr=9.81e-5, step_loss=0.0111]
Steps:   9%|▉         | 88/1000 [01:26<14:45,  1.03it/s, lr=9.81e-5, step_loss=0.102] 
Steps:   9%|▉         | 88/1000 [01:27<14:45,  1.03it/s, lr=9.81e-5, step_loss=0.0051]
Steps:   9%|▉         | 89/1000 [01:27<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.0051]
Steps:   9%|▉         | 89/1000 [01:27<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.0268]
Steps:   9%|▉         | 89/1000 [01:27<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.0729]
Steps:   9%|▉         | 89/1000 [01:27<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.141] 
Steps:   9%|▉         | 89/1000 [01:28<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.0104]
Steps:   9%|▉         | 90/1000 [01:28<14:44,  1.03it/s, lr=9.81e-5, step_loss=0.0104]
Steps:   9%|▉         | 90/1000 [01:28<14:44,  1.03it/s, lr=9.8e-5, step_loss=0.0521] 
Steps:   9%|▉         | 90/1000 [01:28<14:44,  1.03it/s, lr=9.8e-5, step_loss=0.0406]
Steps:   9%|▉         | 90/1000 [01:28<14:44,  1.03it/s, lr=9.8e-5, step_loss=0.828] 
Steps:   9%|▉         | 90/1000 [01:29<14:44,  1.03it/s, lr=9.8e-5, step_loss=0.279]
Steps:   9%|▉         | 91/1000 [01:29<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.279]
Steps:   9%|▉         | 91/1000 [01:29<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.047]
Steps:   9%|▉         | 91/1000 [01:29<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.00922]
Steps:   9%|▉         | 91/1000 [01:29<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.0166] 
Steps:   9%|▉         | 91/1000 [01:29<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.195] 
Steps:   9%|▉         | 92/1000 [01:30<14:42,  1.03it/s, lr=9.8e-5, step_loss=0.195]
Steps:   9%|▉         | 92/1000 [01:30<14:42,  1.03it/s, lr=9.79e-5, step_loss=0.0454]
Steps:   9%|▉         | 92/1000 [01:30<14:42,  1.03it/s, lr=9.79e-5, step_loss=0.196] 
Steps:   9%|▉         | 92/1000 [01:30<14:42,  1.03it/s, lr=9.79e-5, step_loss=0.0916]
Steps:   9%|▉         | 92/1000 [01:30<14:42,  1.03it/s, lr=9.79e-5, step_loss=0.00442]
Steps:   9%|▉         | 93/1000 [01:31<14:41,  1.03it/s, lr=9.79e-5, step_loss=0.00442]
Steps:   9%|▉         | 93/1000 [01:31<14:41,  1.03it/s, lr=9.79e-5, step_loss=0.0981] 
Steps:   9%|▉         | 93/1000 [01:31<14:41,  1.03it/s, lr=9.79e-5, step_loss=0.0183]
Steps:   9%|▉         | 93/1000 [01:31<14:41,  1.03it/s, lr=9.79e-5, step_loss=0.00586]
Steps:   9%|▉         | 93/1000 [01:31<14:41,  1.03it/s, lr=9.79e-5, step_loss=0.0375] 
Steps:   9%|▉         | 94/1000 [01:32<14:40,  1.03it/s, lr=9.79e-5, step_loss=0.0375]
Steps:   9%|▉         | 94/1000 [01:32<14:40,  1.03it/s, lr=9.78e-5, step_loss=0.00335]
Steps:   9%|▉         | 94/1000 [01:32<14:40,  1.03it/s, lr=9.78e-5, step_loss=0.102]  
Steps:   9%|▉         | 94/1000 [01:32<14:40,  1.03it/s, lr=9.78e-5, step_loss=0.00826]
Steps:   9%|▉         | 94/1000 [01:32<14:40,  1.03it/s, lr=9.78e-5, step_loss=0.0117] 
Steps:  10%|▉         | 95/1000 [01:33<14:39,  1.03it/s, lr=9.78e-5, step_loss=0.0117]
Steps:  10%|▉         | 95/1000 [01:33<14:39,  1.03it/s, lr=9.78e-5, step_loss=0.0191]
Steps:  10%|▉         | 95/1000 [01:33<14:39,  1.03it/s, lr=9.78e-5, step_loss=0.59]  
Steps:  10%|▉         | 95/1000 [01:33<14:39,  1.03it/s, lr=9.78e-5, step_loss=0.00315]
Steps:  10%|▉         | 95/1000 [01:33<14:39,  1.03it/s, lr=9.78e-5, step_loss=0.0192] 
Steps:  10%|▉         | 96/1000 [01:34<14:38,  1.03it/s, lr=9.78e-5, step_loss=0.0192]
Steps:  10%|▉         | 96/1000 [01:34<14:38,  1.03it/s, lr=9.77e-5, step_loss=0.063] 
Steps:  10%|▉         | 96/1000 [01:34<14:38,  1.03it/s, lr=9.77e-5, step_loss=0.266]
Steps:  10%|▉         | 96/1000 [01:34<14:38,  1.03it/s, lr=9.77e-5, step_loss=0.00853]
Steps:  10%|▉         | 96/1000 [01:34<14:38,  1.03it/s, lr=9.77e-5, step_loss=0.0371] 
Steps:  10%|▉         | 97/1000 [01:35<14:37,  1.03it/s, lr=9.77e-5, step_loss=0.0371]
Steps:  10%|▉         | 97/1000 [01:35<14:37,  1.03it/s, lr=9.77e-5, step_loss=0.00954]
Steps:  10%|▉         | 97/1000 [01:35<14:37,  1.03it/s, lr=9.77e-5, step_loss=0.164]  
Steps:  10%|▉         | 97/1000 [01:35<14:37,  1.03it/s, lr=9.77e-5, step_loss=0.00815]
Steps:  10%|▉         | 97/1000 [01:35<14:37,  1.03it/s, lr=9.77e-5, step_loss=0.00315]
Steps:  10%|▉         | 98/1000 [01:36<14:36,  1.03it/s, lr=9.77e-5, step_loss=0.00315]
Steps:  10%|▉         | 98/1000 [01:36<14:36,  1.03it/s, lr=9.76e-5, step_loss=0.159]  
Steps:  10%|▉         | 98/1000 [01:36<14:36,  1.03it/s, lr=9.76e-5, step_loss=0.0201]
Steps:  10%|▉         | 98/1000 [01:36<14:36,  1.03it/s, lr=9.76e-5, step_loss=0.0525]
Steps:  10%|▉         | 98/1000 [01:36<14:36,  1.03it/s, lr=9.76e-5, step_loss=0.154] 
Steps:  10%|▉         | 99/1000 [01:36<14:35,  1.03it/s, lr=9.76e-5, step_loss=0.154]
Steps:  10%|▉         | 99/1000 [01:37<14:35,  1.03it/s, lr=9.76e-5, step_loss=0.208]
Steps:  10%|▉         | 99/1000 [01:37<14:35,  1.03it/s, lr=9.76e-5, step_loss=0.022]
Steps:  10%|▉         | 99/1000 [01:37<14:35,  1.03it/s, lr=9.76e-5, step_loss=0.399]
Steps:  10%|▉         | 99/1000 [01:37<14:35,  1.03it/s, lr=9.76e-5, step_loss=0.00202]
Steps:  10%|█         | 100/1000 [01:37<14:34,  1.03it/s, lr=9.76e-5, step_loss=0.00202]
Steps:  10%|█         | 100/1000 [01:37<14:34,  1.03it/s, lr=9.76e-5, step_loss=0.00353]
Steps:  10%|█         | 100/1000 [01:38<14:34,  1.03it/s, lr=9.76e-5, step_loss=0.284]  
Steps:  10%|█         | 100/1000 [01:38<14:34,  1.03it/s, lr=9.76e-5, step_loss=0.00648]
Steps:  10%|█         | 100/1000 [01:38<14:34,  1.03it/s, lr=9.76e-5, step_loss=0.247]  
Steps:  10%|█         | 101/1000 [01:38<14:33,  1.03it/s, lr=9.76e-5, step_loss=0.247]
Steps:  10%|█         | 101/1000 [01:38<14:33,  1.03it/s, lr=9.75e-5, step_loss=0.0629]
Steps:  10%|█         | 101/1000 [01:39<14:33,  1.03it/s, lr=9.75e-5, step_loss=0.181] 
Steps:  10%|█         | 101/1000 [01:39<14:33,  1.03it/s, lr=9.75e-5, step_loss=0.00919]
Steps:  10%|█         | 101/1000 [01:39<14:33,  1.03it/s, lr=9.75e-5, step_loss=0.0109] 
Steps:  10%|█         | 102/1000 [01:39<14:32,  1.03it/s, lr=9.75e-5, step_loss=0.0109]
Steps:  10%|█         | 102/1000 [01:39<14:32,  1.03it/s, lr=9.75e-5, step_loss=0.0798]
Steps:  10%|█         | 102/1000 [01:40<14:32,  1.03it/s, lr=9.75e-5, step_loss=0.0362]
Steps:  10%|█         | 102/1000 [01:40<14:32,  1.03it/s, lr=9.75e-5, step_loss=0.029] 
Steps:  10%|█         | 102/1000 [01:40<14:32,  1.03it/s, lr=9.75e-5, step_loss=0.737]
Steps:  10%|█         | 103/1000 [01:40<14:31,  1.03it/s, lr=9.75e-5, step_loss=0.737]
Steps:  10%|█         | 103/1000 [01:40<14:31,  1.03it/s, lr=9.74e-5, step_loss=0.0261]
Steps:  10%|█         | 103/1000 [01:41<14:31,  1.03it/s, lr=9.74e-5, step_loss=0.0395]
Steps:  10%|█         | 103/1000 [01:41<14:31,  1.03it/s, lr=9.74e-5, step_loss=0.00289]
Steps:  10%|█         | 103/1000 [01:41<14:31,  1.03it/s, lr=9.74e-5, step_loss=0.128]  
Steps:  10%|█         | 104/1000 [01:41<14:30,  1.03it/s, lr=9.74e-5, step_loss=0.128]
Steps:  10%|█         | 104/1000 [01:41<14:30,  1.03it/s, lr=9.74e-5, step_loss=0.0911]
Steps:  10%|█         | 104/1000 [01:42<14:30,  1.03it/s, lr=9.74e-5, step_loss=0.135] 
Steps:  10%|█         | 104/1000 [01:42<14:30,  1.03it/s, lr=9.74e-5, step_loss=0.0669]
Steps:  10%|█         | 104/1000 [01:42<14:30,  1.03it/s, lr=9.74e-5, step_loss=0.00982]
Steps:  10%|█         | 105/1000 [01:42<14:29,  1.03it/s, lr=9.74e-5, step_loss=0.00982]
Steps:  10%|█         | 105/1000 [01:42<14:29,  1.03it/s, lr=9.73e-5, step_loss=0.0752] 
Steps:  10%|█         | 105/1000 [01:43<14:29,  1.03it/s, lr=9.73e-5, step_loss=0.0177]
Steps:  10%|█         | 105/1000 [01:43<14:29,  1.03it/s, lr=9.73e-5, step_loss=0.0138]
Steps:  10%|█         | 105/1000 [01:43<14:29,  1.03it/s, lr=9.73e-5, step_loss=0.0353]
Steps:  11%|█         | 106/1000 [01:43<14:28,  1.03it/s, lr=9.73e-5, step_loss=0.0353]
Steps:  11%|█         | 106/1000 [01:43<14:28,  1.03it/s, lr=9.73e-5, step_loss=0.00279]
Steps:  11%|█         | 106/1000 [01:44<14:28,  1.03it/s, lr=9.73e-5, step_loss=0.0848] 
Steps:  11%|█         | 106/1000 [01:44<14:28,  1.03it/s, lr=9.73e-5, step_loss=0.012] 
Steps:  11%|█         | 106/1000 [01:44<14:28,  1.03it/s, lr=9.73e-5, step_loss=0.109]
Steps:  11%|█         | 107/1000 [01:44<14:27,  1.03it/s, lr=9.73e-5, step_loss=0.109]
Steps:  11%|█         | 107/1000 [01:44<14:27,  1.03it/s, lr=9.72e-5, step_loss=0.221]
Steps:  11%|█         | 107/1000 [01:45<14:27,  1.03it/s, lr=9.72e-5, step_loss=0.00195]
Steps:  11%|█         | 107/1000 [01:45<14:27,  1.03it/s, lr=9.72e-5, step_loss=0.156]  
Steps:  11%|█         | 107/1000 [01:45<14:27,  1.03it/s, lr=9.72e-5, step_loss=0.139]
Steps:  11%|█         | 108/1000 [01:45<14:26,  1.03it/s, lr=9.72e-5, step_loss=0.139]
Steps:  11%|█         | 108/1000 [01:45<14:26,  1.03it/s, lr=9.71e-5, step_loss=0.118]
Steps:  11%|█         | 108/1000 [01:46<14:26,  1.03it/s, lr=9.71e-5, step_loss=0.117]
Steps:  11%|█         | 108/1000 [01:46<14:26,  1.03it/s, lr=9.71e-5, step_loss=0.0672]
Steps:  11%|█         | 108/1000 [01:46<14:26,  1.03it/s, lr=9.71e-5, step_loss=0.639] 
Steps:  11%|█         | 109/1000 [01:46<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.639]
Steps:  11%|█         | 109/1000 [01:46<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.0852]
Steps:  11%|█         | 109/1000 [01:46<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.00959]
Steps:  11%|█         | 109/1000 [01:47<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.119]  
Steps:  11%|█         | 109/1000 [01:47<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.17] 
Steps:  11%|█         | 110/1000 [01:47<14:25,  1.03it/s, lr=9.71e-5, step_loss=0.17]
Steps:  11%|█         | 110/1000 [01:47<14:25,  1.03it/s, lr=9.7e-5, step_loss=0.117]
Steps:  11%|█         | 110/1000 [01:47<14:25,  1.03it/s, lr=9.7e-5, step_loss=0.1]  
Steps:  11%|█         | 110/1000 [01:48<14:25,  1.03it/s, lr=9.7e-5, step_loss=0.0186]
Steps:  11%|█         | 110/1000 [01:48<14:25,  1.03it/s, lr=9.7e-5, step_loss=0.0227]
Steps:  11%|█         | 111/1000 [01:48<14:24,  1.03it/s, lr=9.7e-5, step_loss=0.0227]
Steps:  11%|█         | 111/1000 [01:48<14:24,  1.03it/s, lr=9.7e-5, step_loss=0.00301]
Steps:  11%|█         | 111/1000 [01:48<14:24,  1.03it/s, lr=9.7e-5, step_loss=0.476]  
Steps:  11%|█         | 111/1000 [01:49<14:24,  1.03it/s, lr=9.7e-5, step_loss=0.0364]
Steps:  11%|█         | 111/1000 [01:49<14:24,  1.03it/s, lr=9.7e-5, step_loss=0.0849]
Steps:  11%|█         | 112/1000 [01:49<14:23,  1.03it/s, lr=9.7e-5, step_loss=0.0849]
Steps:  11%|█         | 112/1000 [01:49<14:23,  1.03it/s, lr=9.69e-5, step_loss=0.0371]
Steps:  11%|█         | 112/1000 [01:49<14:23,  1.03it/s, lr=9.69e-5, step_loss=0.00415]
Steps:  11%|█         | 112/1000 [01:50<14:23,  1.03it/s, lr=9.69e-5, step_loss=0.00417]
Steps:  11%|█         | 112/1000 [01:50<14:23,  1.03it/s, lr=9.69e-5, step_loss=0.00581]
Steps:  11%|█▏        | 113/1000 [01:50<14:21,  1.03it/s, lr=9.69e-5, step_loss=0.00581]
Steps:  11%|█▏        | 113/1000 [01:50<14:21,  1.03it/s, lr=9.69e-5, step_loss=0.014]  
Steps:  11%|█▏        | 113/1000 [01:50<14:21,  1.03it/s, lr=9.69e-5, step_loss=0.0222]
Steps:  11%|█▏        | 113/1000 [01:51<14:21,  1.03it/s, lr=9.69e-5, step_loss=0.228] 
Steps:  11%|█▏        | 113/1000 [01:51<14:21,  1.03it/s, lr=9.69e-5, step_loss=0.0631]
Steps:  11%|█▏        | 114/1000 [01:51<14:20,  1.03it/s, lr=9.69e-5, step_loss=0.0631]
Steps:  11%|█▏        | 114/1000 [01:51<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0514]
Steps:  11%|█▏        | 114/1000 [01:51<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0423]
Steps:  11%|█▏        | 114/1000 [01:52<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0449]
Steps:  11%|█▏        | 114/1000 [01:52<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.107] 
Steps:  12%|█▏        | 115/1000 [01:52<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.107]
Steps:  12%|█▏        | 115/1000 [01:52<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0223]
Steps:  12%|█▏        | 115/1000 [01:52<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0587]
Steps:  12%|█▏        | 115/1000 [01:53<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.28]  
Steps:  12%|█▏        | 115/1000 [01:53<14:20,  1.03it/s, lr=9.68e-5, step_loss=0.0564]
Steps:  12%|█▏        | 116/1000 [01:53<14:18,  1.03it/s, lr=9.68e-5, step_loss=0.0564]
Steps:  12%|█▏        | 116/1000 [01:53<14:18,  1.03it/s, lr=9.67e-5, step_loss=0.127] 
Steps:  12%|█▏        | 116/1000 [01:53<14:18,  1.03it/s, lr=9.67e-5, step_loss=0.264]
Steps:  12%|█▏        | 116/1000 [01:54<14:18,  1.03it/s, lr=9.67e-5, step_loss=0.0849]
Steps:  12%|█▏        | 116/1000 [01:54<14:18,  1.03it/s, lr=9.67e-5, step_loss=0.00366]
Steps:  12%|█▏        | 117/1000 [01:54<14:17,  1.03it/s, lr=9.67e-5, step_loss=0.00366]
Steps:  12%|█▏        | 117/1000 [01:54<14:17,  1.03it/s, lr=9.67e-5, step_loss=0.0316] 
Steps:  12%|█▏        | 117/1000 [01:54<14:17,  1.03it/s, lr=9.67e-5, step_loss=0.0686]
Steps:  12%|█▏        | 117/1000 [01:54<14:17,  1.03it/s, lr=9.67e-5, step_loss=0.192] 
Steps:  12%|█▏        | 117/1000 [01:55<14:17,  1.03it/s, lr=9.67e-5, step_loss=0.0659]
Steps:  12%|█▏        | 118/1000 [01:55<14:16,  1.03it/s, lr=9.67e-5, step_loss=0.0659]
Steps:  12%|█▏        | 118/1000 [01:55<14:16,  1.03it/s, lr=9.66e-5, step_loss=0.125] 
Steps:  12%|█▏        | 118/1000 [01:55<14:16,  1.03it/s, lr=9.66e-5, step_loss=0.0228]
Steps:  12%|█▏        | 118/1000 [01:55<14:16,  1.03it/s, lr=9.66e-5, step_loss=0.202] 
Steps:  12%|█▏        | 118/1000 [01:56<14:16,  1.03it/s, lr=9.66e-5, step_loss=0.0762]
Steps:  12%|█▏        | 119/1000 [01:56<14:16,  1.03it/s, lr=9.66e-5, step_loss=0.0762]
Steps:  12%|█▏        | 119/1000 [01:56<14:16,  1.03it/s, lr=9.65e-5, step_loss=0.00454]
Steps:  12%|█▏        | 119/1000 [01:56<14:16,  1.03it/s, lr=9.65e-5, step_loss=0.0246] 
Steps:  12%|█▏        | 119/1000 [01:56<14:16,  1.03it/s, lr=9.65e-5, step_loss=0.0384]
Steps:  12%|█▏        | 119/1000 [01:57<14:16,  1.03it/s, lr=9.65e-5, step_loss=0.164] 
Steps:  12%|█▏        | 120/1000 [01:57<14:15,  1.03it/s, lr=9.65e-5, step_loss=0.164]
Steps:  12%|█▏        | 120/1000 [01:57<14:15,  1.03it/s, lr=9.65e-5, step_loss=0.0318]
Steps:  12%|█▏        | 120/1000 [01:57<14:15,  1.03it/s, lr=9.65e-5, step_loss=0.014] 
Steps:  12%|█▏        | 120/1000 [01:57<14:15,  1.03it/s, lr=9.65e-5, step_loss=0.133]
Steps:  12%|█▏        | 120/1000 [01:58<14:15,  1.03it/s, lr=9.65e-5, step_loss=0.00613]
Steps:  12%|█▏        | 121/1000 [01:58<14:14,  1.03it/s, lr=9.65e-5, step_loss=0.00613]
Steps:  12%|█▏        | 121/1000 [01:58<14:14,  1.03it/s, lr=9.64e-5, step_loss=0.00443]
Steps:  12%|█▏        | 121/1000 [01:58<14:14,  1.03it/s, lr=9.64e-5, step_loss=0.0926] 
Steps:  12%|█▏        | 121/1000 [01:58<14:14,  1.03it/s, lr=9.64e-5, step_loss=0.0313]
Steps:  12%|█▏        | 121/1000 [01:59<14:14,  1.03it/s, lr=9.64e-5, step_loss=0.00667]
Steps:  12%|█▏        | 122/1000 [01:59<14:13,  1.03it/s, lr=9.64e-5, step_loss=0.00667]
Steps:  12%|█▏        | 122/1000 [01:59<14:13,  1.03it/s, lr=9.64e-5, step_loss=0.00261]
Steps:  12%|█▏        | 122/1000 [01:59<14:13,  1.03it/s, lr=9.64e-5, step_loss=0.0754] 
Steps:  12%|█▏        | 122/1000 [01:59<14:13,  1.03it/s, lr=9.64e-5, step_loss=0.0166]
Steps:  12%|█▏        | 122/1000 [02:00<14:13,  1.03it/s, lr=9.64e-5, step_loss=0.0995]
Steps:  12%|█▏        | 123/1000 [02:00<14:12,  1.03it/s, lr=9.64e-5, step_loss=0.0995]
Steps:  12%|█▏        | 123/1000 [02:00<14:12,  1.03it/s, lr=9.63e-5, step_loss=0.0611]
Steps:  12%|█▏        | 123/1000 [02:00<14:12,  1.03it/s, lr=9.63e-5, step_loss=0.4]   
Steps:  12%|█▏        | 123/1000 [02:00<14:12,  1.03it/s, lr=9.63e-5, step_loss=0.154]
Steps:  12%|█▏        | 123/1000 [02:01<14:12,  1.03it/s, lr=9.63e-5, step_loss=0.0208]
Steps:  12%|█▏        | 124/1000 [02:01<14:11,  1.03it/s, lr=9.63e-5, step_loss=0.0208]
Steps:  12%|█▏        | 124/1000 [02:01<14:11,  1.03it/s, lr=9.63e-5, step_loss=0.138] 
Steps:  12%|█▏        | 124/1000 [02:01<14:11,  1.03it/s, lr=9.63e-5, step_loss=0.00607]
Steps:  12%|█▏        | 124/1000 [02:01<14:11,  1.03it/s, lr=9.63e-5, step_loss=0.118]  
Steps:  12%|█▏        | 124/1000 [02:02<14:11,  1.03it/s, lr=9.63e-5, step_loss=0.00499]
Steps:  12%|█▎        | 125/1000 [02:02<14:10,  1.03it/s, lr=9.63e-5, step_loss=0.00499]
Steps:  12%|█▎        | 125/1000 [02:02<14:10,  1.03it/s, lr=9.62e-5, step_loss=0.00819]
Steps:  12%|█▎        | 125/1000 [02:02<14:10,  1.03it/s, lr=9.62e-5, step_loss=0.00379]
Steps:  12%|█▎        | 125/1000 [02:02<14:10,  1.03it/s, lr=9.62e-5, step_loss=0.036]  
Steps:  12%|█▎        | 125/1000 [02:03<14:10,  1.03it/s, lr=9.62e-5, step_loss=0.0661]
Steps:  13%|█▎        | 126/1000 [02:03<14:09,  1.03it/s, lr=9.62e-5, step_loss=0.0661]
Steps:  13%|█▎        | 126/1000 [02:03<14:09,  1.03it/s, lr=9.61e-5, step_loss=0.00879]
Steps:  13%|█▎        | 126/1000 [02:03<14:09,  1.03it/s, lr=9.61e-5, step_loss=0.00566]
Steps:  13%|█▎        | 126/1000 [02:03<14:09,  1.03it/s, lr=9.61e-5, step_loss=0.112]  
Steps:  13%|█▎        | 126/1000 [02:03<14:09,  1.03it/s, lr=9.61e-5, step_loss=0.00746]
Steps:  13%|█▎        | 127/1000 [02:04<14:08,  1.03it/s, lr=9.61e-5, step_loss=0.00746]
Steps:  13%|█▎        | 127/1000 [02:04<14:08,  1.03it/s, lr=9.61e-5, step_loss=0.0296] 
Steps:  13%|█▎        | 127/1000 [02:04<14:08,  1.03it/s, lr=9.61e-5, step_loss=0.0222]
Steps:  13%|█▎        | 127/1000 [02:04<14:08,  1.03it/s, lr=9.61e-5, step_loss=0.457] 
Steps:  13%|█▎        | 127/1000 [02:04<14:08,  1.03it/s, lr=9.61e-5, step_loss=0.0982]
Steps:  13%|█▎        | 128/1000 [02:05<14:07,  1.03it/s, lr=9.61e-5, step_loss=0.0982]
Steps:  13%|█▎        | 128/1000 [02:05<14:07,  1.03it/s, lr=9.6e-5, step_loss=0.0434] 
Steps:  13%|█▎        | 128/1000 [02:05<14:07,  1.03it/s, lr=9.6e-5, step_loss=0.0601]
Steps:  13%|█▎        | 128/1000 [02:05<14:07,  1.03it/s, lr=9.6e-5, step_loss=0.00739]
Steps:  13%|█▎        | 128/1000 [02:05<14:07,  1.03it/s, lr=9.6e-5, step_loss=0.0365] 
Steps:  13%|█▎        | 129/1000 [02:06<14:06,  1.03it/s, lr=9.6e-5, step_loss=0.0365]
Steps:  13%|█▎        | 129/1000 [02:06<14:06,  1.03it/s, lr=9.59e-5, step_loss=0.00965]
Steps:  13%|█▎        | 129/1000 [02:06<14:06,  1.03it/s, lr=9.59e-5, step_loss=0.0159] 
Steps:  13%|█▎        | 129/1000 [02:06<14:06,  1.03it/s, lr=9.59e-5, step_loss=0.13]  
Steps:  13%|█▎        | 129/1000 [02:06<14:06,  1.03it/s, lr=9.59e-5, step_loss=0.008]
Steps:  13%|█▎        | 130/1000 [02:07<14:05,  1.03it/s, lr=9.59e-5, step_loss=0.008]
Steps:  13%|█▎        | 130/1000 [02:07<14:05,  1.03it/s, lr=9.59e-5, step_loss=0.169]
Steps:  13%|█▎        | 130/1000 [02:07<14:05,  1.03it/s, lr=9.59e-5, step_loss=0.00483]
Steps:  13%|█▎        | 130/1000 [02:07<14:05,  1.03it/s, lr=9.59e-5, step_loss=0.00361]
Steps:  13%|█▎        | 130/1000 [02:07<14:05,  1.03it/s, lr=9.59e-5, step_loss=0.0115] 
Steps:  13%|█▎        | 131/1000 [02:08<14:04,  1.03it/s, lr=9.59e-5, step_loss=0.0115]
Steps:  13%|█▎        | 131/1000 [02:08<14:04,  1.03it/s, lr=9.58e-5, step_loss=0.154] 
Steps:  13%|█▎        | 131/1000 [02:08<14:04,  1.03it/s, lr=9.58e-5, step_loss=0.0183]
Steps:  13%|█▎        | 131/1000 [02:08<14:04,  1.03it/s, lr=9.58e-5, step_loss=0.172] 
Steps:  13%|█▎        | 131/1000 [02:08<14:04,  1.03it/s, lr=9.58e-5, step_loss=0.0214]
Steps:  13%|█▎        | 132/1000 [02:09<14:03,  1.03it/s, lr=9.58e-5, step_loss=0.0214]
Steps:  13%|█▎        | 132/1000 [02:09<14:03,  1.03it/s, lr=9.58e-5, step_loss=0.0506]
Steps:  13%|█▎        | 132/1000 [02:09<14:03,  1.03it/s, lr=9.58e-5, step_loss=0.0774]
Steps:  13%|█▎        | 132/1000 [02:09<14:03,  1.03it/s, lr=9.58e-5, step_loss=0.0574]
Steps:  13%|█▎        | 132/1000 [02:09<14:03,  1.03it/s, lr=9.58e-5, step_loss=0.0883]
Steps:  13%|█▎        | 133/1000 [02:10<14:02,  1.03it/s, lr=9.58e-5, step_loss=0.0883]
Steps:  13%|█▎        | 133/1000 [02:10<14:02,  1.03it/s, lr=9.57e-5, step_loss=0.17]  
Steps:  13%|█▎        | 133/1000 [02:10<14:02,  1.03it/s, lr=9.57e-5, step_loss=0.0553]
Steps:  13%|█▎        | 133/1000 [02:10<14:02,  1.03it/s, lr=9.57e-5, step_loss=0.0843]
Steps:  13%|█▎        | 133/1000 [02:10<14:02,  1.03it/s, lr=9.57e-5, step_loss=0.158] 
Steps:  13%|█▎        | 134/1000 [02:10<14:01,  1.03it/s, lr=9.57e-5, step_loss=0.158]
Steps:  13%|█▎        | 134/1000 [02:11<14:01,  1.03it/s, lr=9.56e-5, step_loss=0.0125]
Steps:  13%|█▎        | 134/1000 [02:11<14:01,  1.03it/s, lr=9.56e-5, step_loss=0.292] 
Steps:  13%|█▎        | 134/1000 [02:11<14:01,  1.03it/s, lr=9.56e-5, step_loss=0.323]
Steps:  13%|█▎        | 134/1000 [02:11<14:01,  1.03it/s, lr=9.56e-5, step_loss=0.0959]
Steps:  14%|█▎        | 135/1000 [02:11<14:00,  1.03it/s, lr=9.56e-5, step_loss=0.0959]
Steps:  14%|█▎        | 135/1000 [02:12<14:00,  1.03it/s, lr=9.56e-5, step_loss=0.0317]
Steps:  14%|█▎        | 135/1000 [02:12<14:00,  1.03it/s, lr=9.56e-5, step_loss=0.0906]
Steps:  14%|█▎        | 135/1000 [02:12<14:00,  1.03it/s, lr=9.56e-5, step_loss=0.00547]
Steps:  14%|█▎        | 135/1000 [02:12<14:00,  1.03it/s, lr=9.56e-5, step_loss=0.0238] 
Steps:  14%|█▎        | 136/1000 [02:12<13:59,  1.03it/s, lr=9.56e-5, step_loss=0.0238]
Steps:  14%|█▎        | 136/1000 [02:12<13:59,  1.03it/s, lr=9.55e-5, step_loss=0.00576]
Steps:  14%|█▎        | 136/1000 [02:13<13:59,  1.03it/s, lr=9.55e-5, step_loss=0.0464] 
Steps:  14%|█▎        | 136/1000 [02:13<13:59,  1.03it/s, lr=9.55e-5, step_loss=0.0659]
Steps:  14%|█▎        | 136/1000 [02:13<13:59,  1.03it/s, lr=9.55e-5, step_loss=0.07]  
Steps:  14%|█▎        | 137/1000 [02:13<13:59,  1.03it/s, lr=9.55e-5, step_loss=0.07]
Steps:  14%|█▎        | 137/1000 [02:13<13:59,  1.03it/s, lr=9.54e-5, step_loss=0.0412]
Steps:  14%|█▎        | 137/1000 [02:14<13:59,  1.03it/s, lr=9.54e-5, step_loss=0.204] 
Steps:  14%|█▎        | 137/1000 [02:14<13:59,  1.03it/s, lr=9.54e-5, step_loss=0.00571]
Steps:  14%|█▎        | 137/1000 [02:14<13:59,  1.03it/s, lr=9.54e-5, step_loss=0.0109] 
Steps:  14%|█▍        | 138/1000 [02:14<13:58,  1.03it/s, lr=9.54e-5, step_loss=0.0109]
Steps:  14%|█▍        | 138/1000 [02:14<13:58,  1.03it/s, lr=9.54e-5, step_loss=0.0988]
Steps:  14%|█▍        | 138/1000 [02:15<13:58,  1.03it/s, lr=9.54e-5, step_loss=0.0201]
Steps:  14%|█▍        | 138/1000 [02:15<13:58,  1.03it/s, lr=9.54e-5, step_loss=0.0331]
Steps:  14%|█▍        | 138/1000 [02:15<13:58,  1.03it/s, lr=9.54e-5, step_loss=0.00612]
Steps:  14%|█▍        | 139/1000 [02:15<13:56,  1.03it/s, lr=9.54e-5, step_loss=0.00612]
Steps:  14%|█▍        | 139/1000 [02:15<13:56,  1.03it/s, lr=9.53e-5, step_loss=0.0987] 
Steps:  14%|█▍        | 139/1000 [02:16<13:56,  1.03it/s, lr=9.53e-5, step_loss=0.0613]
Steps:  14%|█▍        | 139/1000 [02:16<13:56,  1.03it/s, lr=9.53e-5, step_loss=0.0288]
Steps:  14%|█▍        | 139/1000 [02:16<13:56,  1.03it/s, lr=9.53e-5, step_loss=0.0367]
Steps:  14%|█▍        | 140/1000 [02:16<13:55,  1.03it/s, lr=9.53e-5, step_loss=0.0367]
Steps:  14%|█▍        | 140/1000 [02:16<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.00268]
Steps:  14%|█▍        | 140/1000 [02:17<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.0435] 
Steps:  14%|█▍        | 140/1000 [02:17<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.285] 
Steps:  14%|█▍        | 140/1000 [02:17<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.0754]
Steps:  14%|█▍        | 141/1000 [02:17<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.0754]
Steps:  14%|█▍        | 141/1000 [02:17<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.559] 
Steps:  14%|█▍        | 141/1000 [02:18<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.1]  
Steps:  14%|█▍        | 141/1000 [02:18<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.429]
Steps:  14%|█▍        | 141/1000 [02:18<13:55,  1.03it/s, lr=9.52e-5, step_loss=0.0165]
Steps:  14%|█▍        | 142/1000 [02:18<13:54,  1.03it/s, lr=9.52e-5, step_loss=0.0165]
Steps:  14%|█▍        | 142/1000 [02:18<13:54,  1.03it/s, lr=9.51e-5, step_loss=0.0974]
Steps:  14%|█▍        | 142/1000 [02:19<13:54,  1.03it/s, lr=9.51e-5, step_loss=0.00487]
Steps:  14%|█▍        | 142/1000 [02:19<13:54,  1.03it/s, lr=9.51e-5, step_loss=0.0439] 
Steps:  14%|█▍        | 142/1000 [02:19<13:54,  1.03it/s, lr=9.51e-5, step_loss=0.108] 
Steps:  14%|█▍        | 143/1000 [02:19<13:53,  1.03it/s, lr=9.51e-5, step_loss=0.108]
Steps:  14%|█▍        | 143/1000 [02:19<13:53,  1.03it/s, lr=9.5e-5, step_loss=0.00457]
Steps:  14%|█▍        | 143/1000 [02:20<13:53,  1.03it/s, lr=9.5e-5, step_loss=0.00306]
Steps:  14%|█▍        | 143/1000 [02:20<13:53,  1.03it/s, lr=9.5e-5, step_loss=0.0544] 
Steps:  14%|█▍        | 143/1000 [02:20<13:53,  1.03it/s, lr=9.5e-5, step_loss=0.11]  
Steps:  14%|█▍        | 144/1000 [02:20<13:52,  1.03it/s, lr=9.5e-5, step_loss=0.11]
Steps:  14%|█▍        | 144/1000 [02:20<13:52,  1.03it/s, lr=9.5e-5, step_loss=0.0386]
Steps:  14%|█▍        | 144/1000 [02:20<13:52,  1.03it/s, lr=9.5e-5, step_loss=0.00512]
Steps:  14%|█▍        | 144/1000 [02:21<13:52,  1.03it/s, lr=9.5e-5, step_loss=0.271]  
Steps:  14%|█▍        | 144/1000 [02:21<13:52,  1.03it/s, lr=9.5e-5, step_loss=0.0496]
Steps:  14%|█▍        | 145/1000 [02:21<13:51,  1.03it/s, lr=9.5e-5, step_loss=0.0496]
Steps:  14%|█▍        | 145/1000 [02:21<13:51,  1.03it/s, lr=9.49e-5, step_loss=0.018]
Steps:  14%|█▍        | 145/1000 [02:21<13:51,  1.03it/s, lr=9.49e-5, step_loss=0.135]
Steps:  14%|█▍        | 145/1000 [02:22<13:51,  1.03it/s, lr=9.49e-5, step_loss=0.249]
Steps:  14%|█▍        | 145/1000 [02:22<13:51,  1.03it/s, lr=9.49e-5, step_loss=0.014]
Steps:  15%|█▍        | 146/1000 [02:22<13:50,  1.03it/s, lr=9.49e-5, step_loss=0.014]
Steps:  15%|█▍        | 146/1000 [02:22<13:50,  1.03it/s, lr=9.48e-5, step_loss=0.105]
Steps:  15%|█▍        | 146/1000 [02:22<13:50,  1.03it/s, lr=9.48e-5, step_loss=0.127]
Steps:  15%|█▍        | 146/1000 [02:23<13:50,  1.03it/s, lr=9.48e-5, step_loss=0.0885]
Steps:  15%|█▍        | 146/1000 [02:23<13:50,  1.03it/s, lr=9.48e-5, step_loss=0.0621]
Steps:  15%|█▍        | 147/1000 [02:23<13:49,  1.03it/s, lr=9.48e-5, step_loss=0.0621]
Steps:  15%|█▍        | 147/1000 [02:23<13:49,  1.03it/s, lr=9.48e-5, step_loss=0.00297]
Steps:  15%|█▍        | 147/1000 [02:23<13:49,  1.03it/s, lr=9.48e-5, step_loss=0.00669]
Steps:  15%|█▍        | 147/1000 [02:24<13:49,  1.03it/s, lr=9.48e-5, step_loss=0.153]  
Steps:  15%|█▍        | 147/1000 [02:24<13:49,  1.03it/s, lr=9.48e-5, step_loss=0.293]
Steps:  15%|█▍        | 148/1000 [02:24<13:48,  1.03it/s, lr=9.48e-5, step_loss=0.293]
Steps:  15%|█▍        | 148/1000 [02:24<13:48,  1.03it/s, lr=9.47e-5, step_loss=0.325]
Steps:  15%|█▍        | 148/1000 [02:24<13:48,  1.03it/s, lr=9.47e-5, step_loss=0.0623]
Steps:  15%|█▍        | 148/1000 [02:25<13:48,  1.03it/s, lr=9.47e-5, step_loss=0.138] 
Steps:  15%|█▍        | 148/1000 [02:25<13:48,  1.03it/s, lr=9.47e-5, step_loss=0.105]
Steps:  15%|█▍        | 149/1000 [02:25<13:47,  1.03it/s, lr=9.47e-5, step_loss=0.105]
Steps:  15%|█▍        | 149/1000 [02:25<13:47,  1.03it/s, lr=9.46e-5, step_loss=0.152]
Steps:  15%|█▍        | 149/1000 [02:25<13:47,  1.03it/s, lr=9.46e-5, step_loss=0.0916]
Steps:  15%|█▍        | 149/1000 [02:26<13:47,  1.03it/s, lr=9.46e-5, step_loss=0.00988]
Steps:  15%|█▍        | 149/1000 [02:26<13:47,  1.03it/s, lr=9.46e-5, step_loss=0.0108] 
Steps:  15%|█▌        | 150/1000 [02:26<13:46,  1.03it/s, lr=9.46e-5, step_loss=0.0108]
Steps:  15%|█▌        | 150/1000 [02:26<13:46,  1.03it/s, lr=9.46e-5, step_loss=0.108] 
Steps:  15%|█▌        | 150/1000 [02:26<13:46,  1.03it/s, lr=9.46e-5, step_loss=0.0699]
Steps:  15%|█▌        | 150/1000 [02:27<13:46,  1.03it/s, lr=9.46e-5, step_loss=0.107] 
Steps:  15%|█▌        | 150/1000 [02:27<13:46,  1.03it/s, lr=9.46e-5, step_loss=0.0377]
Steps:  15%|█▌        | 151/1000 [02:27<13:45,  1.03it/s, lr=9.46e-5, step_loss=0.0377]
Steps:  15%|█▌        | 151/1000 [02:27<13:45,  1.03it/s, lr=9.45e-5, step_loss=0.133] 
Steps:  15%|█▌        | 151/1000 [02:27<13:45,  1.03it/s, lr=9.45e-5, step_loss=0.0679]
Steps:  15%|█▌        | 151/1000 [02:28<13:45,  1.03it/s, lr=9.45e-5, step_loss=0.00922]
Steps:  15%|█▌        | 151/1000 [02:28<13:45,  1.03it/s, lr=9.45e-5, step_loss=0.274]  
Steps:  15%|█▌        | 152/1000 [02:28<13:44,  1.03it/s, lr=9.45e-5, step_loss=0.274]
Steps:  15%|█▌        | 152/1000 [02:28<13:44,  1.03it/s, lr=9.44e-5, step_loss=0.569]
Steps:  15%|█▌        | 152/1000 [02:28<13:44,  1.03it/s, lr=9.44e-5, step_loss=0.078]
Steps:  15%|█▌        | 152/1000 [02:29<13:44,  1.03it/s, lr=9.44e-5, step_loss=0.159]
Steps:  15%|█▌        | 152/1000 [02:29<13:44,  1.03it/s, lr=9.44e-5, step_loss=0.0243]
Steps:  15%|█▌        | 153/1000 [02:29<13:43,  1.03it/s, lr=9.44e-5, step_loss=0.0243]
Steps:  15%|█▌        | 153/1000 [02:29<13:43,  1.03it/s, lr=9.43e-5, step_loss=0.0426]
Steps:  15%|█▌        | 153/1000 [02:29<13:43,  1.03it/s, lr=9.43e-5, step_loss=0.0419]
Steps:  15%|█▌        | 153/1000 [02:29<13:43,  1.03it/s, lr=9.43e-5, step_loss=0.0121]
Steps:  15%|█▌        | 153/1000 [02:30<13:43,  1.03it/s, lr=9.43e-5, step_loss=0.00516]
Steps:  15%|█▌        | 154/1000 [02:30<13:42,  1.03it/s, lr=9.43e-5, step_loss=0.00516]
Steps:  15%|█▌        | 154/1000 [02:30<13:42,  1.03it/s, lr=9.43e-5, step_loss=0.0243] 
Steps:  15%|█▌        | 154/1000 [02:30<13:42,  1.03it/s, lr=9.43e-5, step_loss=0.0329]
Steps:  15%|█▌        | 154/1000 [02:30<13:42,  1.03it/s, lr=9.43e-5, step_loss=0.118] 
Steps:  15%|█▌        | 154/1000 [02:31<13:42,  1.03it/s, lr=9.43e-5, step_loss=0.104]
Steps:  16%|█▌        | 155/1000 [02:31<13:41,  1.03it/s, lr=9.43e-5, step_loss=0.104]
Steps:  16%|█▌        | 155/1000 [02:31<13:41,  1.03it/s, lr=9.42e-5, step_loss=0.31] 
Steps:  16%|█▌        | 155/1000 [02:31<13:41,  1.03it/s, lr=9.42e-5, step_loss=0.152]
Steps:  16%|█▌        | 155/1000 [02:31<13:41,  1.03it/s, lr=9.42e-5, step_loss=0.128]
Steps:  16%|█▌        | 155/1000 [02:32<13:41,  1.03it/s, lr=9.42e-5, step_loss=0.00568]
Steps:  16%|█▌        | 156/1000 [02:32<13:41,  1.03it/s, lr=9.42e-5, step_loss=0.00568]
Steps:  16%|█▌        | 156/1000 [02:32<13:41,  1.03it/s, lr=9.41e-5, step_loss=0.0935] 
Steps:  16%|█▌        | 156/1000 [02:32<13:41,  1.03it/s, lr=9.41e-5, step_loss=0.0601]
Steps:  16%|█▌        | 156/1000 [02:32<13:41,  1.03it/s, lr=9.41e-5, step_loss=0.00301]
Steps:  16%|█▌        | 156/1000 [02:33<13:41,  1.03it/s, lr=9.41e-5, step_loss=0.191]  
Steps:  16%|█▌        | 157/1000 [02:33<13:40,  1.03it/s, lr=9.41e-5, step_loss=0.191]
Steps:  16%|█▌        | 157/1000 [02:33<13:40,  1.03it/s, lr=9.4e-5, step_loss=0.115] 
Steps:  16%|█▌        | 157/1000 [02:33<13:40,  1.03it/s, lr=9.4e-5, step_loss=0.0077]
Steps:  16%|█▌        | 157/1000 [02:33<13:40,  1.03it/s, lr=9.4e-5, step_loss=0.123] 
Steps:  16%|█▌        | 157/1000 [02:34<13:40,  1.03it/s, lr=9.4e-5, step_loss=0.00866]
Steps:  16%|█▌        | 158/1000 [02:34<13:39,  1.03it/s, lr=9.4e-5, step_loss=0.00866]
Steps:  16%|█▌        | 158/1000 [02:34<13:39,  1.03it/s, lr=9.4e-5, step_loss=0.0217] 
Steps:  16%|█▌        | 158/1000 [02:34<13:39,  1.03it/s, lr=9.4e-5, step_loss=0.0294]
Steps:  16%|█▌        | 158/1000 [02:34<13:39,  1.03it/s, lr=9.4e-5, step_loss=0.0423]
Steps:  16%|█▌        | 158/1000 [02:35<13:39,  1.03it/s, lr=9.4e-5, step_loss=0.0579]
Steps:  16%|█▌        | 159/1000 [02:35<13:38,  1.03it/s, lr=9.4e-5, step_loss=0.0579]
Steps:  16%|█▌        | 159/1000 [02:35<13:38,  1.03it/s, lr=9.39e-5, step_loss=0.102]
Steps:  16%|█▌        | 159/1000 [02:35<13:38,  1.03it/s, lr=9.39e-5, step_loss=0.0175]
Steps:  16%|█▌        | 159/1000 [02:35<13:38,  1.03it/s, lr=9.39e-5, step_loss=0.0469]
Steps:  16%|█▌        | 159/1000 [02:36<13:38,  1.03it/s, lr=9.39e-5, step_loss=0.128] 
Steps:  16%|█▌        | 160/1000 [02:36<13:37,  1.03it/s, lr=9.39e-5, step_loss=0.128]
Steps:  16%|█▌        | 160/1000 [02:36<13:37,  1.03it/s, lr=9.38e-5, step_loss=0.0998]
Steps:  16%|█▌        | 160/1000 [02:36<13:37,  1.03it/s, lr=9.38e-5, step_loss=0.0164]
Steps:  16%|█▌        | 160/1000 [02:36<13:37,  1.03it/s, lr=9.38e-5, step_loss=0.00898]
Steps:  16%|█▌        | 160/1000 [02:37<13:37,  1.03it/s, lr=9.38e-5, step_loss=0.0557] 
Steps:  16%|█▌        | 161/1000 [02:37<13:36,  1.03it/s, lr=9.38e-5, step_loss=0.0557]
Steps:  16%|█▌        | 161/1000 [02:37<13:36,  1.03it/s, lr=9.37e-5, step_loss=0.0176]
Steps:  16%|█▌        | 161/1000 [02:37<13:36,  1.03it/s, lr=9.37e-5, step_loss=0.034] 
Steps:  16%|█▌        | 161/1000 [02:37<13:36,  1.03it/s, lr=9.37e-5, step_loss=0.00416]
Steps:  16%|█▌        | 161/1000 [02:38<13:36,  1.03it/s, lr=9.37e-5, step_loss=0.0117] 
Steps:  16%|█▌        | 162/1000 [02:38<13:35,  1.03it/s, lr=9.37e-5, step_loss=0.0117]
Steps:  16%|█▌        | 162/1000 [02:38<13:35,  1.03it/s, lr=9.37e-5, step_loss=0.0439]
Steps:  16%|█▌        | 162/1000 [02:38<13:35,  1.03it/s, lr=9.37e-5, step_loss=0.0029]
Steps:  16%|█▌        | 162/1000 [02:38<13:35,  1.03it/s, lr=9.37e-5, step_loss=0.0062]
Steps:  16%|█▌        | 162/1000 [02:38<13:35,  1.03it/s, lr=9.37e-5, step_loss=0.0827]
Steps:  16%|█▋        | 163/1000 [02:39<13:34,  1.03it/s, lr=9.37e-5, step_loss=0.0827]
Steps:  16%|█▋        | 163/1000 [02:39<13:34,  1.03it/s, lr=9.36e-5, step_loss=0.0167]
Steps:  16%|█▋        | 163/1000 [02:39<13:34,  1.03it/s, lr=9.36e-5, step_loss=0.032] 
Steps:  16%|█▋        | 163/1000 [02:39<13:34,  1.03it/s, lr=9.36e-5, step_loss=0.119]
Steps:  16%|█▋        | 163/1000 [02:39<13:34,  1.03it/s, lr=9.36e-5, step_loss=0.076]
Steps:  16%|█▋        | 164/1000 [02:40<13:34,  1.03it/s, lr=9.36e-5, step_loss=0.076]
Steps:  16%|█▋        | 164/1000 [02:40<13:34,  1.03it/s, lr=9.35e-5, step_loss=0.272]
Steps:  16%|█▋        | 164/1000 [02:40<13:34,  1.03it/s, lr=9.35e-5, step_loss=0.0627]
Steps:  16%|█▋        | 164/1000 [02:40<13:34,  1.03it/s, lr=9.35e-5, step_loss=0.0267]
Steps:  16%|█▋        | 164/1000 [02:40<13:34,  1.03it/s, lr=9.35e-5, step_loss=0.0103]
Steps:  16%|█▋        | 165/1000 [02:41<13:32,  1.03it/s, lr=9.35e-5, step_loss=0.0103]
Steps:  16%|█▋        | 165/1000 [02:41<13:32,  1.03it/s, lr=9.34e-5, step_loss=0.0208]
Steps:  16%|█▋        | 165/1000 [02:41<13:32,  1.03it/s, lr=9.34e-5, step_loss=0.328] 
Steps:  16%|█▋        | 165/1000 [02:41<13:32,  1.03it/s, lr=9.34e-5, step_loss=0.0225]
Steps:  16%|█▋        | 165/1000 [02:41<13:32,  1.03it/s, lr=9.34e-5, step_loss=0.0645]
Steps:  17%|█▋        | 166/1000 [02:42<13:31,  1.03it/s, lr=9.34e-5, step_loss=0.0645]
Steps:  17%|█▋        | 166/1000 [02:42<13:31,  1.03it/s, lr=9.34e-5, step_loss=0.0831]
Steps:  17%|█▋        | 166/1000 [02:42<13:31,  1.03it/s, lr=9.34e-5, step_loss=0.00906]
Steps:  17%|█▋        | 166/1000 [02:42<13:31,  1.03it/s, lr=9.34e-5, step_loss=0.0414] 
Steps:  17%|█▋        | 166/1000 [02:42<13:31,  1.03it/s, lr=9.34e-5, step_loss=0.0724]
Steps:  17%|█▋        | 167/1000 [02:43<13:30,  1.03it/s, lr=9.34e-5, step_loss=0.0724]
Steps:  17%|█▋        | 167/1000 [02:43<13:30,  1.03it/s, lr=9.33e-5, step_loss=0.0355]
Steps:  17%|█▋        | 167/1000 [02:43<13:30,  1.03it/s, lr=9.33e-5, step_loss=0.0177]
Steps:  17%|█▋        | 167/1000 [02:43<13:30,  1.03it/s, lr=9.33e-5, step_loss=0.274] 
Steps:  17%|█▋        | 167/1000 [02:43<13:30,  1.03it/s, lr=9.33e-5, step_loss=0.0907]
Steps:  17%|█▋        | 168/1000 [02:44<13:29,  1.03it/s, lr=9.33e-5, step_loss=0.0907]
Steps:  17%|█▋        | 168/1000 [02:44<13:29,  1.03it/s, lr=9.32e-5, step_loss=0.0882]
Steps:  17%|█▋        | 168/1000 [02:44<13:29,  1.03it/s, lr=9.32e-5, step_loss=0.429] 
Steps:  17%|█▋        | 168/1000 [02:44<13:29,  1.03it/s, lr=9.32e-5, step_loss=0.00458]
Steps:  17%|█▋        | 168/1000 [02:44<13:29,  1.03it/s, lr=9.32e-5, step_loss=0.0254] 
Steps:  17%|█▋        | 169/1000 [02:45<13:27,  1.03it/s, lr=9.32e-5, step_loss=0.0254]
Steps:  17%|█▋        | 169/1000 [02:45<13:27,  1.03it/s, lr=9.31e-5, step_loss=0.247] 
Steps:  17%|█▋        | 169/1000 [02:45<13:27,  1.03it/s, lr=9.31e-5, step_loss=0.0179]
Steps:  17%|█▋        | 169/1000 [02:45<13:27,  1.03it/s, lr=9.31e-5, step_loss=0.00721]
Steps:  17%|█▋        | 169/1000 [02:45<13:27,  1.03it/s, lr=9.31e-5, step_loss=0.00722]
Steps:  17%|█▋        | 170/1000 [02:46<13:26,  1.03it/s, lr=9.31e-5, step_loss=0.00722]
Steps:  17%|█▋        | 170/1000 [02:46<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.0137]  
Steps:  17%|█▋        | 170/1000 [02:46<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.138] 
Steps:  17%|█▋        | 170/1000 [02:46<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.0944]
Steps:  17%|█▋        | 170/1000 [02:46<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.0832]
Steps:  17%|█▋        | 171/1000 [02:46<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.0832]
Steps:  17%|█▋        | 171/1000 [02:47<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.105] 
Steps:  17%|█▋        | 171/1000 [02:47<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.084]
Steps:  17%|█▋        | 171/1000 [02:47<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.109]
Steps:  17%|█▋        | 171/1000 [02:47<13:26,  1.03it/s, lr=9.3e-5, step_loss=0.114]
Steps:  17%|█▋        | 172/1000 [02:47<13:25,  1.03it/s, lr=9.3e-5, step_loss=0.114]
Steps:  17%|█▋        | 172/1000 [02:47<13:25,  1.03it/s, lr=9.29e-5, step_loss=0.0399]
Steps:  17%|█▋        | 172/1000 [02:48<13:25,  1.03it/s, lr=9.29e-5, step_loss=0.105] 
Steps:  17%|█▋        | 172/1000 [02:48<13:25,  1.03it/s, lr=9.29e-5, step_loss=0.0848]
Steps:  17%|█▋        | 172/1000 [02:48<13:25,  1.03it/s, lr=9.29e-5, step_loss=0.0383]
Steps:  17%|█▋        | 173/1000 [02:48<13:24,  1.03it/s, lr=9.29e-5, step_loss=0.0383]
Steps:  17%|█▋        | 173/1000 [02:48<13:24,  1.03it/s, lr=9.28e-5, step_loss=0.0213]
Steps:  17%|█▋        | 173/1000 [02:49<13:24,  1.03it/s, lr=9.28e-5, step_loss=0.011] 
Steps:  17%|█▋        | 173/1000 [02:49<13:24,  1.03it/s, lr=9.28e-5, step_loss=0.0928]
Steps:  17%|█▋        | 173/1000 [02:49<13:24,  1.03it/s, lr=9.28e-5, step_loss=0.0562]
Steps:  17%|█▋        | 174/1000 [02:49<13:24,  1.03it/s, lr=9.28e-5, step_loss=0.0562]
Steps:  17%|█▋        | 174/1000 [02:49<13:24,  1.03it/s, lr=9.27e-5, step_loss=0.0935]
Steps:  17%|█▋        | 174/1000 [02:50<13:24,  1.03it/s, lr=9.27e-5, step_loss=0.0105]
Steps:  17%|█▋        | 174/1000 [02:50<13:24,  1.03it/s, lr=9.27e-5, step_loss=0.037] 
Steps:  17%|█▋        | 174/1000 [02:50<13:24,  1.03it/s, lr=9.27e-5, step_loss=0.193]
Steps:  18%|█▊        | 175/1000 [02:50<13:22,  1.03it/s, lr=9.27e-5, step_loss=0.193]
Steps:  18%|█▊        | 175/1000 [02:50<13:22,  1.03it/s, lr=9.26e-5, step_loss=0.0503]
Steps:  18%|█▊        | 175/1000 [02:51<13:22,  1.03it/s, lr=9.26e-5, step_loss=0.00758]
Steps:  18%|█▊        | 175/1000 [02:51<13:22,  1.03it/s, lr=9.26e-5, step_loss=0.031]  
Steps:  18%|█▊        | 175/1000 [02:51<13:22,  1.03it/s, lr=9.26e-5, step_loss=0.0586]
Steps:  18%|█▊        | 176/1000 [02:51<13:21,  1.03it/s, lr=9.26e-5, step_loss=0.0586]
Steps:  18%|█▊        | 176/1000 [02:51<13:21,  1.03it/s, lr=9.25e-5, step_loss=0.00809]
Steps:  18%|█▊        | 176/1000 [02:52<13:21,  1.03it/s, lr=9.25e-5, step_loss=0.156]  
Steps:  18%|█▊        | 176/1000 [02:52<13:21,  1.03it/s, lr=9.25e-5, step_loss=0.0281]
Steps:  18%|█▊        | 176/1000 [02:52<13:21,  1.03it/s, lr=9.25e-5, step_loss=0.164] 
Steps:  18%|█▊        | 177/1000 [02:52<13:20,  1.03it/s, lr=9.25e-5, step_loss=0.164]
Steps:  18%|█▊        | 177/1000 [02:52<13:20,  1.03it/s, lr=9.25e-5, step_loss=0.0328]
Steps:  18%|█▊        | 177/1000 [02:53<13:20,  1.03it/s, lr=9.25e-5, step_loss=0.192] 
Steps:  18%|█▊        | 177/1000 [02:53<13:20,  1.03it/s, lr=9.25e-5, step_loss=0.0491]
Steps:  18%|█▊        | 177/1000 [02:53<13:20,  1.03it/s, lr=9.25e-5, step_loss=0.156] 
Steps:  18%|█▊        | 178/1000 [02:53<13:19,  1.03it/s, lr=9.25e-5, step_loss=0.156]
Steps:  18%|█▊        | 178/1000 [02:53<13:19,  1.03it/s, lr=9.24e-5, step_loss=0.0416]
Steps:  18%|█▊        | 178/1000 [02:54<13:19,  1.03it/s, lr=9.24e-5, step_loss=0.0565]
Steps:  18%|█▊        | 178/1000 [02:54<13:19,  1.03it/s, lr=9.24e-5, step_loss=0.0514]
Steps:  18%|█▊        | 178/1000 [02:54<13:19,  1.03it/s, lr=9.24e-5, step_loss=0.103] 
Steps:  18%|█▊        | 179/1000 [02:54<13:18,  1.03it/s, lr=9.24e-5, step_loss=0.103]
Steps:  18%|█▊        | 179/1000 [02:54<13:18,  1.03it/s, lr=9.23e-5, step_loss=0.034]
Steps:  18%|█▊        | 179/1000 [02:55<13:18,  1.03it/s, lr=9.23e-5, step_loss=0.00666]
Steps:  18%|█▊        | 179/1000 [02:55<13:18,  1.03it/s, lr=9.23e-5, step_loss=0.128]  
Steps:  18%|█▊        | 179/1000 [02:55<13:18,  1.03it/s, lr=9.23e-5, step_loss=0.0874]
Steps:  18%|█▊        | 180/1000 [02:55<13:17,  1.03it/s, lr=9.23e-5, step_loss=0.0874]
Steps:  18%|█▊        | 180/1000 [02:55<13:17,  1.03it/s, lr=9.22e-5, step_loss=0.00198]
Steps:  18%|█▊        | 180/1000 [02:56<13:17,  1.03it/s, lr=9.22e-5, step_loss=0.343]  
Steps:  18%|█▊        | 180/1000 [02:56<13:17,  1.03it/s, lr=9.22e-5, step_loss=0.0573]
Steps:  18%|█▊        | 180/1000 [02:56<13:17,  1.03it/s, lr=9.22e-5, step_loss=0.126] 
Steps:  18%|█▊        | 181/1000 [02:56<13:16,  1.03it/s, lr=9.22e-5, step_loss=0.126]
Steps:  18%|█▊        | 181/1000 [02:56<13:16,  1.03it/s, lr=9.21e-5, step_loss=0.0491]
Steps:  18%|█▊        | 181/1000 [02:56<13:16,  1.03it/s, lr=9.21e-5, step_loss=0.0421]
Steps:  18%|█▊        | 181/1000 [02:57<13:16,  1.03it/s, lr=9.21e-5, step_loss=0.0939]
Steps:  18%|█▊        | 181/1000 [02:57<13:16,  1.03it/s, lr=9.21e-5, step_loss=0.0991]
Steps:  18%|█▊        | 182/1000 [02:57<13:16,  1.03it/s, lr=9.21e-5, step_loss=0.0991]
Steps:  18%|█▊        | 182/1000 [02:57<13:16,  1.03it/s, lr=9.2e-5, step_loss=0.00575]
Steps:  18%|█▊        | 182/1000 [02:57<13:16,  1.03it/s, lr=9.2e-5, step_loss=0.0069] 
Steps:  18%|█▊        | 182/1000 [02:58<13:16,  1.03it/s, lr=9.2e-5, step_loss=0.0371]
Steps:  18%|█▊        | 182/1000 [02:58<13:16,  1.03it/s, lr=9.2e-5, step_loss=0.00451]
Steps:  18%|█▊        | 183/1000 [02:58<13:15,  1.03it/s, lr=9.2e-5, step_loss=0.00451]
Steps:  18%|█▊        | 183/1000 [02:58<13:15,  1.03it/s, lr=9.2e-5, step_loss=0.081]  
Steps:  18%|█▊        | 183/1000 [02:58<13:15,  1.03it/s, lr=9.2e-5, step_loss=0.0246]
Steps:  18%|█▊        | 183/1000 [02:59<13:15,  1.03it/s, lr=9.2e-5, step_loss=0.071] 
Steps:  18%|█▊        | 183/1000 [02:59<13:15,  1.03it/s, lr=9.2e-5, step_loss=0.0595]
Steps:  18%|█▊        | 184/1000 [02:59<13:14,  1.03it/s, lr=9.2e-5, step_loss=0.0595]
Steps:  18%|█▊        | 184/1000 [02:59<13:14,  1.03it/s, lr=9.19e-5, step_loss=0.091]
Steps:  18%|█▊        | 184/1000 [02:59<13:14,  1.03it/s, lr=9.19e-5, step_loss=0.296]
Steps:  18%|█▊        | 184/1000 [03:00<13:14,  1.03it/s, lr=9.19e-5, step_loss=0.041]
Steps:  18%|█▊        | 184/1000 [03:00<13:14,  1.03it/s, lr=9.19e-5, step_loss=0.17] 
Steps:  18%|█▊        | 185/1000 [03:00<13:13,  1.03it/s, lr=9.19e-5, step_loss=0.17]
Steps:  18%|█▊        | 185/1000 [03:00<13:13,  1.03it/s, lr=9.18e-5, step_loss=0.551]
Steps:  18%|█▊        | 185/1000 [03:00<13:13,  1.03it/s, lr=9.18e-5, step_loss=0.568]
Steps:  18%|█▊        | 185/1000 [03:01<13:13,  1.03it/s, lr=9.18e-5, step_loss=0.0656]
Steps:  18%|█▊        | 185/1000 [03:01<13:13,  1.03it/s, lr=9.18e-5, step_loss=0.114] 
Steps:  19%|█▊        | 186/1000 [03:01<13:12,  1.03it/s, lr=9.18e-5, step_loss=0.114]
Steps:  19%|█▊        | 186/1000 [03:01<13:12,  1.03it/s, lr=9.17e-5, step_loss=0.0155]
Steps:  19%|█▊        | 186/1000 [03:01<13:12,  1.03it/s, lr=9.17e-5, step_loss=0.0597]
Steps:  19%|█▊        | 186/1000 [03:02<13:12,  1.03it/s, lr=9.17e-5, step_loss=0.0769]
Steps:  19%|█▊        | 186/1000 [03:02<13:12,  1.03it/s, lr=9.17e-5, step_loss=0.0689]
Steps:  19%|█▊        | 187/1000 [03:02<13:11,  1.03it/s, lr=9.17e-5, step_loss=0.0689]
Steps:  19%|█▊        | 187/1000 [03:02<13:11,  1.03it/s, lr=9.16e-5, step_loss=0.0309]
Steps:  19%|█▊        | 187/1000 [03:02<13:11,  1.03it/s, lr=9.16e-5, step_loss=0.0246]
Steps:  19%|█▊        | 187/1000 [03:03<13:11,  1.03it/s, lr=9.16e-5, step_loss=0.254] 
Steps:  19%|█▊        | 187/1000 [03:03<13:11,  1.03it/s, lr=9.16e-5, step_loss=0.0767]
Steps:  19%|█▉        | 188/1000 [03:03<13:10,  1.03it/s, lr=9.16e-5, step_loss=0.0767]
Steps:  19%|█▉        | 188/1000 [03:03<13:10,  1.03it/s, lr=9.15e-5, step_loss=0.0901]
Steps:  19%|█▉        | 188/1000 [03:03<13:10,  1.03it/s, lr=9.15e-5, step_loss=0.0492]
Steps:  19%|█▉        | 188/1000 [03:04<13:10,  1.03it/s, lr=9.15e-5, step_loss=0.00369]
Steps:  19%|█▉        | 188/1000 [03:04<13:10,  1.03it/s, lr=9.15e-5, step_loss=0.145]  
Steps:  19%|█▉        | 189/1000 [03:04<13:09,  1.03it/s, lr=9.15e-5, step_loss=0.145]
Steps:  19%|█▉        | 189/1000 [03:04<13:09,  1.03it/s, lr=9.14e-5, step_loss=0.0899]
Steps:  19%|█▉        | 189/1000 [03:04<13:09,  1.03it/s, lr=9.14e-5, step_loss=0.0773]
Steps:  19%|█▉        | 189/1000 [03:05<13:09,  1.03it/s, lr=9.14e-5, step_loss=0.00854]
Steps:  19%|█▉        | 189/1000 [03:05<13:09,  1.03it/s, lr=9.14e-5, step_loss=0.0765] 
Steps:  19%|█▉        | 190/1000 [03:05<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.0765]
Steps:  19%|█▉        | 190/1000 [03:05<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.0852]
Steps:  19%|█▉        | 190/1000 [03:05<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.272] 
Steps:  19%|█▉        | 190/1000 [03:06<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.00311]
Steps:  19%|█▉        | 190/1000 [03:06<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.163]  
Steps:  19%|█▉        | 191/1000 [03:06<13:08,  1.03it/s, lr=9.14e-5, step_loss=0.163]
Steps:  19%|█▉        | 191/1000 [03:06<13:08,  1.03it/s, lr=9.13e-5, step_loss=0.391]
Steps:  19%|█▉        | 191/1000 [03:06<13:08,  1.03it/s, lr=9.13e-5, step_loss=0.027]
Steps:  19%|█▉        | 191/1000 [03:06<13:08,  1.03it/s, lr=9.13e-5, step_loss=0.042]
Steps:  19%|█▉        | 191/1000 [03:07<13:08,  1.03it/s, lr=9.13e-5, step_loss=0.187]
Steps:  19%|█▉        | 192/1000 [03:07<13:07,  1.03it/s, lr=9.13e-5, step_loss=0.187]
Steps:  19%|█▉        | 192/1000 [03:07<13:07,  1.03it/s, lr=9.12e-5, step_loss=0.0845]
Steps:  19%|█▉        | 192/1000 [03:07<13:07,  1.03it/s, lr=9.12e-5, step_loss=0.118] 
Steps:  19%|█▉        | 192/1000 [03:07<13:07,  1.03it/s, lr=9.12e-5, step_loss=0.0108]
Steps:  19%|█▉        | 192/1000 [03:08<13:07,  1.03it/s, lr=9.12e-5, step_loss=0.174] 
Steps:  19%|█▉        | 193/1000 [03:08<13:05,  1.03it/s, lr=9.12e-5, step_loss=0.174]
Steps:  19%|█▉        | 193/1000 [03:08<13:05,  1.03it/s, lr=9.11e-5, step_loss=0.177]
Steps:  19%|█▉        | 193/1000 [03:08<13:05,  1.03it/s, lr=9.11e-5, step_loss=0.0898]
Steps:  19%|█▉        | 193/1000 [03:08<13:05,  1.03it/s, lr=9.11e-5, step_loss=0.0275]
Steps:  19%|█▉        | 193/1000 [03:09<13:05,  1.03it/s, lr=9.11e-5, step_loss=0.317] 
Steps:  19%|█▉        | 194/1000 [03:09<13:05,  1.03it/s, lr=9.11e-5, step_loss=0.317]
Steps:  19%|█▉        | 194/1000 [03:09<13:05,  1.03it/s, lr=9.1e-5, step_loss=0.0622]
Steps:  19%|█▉        | 194/1000 [03:09<13:05,  1.03it/s, lr=9.1e-5, step_loss=0.0206]
Steps:  19%|█▉        | 194/1000 [03:09<13:05,  1.03it/s, lr=9.1e-5, step_loss=0.0127]
Steps:  19%|█▉        | 194/1000 [03:10<13:05,  1.03it/s, lr=9.1e-5, step_loss=0.00477]
Steps:  20%|█▉        | 195/1000 [03:10<13:04,  1.03it/s, lr=9.1e-5, step_loss=0.00477]
Steps:  20%|█▉        | 195/1000 [03:10<13:04,  1.03it/s, lr=9.09e-5, step_loss=0.0386]
Steps:  20%|█▉        | 195/1000 [03:10<13:04,  1.03it/s, lr=9.09e-5, step_loss=0.0225]
Steps:  20%|█▉        | 195/1000 [03:10<13:04,  1.03it/s, lr=9.09e-5, step_loss=0.0519]
Steps:  20%|█▉        | 195/1000 [03:11<13:04,  1.03it/s, lr=9.09e-5, step_loss=0.00273]
Steps:  20%|█▉        | 196/1000 [03:11<13:02,  1.03it/s, lr=9.09e-5, step_loss=0.00273]
Steps:  20%|█▉        | 196/1000 [03:11<13:02,  1.03it/s, lr=9.08e-5, step_loss=0.0377] 
Steps:  20%|█▉        | 196/1000 [03:11<13:02,  1.03it/s, lr=9.08e-5, step_loss=0.167] 
Steps:  20%|█▉        | 196/1000 [03:11<13:02,  1.03it/s, lr=9.08e-5, step_loss=0.014]
Steps:  20%|█▉        | 196/1000 [03:12<13:02,  1.03it/s, lr=9.08e-5, step_loss=0.0268]
Steps:  20%|█▉        | 197/1000 [03:12<13:01,  1.03it/s, lr=9.08e-5, step_loss=0.0268]
Steps:  20%|█▉        | 197/1000 [03:12<13:01,  1.03it/s, lr=9.07e-5, step_loss=0.0654]
Steps:  20%|█▉        | 197/1000 [03:12<13:01,  1.03it/s, lr=9.07e-5, step_loss=0.0341]
Steps:  20%|█▉        | 197/1000 [03:12<13:01,  1.03it/s, lr=9.07e-5, step_loss=0.00241]
Steps:  20%|█▉        | 197/1000 [03:13<13:01,  1.03it/s, lr=9.07e-5, step_loss=0.0721] 
Steps:  20%|█▉        | 198/1000 [03:13<13:00,  1.03it/s, lr=9.07e-5, step_loss=0.0721]
Steps:  20%|█▉        | 198/1000 [03:13<13:00,  1.03it/s, lr=9.06e-5, step_loss=0.0217]
Steps:  20%|█▉        | 198/1000 [03:13<13:00,  1.03it/s, lr=9.06e-5, step_loss=0.104] 
Steps:  20%|█▉        | 198/1000 [03:13<13:00,  1.03it/s, lr=9.06e-5, step_loss=0.0621]
Steps:  20%|█▉        | 198/1000 [03:14<13:00,  1.03it/s, lr=9.06e-5, step_loss=0.00787]
Steps:  20%|█▉        | 199/1000 [03:14<12:59,  1.03it/s, lr=9.06e-5, step_loss=0.00787]
Steps:  20%|█▉        | 199/1000 [03:14<12:59,  1.03it/s, lr=9.05e-5, step_loss=0.0455] 
Steps:  20%|█▉        | 199/1000 [03:14<12:59,  1.03it/s, lr=9.05e-5, step_loss=0.0169]
Steps:  20%|█▉        | 199/1000 [03:14<12:59,  1.03it/s, lr=9.05e-5, step_loss=0.104] 
Steps:  20%|█▉        | 199/1000 [03:15<12:59,  1.03it/s, lr=9.05e-5, step_loss=0.139]
Steps:  20%|██        | 200/1000 [03:15<12:58,  1.03it/s, lr=9.05e-5, step_loss=0.139]
Steps:  20%|██        | 200/1000 [03:15<12:58,  1.03it/s, lr=9.05e-5, step_loss=0.0564]
Steps:  20%|██        | 200/1000 [03:15<12:58,  1.03it/s, lr=9.05e-5, step_loss=0.0218]
Steps:  20%|██        | 200/1000 [03:15<12:58,  1.03it/s, lr=9.05e-5, step_loss=0.0378]
Steps:  20%|██        | 200/1000 [03:15<12:58,  1.03it/s, lr=9.05e-5, step_loss=0.423] 
Steps:  20%|██        | 201/1000 [03:16<12:57,  1.03it/s, lr=9.05e-5, step_loss=0.423]
Steps:  20%|██        | 201/1000 [03:16<12:57,  1.03it/s, lr=9.04e-5, step_loss=0.0884]
Steps:  20%|██        | 201/1000 [03:16<12:57,  1.03it/s, lr=9.04e-5, step_loss=0.00491]
Steps:  20%|██        | 201/1000 [03:16<12:57,  1.03it/s, lr=9.04e-5, step_loss=0.0208] 
Steps:  20%|██        | 201/1000 [03:16<12:57,  1.03it/s, lr=9.04e-5, step_loss=0.0834]
Steps:  20%|██        | 202/1000 [03:17<12:56,  1.03it/s, lr=9.04e-5, step_loss=0.0834]
Steps:  20%|██        | 202/1000 [03:17<12:56,  1.03it/s, lr=9.03e-5, step_loss=0.0674]
Steps:  20%|██        | 202/1000 [03:17<12:56,  1.03it/s, lr=9.03e-5, step_loss=0.033] 
Steps:  20%|██        | 202/1000 [03:17<12:56,  1.03it/s, lr=9.03e-5, step_loss=0.0146]
Steps:  20%|██        | 202/1000 [03:17<12:56,  1.03it/s, lr=9.03e-5, step_loss=0.00582]
Steps:  20%|██        | 203/1000 [03:18<12:55,  1.03it/s, lr=9.03e-5, step_loss=0.00582]
Steps:  20%|██        | 203/1000 [03:18<12:55,  1.03it/s, lr=9.02e-5, step_loss=0.0124] 
Steps:  20%|██        | 203/1000 [03:18<12:55,  1.03it/s, lr=9.02e-5, step_loss=0.00812]
Steps:  20%|██        | 203/1000 [03:18<12:55,  1.03it/s, lr=9.02e-5, step_loss=0.111]  
Steps:  20%|██        | 203/1000 [03:18<12:55,  1.03it/s, lr=9.02e-5, step_loss=0.0259]
Steps:  20%|██        | 204/1000 [03:19<12:54,  1.03it/s, lr=9.02e-5, step_loss=0.0259]
Steps:  20%|██        | 204/1000 [03:19<12:54,  1.03it/s, lr=9.01e-5, step_loss=0.00301]
Steps:  20%|██        | 204/1000 [03:19<12:54,  1.03it/s, lr=9.01e-5, step_loss=0.00527]
Steps:  20%|██        | 204/1000 [03:19<12:54,  1.03it/s, lr=9.01e-5, step_loss=0.117]  
Steps:  20%|██        | 204/1000 [03:19<12:54,  1.03it/s, lr=9.01e-5, step_loss=0.0303]
Steps:  20%|██        | 205/1000 [03:20<12:53,  1.03it/s, lr=9.01e-5, step_loss=0.0303]
Steps:  20%|██        | 205/1000 [03:20<12:53,  1.03it/s, lr=9e-5, step_loss=0.0196]   
Steps:  20%|██        | 205/1000 [03:20<12:53,  1.03it/s, lr=9e-5, step_loss=0.15]  
Steps:  20%|██        | 205/1000 [03:20<12:53,  1.03it/s, lr=9e-5, step_loss=0.00383]
Steps:  20%|██        | 205/1000 [03:20<12:53,  1.03it/s, lr=9e-5, step_loss=0.0356] 
Steps:  21%|██        | 206/1000 [03:21<12:52,  1.03it/s, lr=9e-5, step_loss=0.0356]
Steps:  21%|██        | 206/1000 [03:21<12:52,  1.03it/s, lr=8.99e-5, step_loss=0.0169]
Steps:  21%|██        | 206/1000 [03:21<12:52,  1.03it/s, lr=8.99e-5, step_loss=0.0573]
Steps:  21%|██        | 206/1000 [03:21<12:52,  1.03it/s, lr=8.99e-5, step_loss=0.00892]
Steps:  21%|██        | 206/1000 [03:21<12:52,  1.03it/s, lr=8.99e-5, step_loss=0.111]  
Steps:  21%|██        | 207/1000 [03:22<12:51,  1.03it/s, lr=8.99e-5, step_loss=0.111]
Steps:  21%|██        | 207/1000 [03:22<12:51,  1.03it/s, lr=8.98e-5, step_loss=0.0133]
Steps:  21%|██        | 207/1000 [03:22<12:51,  1.03it/s, lr=8.98e-5, step_loss=0.0722]
Steps:  21%|██        | 207/1000 [03:22<12:51,  1.03it/s, lr=8.98e-5, step_loss=0.113] 
Steps:  21%|██        | 207/1000 [03:22<12:51,  1.03it/s, lr=8.98e-5, step_loss=0.0261]
Steps:  21%|██        | 208/1000 [03:22<12:50,  1.03it/s, lr=8.98e-5, step_loss=0.0261]
Steps:  21%|██        | 208/1000 [03:23<12:50,  1.03it/s, lr=8.97e-5, step_loss=0.112] 
Steps:  21%|██        | 208/1000 [03:23<12:50,  1.03it/s, lr=8.97e-5, step_loss=0.249]
Steps:  21%|██        | 208/1000 [03:23<12:50,  1.03it/s, lr=8.97e-5, step_loss=0.00357]
Steps:  21%|██        | 208/1000 [03:23<12:50,  1.03it/s, lr=8.97e-5, step_loss=0.0522] 
Steps:  21%|██        | 209/1000 [03:23<12:49,  1.03it/s, lr=8.97e-5, step_loss=0.0522]
Steps:  21%|██        | 209/1000 [03:24<12:49,  1.03it/s, lr=8.96e-5, step_loss=0.297] 
Steps:  21%|██        | 209/1000 [03:24<12:49,  1.03it/s, lr=8.96e-5, step_loss=0.0982]
Steps:  21%|██        | 209/1000 [03:24<12:49,  1.03it/s, lr=8.96e-5, step_loss=0.187] 
Steps:  21%|██        | 209/1000 [03:24<12:49,  1.03it/s, lr=8.96e-5, step_loss=0.0525]
Steps:  21%|██        | 210/1000 [03:24<12:48,  1.03it/s, lr=8.96e-5, step_loss=0.0525]
Steps:  21%|██        | 210/1000 [03:24<12:48,  1.03it/s, lr=8.95e-5, step_loss=0.208] 
Steps:  21%|██        | 210/1000 [03:25<12:48,  1.03it/s, lr=8.95e-5, step_loss=0.0424]
Steps:  21%|██        | 210/1000 [03:25<12:48,  1.03it/s, lr=8.95e-5, step_loss=0.00562]
Steps:  21%|██        | 210/1000 [03:25<12:48,  1.03it/s, lr=8.95e-5, step_loss=0.0156] 
Steps:  21%|██        | 211/1000 [03:25<12:47,  1.03it/s, lr=8.95e-5, step_loss=0.0156]
Steps:  21%|██        | 211/1000 [03:25<12:47,  1.03it/s, lr=8.94e-5, step_loss=0.00695]
Steps:  21%|██        | 211/1000 [03:26<12:47,  1.03it/s, lr=8.94e-5, step_loss=0.00477]
Steps:  21%|██        | 211/1000 [03:26<12:47,  1.03it/s, lr=8.94e-5, step_loss=0.245]  
Steps:  21%|██        | 211/1000 [03:26<12:47,  1.03it/s, lr=8.94e-5, step_loss=0.0718]
Steps:  21%|██        | 212/1000 [03:26<12:46,  1.03it/s, lr=8.94e-5, step_loss=0.0718]
Steps:  21%|██        | 212/1000 [03:26<12:46,  1.03it/s, lr=8.93e-5, step_loss=0.0834]
Steps:  21%|██        | 212/1000 [03:27<12:46,  1.03it/s, lr=8.93e-5, step_loss=0.0899]
Steps:  21%|██        | 212/1000 [03:27<12:46,  1.03it/s, lr=8.93e-5, step_loss=0.0594]
Steps:  21%|██        | 212/1000 [03:27<12:46,  1.03it/s, lr=8.93e-5, step_loss=0.0314]
Steps:  21%|██▏       | 213/1000 [03:27<12:45,  1.03it/s, lr=8.93e-5, step_loss=0.0314]
Steps:  21%|██▏       | 213/1000 [03:27<12:45,  1.03it/s, lr=8.92e-5, step_loss=0.00268]
Steps:  21%|██▏       | 213/1000 [03:28<12:45,  1.03it/s, lr=8.92e-5, step_loss=0.00269]
Steps:  21%|██▏       | 213/1000 [03:28<12:45,  1.03it/s, lr=8.92e-5, step_loss=0.121]  
Steps:  21%|██▏       | 213/1000 [03:28<12:45,  1.03it/s, lr=8.92e-5, step_loss=0.11] 
Steps:  21%|██▏       | 214/1000 [03:28<12:44,  1.03it/s, lr=8.92e-5, step_loss=0.11]
Steps:  21%|██▏       | 214/1000 [03:28<12:44,  1.03it/s, lr=8.91e-5, step_loss=0.0824]
Steps:  21%|██▏       | 214/1000 [03:29<12:44,  1.03it/s, lr=8.91e-5, step_loss=0.0139]
Steps:  21%|██▏       | 214/1000 [03:29<12:44,  1.03it/s, lr=8.91e-5, step_loss=0.0628]
Steps:  21%|██▏       | 214/1000 [03:29<12:44,  1.03it/s, lr=8.91e-5, step_loss=0.0416]
Steps:  22%|██▏       | 215/1000 [03:29<12:43,  1.03it/s, lr=8.91e-5, step_loss=0.0416]
Steps:  22%|██▏       | 215/1000 [03:29<12:43,  1.03it/s, lr=8.9e-5, step_loss=0.234]  
Steps:  22%|██▏       | 215/1000 [03:30<12:43,  1.03it/s, lr=8.9e-5, step_loss=0.107]
Steps:  22%|██▏       | 215/1000 [03:30<12:43,  1.03it/s, lr=8.9e-5, step_loss=0.0364]
Steps:  22%|██▏       | 215/1000 [03:30<12:43,  1.03it/s, lr=8.9e-5, step_loss=0.0532]
Steps:  22%|██▏       | 216/1000 [03:30<12:42,  1.03it/s, lr=8.9e-5, step_loss=0.0532]
Steps:  22%|██▏       | 216/1000 [03:30<12:42,  1.03it/s, lr=8.89e-5, step_loss=0.247]
Steps:  22%|██▏       | 216/1000 [03:31<12:42,  1.03it/s, lr=8.89e-5, step_loss=0.0523]
Steps:  22%|██▏       | 216/1000 [03:31<12:42,  1.03it/s, lr=8.89e-5, step_loss=0.00264]
Steps:  22%|██▏       | 216/1000 [03:31<12:42,  1.03it/s, lr=8.89e-5, step_loss=0.106]  
Steps:  22%|██▏       | 217/1000 [03:31<12:41,  1.03it/s, lr=8.89e-5, step_loss=0.106]
Steps:  22%|██▏       | 217/1000 [03:31<12:41,  1.03it/s, lr=8.88e-5, step_loss=0.0713]
Steps:  22%|██▏       | 217/1000 [03:32<12:41,  1.03it/s, lr=8.88e-5, step_loss=0.102] 
Steps:  22%|██▏       | 217/1000 [03:32<12:41,  1.03it/s, lr=8.88e-5, step_loss=0.048]
Steps:  22%|██▏       | 217/1000 [03:32<12:41,  1.03it/s, lr=8.88e-5, step_loss=0.0449]
Steps:  22%|██▏       | 218/1000 [03:32<12:40,  1.03it/s, lr=8.88e-5, step_loss=0.0449]
Steps:  22%|██▏       | 218/1000 [03:32<12:40,  1.03it/s, lr=8.87e-5, step_loss=0.00413]
Steps:  22%|██▏       | 218/1000 [03:33<12:40,  1.03it/s, lr=8.87e-5, step_loss=0.00354]
Steps:  22%|██▏       | 218/1000 [03:33<12:40,  1.03it/s, lr=8.87e-5, step_loss=0.0048] 
Steps:  22%|██▏       | 218/1000 [03:33<12:40,  1.03it/s, lr=8.87e-5, step_loss=0.00927]
Steps:  22%|██▏       | 219/1000 [03:33<12:39,  1.03it/s, lr=8.87e-5, step_loss=0.00927]
Steps:  22%|██▏       | 219/1000 [03:33<12:39,  1.03it/s, lr=8.86e-5, step_loss=0.00289]
Steps:  22%|██▏       | 219/1000 [03:33<12:39,  1.03it/s, lr=8.86e-5, step_loss=0.0291] 
Steps:  22%|██▏       | 219/1000 [03:34<12:39,  1.03it/s, lr=8.86e-5, step_loss=0.0214]
Steps:  22%|██▏       | 219/1000 [03:34<12:39,  1.03it/s, lr=8.86e-5, step_loss=0.0159]
Steps:  22%|██▏       | 220/1000 [03:34<12:39,  1.03it/s, lr=8.86e-5, step_loss=0.0159]
Steps:  22%|██▏       | 220/1000 [03:34<12:39,  1.03it/s, lr=8.85e-5, step_loss=0.0215]
Steps:  22%|██▏       | 220/1000 [03:34<12:39,  1.03it/s, lr=8.85e-5, step_loss=0.195] 
Steps:  22%|██▏       | 220/1000 [03:35<12:39,  1.03it/s, lr=8.85e-5, step_loss=0.00267]
Steps:  22%|██▏       | 220/1000 [03:35<12:39,  1.03it/s, lr=8.85e-5, step_loss=0.00182]
Steps:  22%|██▏       | 221/1000 [03:35<12:38,  1.03it/s, lr=8.85e-5, step_loss=0.00182]
Steps:  22%|██▏       | 221/1000 [03:35<12:38,  1.03it/s, lr=8.84e-5, step_loss=0.0441] 
Steps:  22%|██▏       | 221/1000 [03:35<12:38,  1.03it/s, lr=8.84e-5, step_loss=0.00483]
Steps:  22%|██▏       | 221/1000 [03:36<12:38,  1.03it/s, lr=8.84e-5, step_loss=0.0959] 
Steps:  22%|██▏       | 221/1000 [03:36<12:38,  1.03it/s, lr=8.84e-5, step_loss=0.124] 
Steps:  22%|██▏       | 222/1000 [03:36<12:37,  1.03it/s, lr=8.84e-5, step_loss=0.124]
Steps:  22%|██▏       | 222/1000 [03:36<12:37,  1.03it/s, lr=8.83e-5, step_loss=0.0126]
Steps:  22%|██▏       | 222/1000 [03:36<12:37,  1.03it/s, lr=8.83e-5, step_loss=0.0721]
Steps:  22%|██▏       | 222/1000 [03:37<12:37,  1.03it/s, lr=8.83e-5, step_loss=0.0983]
Steps:  22%|██▏       | 222/1000 [03:37<12:37,  1.03it/s, lr=8.83e-5, step_loss=0.064] 
Steps:  22%|██▏       | 223/1000 [03:37<12:36,  1.03it/s, lr=8.83e-5, step_loss=0.064]
Steps:  22%|██▏       | 223/1000 [03:37<12:36,  1.03it/s, lr=8.82e-5, step_loss=0.00596]
Steps:  22%|██▏       | 223/1000 [03:37<12:36,  1.03it/s, lr=8.82e-5, step_loss=0.00665]
Steps:  22%|██▏       | 223/1000 [03:38<12:36,  1.03it/s, lr=8.82e-5, step_loss=0.00564]
Steps:  22%|██▏       | 223/1000 [03:38<12:36,  1.03it/s, lr=8.82e-5, step_loss=0.0422] 
Steps:  22%|██▏       | 224/1000 [03:38<12:35,  1.03it/s, lr=8.82e-5, step_loss=0.0422]
Steps:  22%|██▏       | 224/1000 [03:38<12:35,  1.03it/s, lr=8.81e-5, step_loss=0.204] 
Steps:  22%|██▏       | 224/1000 [03:38<12:35,  1.03it/s, lr=8.81e-5, step_loss=0.0253]
Steps:  22%|██▏       | 224/1000 [03:39<12:35,  1.03it/s, lr=8.81e-5, step_loss=0.0599]
Steps:  22%|██▏       | 224/1000 [03:39<12:35,  1.03it/s, lr=8.81e-5, step_loss=0.177] 
Steps:  22%|██▎       | 225/1000 [03:39<12:34,  1.03it/s, lr=8.81e-5, step_loss=0.177]
Steps:  22%|██▎       | 225/1000 [03:39<12:34,  1.03it/s, lr=8.8e-5, step_loss=0.0628]
Steps:  22%|██▎       | 225/1000 [03:39<12:34,  1.03it/s, lr=8.8e-5, step_loss=0.0468]
Steps:  22%|██▎       | 225/1000 [03:40<12:34,  1.03it/s, lr=8.8e-5, step_loss=0.0195]
Steps:  22%|██▎       | 225/1000 [03:40<12:34,  1.03it/s, lr=8.8e-5, step_loss=0.0239]
Steps:  23%|██▎       | 226/1000 [03:40<12:34,  1.03it/s, lr=8.8e-5, step_loss=0.0239]
Steps:  23%|██▎       | 226/1000 [03:40<12:34,  1.03it/s, lr=8.79e-5, step_loss=0.0108]
Steps:  23%|██▎       | 226/1000 [03:40<12:34,  1.03it/s, lr=8.79e-5, step_loss=0.192] 
Steps:  23%|██▎       | 226/1000 [03:41<12:34,  1.03it/s, lr=8.79e-5, step_loss=0.0108]
Steps:  23%|██▎       | 226/1000 [03:41<12:34,  1.03it/s, lr=8.79e-5, step_loss=0.403] 
Steps:  23%|██▎       | 227/1000 [03:41<12:32,  1.03it/s, lr=8.79e-5, step_loss=0.403]
Steps:  23%|██▎       | 227/1000 [03:41<12:32,  1.03it/s, lr=8.78e-5, step_loss=0.18] 
Steps:  23%|██▎       | 227/1000 [03:41<12:32,  1.03it/s, lr=8.78e-5, step_loss=0.183]
Steps:  23%|██▎       | 227/1000 [03:42<12:32,  1.03it/s, lr=8.78e-5, step_loss=0.251]
Steps:  23%|██▎       | 227/1000 [03:42<12:32,  1.03it/s, lr=8.78e-5, step_loss=0.0444]
Steps:  23%|██▎       | 228/1000 [03:42<12:31,  1.03it/s, lr=8.78e-5, step_loss=0.0444]
Steps:  23%|██▎       | 228/1000 [03:42<12:31,  1.03it/s, lr=8.77e-5, step_loss=0.00208]
Steps:  23%|██▎       | 228/1000 [03:42<12:31,  1.03it/s, lr=8.77e-5, step_loss=0.0232] 
Steps:  23%|██▎       | 228/1000 [03:42<12:31,  1.03it/s, lr=8.77e-5, step_loss=0.0405]
Steps:  23%|██▎       | 228/1000 [03:43<12:31,  1.03it/s, lr=8.77e-5, step_loss=0.00304]
Steps:  23%|██▎       | 229/1000 [03:43<12:31,  1.03it/s, lr=8.77e-5, step_loss=0.00304]
Steps:  23%|██▎       | 229/1000 [03:43<12:31,  1.03it/s, lr=8.76e-5, step_loss=0.00368]
Steps:  23%|██▎       | 229/1000 [03:43<12:31,  1.03it/s, lr=8.76e-5, step_loss=0.0353] 
Steps:  23%|██▎       | 229/1000 [03:43<12:31,  1.03it/s, lr=8.76e-5, step_loss=0.00491]
Steps:  23%|██▎       | 229/1000 [03:44<12:31,  1.03it/s, lr=8.76e-5, step_loss=0.0367] 
Steps:  23%|██▎       | 230/1000 [03:44<12:30,  1.03it/s, lr=8.76e-5, step_loss=0.0367]
Steps:  23%|██▎       | 230/1000 [03:44<12:30,  1.03it/s, lr=8.75e-5, step_loss=0.00375]
Steps:  23%|██▎       | 230/1000 [03:44<12:30,  1.03it/s, lr=8.75e-5, step_loss=0.159]  
Steps:  23%|██▎       | 230/1000 [03:44<12:30,  1.03it/s, lr=8.75e-5, step_loss=0.0476]
Steps:  23%|██▎       | 230/1000 [03:45<12:30,  1.03it/s, lr=8.75e-5, step_loss=0.00546]
Steps:  23%|██▎       | 231/1000 [03:45<12:28,  1.03it/s, lr=8.75e-5, step_loss=0.00546]
Steps:  23%|██▎       | 231/1000 [03:45<12:28,  1.03it/s, lr=8.74e-5, step_loss=0.00405]
Steps:  23%|██▎       | 231/1000 [03:45<12:28,  1.03it/s, lr=8.74e-5, step_loss=0.286]  
Steps:  23%|██▎       | 231/1000 [03:45<12:28,  1.03it/s, lr=8.74e-5, step_loss=0.216]
Steps:  23%|██▎       | 231/1000 [03:46<12:28,  1.03it/s, lr=8.74e-5, step_loss=0.00922]
Steps:  23%|██▎       | 232/1000 [03:46<12:27,  1.03it/s, lr=8.74e-5, step_loss=0.00922]
Steps:  23%|██▎       | 232/1000 [03:46<12:27,  1.03it/s, lr=8.73e-5, step_loss=0.0678] 
Steps:  23%|██▎       | 232/1000 [03:46<12:27,  1.03it/s, lr=8.73e-5, step_loss=0.00523]
Steps:  23%|██▎       | 232/1000 [03:46<12:27,  1.03it/s, lr=8.73e-5, step_loss=0.004]  
Steps:  23%|██▎       | 232/1000 [03:47<12:27,  1.03it/s, lr=8.73e-5, step_loss=0.0219]
Steps:  23%|██▎       | 233/1000 [03:47<12:27,  1.03it/s, lr=8.73e-5, step_loss=0.0219]
Steps:  23%|██▎       | 233/1000 [03:47<12:27,  1.03it/s, lr=8.72e-5, step_loss=0.113] 
Steps:  23%|██▎       | 233/1000 [03:47<12:27,  1.03it/s, lr=8.72e-5, step_loss=0.0124]
Steps:  23%|██▎       | 233/1000 [03:47<12:27,  1.03it/s, lr=8.72e-5, step_loss=0.0075]
Steps:  23%|██▎       | 233/1000 [03:48<12:27,  1.03it/s, lr=8.72e-5, step_loss=0.00584]
Steps:  23%|██▎       | 234/1000 [03:48<12:25,  1.03it/s, lr=8.72e-5, step_loss=0.00584]
Steps:  23%|██▎       | 234/1000 [03:48<12:25,  1.03it/s, lr=8.71e-5, step_loss=0.158]  
Steps:  23%|██▎       | 234/1000 [03:48<12:25,  1.03it/s, lr=8.71e-5, step_loss=0.142]
Steps:  23%|██▎       | 234/1000 [03:48<12:25,  1.03it/s, lr=8.71e-5, step_loss=0.253]
Steps:  23%|██▎       | 234/1000 [03:49<12:25,  1.03it/s, lr=8.71e-5, step_loss=0.168]
Steps:  24%|██▎       | 235/1000 [03:49<12:25,  1.03it/s, lr=8.71e-5, step_loss=0.168]
Steps:  24%|██▎       | 235/1000 [03:49<12:25,  1.03it/s, lr=8.7e-5, step_loss=0.0619]
Steps:  24%|██▎       | 235/1000 [03:49<12:25,  1.03it/s, lr=8.7e-5, step_loss=0.155] 
Steps:  24%|██▎       | 235/1000 [03:49<12:25,  1.03it/s, lr=8.7e-5, step_loss=0.00349]
Steps:  24%|██▎       | 235/1000 [03:50<12:25,  1.03it/s, lr=8.7e-5, step_loss=0.111]  
Steps:  24%|██▎       | 236/1000 [03:50<12:24,  1.03it/s, lr=8.7e-5, step_loss=0.111]
Steps:  24%|██▎       | 236/1000 [03:50<12:24,  1.03it/s, lr=8.69e-5, step_loss=0.00193]
Steps:  24%|██▎       | 236/1000 [03:50<12:24,  1.03it/s, lr=8.69e-5, step_loss=0.0591] 
Steps:  24%|██▎       | 236/1000 [03:50<12:24,  1.03it/s, lr=8.69e-5, step_loss=0.0153]
Steps:  24%|██▎       | 236/1000 [03:51<12:24,  1.03it/s, lr=8.69e-5, step_loss=0.752] 
Steps:  24%|██▎       | 237/1000 [03:51<12:24,  1.02it/s, lr=8.69e-5, step_loss=0.752]
Steps:  24%|██▎       | 237/1000 [03:51<12:24,  1.02it/s, lr=8.68e-5, step_loss=0.0442]
Steps:  24%|██▎       | 237/1000 [03:51<12:24,  1.02it/s, lr=8.68e-5, step_loss=0.0101]
Steps:  24%|██▎       | 237/1000 [03:51<12:24,  1.02it/s, lr=8.68e-5, step_loss=0.0551]
Steps:  24%|██▎       | 237/1000 [03:52<12:24,  1.02it/s, lr=8.68e-5, step_loss=0.0252]
Steps:  24%|██▍       | 238/1000 [03:52<12:23,  1.02it/s, lr=8.68e-5, step_loss=0.0252]
Steps:  24%|██▍       | 238/1000 [03:52<12:23,  1.02it/s, lr=8.67e-5, step_loss=0.248] 
Steps:  24%|██▍       | 238/1000 [03:52<12:23,  1.02it/s, lr=8.67e-5, step_loss=0.125]
Steps:  24%|██▍       | 238/1000 [03:52<12:23,  1.02it/s, lr=8.67e-5, step_loss=0.00364]
Steps:  24%|██▍       | 238/1000 [03:52<12:23,  1.02it/s, lr=8.67e-5, step_loss=0.0916] 
Steps:  24%|██▍       | 239/1000 [03:53<12:21,  1.03it/s, lr=8.67e-5, step_loss=0.0916]
Steps:  24%|██▍       | 239/1000 [03:53<12:21,  1.03it/s, lr=8.66e-5, step_loss=0.137] 
Steps:  24%|██▍       | 239/1000 [03:53<12:21,  1.03it/s, lr=8.66e-5, step_loss=0.0391]
Steps:  24%|██▍       | 239/1000 [03:53<12:21,  1.03it/s, lr=8.66e-5, step_loss=0.131] 
Steps:  24%|██▍       | 239/1000 [03:53<12:21,  1.03it/s, lr=8.66e-5, step_loss=0.057]
Steps:  24%|██▍       | 240/1000 [03:54<12:20,  1.03it/s, lr=8.66e-5, step_loss=0.057]
Steps:  24%|██▍       | 240/1000 [03:54<12:20,  1.03it/s, lr=8.64e-5, step_loss=0.195]
Steps:  24%|██▍       | 240/1000 [03:54<12:20,  1.03it/s, lr=8.64e-5, step_loss=0.00242]
Steps:  24%|██▍       | 240/1000 [03:54<12:20,  1.03it/s, lr=8.64e-5, step_loss=0.00255]
Steps:  24%|██▍       | 240/1000 [03:54<12:20,  1.03it/s, lr=8.64e-5, step_loss=0.00237]
Steps:  24%|██▍       | 241/1000 [03:55<12:19,  1.03it/s, lr=8.64e-5, step_loss=0.00237]
Steps:  24%|██▍       | 241/1000 [03:55<12:19,  1.03it/s, lr=8.63e-5, step_loss=0.282]  
Steps:  24%|██▍       | 241/1000 [03:55<12:19,  1.03it/s, lr=8.63e-5, step_loss=0.00539]
Steps:  24%|██▍       | 241/1000 [03:55<12:19,  1.03it/s, lr=8.63e-5, step_loss=0.0641] 
Steps:  24%|██▍       | 241/1000 [03:55<12:19,  1.03it/s, lr=8.63e-5, step_loss=0.0533]
Steps:  24%|██▍       | 242/1000 [03:56<12:17,  1.03it/s, lr=8.63e-5, step_loss=0.0533]
Steps:  24%|██▍       | 242/1000 [03:56<12:17,  1.03it/s, lr=8.62e-5, step_loss=0.0432]
Steps:  24%|██▍       | 242/1000 [03:56<12:17,  1.03it/s, lr=8.62e-5, step_loss=0.0329]
Steps:  24%|██▍       | 242/1000 [03:56<12:17,  1.03it/s, lr=8.62e-5, step_loss=0.0026]
Steps:  24%|██▍       | 242/1000 [03:56<12:17,  1.03it/s, lr=8.62e-5, step_loss=0.00226]
Steps:  24%|██▍       | 243/1000 [03:57<12:16,  1.03it/s, lr=8.62e-5, step_loss=0.00226]
Steps:  24%|██▍       | 243/1000 [03:57<12:16,  1.03it/s, lr=8.61e-5, step_loss=0.142]  
Steps:  24%|██▍       | 243/1000 [03:57<12:16,  1.03it/s, lr=8.61e-5, step_loss=0.104]
Steps:  24%|██▍       | 243/1000 [03:57<12:16,  1.03it/s, lr=8.61e-5, step_loss=0.182]
Steps:  24%|██▍       | 243/1000 [03:57<12:16,  1.03it/s, lr=8.61e-5, step_loss=0.00225]
Steps:  24%|██▍       | 244/1000 [03:58<12:15,  1.03it/s, lr=8.61e-5, step_loss=0.00225]
Steps:  24%|██▍       | 244/1000 [03:58<12:15,  1.03it/s, lr=8.6e-5, step_loss=0.0539]  
Steps:  24%|██▍       | 244/1000 [03:58<12:15,  1.03it/s, lr=8.6e-5, step_loss=0.0635]
Steps:  24%|██▍       | 244/1000 [03:58<12:15,  1.03it/s, lr=8.6e-5, step_loss=0.0393]
Steps:  24%|██▍       | 244/1000 [03:58<12:15,  1.03it/s, lr=8.6e-5, step_loss=0.0366]
Steps:  24%|██▍       | 245/1000 [03:59<12:14,  1.03it/s, lr=8.6e-5, step_loss=0.0366]
Steps:  24%|██▍       | 245/1000 [03:59<12:14,  1.03it/s, lr=8.59e-5, step_loss=0.0575]
Steps:  24%|██▍       | 245/1000 [03:59<12:14,  1.03it/s, lr=8.59e-5, step_loss=0.0954]
Steps:  24%|██▍       | 245/1000 [03:59<12:14,  1.03it/s, lr=8.59e-5, step_loss=0.0694]
Steps:  24%|██▍       | 245/1000 [03:59<12:14,  1.03it/s, lr=8.59e-5, step_loss=0.0107]
Steps:  25%|██▍       | 246/1000 [03:59<12:13,  1.03it/s, lr=8.59e-5, step_loss=0.0107]
Steps:  25%|██▍       | 246/1000 [04:00<12:13,  1.03it/s, lr=8.58e-5, step_loss=0.0352]
Steps:  25%|██▍       | 246/1000 [04:00<12:13,  1.03it/s, lr=8.58e-5, step_loss=0.0331]
Steps:  25%|██▍       | 246/1000 [04:00<12:13,  1.03it/s, lr=8.58e-5, step_loss=0.114] 
Steps:  25%|██▍       | 246/1000 [04:00<12:13,  1.03it/s, lr=8.58e-5, step_loss=0.0829]
Steps:  25%|██▍       | 247/1000 [04:00<12:12,  1.03it/s, lr=8.58e-5, step_loss=0.0829]
Steps:  25%|██▍       | 247/1000 [04:01<12:12,  1.03it/s, lr=8.57e-5, step_loss=0.00269]
Steps:  25%|██▍       | 247/1000 [04:01<12:12,  1.03it/s, lr=8.57e-5, step_loss=0.0817] 
Steps:  25%|██▍       | 247/1000 [04:01<12:12,  1.03it/s, lr=8.57e-5, step_loss=0.0109]
Steps:  25%|██▍       | 247/1000 [04:01<12:12,  1.03it/s, lr=8.57e-5, step_loss=0.0049]
Steps:  25%|██▍       | 248/1000 [04:01<12:11,  1.03it/s, lr=8.57e-5, step_loss=0.0049]
Steps:  25%|██▍       | 248/1000 [04:01<12:11,  1.03it/s, lr=8.56e-5, step_loss=0.0134]
Steps:  25%|██▍       | 248/1000 [04:02<12:11,  1.03it/s, lr=8.56e-5, step_loss=0.065] 
Steps:  25%|██▍       | 248/1000 [04:02<12:11,  1.03it/s, lr=8.56e-5, step_loss=0.311]
Steps:  25%|██▍       | 248/1000 [04:02<12:11,  1.03it/s, lr=8.56e-5, step_loss=0.535]
Steps:  25%|██▍       | 249/1000 [04:02<12:10,  1.03it/s, lr=8.56e-5, step_loss=0.535]
Steps:  25%|██▍       | 249/1000 [04:02<12:10,  1.03it/s, lr=8.55e-5, step_loss=0.142]
Steps:  25%|██▍       | 249/1000 [04:03<12:10,  1.03it/s, lr=8.55e-5, step_loss=0.051]
Steps:  25%|██▍       | 249/1000 [04:03<12:10,  1.03it/s, lr=8.55e-5, step_loss=0.124]
Steps:  25%|██▍       | 249/1000 [04:03<12:10,  1.03it/s, lr=8.55e-5, step_loss=0.0457]
Steps:  25%|██▌       | 250/1000 [04:03<12:09,  1.03it/s, lr=8.55e-5, step_loss=0.0457]
Steps:  25%|██▌       | 250/1000 [04:03<12:09,  1.03it/s, lr=8.54e-5, step_loss=0.123] 
Steps:  25%|██▌       | 250/1000 [04:04<12:09,  1.03it/s, lr=8.54e-5, step_loss=0.0114]
Steps:  25%|██▌       | 250/1000 [04:04<12:09,  1.03it/s, lr=8.54e-5, step_loss=0.0938]
Steps:  25%|██▌       | 250/1000 [04:04<12:09,  1.03it/s, lr=8.54e-5, step_loss=0.397] 
Steps:  25%|██▌       | 251/1000 [04:04<12:08,  1.03it/s, lr=8.54e-5, step_loss=0.397]
Steps:  25%|██▌       | 251/1000 [04:04<12:08,  1.03it/s, lr=8.52e-5, step_loss=0.0204]
Steps:  25%|██▌       | 251/1000 [04:05<12:08,  1.03it/s, lr=8.52e-5, step_loss=0.105] 
Steps:  25%|██▌       | 251/1000 [04:05<12:08,  1.03it/s, lr=8.52e-5, step_loss=0.0536]
Steps:  25%|██▌       | 251/1000 [04:05<12:08,  1.03it/s, lr=8.52e-5, step_loss=0.00637]
Steps:  25%|██▌       | 252/1000 [04:05<12:07,  1.03it/s, lr=8.52e-5, step_loss=0.00637]
Steps:  25%|██▌       | 252/1000 [04:05<12:07,  1.03it/s, lr=8.51e-5, step_loss=0.0956] 
Steps:  25%|██▌       | 252/1000 [04:06<12:07,  1.03it/s, lr=8.51e-5, step_loss=0.0225]
Steps:  25%|██▌       | 252/1000 [04:06<12:07,  1.03it/s, lr=8.51e-5, step_loss=0.0311]
Steps:  25%|██▌       | 252/1000 [04:06<12:07,  1.03it/s, lr=8.51e-5, step_loss=0.0634]
Steps:  25%|██▌       | 253/1000 [04:06<12:06,  1.03it/s, lr=8.51e-5, step_loss=0.0634]
Steps:  25%|██▌       | 253/1000 [04:06<12:06,  1.03it/s, lr=8.5e-5, step_loss=0.052]  
Steps:  25%|██▌       | 253/1000 [04:07<12:06,  1.03it/s, lr=8.5e-5, step_loss=0.0122]
Steps:  25%|██▌       | 253/1000 [04:07<12:06,  1.03it/s, lr=8.5e-5, step_loss=0.0108]
Steps:  25%|██▌       | 253/1000 [04:07<12:06,  1.03it/s, lr=8.5e-5, step_loss=0.0424]
Steps:  25%|██▌       | 254/1000 [04:07<12:05,  1.03it/s, lr=8.5e-5, step_loss=0.0424]
Steps:  25%|██▌       | 254/1000 [04:07<12:05,  1.03it/s, lr=8.49e-5, step_loss=0.18] 
Steps:  25%|██▌       | 254/1000 [04:08<12:05,  1.03it/s, lr=8.49e-5, step_loss=0.0151]
Steps:  25%|██▌       | 254/1000 [04:08<12:05,  1.03it/s, lr=8.49e-5, step_loss=0.324] 
Steps:  25%|██▌       | 254/1000 [04:08<12:05,  1.03it/s, lr=8.49e-5, step_loss=0.452]
Steps:  26%|██▌       | 255/1000 [04:08<12:04,  1.03it/s, lr=8.49e-5, step_loss=0.452]
Steps:  26%|██▌       | 255/1000 [04:08<12:04,  1.03it/s, lr=8.48e-5, step_loss=0.00736]
Steps:  26%|██▌       | 255/1000 [04:09<12:04,  1.03it/s, lr=8.48e-5, step_loss=0.057]  
Steps:  26%|██▌       | 255/1000 [04:09<12:04,  1.03it/s, lr=8.48e-5, step_loss=0.0261]
Steps:  26%|██▌       | 255/1000 [04:09<12:04,  1.03it/s, lr=8.48e-5, step_loss=0.211] 
Steps:  26%|██▌       | 256/1000 [04:09<12:03,  1.03it/s, lr=8.48e-5, step_loss=0.211]
Steps:  26%|██▌       | 256/1000 [04:09<12:03,  1.03it/s, lr=8.47e-5, step_loss=0.0371]
Steps:  26%|██▌       | 256/1000 [04:09<12:03,  1.03it/s, lr=8.47e-5, step_loss=0.0548]
Steps:  26%|██▌       | 256/1000 [04:10<12:03,  1.03it/s, lr=8.47e-5, step_loss=0.063] 
Steps:  26%|██▌       | 256/1000 [04:10<12:03,  1.03it/s, lr=8.47e-5, step_loss=0.372]
Steps:  26%|██▌       | 257/1000 [04:10<12:02,  1.03it/s, lr=8.47e-5, step_loss=0.372]
Steps:  26%|██▌       | 257/1000 [04:10<12:02,  1.03it/s, lr=8.46e-5, step_loss=0.031]
Steps:  26%|██▌       | 257/1000 [04:10<12:02,  1.03it/s, lr=8.46e-5, step_loss=0.0141]
Steps:  26%|██▌       | 257/1000 [04:11<12:02,  1.03it/s, lr=8.46e-5, step_loss=0.04]  
Steps:  26%|██▌       | 257/1000 [04:11<12:02,  1.03it/s, lr=8.46e-5, step_loss=0.0653]
Steps:  26%|██▌       | 258/1000 [04:11<12:01,  1.03it/s, lr=8.46e-5, step_loss=0.0653]
Steps:  26%|██▌       | 258/1000 [04:11<12:01,  1.03it/s, lr=8.45e-5, step_loss=0.167] 
Steps:  26%|██▌       | 258/1000 [04:11<12:01,  1.03it/s, lr=8.45e-5, step_loss=0.0642]
Steps:  26%|██▌       | 258/1000 [04:12<12:01,  1.03it/s, lr=8.45e-5, step_loss=0.21]  
Steps:  26%|██▌       | 258/1000 [04:12<12:01,  1.03it/s, lr=8.45e-5, step_loss=0.0516]
Steps:  26%|██▌       | 259/1000 [04:12<12:00,  1.03it/s, lr=8.45e-5, step_loss=0.0516]
Steps:  26%|██▌       | 259/1000 [04:12<12:00,  1.03it/s, lr=8.43e-5, step_loss=0.208] 
Steps:  26%|██▌       | 259/1000 [04:12<12:00,  1.03it/s, lr=8.43e-5, step_loss=0.131]
Steps:  26%|██▌       | 259/1000 [04:13<12:00,  1.03it/s, lr=8.43e-5, step_loss=0.689]
Steps:  26%|██▌       | 259/1000 [04:13<12:00,  1.03it/s, lr=8.43e-5, step_loss=0.135]
Steps:  26%|██▌       | 260/1000 [04:13<11:59,  1.03it/s, lr=8.43e-5, step_loss=0.135]
Steps:  26%|██▌       | 260/1000 [04:13<11:59,  1.03it/s, lr=8.42e-5, step_loss=0.255]
Steps:  26%|██▌       | 260/1000 [04:13<11:59,  1.03it/s, lr=8.42e-5, step_loss=0.0905]
Steps:  26%|██▌       | 260/1000 [04:14<11:59,  1.03it/s, lr=8.42e-5, step_loss=0.0482]
Steps:  26%|██▌       | 260/1000 [04:14<11:59,  1.03it/s, lr=8.42e-5, step_loss=0.0559]
Steps:  26%|██▌       | 261/1000 [04:14<11:58,  1.03it/s, lr=8.42e-5, step_loss=0.0559]
Steps:  26%|██▌       | 261/1000 [04:14<11:58,  1.03it/s, lr=8.41e-5, step_loss=0.0198]
Steps:  26%|██▌       | 261/1000 [04:14<11:58,  1.03it/s, lr=8.41e-5, step_loss=0.0385]
Steps:  26%|██▌       | 261/1000 [04:15<11:58,  1.03it/s, lr=8.41e-5, step_loss=0.16]  
Steps:  26%|██▌       | 261/1000 [04:15<11:58,  1.03it/s, lr=8.41e-5, step_loss=0.00742]
Steps:  26%|██▌       | 262/1000 [04:15<11:57,  1.03it/s, lr=8.41e-5, step_loss=0.00742]
Steps:  26%|██▌       | 262/1000 [04:15<11:57,  1.03it/s, lr=8.4e-5, step_loss=0.367]   
Steps:  26%|██▌       | 262/1000 [04:15<11:57,  1.03it/s, lr=8.4e-5, step_loss=0.218]
Steps:  26%|██▌       | 262/1000 [04:16<11:57,  1.03it/s, lr=8.4e-5, step_loss=0.0134]
Steps:  26%|██▌       | 262/1000 [04:16<11:57,  1.03it/s, lr=8.4e-5, step_loss=0.0566]
Steps:  26%|██▋       | 263/1000 [04:16<11:56,  1.03it/s, lr=8.4e-5, step_loss=0.0566]
Steps:  26%|██▋       | 263/1000 [04:16<11:56,  1.03it/s, lr=8.39e-5, step_loss=0.0785]
Steps:  26%|██▋       | 263/1000 [04:16<11:56,  1.03it/s, lr=8.39e-5, step_loss=0.505] 
Steps:  26%|██▋       | 263/1000 [04:17<11:56,  1.03it/s, lr=8.39e-5, step_loss=0.00261]
Steps:  26%|██▋       | 263/1000 [04:17<11:56,  1.03it/s, lr=8.39e-5, step_loss=0.0298] 
Steps:  26%|██▋       | 264/1000 [04:17<11:55,  1.03it/s, lr=8.39e-5, step_loss=0.0298]
Steps:  26%|██▋       | 264/1000 [04:17<11:55,  1.03it/s, lr=8.38e-5, step_loss=0.107] 
Steps:  26%|██▋       | 264/1000 [04:17<11:55,  1.03it/s, lr=8.38e-5, step_loss=0.177]
Steps:  26%|██▋       | 264/1000 [04:18<11:55,  1.03it/s, lr=8.38e-5, step_loss=0.0597]
Steps:  26%|██▋       | 264/1000 [04:18<11:55,  1.03it/s, lr=8.38e-5, step_loss=0.219] 
Steps:  26%|██▋       | 265/1000 [04:18<11:54,  1.03it/s, lr=8.38e-5, step_loss=0.219]
Steps:  26%|██▋       | 265/1000 [04:18<11:54,  1.03it/s, lr=8.37e-5, step_loss=0.0283]
Steps:  26%|██▋       | 265/1000 [04:18<11:54,  1.03it/s, lr=8.37e-5, step_loss=0.134] 
Steps:  26%|██▋       | 265/1000 [04:18<11:54,  1.03it/s, lr=8.37e-5, step_loss=0.00719]
Steps:  26%|██▋       | 265/1000 [04:19<11:54,  1.03it/s, lr=8.37e-5, step_loss=0.069]  
Steps:  27%|██▋       | 266/1000 [04:19<11:54,  1.03it/s, lr=8.37e-5, step_loss=0.069]
Steps:  27%|██▋       | 266/1000 [04:19<11:54,  1.03it/s, lr=8.35e-5, step_loss=0.11] 
Steps:  27%|██▋       | 266/1000 [04:19<11:54,  1.03it/s, lr=8.35e-5, step_loss=0.0678]
Steps:  27%|██▋       | 266/1000 [04:19<11:54,  1.03it/s, lr=8.35e-5, step_loss=0.00324]
Steps:  27%|██▋       | 266/1000 [04:20<11:54,  1.03it/s, lr=8.35e-5, step_loss=0.0198] 
Steps:  27%|██▋       | 267/1000 [04:20<11:53,  1.03it/s, lr=8.35e-5, step_loss=0.0198]
Steps:  27%|██▋       | 267/1000 [04:20<11:53,  1.03it/s, lr=8.34e-5, step_loss=0.0987]
Steps:  27%|██▋       | 267/1000 [04:20<11:53,  1.03it/s, lr=8.34e-5, step_loss=0.199] 
Steps:  27%|██▋       | 267/1000 [04:20<11:53,  1.03it/s, lr=8.34e-5, step_loss=0.0982]
Steps:  27%|██▋       | 267/1000 [04:21<11:53,  1.03it/s, lr=8.34e-5, step_loss=0.0574]
Steps:  27%|██▋       | 268/1000 [04:21<11:52,  1.03it/s, lr=8.34e-5, step_loss=0.0574]
Steps:  27%|██▋       | 268/1000 [04:21<11:52,  1.03it/s, lr=8.33e-5, step_loss=0.1]   
Steps:  27%|██▋       | 268/1000 [04:21<11:52,  1.03it/s, lr=8.33e-5, step_loss=0.00517]
Steps:  27%|██▋       | 268/1000 [04:21<11:52,  1.03it/s, lr=8.33e-5, step_loss=0.00931]
Steps:  27%|██▋       | 268/1000 [04:22<11:52,  1.03it/s, lr=8.33e-5, step_loss=0.244]  
Steps:  27%|██▋       | 269/1000 [04:22<11:51,  1.03it/s, lr=8.33e-5, step_loss=0.244]
Steps:  27%|██▋       | 269/1000 [04:22<11:51,  1.03it/s, lr=8.32e-5, step_loss=0.0121]
Steps:  27%|██▋       | 269/1000 [04:22<11:51,  1.03it/s, lr=8.32e-5, step_loss=0.106] 
Steps:  27%|██▋       | 269/1000 [04:22<11:51,  1.03it/s, lr=8.32e-5, step_loss=0.228]
Steps:  27%|██▋       | 269/1000 [04:23<11:51,  1.03it/s, lr=8.32e-5, step_loss=0.188]
Steps:  27%|██▋       | 270/1000 [04:23<11:50,  1.03it/s, lr=8.32e-5, step_loss=0.188]
Steps:  27%|██▋       | 270/1000 [04:23<11:50,  1.03it/s, lr=8.31e-5, step_loss=0.13] 
Steps:  27%|██▋       | 270/1000 [04:23<11:50,  1.03it/s, lr=8.31e-5, step_loss=0.0336]
Steps:  27%|██▋       | 270/1000 [04:23<11:50,  1.03it/s, lr=8.31e-5, step_loss=0.0774]
Steps:  27%|██▋       | 270/1000 [04:24<11:50,  1.03it/s, lr=8.31e-5, step_loss=0.0714]
Steps:  27%|██▋       | 271/1000 [04:24<11:49,  1.03it/s, lr=8.31e-5, step_loss=0.0714]
Steps:  27%|██▋       | 271/1000 [04:24<11:49,  1.03it/s, lr=8.29e-5, step_loss=0.194] 
Steps:  27%|██▋       | 271/1000 [04:24<11:49,  1.03it/s, lr=8.29e-5, step_loss=0.0118]
Steps:  27%|██▋       | 271/1000 [04:24<11:49,  1.03it/s, lr=8.29e-5, step_loss=0.0924]
Steps:  27%|██▋       | 271/1000 [04:25<11:49,  1.03it/s, lr=8.29e-5, step_loss=0.0544]
Steps:  27%|██▋       | 272/1000 [04:25<11:48,  1.03it/s, lr=8.29e-5, step_loss=0.0544]
Steps:  27%|██▋       | 272/1000 [04:25<11:48,  1.03it/s, lr=8.28e-5, step_loss=0.0134]
Steps:  27%|██▋       | 272/1000 [04:25<11:48,  1.03it/s, lr=8.28e-5, step_loss=0.0589]
Steps:  27%|██▋       | 272/1000 [04:25<11:48,  1.03it/s, lr=8.28e-5, step_loss=0.181] 
Steps:  27%|██▋       | 272/1000 [04:26<11:48,  1.03it/s, lr=8.28e-5, step_loss=0.0607]
Steps:  27%|██▋       | 273/1000 [04:26<11:46,  1.03it/s, lr=8.28e-5, step_loss=0.0607]
Steps:  27%|██▋       | 273/1000 [04:26<11:46,  1.03it/s, lr=8.27e-5, step_loss=0.0166]
Steps:  27%|██▋       | 273/1000 [04:26<11:46,  1.03it/s, lr=8.27e-5, step_loss=0.0388]
Steps:  27%|██▋       | 273/1000 [04:26<11:46,  1.03it/s, lr=8.27e-5, step_loss=0.124] 
Steps:  27%|██▋       | 273/1000 [04:27<11:46,  1.03it/s, lr=8.27e-5, step_loss=0.0777]
Steps:  27%|██▋       | 274/1000 [04:27<11:46,  1.03it/s, lr=8.27e-5, step_loss=0.0777]
Steps:  27%|██▋       | 274/1000 [04:27<11:46,  1.03it/s, lr=8.26e-5, step_loss=0.00911]
Steps:  27%|██▋       | 274/1000 [04:27<11:46,  1.03it/s, lr=8.26e-5, step_loss=0.0953] 
Steps:  27%|██▋       | 274/1000 [04:27<11:46,  1.03it/s, lr=8.26e-5, step_loss=0.0637]
Steps:  27%|██▋       | 274/1000 [04:27<11:46,  1.03it/s, lr=8.26e-5, step_loss=0.111] 
Steps:  28%|██▊       | 275/1000 [04:28<11:45,  1.03it/s, lr=8.26e-5, step_loss=0.111]
Steps:  28%|██▊       | 275/1000 [04:28<11:45,  1.03it/s, lr=8.25e-5, step_loss=0.00476]
Steps:  28%|██▊       | 275/1000 [04:28<11:45,  1.03it/s, lr=8.25e-5, step_loss=0.0559] 
Steps:  28%|██▊       | 275/1000 [04:28<11:45,  1.03it/s, lr=8.25e-5, step_loss=0.00242]
Steps:  28%|██▊       | 275/1000 [04:28<11:45,  1.03it/s, lr=8.25e-5, step_loss=0.0245] 
Steps:  28%|██▊       | 276/1000 [04:29<11:45,  1.03it/s, lr=8.25e-5, step_loss=0.0245]
Steps:  28%|██▊       | 276/1000 [04:29<11:45,  1.03it/s, lr=8.24e-5, step_loss=0.0173]
Steps:  28%|██▊       | 276/1000 [04:29<11:45,  1.03it/s, lr=8.24e-5, step_loss=0.154] 
Steps:  28%|██▊       | 276/1000 [04:29<11:45,  1.03it/s, lr=8.24e-5, step_loss=0.271]
Steps:  28%|██▊       | 276/1000 [04:29<11:45,  1.03it/s, lr=8.24e-5, step_loss=0.429]
Steps:  28%|██▊       | 277/1000 [04:30<11:44,  1.03it/s, lr=8.24e-5, step_loss=0.429]
Steps:  28%|██▊       | 277/1000 [04:30<11:44,  1.03it/s, lr=8.22e-5, step_loss=0.209]
Steps:  28%|██▊       | 277/1000 [04:30<11:44,  1.03it/s, lr=8.22e-5, step_loss=0.0989]
Steps:  28%|██▊       | 277/1000 [04:30<11:44,  1.03it/s, lr=8.22e-5, step_loss=0.0615]
Steps:  28%|██▊       | 277/1000 [04:30<11:44,  1.03it/s, lr=8.22e-5, step_loss=0.17]  
Steps:  28%|██▊       | 278/1000 [04:31<11:42,  1.03it/s, lr=8.22e-5, step_loss=0.17]
Steps:  28%|██▊       | 278/1000 [04:31<11:42,  1.03it/s, lr=8.21e-5, step_loss=0.0612]
Steps:  28%|██▊       | 278/1000 [04:31<11:42,  1.03it/s, lr=8.21e-5, step_loss=0.0593]
Steps:  28%|██▊       | 278/1000 [04:31<11:42,  1.03it/s, lr=8.21e-5, step_loss=0.0763]
Steps:  28%|██▊       | 278/1000 [04:31<11:42,  1.03it/s, lr=8.21e-5, step_loss=0.116] 
Steps:  28%|██▊       | 279/1000 [04:32<11:41,  1.03it/s, lr=8.21e-5, step_loss=0.116]
Steps:  28%|██▊       | 279/1000 [04:32<11:41,  1.03it/s, lr=8.2e-5, step_loss=0.0181]
Steps:  28%|██▊       | 279/1000 [04:32<11:41,  1.03it/s, lr=8.2e-5, step_loss=0.0506]
Steps:  28%|██▊       | 279/1000 [04:32<11:41,  1.03it/s, lr=8.2e-5, step_loss=0.0164]
Steps:  28%|██▊       | 279/1000 [04:32<11:41,  1.03it/s, lr=8.2e-5, step_loss=0.0652]
Steps:  28%|██▊       | 280/1000 [04:33<11:40,  1.03it/s, lr=8.2e-5, step_loss=0.0652]
Steps:  28%|██▊       | 280/1000 [04:33<11:40,  1.03it/s, lr=8.19e-5, step_loss=0.382]
Steps:  28%|██▊       | 280/1000 [04:33<11:40,  1.03it/s, lr=8.19e-5, step_loss=0.0145]
Steps:  28%|██▊       | 280/1000 [04:33<11:40,  1.03it/s, lr=8.19e-5, step_loss=0.0155]
Steps:  28%|██▊       | 280/1000 [04:33<11:40,  1.03it/s, lr=8.19e-5, step_loss=0.0214]
Steps:  28%|██▊       | 281/1000 [04:34<11:39,  1.03it/s, lr=8.19e-5, step_loss=0.0214]
Steps:  28%|██▊       | 281/1000 [04:34<11:39,  1.03it/s, lr=8.18e-5, step_loss=0.0792]
Steps:  28%|██▊       | 281/1000 [04:34<11:39,  1.03it/s, lr=8.18e-5, step_loss=0.00952]
Steps:  28%|██▊       | 281/1000 [04:34<11:39,  1.03it/s, lr=8.18e-5, step_loss=0.0476] 
Steps:  28%|██▊       | 281/1000 [04:34<11:39,  1.03it/s, lr=8.18e-5, step_loss=0.00307]
Steps:  28%|██▊       | 282/1000 [04:35<11:38,  1.03it/s, lr=8.18e-5, step_loss=0.00307]
Steps:  28%|██▊       | 282/1000 [04:35<11:38,  1.03it/s, lr=8.16e-5, step_loss=0.157]  
Steps:  28%|██▊       | 282/1000 [04:35<11:38,  1.03it/s, lr=8.16e-5, step_loss=0.0193]
Steps:  28%|██▊       | 282/1000 [04:35<11:38,  1.03it/s, lr=8.16e-5, step_loss=0.206] 
Steps:  28%|██▊       | 282/1000 [04:35<11:38,  1.03it/s, lr=8.16e-5, step_loss=0.00778]
Steps:  28%|██▊       | 283/1000 [04:35<11:37,  1.03it/s, lr=8.16e-5, step_loss=0.00778]
Steps:  28%|██▊       | 283/1000 [04:36<11:37,  1.03it/s, lr=8.15e-5, step_loss=0.314]  
Steps:  28%|██▊       | 283/1000 [04:36<11:37,  1.03it/s, lr=8.15e-5, step_loss=0.00185]
Steps:  28%|██▊       | 283/1000 [04:36<11:37,  1.03it/s, lr=8.15e-5, step_loss=0.00888]
Steps:  28%|██▊       | 283/1000 [04:36<11:37,  1.03it/s, lr=8.15e-5, step_loss=0.00625]
Steps:  28%|██▊       | 284/1000 [04:36<11:36,  1.03it/s, lr=8.15e-5, step_loss=0.00625]
Steps:  28%|██▊       | 284/1000 [04:36<11:36,  1.03it/s, lr=8.14e-5, step_loss=0.0493] 
Steps:  28%|██▊       | 284/1000 [04:37<11:36,  1.03it/s, lr=8.14e-5, step_loss=0.00647]
Steps:  28%|██▊       | 284/1000 [04:37<11:36,  1.03it/s, lr=8.14e-5, step_loss=0.0453] 
Steps:  28%|██▊       | 284/1000 [04:37<11:36,  1.03it/s, lr=8.14e-5, step_loss=0.0752]
Steps:  28%|██▊       | 285/1000 [04:37<11:35,  1.03it/s, lr=8.14e-5, step_loss=0.0752]
Steps:  28%|██▊       | 285/1000 [04:37<11:35,  1.03it/s, lr=8.13e-5, step_loss=0.0113]
Steps:  28%|██▊       | 285/1000 [04:38<11:35,  1.03it/s, lr=8.13e-5, step_loss=0.0245]
Steps:  28%|██▊       | 285/1000 [04:38<11:35,  1.03it/s, lr=8.13e-5, step_loss=0.199] 
Steps:  28%|██▊       | 285/1000 [04:38<11:35,  1.03it/s, lr=8.13e-5, step_loss=0.0519]
Steps:  29%|██▊       | 286/1000 [04:38<11:34,  1.03it/s, lr=8.13e-5, step_loss=0.0519]
Steps:  29%|██▊       | 286/1000 [04:38<11:34,  1.03it/s, lr=8.11e-5, step_loss=0.249] 
Steps:  29%|██▊       | 286/1000 [04:39<11:34,  1.03it/s, lr=8.11e-5, step_loss=0.0349]
Steps:  29%|██▊       | 286/1000 [04:39<11:34,  1.03it/s, lr=8.11e-5, step_loss=0.0225]
Steps:  29%|██▊       | 286/1000 [04:39<11:34,  1.03it/s, lr=8.11e-5, step_loss=0.00564]
Steps:  29%|██▊       | 287/1000 [04:39<11:33,  1.03it/s, lr=8.11e-5, step_loss=0.00564]
Steps:  29%|██▊       | 287/1000 [04:39<11:33,  1.03it/s, lr=8.1e-5, step_loss=0.0674]  
Steps:  29%|██▊       | 287/1000 [04:40<11:33,  1.03it/s, lr=8.1e-5, step_loss=0.0352]
Steps:  29%|██▊       | 287/1000 [04:40<11:33,  1.03it/s, lr=8.1e-5, step_loss=0.203] 
Steps:  29%|██▊       | 287/1000 [04:40<11:33,  1.03it/s, lr=8.1e-5, step_loss=0.116]
Steps:  29%|██▉       | 288/1000 [04:40<11:32,  1.03it/s, lr=8.1e-5, step_loss=0.116]
Steps:  29%|██▉       | 288/1000 [04:40<11:32,  1.03it/s, lr=8.09e-5, step_loss=0.123]
Steps:  29%|██▉       | 288/1000 [04:41<11:32,  1.03it/s, lr=8.09e-5, step_loss=0.0643]
Steps:  29%|██▉       | 288/1000 [04:41<11:32,  1.03it/s, lr=8.09e-5, step_loss=0.066] 
Steps:  29%|██▉       | 288/1000 [04:41<11:32,  1.03it/s, lr=8.09e-5, step_loss=0.107]
Steps:  29%|██▉       | 289/1000 [04:41<11:31,  1.03it/s, lr=8.09e-5, step_loss=0.107]
Steps:  29%|██▉       | 289/1000 [04:41<11:31,  1.03it/s, lr=8.08e-5, step_loss=0.0521]
Steps:  29%|██▉       | 289/1000 [04:42<11:31,  1.03it/s, lr=8.08e-5, step_loss=0.155] 
Steps:  29%|██▉       | 289/1000 [04:42<11:31,  1.03it/s, lr=8.08e-5, step_loss=0.0312]
Steps:  29%|██▉       | 289/1000 [04:42<11:31,  1.03it/s, lr=8.08e-5, step_loss=0.0777]
Steps:  29%|██▉       | 290/1000 [04:42<11:30,  1.03it/s, lr=8.08e-5, step_loss=0.0777]
Steps:  29%|██▉       | 290/1000 [04:42<11:30,  1.03it/s, lr=8.06e-5, step_loss=0.0038]
Steps:  29%|██▉       | 290/1000 [04:43<11:30,  1.03it/s, lr=8.06e-5, step_loss=0.0589]
Steps:  29%|██▉       | 290/1000 [04:43<11:30,  1.03it/s, lr=8.06e-5, step_loss=0.00569]
Steps:  29%|██▉       | 290/1000 [04:43<11:30,  1.03it/s, lr=8.06e-5, step_loss=0.00267]
Steps:  29%|██▉       | 291/1000 [04:43<11:30,  1.03it/s, lr=8.06e-5, step_loss=0.00267]
Steps:  29%|██▉       | 291/1000 [04:43<11:30,  1.03it/s, lr=8.05e-5, step_loss=0.0122] 
Steps:  29%|██▉       | 291/1000 [04:44<11:30,  1.03it/s, lr=8.05e-5, step_loss=0.0346]
Steps:  29%|██▉       | 291/1000 [04:44<11:30,  1.03it/s, lr=8.05e-5, step_loss=0.0808]
Steps:  29%|██▉       | 291/1000 [04:44<11:30,  1.03it/s, lr=8.05e-5, step_loss=0.13]  
Steps:  29%|██▉       | 292/1000 [04:44<11:29,  1.03it/s, lr=8.05e-5, step_loss=0.13]
Steps:  29%|██▉       | 292/1000 [04:44<11:29,  1.03it/s, lr=8.04e-5, step_loss=0.0693]
Steps:  29%|██▉       | 292/1000 [04:45<11:29,  1.03it/s, lr=8.04e-5, step_loss=0.0185]
Steps:  29%|██▉       | 292/1000 [04:45<11:29,  1.03it/s, lr=8.04e-5, step_loss=0.0764]
Steps:  29%|██▉       | 292/1000 [04:45<11:29,  1.03it/s, lr=8.04e-5, step_loss=0.0427]
Steps:  29%|██▉       | 293/1000 [04:45<11:28,  1.03it/s, lr=8.04e-5, step_loss=0.0427]
Steps:  29%|██▉       | 293/1000 [04:45<11:28,  1.03it/s, lr=8.03e-5, step_loss=0.0232]
Steps:  29%|██▉       | 293/1000 [04:45<11:28,  1.03it/s, lr=8.03e-5, step_loss=0.28]  
Steps:  29%|██▉       | 293/1000 [04:46<11:28,  1.03it/s, lr=8.03e-5, step_loss=0.0981]
Steps:  29%|██▉       | 293/1000 [04:46<11:28,  1.03it/s, lr=8.03e-5, step_loss=0.139] 
Steps:  29%|██▉       | 294/1000 [04:46<11:26,  1.03it/s, lr=8.03e-5, step_loss=0.139]
Steps:  29%|██▉       | 294/1000 [04:46<11:26,  1.03it/s, lr=8.01e-5, step_loss=0.0328]
Steps:  29%|██▉       | 294/1000 [04:46<11:26,  1.03it/s, lr=8.01e-5, step_loss=0.0341]
Steps:  29%|██▉       | 294/1000 [04:47<11:26,  1.03it/s, lr=8.01e-5, step_loss=0.17]  
Steps:  29%|██▉       | 294/1000 [04:47<11:26,  1.03it/s, lr=8.01e-5, step_loss=0.0576]
Steps:  30%|██▉       | 295/1000 [04:47<11:25,  1.03it/s, lr=8.01e-5, step_loss=0.0576]
Steps:  30%|██▉       | 295/1000 [04:47<11:25,  1.03it/s, lr=8e-5, step_loss=0.115]    
Steps:  30%|██▉       | 295/1000 [04:47<11:25,  1.03it/s, lr=8e-5, step_loss=0.186]
Steps:  30%|██▉       | 295/1000 [04:48<11:25,  1.03it/s, lr=8e-5, step_loss=0.396]
Steps:  30%|██▉       | 295/1000 [04:48<11:25,  1.03it/s, lr=8e-5, step_loss=0.257]
Steps:  30%|██▉       | 296/1000 [04:48<11:24,  1.03it/s, lr=8e-5, step_loss=0.257]
Steps:  30%|██▉       | 296/1000 [04:48<11:24,  1.03it/s, lr=7.99e-5, step_loss=0.0229]
Steps:  30%|██▉       | 296/1000 [04:48<11:24,  1.03it/s, lr=7.99e-5, step_loss=0.0666]
Steps:  30%|██▉       | 296/1000 [04:49<11:24,  1.03it/s, lr=7.99e-5, step_loss=0.184] 
Steps:  30%|██▉       | 296/1000 [04:49<11:24,  1.03it/s, lr=7.99e-5, step_loss=0.0451]
Steps:  30%|██▉       | 297/1000 [04:49<11:23,  1.03it/s, lr=7.99e-5, step_loss=0.0451]
Steps:  30%|██▉       | 297/1000 [04:49<11:23,  1.03it/s, lr=7.98e-5, step_loss=0.0118]
Steps:  30%|██▉       | 297/1000 [04:49<11:23,  1.03it/s, lr=7.98e-5, step_loss=0.00842]
Steps:  30%|██▉       | 297/1000 [04:50<11:23,  1.03it/s, lr=7.98e-5, step_loss=0.103]  
Steps:  30%|██▉       | 297/1000 [04:50<11:23,  1.03it/s, lr=7.98e-5, step_loss=0.0579]
Steps:  30%|██▉       | 298/1000 [04:50<11:23,  1.03it/s, lr=7.98e-5, step_loss=0.0579]
Steps:  30%|██▉       | 298/1000 [04:50<11:23,  1.03it/s, lr=7.96e-5, step_loss=0.00595]
Steps:  30%|██▉       | 298/1000 [04:50<11:23,  1.03it/s, lr=7.96e-5, step_loss=0.0447] 
Steps:  30%|██▉       | 298/1000 [04:51<11:23,  1.03it/s, lr=7.96e-5, step_loss=0.198] 
Steps:  30%|██▉       | 298/1000 [04:51<11:23,  1.03it/s, lr=7.96e-5, step_loss=0.112]
Steps:  30%|██▉       | 299/1000 [04:51<11:22,  1.03it/s, lr=7.96e-5, step_loss=0.112]
Steps:  30%|██▉       | 299/1000 [04:51<11:22,  1.03it/s, lr=7.95e-5, step_loss=0.0128]
Steps:  30%|██▉       | 299/1000 [04:51<11:22,  1.03it/s, lr=7.95e-5, step_loss=0.00619]
Steps:  30%|██▉       | 299/1000 [04:52<11:22,  1.03it/s, lr=7.95e-5, step_loss=0.00582]
Steps:  30%|██▉       | 299/1000 [04:52<11:22,  1.03it/s, lr=7.95e-5, step_loss=0.0569] 
Steps:  30%|███       | 300/1000 [04:52<11:21,  1.03it/s, lr=7.95e-5, step_loss=0.0569]
Steps:  30%|███       | 300/1000 [04:52<11:21,  1.03it/s, lr=7.94e-5, step_loss=0.00862]
Steps:  30%|███       | 300/1000 [04:52<11:21,  1.03it/s, lr=7.94e-5, step_loss=0.103]  
Steps:  30%|███       | 300/1000 [04:53<11:21,  1.03it/s, lr=7.94e-5, step_loss=0.0749]
Steps:  30%|███       | 300/1000 [04:53<11:21,  1.03it/s, lr=7.94e-5, step_loss=0.0129]
Steps:  30%|███       | 301/1000 [04:53<11:20,  1.03it/s, lr=7.94e-5, step_loss=0.0129]
Steps:  30%|███       | 301/1000 [04:53<11:20,  1.03it/s, lr=7.93e-5, step_loss=0.0104]
Steps:  30%|███       | 301/1000 [04:53<11:20,  1.03it/s, lr=7.93e-5, step_loss=0.0272]
Steps:  30%|███       | 301/1000 [04:54<11:20,  1.03it/s, lr=7.93e-5, step_loss=0.0672]
Steps:  30%|███       | 301/1000 [04:54<11:20,  1.03it/s, lr=7.93e-5, step_loss=0.0389]
Steps:  30%|███       | 302/1000 [04:54<11:19,  1.03it/s, lr=7.93e-5, step_loss=0.0389]
Steps:  30%|███       | 302/1000 [04:54<11:19,  1.03it/s, lr=7.91e-5, step_loss=0.0323]
Steps:  30%|███       | 302/1000 [04:54<11:19,  1.03it/s, lr=7.91e-5, step_loss=0.0276]
Steps:  30%|███       | 302/1000 [04:55<11:19,  1.03it/s, lr=7.91e-5, step_loss=0.207] 
Steps:  30%|███       | 302/1000 [04:55<11:19,  1.03it/s, lr=7.91e-5, step_loss=0.025]
Steps:  30%|███       | 303/1000 [04:55<11:18,  1.03it/s, lr=7.91e-5, step_loss=0.025]
Steps:  30%|███       | 303/1000 [04:55<11:18,  1.03it/s, lr=7.9e-5, step_loss=0.0433]
Steps:  30%|███       | 303/1000 [04:55<11:18,  1.03it/s, lr=7.9e-5, step_loss=0.0407]
Steps:  30%|███       | 303/1000 [04:55<11:18,  1.03it/s, lr=7.9e-5, step_loss=0.00991]
Steps:  30%|███       | 303/1000 [04:56<11:18,  1.03it/s, lr=7.9e-5, step_loss=0.0784] 
Steps:  30%|███       | 304/1000 [04:56<11:17,  1.03it/s, lr=7.9e-5, step_loss=0.0784]
Steps:  30%|███       | 304/1000 [04:56<11:17,  1.03it/s, lr=7.89e-5, step_loss=0.0417]
Steps:  30%|███       | 304/1000 [04:56<11:17,  1.03it/s, lr=7.89e-5, step_loss=0.277] 
Steps:  30%|███       | 304/1000 [04:56<11:17,  1.03it/s, lr=7.89e-5, step_loss=0.213]
Steps:  30%|███       | 304/1000 [04:57<11:17,  1.03it/s, lr=7.89e-5, step_loss=0.0161]
Steps:  30%|███       | 305/1000 [04:57<11:16,  1.03it/s, lr=7.89e-5, step_loss=0.0161]
Steps:  30%|███       | 305/1000 [04:57<11:16,  1.03it/s, lr=7.88e-5, step_loss=0.209] 
Steps:  31%|███       | 306/1000 [04:57<08:48,  1.31it/s, lr=7.88e-5, step_loss=0.209]
Steps:  31%|███       | 306/1000 [04:57<08:48,  1.31it/s, lr=7.86e-5, step_loss=0.0595]{'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 18.25it/s]Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of runwayml/stable-diffusion-v1-5.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 14.09it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.37it/s]
07/28/2024 20:41:07 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A naruto with blue eyes..
Steps:  31%|███       | 306/1000 [05:14<08:48,  1.31it/s, lr=7.86e-5, step_loss=0.0105]
Steps:  31%|███       | 306/1000 [05:14<08:48,  1.31it/s, lr=7.86e-5, step_loss=0.0669]
Steps:  31%|███       | 306/1000 [05:15<08:48,  1.31it/s, lr=7.86e-5, step_loss=0.0446]
Steps:  31%|███       | 307/1000 [05:15<1:06:59,  5.80s/it, lr=7.86e-5, step_loss=0.0446]
Steps:  31%|███       | 307/1000 [05:15<1:06:59,  5.80s/it, lr=7.85e-5, step_loss=0.175] 
Steps:  31%|███       | 307/1000 [05:15<1:06:59,  5.80s/it, lr=7.85e-5, step_loss=0.0188]
Steps:  31%|███       | 307/1000 [05:15<1:06:59,  5.80s/it, lr=7.85e-5, step_loss=0.00263]
Steps:  31%|███       | 307/1000 [05:15<1:06:59,  5.80s/it, lr=7.85e-5, step_loss=0.11]   
Steps:  31%|███       | 308/1000 [05:16<50:10,  4.35s/it, lr=7.85e-5, step_loss=0.11]  
Steps:  31%|███       | 308/1000 [05:16<50:10,  4.35s/it, lr=7.84e-5, step_loss=0.178]
Steps:  31%|███       | 308/1000 [05:16<50:10,  4.35s/it, lr=7.84e-5, step_loss=0.136]
Steps:  31%|███       | 308/1000 [05:16<50:10,  4.35s/it, lr=7.84e-5, step_loss=0.0985]
Steps:  31%|███       | 308/1000 [05:16<50:10,  4.35s/it, lr=7.84e-5, step_loss=0.0551]
Steps:  31%|███       | 309/1000 [05:17<38:24,  3.34s/it, lr=7.84e-5, step_loss=0.0551]
Steps:  31%|███       | 309/1000 [05:17<38:24,  3.34s/it, lr=7.82e-5, step_loss=0.0296]
Steps:  31%|███       | 309/1000 [05:17<38:24,  3.34s/it, lr=7.82e-5, step_loss=0.071] 
Steps:  31%|███       | 309/1000 [05:17<38:24,  3.34s/it, lr=7.82e-5, step_loss=0.0737]
Steps:  31%|███       | 309/1000 [05:17<38:24,  3.34s/it, lr=7.82e-5, step_loss=0.195] 
Steps:  31%|███       | 310/1000 [05:18<30:11,  2.63s/it, lr=7.82e-5, step_loss=0.195]
Steps:  31%|███       | 310/1000 [05:18<30:11,  2.63s/it, lr=7.81e-5, step_loss=0.00559]
Steps:  31%|███       | 310/1000 [05:18<30:11,  2.63s/it, lr=7.81e-5, step_loss=0.127]  
Steps:  31%|███       | 310/1000 [05:18<30:11,  2.63s/it, lr=7.81e-5, step_loss=0.109]
Steps:  31%|███       | 310/1000 [05:18<30:11,  2.63s/it, lr=7.81e-5, step_loss=0.227]
Steps:  31%|███       | 311/1000 [05:19<24:26,  2.13s/it, lr=7.81e-5, step_loss=0.227]
Steps:  31%|███       | 311/1000 [05:19<24:26,  2.13s/it, lr=7.8e-5, step_loss=0.296] 
Steps:  31%|███       | 311/1000 [05:19<24:26,  2.13s/it, lr=7.8e-5, step_loss=0.0116]
Steps:  31%|███       | 311/1000 [05:19<24:26,  2.13s/it, lr=7.8e-5, step_loss=0.149] 
Steps:  31%|███       | 311/1000 [05:19<24:26,  2.13s/it, lr=7.8e-5, step_loss=0.293]
Steps:  31%|███       | 312/1000 [05:20<20:25,  1.78s/it, lr=7.8e-5, step_loss=0.293]
Steps:  31%|███       | 312/1000 [05:20<20:25,  1.78s/it, lr=7.78e-5, step_loss=0.0586]
Steps:  31%|███       | 312/1000 [05:20<20:25,  1.78s/it, lr=7.78e-5, step_loss=0.215] 
Steps:  31%|███       | 312/1000 [05:20<20:25,  1.78s/it, lr=7.78e-5, step_loss=0.0205]
Steps:  31%|███       | 312/1000 [05:20<20:25,  1.78s/it, lr=7.78e-5, step_loss=0.228] 
Steps:  31%|███▏      | 313/1000 [05:21<17:36,  1.54s/it, lr=7.78e-5, step_loss=0.228]
Steps:  31%|███▏      | 313/1000 [05:21<17:36,  1.54s/it, lr=7.77e-5, step_loss=0.187]
Steps:  31%|███▏      | 313/1000 [05:21<17:36,  1.54s/it, lr=7.77e-5, step_loss=0.00925]
Steps:  31%|███▏      | 313/1000 [05:21<17:36,  1.54s/it, lr=7.77e-5, step_loss=0.0643] 
Steps:  31%|███▏      | 313/1000 [05:21<17:36,  1.54s/it, lr=7.77e-5, step_loss=0.0132]
Steps:  31%|███▏      | 314/1000 [05:22<15:39,  1.37s/it, lr=7.77e-5, step_loss=0.0132]
Steps:  31%|███▏      | 314/1000 [05:22<15:39,  1.37s/it, lr=7.76e-5, step_loss=0.0404]
Steps:  31%|███▏      | 314/1000 [05:22<15:39,  1.37s/it, lr=7.76e-5, step_loss=0.0882]
Steps:  31%|███▏      | 314/1000 [05:22<15:39,  1.37s/it, lr=7.76e-5, step_loss=0.159] 
Steps:  31%|███▏      | 314/1000 [05:22<15:39,  1.37s/it, lr=7.76e-5, step_loss=0.11] 
Steps:  32%|███▏      | 315/1000 [05:22<14:16,  1.25s/it, lr=7.76e-5, step_loss=0.11]
Steps:  32%|███▏      | 315/1000 [05:23<14:16,  1.25s/it, lr=7.75e-5, step_loss=0.0181]
Steps:  32%|███▏      | 315/1000 [05:23<14:16,  1.25s/it, lr=7.75e-5, step_loss=0.106] 
Steps:  32%|███▏      | 315/1000 [05:23<14:16,  1.25s/it, lr=7.75e-5, step_loss=0.0517]
Steps:  32%|███▏      | 315/1000 [05:23<14:16,  1.25s/it, lr=7.75e-5, step_loss=0.0689]
Steps:  32%|███▏      | 316/1000 [05:23<13:17,  1.17s/it, lr=7.75e-5, step_loss=0.0689]
Steps:  32%|███▏      | 316/1000 [05:23<13:17,  1.17s/it, lr=7.73e-5, step_loss=0.0483]
Steps:  32%|███▏      | 316/1000 [05:24<13:17,  1.17s/it, lr=7.73e-5, step_loss=0.0461]
Steps:  32%|███▏      | 316/1000 [05:24<13:17,  1.17s/it, lr=7.73e-5, step_loss=0.0226]
Steps:  32%|███▏      | 316/1000 [05:24<13:17,  1.17s/it, lr=7.73e-5, step_loss=0.265] 
Steps:  32%|███▏      | 317/1000 [05:24<12:36,  1.11s/it, lr=7.73e-5, step_loss=0.265]
Steps:  32%|███▏      | 317/1000 [05:24<12:36,  1.11s/it, lr=7.72e-5, step_loss=0.176]
Steps:  32%|███▏      | 317/1000 [05:25<12:36,  1.11s/it, lr=7.72e-5, step_loss=0.0894]
Steps:  32%|███▏      | 317/1000 [05:25<12:36,  1.11s/it, lr=7.72e-5, step_loss=0.402] 
Steps:  32%|███▏      | 317/1000 [05:25<12:36,  1.11s/it, lr=7.72e-5, step_loss=0.0293]
Steps:  32%|███▏      | 318/1000 [05:25<12:07,  1.07s/it, lr=7.72e-5, step_loss=0.0293]
Steps:  32%|███▏      | 318/1000 [05:25<12:07,  1.07s/it, lr=7.71e-5, step_loss=0.0532]
Steps:  32%|███▏      | 318/1000 [05:26<12:07,  1.07s/it, lr=7.71e-5, step_loss=0.00221]
Steps:  32%|███▏      | 318/1000 [05:26<12:07,  1.07s/it, lr=7.71e-5, step_loss=0.05]   
Steps:  32%|███▏      | 318/1000 [05:26<12:07,  1.07s/it, lr=7.71e-5, step_loss=0.125]
Steps:  32%|███▏      | 319/1000 [05:26<11:46,  1.04s/it, lr=7.71e-5, step_loss=0.125]
Steps:  32%|███▏      | 319/1000 [05:26<11:46,  1.04s/it, lr=7.69e-5, step_loss=0.129]
Steps:  32%|███▏      | 319/1000 [05:27<11:46,  1.04s/it, lr=7.69e-5, step_loss=0.00773]
Steps:  32%|███▏      | 319/1000 [05:27<11:46,  1.04s/it, lr=7.69e-5, step_loss=0.0759] 
Steps:  32%|███▏      | 319/1000 [05:27<11:46,  1.04s/it, lr=7.69e-5, step_loss=0.0154]
Steps:  32%|███▏      | 320/1000 [05:27<11:32,  1.02s/it, lr=7.69e-5, step_loss=0.0154]
Steps:  32%|███▏      | 320/1000 [05:27<11:32,  1.02s/it, lr=7.68e-5, step_loss=0.139] 
Steps:  32%|███▏      | 320/1000 [05:28<11:32,  1.02s/it, lr=7.68e-5, step_loss=0.0257]
Steps:  32%|███▏      | 320/1000 [05:28<11:32,  1.02s/it, lr=7.68e-5, step_loss=0.0279]
Steps:  32%|███▏      | 320/1000 [05:28<11:32,  1.02s/it, lr=7.68e-5, step_loss=0.0899]
Steps:  32%|███▏      | 321/1000 [05:28<11:21,  1.00s/it, lr=7.68e-5, step_loss=0.0899]
Steps:  32%|███▏      | 321/1000 [05:28<11:21,  1.00s/it, lr=7.67e-5, step_loss=0.378] 
Steps:  32%|███▏      | 321/1000 [05:29<11:21,  1.00s/it, lr=7.67e-5, step_loss=0.197]
Steps:  32%|███▏      | 321/1000 [05:29<11:21,  1.00s/it, lr=7.67e-5, step_loss=0.53] 
Steps:  32%|███▏      | 321/1000 [05:29<11:21,  1.00s/it, lr=7.67e-5, step_loss=0.143]
Steps:  32%|███▏      | 322/1000 [05:29<11:14,  1.01it/s, lr=7.67e-5, step_loss=0.143]
Steps:  32%|███▏      | 322/1000 [05:29<11:14,  1.01it/s, lr=7.65e-5, step_loss=0.0853]
Steps:  32%|███▏      | 322/1000 [05:30<11:14,  1.01it/s, lr=7.65e-5, step_loss=0.0303]
Steps:  32%|███▏      | 322/1000 [05:30<11:14,  1.01it/s, lr=7.65e-5, step_loss=0.0724]
Steps:  32%|███▏      | 322/1000 [05:30<11:14,  1.01it/s, lr=7.65e-5, step_loss=0.0382]
Steps:  32%|███▏      | 323/1000 [05:30<11:08,  1.01it/s, lr=7.65e-5, step_loss=0.0382]
Steps:  32%|███▏      | 323/1000 [05:30<11:08,  1.01it/s, lr=7.64e-5, step_loss=0.00649]
Steps:  32%|███▏      | 323/1000 [05:31<11:08,  1.01it/s, lr=7.64e-5, step_loss=0.16]   
Steps:  32%|███▏      | 323/1000 [05:31<11:08,  1.01it/s, lr=7.64e-5, step_loss=0.0473]
Steps:  32%|███▏      | 323/1000 [05:31<11:08,  1.01it/s, lr=7.64e-5, step_loss=0.12]  
Steps:  32%|███▏      | 324/1000 [05:31<11:04,  1.02it/s, lr=7.64e-5, step_loss=0.12]
Steps:  32%|███▏      | 324/1000 [05:31<11:04,  1.02it/s, lr=7.63e-5, step_loss=0.179]
Steps:  32%|███▏      | 324/1000 [05:32<11:04,  1.02it/s, lr=7.63e-5, step_loss=0.0175]
Steps:  32%|███▏      | 324/1000 [05:32<11:04,  1.02it/s, lr=7.63e-5, step_loss=0.0131]
Steps:  32%|███▏      | 324/1000 [05:32<11:04,  1.02it/s, lr=7.63e-5, step_loss=0.128] 
Steps:  32%|███▎      | 325/1000 [05:32<11:00,  1.02it/s, lr=7.63e-5, step_loss=0.128]
Steps:  32%|███▎      | 325/1000 [05:32<11:00,  1.02it/s, lr=7.61e-5, step_loss=0.115]
Steps:  32%|███▎      | 325/1000 [05:32<11:00,  1.02it/s, lr=7.61e-5, step_loss=0.00812]
Steps:  32%|███▎      | 325/1000 [05:33<11:00,  1.02it/s, lr=7.61e-5, step_loss=0.00299]
Steps:  32%|███▎      | 325/1000 [05:33<11:00,  1.02it/s, lr=7.61e-5, step_loss=0.00389]
Steps:  33%|███▎      | 326/1000 [05:33<10:58,  1.02it/s, lr=7.61e-5, step_loss=0.00389]
Steps:  33%|███▎      | 326/1000 [05:33<10:58,  1.02it/s, lr=7.6e-5, step_loss=0.129]   
Steps:  33%|███▎      | 326/1000 [05:33<10:58,  1.02it/s, lr=7.6e-5, step_loss=0.00966]
Steps:  33%|███▎      | 326/1000 [05:34<10:58,  1.02it/s, lr=7.6e-5, step_loss=0.265]  
Steps:  33%|███▎      | 326/1000 [05:34<10:58,  1.02it/s, lr=7.6e-5, step_loss=0.148]
Steps:  33%|███▎      | 327/1000 [05:34<10:56,  1.03it/s, lr=7.6e-5, step_loss=0.148]
Steps:  33%|███▎      | 327/1000 [05:34<10:56,  1.03it/s, lr=7.59e-5, step_loss=0.0163]
Steps:  33%|███▎      | 327/1000 [05:34<10:56,  1.03it/s, lr=7.59e-5, step_loss=0.0104]
Steps:  33%|███▎      | 327/1000 [05:35<10:56,  1.03it/s, lr=7.59e-5, step_loss=0.0254]
Steps:  33%|███▎      | 327/1000 [05:35<10:56,  1.03it/s, lr=7.59e-5, step_loss=0.0886]
Steps:  33%|███▎      | 328/1000 [05:35<10:55,  1.03it/s, lr=7.59e-5, step_loss=0.0886]
Steps:  33%|███▎      | 328/1000 [05:35<10:55,  1.03it/s, lr=7.57e-5, step_loss=0.583] 
Steps:  33%|███▎      | 328/1000 [05:35<10:55,  1.03it/s, lr=7.57e-5, step_loss=0.0258]
Steps:  33%|███▎      | 328/1000 [05:36<10:55,  1.03it/s, lr=7.57e-5, step_loss=0.036] 
Steps:  33%|███▎      | 328/1000 [05:36<10:55,  1.03it/s, lr=7.57e-5, step_loss=0.00387]
Steps:  33%|███▎      | 329/1000 [05:36<10:54,  1.03it/s, lr=7.57e-5, step_loss=0.00387]
Steps:  33%|███▎      | 329/1000 [05:36<10:54,  1.03it/s, lr=7.56e-5, step_loss=0.0336] 
Steps:  33%|███▎      | 329/1000 [05:36<10:54,  1.03it/s, lr=7.56e-5, step_loss=0.0166]
Steps:  33%|███▎      | 329/1000 [05:37<10:54,  1.03it/s, lr=7.56e-5, step_loss=0.00358]
Steps:  33%|███▎      | 329/1000 [05:37<10:54,  1.03it/s, lr=7.56e-5, step_loss=0.117]  
Steps:  33%|███▎      | 330/1000 [05:37<10:53,  1.03it/s, lr=7.56e-5, step_loss=0.117]
Steps:  33%|███▎      | 330/1000 [05:37<10:53,  1.03it/s, lr=7.55e-5, step_loss=0.00768]
Steps:  33%|███▎      | 330/1000 [05:37<10:53,  1.03it/s, lr=7.55e-5, step_loss=0.113]  
Steps:  33%|███▎      | 330/1000 [05:38<10:53,  1.03it/s, lr=7.55e-5, step_loss=0.108]
Steps:  33%|███▎      | 330/1000 [05:38<10:53,  1.03it/s, lr=7.55e-5, step_loss=0.00339]
Steps:  33%|███▎      | 331/1000 [05:38<10:51,  1.03it/s, lr=7.55e-5, step_loss=0.00339]
Steps:  33%|███▎      | 331/1000 [05:38<10:51,  1.03it/s, lr=7.53e-5, step_loss=0.00368]
Steps:  33%|███▎      | 331/1000 [05:38<10:51,  1.03it/s, lr=7.53e-5, step_loss=0.215]  
Steps:  33%|███▎      | 331/1000 [05:39<10:51,  1.03it/s, lr=7.53e-5, step_loss=0.0496]
Steps:  33%|███▎      | 331/1000 [05:39<10:51,  1.03it/s, lr=7.53e-5, step_loss=0.0204]
Steps:  33%|███▎      | 332/1000 [05:39<10:51,  1.03it/s, lr=7.53e-5, step_loss=0.0204]
Steps:  33%|███▎      | 332/1000 [05:39<10:51,  1.03it/s, lr=7.52e-5, step_loss=0.00218]
Steps:  33%|███▎      | 332/1000 [05:39<10:51,  1.03it/s, lr=7.52e-5, step_loss=0.155]  
Steps:  33%|███▎      | 332/1000 [05:40<10:51,  1.03it/s, lr=7.52e-5, step_loss=0.00834]
Steps:  33%|███▎      | 332/1000 [05:40<10:51,  1.03it/s, lr=7.52e-5, step_loss=0.00512]
Steps:  33%|███▎      | 333/1000 [05:40<10:50,  1.03it/s, lr=7.52e-5, step_loss=0.00512]
Steps:  33%|███▎      | 333/1000 [05:40<10:50,  1.03it/s, lr=7.5e-5, step_loss=0.0445]  
Steps:  33%|███▎      | 333/1000 [05:40<10:50,  1.03it/s, lr=7.5e-5, step_loss=0.0218]
Steps:  33%|███▎      | 333/1000 [05:41<10:50,  1.03it/s, lr=7.5e-5, step_loss=0.251] 
Steps:  33%|███▎      | 333/1000 [05:41<10:50,  1.03it/s, lr=7.5e-5, step_loss=0.0463]
Steps:  33%|███▎      | 334/1000 [05:41<10:48,  1.03it/s, lr=7.5e-5, step_loss=0.0463]
Steps:  33%|███▎      | 334/1000 [05:41<10:48,  1.03it/s, lr=7.49e-5, step_loss=0.0873]
Steps:  33%|███▎      | 334/1000 [05:41<10:48,  1.03it/s, lr=7.49e-5, step_loss=0.0242]
Steps:  33%|███▎      | 334/1000 [05:41<10:48,  1.03it/s, lr=7.49e-5, step_loss=0.0718]
Steps:  33%|███▎      | 334/1000 [05:42<10:48,  1.03it/s, lr=7.49e-5, step_loss=0.0711]
Steps:  34%|███▎      | 335/1000 [05:42<10:47,  1.03it/s, lr=7.49e-5, step_loss=0.0711]
Steps:  34%|███▎      | 335/1000 [05:42<10:47,  1.03it/s, lr=7.48e-5, step_loss=0.0047]
Steps:  34%|███▎      | 335/1000 [05:42<10:47,  1.03it/s, lr=7.48e-5, step_loss=0.0674]
Steps:  34%|███▎      | 335/1000 [05:42<10:47,  1.03it/s, lr=7.48e-5, step_loss=0.00698]
Steps:  34%|███▎      | 335/1000 [05:43<10:47,  1.03it/s, lr=7.48e-5, step_loss=0.176]  
Steps:  34%|███▎      | 336/1000 [05:43<10:46,  1.03it/s, lr=7.48e-5, step_loss=0.176]
Steps:  34%|███▎      | 336/1000 [05:43<10:46,  1.03it/s, lr=7.46e-5, step_loss=0.121]
Steps:  34%|███▎      | 336/1000 [05:43<10:46,  1.03it/s, lr=7.46e-5, step_loss=0.035]
Steps:  34%|███▎      | 336/1000 [05:43<10:46,  1.03it/s, lr=7.46e-5, step_loss=0.107]
Steps:  34%|███▎      | 336/1000 [05:44<10:46,  1.03it/s, lr=7.46e-5, step_loss=0.074]
Steps:  34%|███▎      | 337/1000 [05:44<10:45,  1.03it/s, lr=7.46e-5, step_loss=0.074]
Steps:  34%|███▎      | 337/1000 [05:44<10:45,  1.03it/s, lr=7.45e-5, step_loss=0.0445]
Steps:  34%|███▎      | 337/1000 [05:44<10:45,  1.03it/s, lr=7.45e-5, step_loss=0.0458]
Steps:  34%|███▎      | 337/1000 [05:44<10:45,  1.03it/s, lr=7.45e-5, step_loss=0.139] 
Steps:  34%|███▎      | 337/1000 [05:45<10:45,  1.03it/s, lr=7.45e-5, step_loss=0.087]
Steps:  34%|███▍      | 338/1000 [05:45<10:45,  1.03it/s, lr=7.45e-5, step_loss=0.087]
Steps:  34%|███▍      | 338/1000 [05:45<10:45,  1.03it/s, lr=7.44e-5, step_loss=0.119]
Steps:  34%|███▍      | 338/1000 [05:45<10:45,  1.03it/s, lr=7.44e-5, step_loss=0.11] 
Steps:  34%|███▍      | 338/1000 [05:45<10:45,  1.03it/s, lr=7.44e-5, step_loss=0.0549]
Steps:  34%|███▍      | 338/1000 [05:46<10:45,  1.03it/s, lr=7.44e-5, step_loss=0.141] 
Steps:  34%|███▍      | 339/1000 [05:46<10:44,  1.03it/s, lr=7.44e-5, step_loss=0.141]
Steps:  34%|███▍      | 339/1000 [05:46<10:44,  1.03it/s, lr=7.42e-5, step_loss=0.0567]
Steps:  34%|███▍      | 339/1000 [05:46<10:44,  1.03it/s, lr=7.42e-5, step_loss=0.0273]
Steps:  34%|███▍      | 339/1000 [05:46<10:44,  1.03it/s, lr=7.42e-5, step_loss=0.0826]
Steps:  34%|███▍      | 339/1000 [05:47<10:44,  1.03it/s, lr=7.42e-5, step_loss=0.0231]
Steps:  34%|███▍      | 340/1000 [05:47<10:43,  1.03it/s, lr=7.42e-5, step_loss=0.0231]
Steps:  34%|███▍      | 340/1000 [05:47<10:43,  1.03it/s, lr=7.41e-5, step_loss=0.0446]
Steps:  34%|███▍      | 340/1000 [05:47<10:43,  1.03it/s, lr=7.41e-5, step_loss=0.243] 
Steps:  34%|███▍      | 340/1000 [05:47<10:43,  1.03it/s, lr=7.41e-5, step_loss=0.0077]
Steps:  34%|███▍      | 340/1000 [05:48<10:43,  1.03it/s, lr=7.41e-5, step_loss=0.205] 
Steps:  34%|███▍      | 341/1000 [05:48<10:42,  1.03it/s, lr=7.41e-5, step_loss=0.205]
Steps:  34%|███▍      | 341/1000 [05:48<10:42,  1.03it/s, lr=7.39e-5, step_loss=0.149]
Steps:  34%|███▍      | 341/1000 [05:48<10:42,  1.03it/s, lr=7.39e-5, step_loss=0.287]
Steps:  34%|███▍      | 341/1000 [05:48<10:42,  1.03it/s, lr=7.39e-5, step_loss=0.143]
Steps:  34%|███▍      | 341/1000 [05:49<10:42,  1.03it/s, lr=7.39e-5, step_loss=0.0985]
Steps:  34%|███▍      | 342/1000 [05:49<10:40,  1.03it/s, lr=7.39e-5, step_loss=0.0985]
Steps:  34%|███▍      | 342/1000 [05:49<10:40,  1.03it/s, lr=7.38e-5, step_loss=0.045] 
Steps:  34%|███▍      | 342/1000 [05:49<10:40,  1.03it/s, lr=7.38e-5, step_loss=0.234]
Steps:  34%|███▍      | 342/1000 [05:49<10:40,  1.03it/s, lr=7.38e-5, step_loss=0.0284]
Steps:  34%|███▍      | 342/1000 [05:50<10:40,  1.03it/s, lr=7.38e-5, step_loss=0.0425]
Steps:  34%|███▍      | 343/1000 [05:50<10:39,  1.03it/s, lr=7.38e-5, step_loss=0.0425]
Steps:  34%|███▍      | 343/1000 [05:50<10:39,  1.03it/s, lr=7.37e-5, step_loss=0.00321]
Steps:  34%|███▍      | 343/1000 [05:50<10:39,  1.03it/s, lr=7.37e-5, step_loss=0.212]  
Steps:  34%|███▍      | 343/1000 [05:50<10:39,  1.03it/s, lr=7.37e-5, step_loss=0.0205]
Steps:  34%|███▍      | 343/1000 [05:50<10:39,  1.03it/s, lr=7.37e-5, step_loss=0.0187]
Steps:  34%|███▍      | 344/1000 [05:51<10:38,  1.03it/s, lr=7.37e-5, step_loss=0.0187]
Steps:  34%|███▍      | 344/1000 [05:51<10:38,  1.03it/s, lr=7.35e-5, step_loss=0.00535]
Steps:  34%|███▍      | 344/1000 [05:51<10:38,  1.03it/s, lr=7.35e-5, step_loss=0.00365]
Steps:  34%|███▍      | 344/1000 [05:51<10:38,  1.03it/s, lr=7.35e-5, step_loss=0.353]  
Steps:  34%|███▍      | 344/1000 [05:51<10:38,  1.03it/s, lr=7.35e-5, step_loss=0.0564]
Steps:  34%|███▍      | 345/1000 [05:52<10:37,  1.03it/s, lr=7.35e-5, step_loss=0.0564]
Steps:  34%|███▍      | 345/1000 [05:52<10:37,  1.03it/s, lr=7.34e-5, step_loss=0.022] 
Steps:  34%|███▍      | 345/1000 [05:52<10:37,  1.03it/s, lr=7.34e-5, step_loss=0.00403]
Steps:  34%|███▍      | 345/1000 [05:52<10:37,  1.03it/s, lr=7.34e-5, step_loss=0.0773] 
Steps:  34%|███▍      | 345/1000 [05:52<10:37,  1.03it/s, lr=7.34e-5, step_loss=0.0334]
Steps:  35%|███▍      | 346/1000 [05:53<10:37,  1.03it/s, lr=7.34e-5, step_loss=0.0334]
Steps:  35%|███▍      | 346/1000 [05:53<10:37,  1.03it/s, lr=7.33e-5, step_loss=0.0192]
Steps:  35%|███▍      | 346/1000 [05:53<10:37,  1.03it/s, lr=7.33e-5, step_loss=0.0196]
Steps:  35%|███▍      | 346/1000 [05:53<10:37,  1.03it/s, lr=7.33e-5, step_loss=0.051] 
Steps:  35%|███▍      | 346/1000 [05:53<10:37,  1.03it/s, lr=7.33e-5, step_loss=0.00558]
Steps:  35%|███▍      | 347/1000 [05:54<10:35,  1.03it/s, lr=7.33e-5, step_loss=0.00558]
Steps:  35%|███▍      | 347/1000 [05:54<10:35,  1.03it/s, lr=7.31e-5, step_loss=0.0204] 
Steps:  35%|███▍      | 347/1000 [05:54<10:35,  1.03it/s, lr=7.31e-5, step_loss=0.159] 
Steps:  35%|███▍      | 347/1000 [05:54<10:35,  1.03it/s, lr=7.31e-5, step_loss=0.071]
Steps:  35%|███▍      | 347/1000 [05:54<10:35,  1.03it/s, lr=7.31e-5, step_loss=0.0836]
Steps:  35%|███▍      | 348/1000 [05:55<10:35,  1.03it/s, lr=7.31e-5, step_loss=0.0836]
Steps:  35%|███▍      | 348/1000 [05:55<10:35,  1.03it/s, lr=7.3e-5, step_loss=0.153]  
Steps:  35%|███▍      | 348/1000 [05:55<10:35,  1.03it/s, lr=7.3e-5, step_loss=0.319]
Steps:  35%|███▍      | 348/1000 [05:55<10:35,  1.03it/s, lr=7.3e-5, step_loss=0.0277]
Steps:  35%|███▍      | 348/1000 [05:55<10:35,  1.03it/s, lr=7.3e-5, step_loss=0.00488]
Steps:  35%|███▍      | 349/1000 [05:56<10:34,  1.03it/s, lr=7.3e-5, step_loss=0.00488]
Steps:  35%|███▍      | 349/1000 [05:56<10:34,  1.03it/s, lr=7.28e-5, step_loss=0.00342]
Steps:  35%|███▍      | 349/1000 [05:56<10:34,  1.03it/s, lr=7.28e-5, step_loss=0.00863]
Steps:  35%|███▍      | 349/1000 [05:56<10:34,  1.03it/s, lr=7.28e-5, step_loss=0.101]  
Steps:  35%|███▍      | 349/1000 [05:56<10:34,  1.03it/s, lr=7.28e-5, step_loss=0.0617]
Steps:  35%|███▌      | 350/1000 [05:57<10:32,  1.03it/s, lr=7.28e-5, step_loss=0.0617]
Steps:  35%|███▌      | 350/1000 [05:57<10:32,  1.03it/s, lr=7.27e-5, step_loss=0.00304]
Steps:  35%|███▌      | 350/1000 [05:57<10:32,  1.03it/s, lr=7.27e-5, step_loss=0.00432]
Steps:  35%|███▌      | 350/1000 [05:57<10:32,  1.03it/s, lr=7.27e-5, step_loss=0.43]   
Steps:  35%|███▌      | 350/1000 [05:57<10:32,  1.03it/s, lr=7.27e-5, step_loss=0.104]
Steps:  35%|███▌      | 351/1000 [05:58<10:32,  1.03it/s, lr=7.27e-5, step_loss=0.104]
Steps:  35%|███▌      | 351/1000 [05:58<10:32,  1.03it/s, lr=7.26e-5, step_loss=0.171]
Steps:  35%|███▌      | 351/1000 [05:58<10:32,  1.03it/s, lr=7.26e-5, step_loss=0.0124]
Steps:  35%|███▌      | 351/1000 [05:58<10:32,  1.03it/s, lr=7.26e-5, step_loss=0.024] 
Steps:  35%|███▌      | 351/1000 [05:58<10:32,  1.03it/s, lr=7.26e-5, step_loss=0.0774]
Steps:  35%|███▌      | 352/1000 [05:58<10:30,  1.03it/s, lr=7.26e-5, step_loss=0.0774]
Steps:  35%|███▌      | 352/1000 [05:59<10:30,  1.03it/s, lr=7.24e-5, step_loss=0.074] 
Steps:  35%|███▌      | 352/1000 [05:59<10:30,  1.03it/s, lr=7.24e-5, step_loss=0.0238]
Steps:  35%|███▌      | 352/1000 [05:59<10:30,  1.03it/s, lr=7.24e-5, step_loss=0.124] 
Steps:  35%|███▌      | 352/1000 [05:59<10:30,  1.03it/s, lr=7.24e-5, step_loss=0.0212]
Steps:  35%|███▌      | 353/1000 [05:59<10:29,  1.03it/s, lr=7.24e-5, step_loss=0.0212]
Steps:  35%|███▌      | 353/1000 [05:59<10:29,  1.03it/s, lr=7.23e-5, step_loss=0.0613]
Steps:  35%|███▌      | 353/1000 [06:00<10:29,  1.03it/s, lr=7.23e-5, step_loss=0.198] 
Steps:  35%|███▌      | 353/1000 [06:00<10:29,  1.03it/s, lr=7.23e-5, step_loss=0.0803]
Steps:  35%|███▌      | 353/1000 [06:00<10:29,  1.03it/s, lr=7.23e-5, step_loss=0.0702]
Steps:  35%|███▌      | 354/1000 [06:00<10:29,  1.03it/s, lr=7.23e-5, step_loss=0.0702]
Steps:  35%|███▌      | 354/1000 [06:00<10:29,  1.03it/s, lr=7.21e-5, step_loss=0.202] 
Steps:  35%|███▌      | 354/1000 [06:01<10:29,  1.03it/s, lr=7.21e-5, step_loss=0.0916]
Steps:  35%|███▌      | 354/1000 [06:01<10:29,  1.03it/s, lr=7.21e-5, step_loss=0.0309]
Steps:  35%|███▌      | 354/1000 [06:01<10:29,  1.03it/s, lr=7.21e-5, step_loss=0.326] 
Steps:  36%|███▌      | 355/1000 [06:01<10:27,  1.03it/s, lr=7.21e-5, step_loss=0.326]
Steps:  36%|███▌      | 355/1000 [06:01<10:27,  1.03it/s, lr=7.2e-5, step_loss=0.211] 
Steps:  36%|███▌      | 355/1000 [06:02<10:27,  1.03it/s, lr=7.2e-5, step_loss=0.0535]
Steps:  36%|███▌      | 355/1000 [06:02<10:27,  1.03it/s, lr=7.2e-5, step_loss=0.0283]
Steps:  36%|███▌      | 355/1000 [06:02<10:27,  1.03it/s, lr=7.2e-5, step_loss=0.106] 
Steps:  36%|███▌      | 356/1000 [06:02<10:26,  1.03it/s, lr=7.2e-5, step_loss=0.106]
Steps:  36%|███▌      | 356/1000 [06:02<10:26,  1.03it/s, lr=7.19e-5, step_loss=0.0147]
Steps:  36%|███▌      | 356/1000 [06:03<10:26,  1.03it/s, lr=7.19e-5, step_loss=0.0765]
Steps:  36%|███▌      | 356/1000 [06:03<10:26,  1.03it/s, lr=7.19e-5, step_loss=0.0437]
Steps:  36%|███▌      | 356/1000 [06:03<10:26,  1.03it/s, lr=7.19e-5, step_loss=0.0429]
Steps:  36%|███▌      | 357/1000 [06:03<10:25,  1.03it/s, lr=7.19e-5, step_loss=0.0429]
Steps:  36%|███▌      | 357/1000 [06:03<10:25,  1.03it/s, lr=7.17e-5, step_loss=0.0535]
Steps:  36%|███▌      | 357/1000 [06:04<10:25,  1.03it/s, lr=7.17e-5, step_loss=0.0816]
Steps:  36%|███▌      | 357/1000 [06:04<10:25,  1.03it/s, lr=7.17e-5, step_loss=0.00493]
Steps:  36%|███▌      | 357/1000 [06:04<10:25,  1.03it/s, lr=7.17e-5, step_loss=0.00867]
Steps:  36%|███▌      | 358/1000 [06:04<10:24,  1.03it/s, lr=7.17e-5, step_loss=0.00867]
Steps:  36%|███▌      | 358/1000 [06:04<10:24,  1.03it/s, lr=7.16e-5, step_loss=0.00927]
Steps:  36%|███▌      | 358/1000 [06:05<10:24,  1.03it/s, lr=7.16e-5, step_loss=0.0347] 
Steps:  36%|███▌      | 358/1000 [06:05<10:24,  1.03it/s, lr=7.16e-5, step_loss=0.00483]
Steps:  36%|███▌      | 358/1000 [06:05<10:24,  1.03it/s, lr=7.16e-5, step_loss=0.00831]
Steps:  36%|███▌      | 359/1000 [06:05<10:24,  1.03it/s, lr=7.16e-5, step_loss=0.00831]
Steps:  36%|███▌      | 359/1000 [06:05<10:24,  1.03it/s, lr=7.14e-5, step_loss=0.105]  
Steps:  36%|███▌      | 359/1000 [06:06<10:24,  1.03it/s, lr=7.14e-5, step_loss=0.115]
Steps:  36%|███▌      | 359/1000 [06:06<10:24,  1.03it/s, lr=7.14e-5, step_loss=0.0172]
Steps:  36%|███▌      | 359/1000 [06:06<10:24,  1.03it/s, lr=7.14e-5, step_loss=0.102] 
Steps:  36%|███▌      | 360/1000 [06:06<10:23,  1.03it/s, lr=7.14e-5, step_loss=0.102]
Steps:  36%|███▌      | 360/1000 [06:06<10:23,  1.03it/s, lr=7.13e-5, step_loss=0.0812]
Steps:  36%|███▌      | 360/1000 [06:07<10:23,  1.03it/s, lr=7.13e-5, step_loss=0.0587]
Steps:  36%|███▌      | 360/1000 [06:07<10:23,  1.03it/s, lr=7.13e-5, step_loss=0.118] 
Steps:  36%|███▌      | 360/1000 [06:07<10:23,  1.03it/s, lr=7.13e-5, step_loss=0.0459]
Steps:  36%|███▌      | 361/1000 [06:07<10:22,  1.03it/s, lr=7.13e-5, step_loss=0.0459]
Steps:  36%|███▌      | 361/1000 [06:07<10:22,  1.03it/s, lr=7.11e-5, step_loss=0.0643]
Steps:  36%|███▌      | 361/1000 [06:08<10:22,  1.03it/s, lr=7.11e-5, step_loss=0.0534]
Steps:  36%|███▌      | 361/1000 [06:08<10:22,  1.03it/s, lr=7.11e-5, step_loss=0.00596]
Steps:  36%|███▌      | 361/1000 [06:08<10:22,  1.03it/s, lr=7.11e-5, step_loss=0.0536] 
Steps:  36%|███▌      | 362/1000 [06:08<10:21,  1.03it/s, lr=7.11e-5, step_loss=0.0536]
Steps:  36%|███▌      | 362/1000 [06:08<10:21,  1.03it/s, lr=7.1e-5, step_loss=0.197]  
Steps:  36%|███▌      | 362/1000 [06:09<10:21,  1.03it/s, lr=7.1e-5, step_loss=0.0147]
Steps:  36%|███▌      | 362/1000 [06:09<10:21,  1.03it/s, lr=7.1e-5, step_loss=0.0309]
Steps:  36%|███▌      | 362/1000 [06:09<10:21,  1.03it/s, lr=7.1e-5, step_loss=0.0379]
Steps:  36%|███▋      | 363/1000 [06:09<10:22,  1.02it/s, lr=7.1e-5, step_loss=0.0379]
Steps:  36%|███▋      | 363/1000 [06:09<10:22,  1.02it/s, lr=7.09e-5, step_loss=0.322]
Steps:  36%|███▋      | 363/1000 [06:09<10:22,  1.02it/s, lr=7.09e-5, step_loss=0.0475]
Steps:  36%|███▋      | 363/1000 [06:10<10:22,  1.02it/s, lr=7.09e-5, step_loss=0.122] 
Steps:  36%|███▋      | 363/1000 [06:10<10:22,  1.02it/s, lr=7.09e-5, step_loss=0.152]
Steps:  36%|███▋      | 364/1000 [06:10<10:20,  1.02it/s, lr=7.09e-5, step_loss=0.152]
Steps:  36%|███▋      | 364/1000 [06:10<10:20,  1.02it/s, lr=7.07e-5, step_loss=0.249]
Steps:  36%|███▋      | 364/1000 [06:10<10:20,  1.02it/s, lr=7.07e-5, step_loss=0.0633]
Steps:  36%|███▋      | 364/1000 [06:11<10:20,  1.02it/s, lr=7.07e-5, step_loss=0.123] 
Steps:  36%|███▋      | 364/1000 [06:11<10:20,  1.02it/s, lr=7.07e-5, step_loss=0.257]
Steps:  36%|███▋      | 365/1000 [06:11<10:19,  1.03it/s, lr=7.07e-5, step_loss=0.257]
Steps:  36%|███▋      | 365/1000 [06:11<10:19,  1.03it/s, lr=7.06e-5, step_loss=0.129]
Steps:  36%|███▋      | 365/1000 [06:11<10:19,  1.03it/s, lr=7.06e-5, step_loss=0.0809]
Steps:  36%|███▋      | 365/1000 [06:12<10:19,  1.03it/s, lr=7.06e-5, step_loss=0.067] 
Steps:  36%|███▋      | 365/1000 [06:12<10:19,  1.03it/s, lr=7.06e-5, step_loss=0.00857]
Steps:  37%|███▋      | 366/1000 [06:12<10:18,  1.03it/s, lr=7.06e-5, step_loss=0.00857]
Steps:  37%|███▋      | 366/1000 [06:12<10:18,  1.03it/s, lr=7.04e-5, step_loss=0.0628] 
Steps:  37%|███▋      | 366/1000 [06:12<10:18,  1.03it/s, lr=7.04e-5, step_loss=0.0916]
Steps:  37%|███▋      | 366/1000 [06:13<10:18,  1.03it/s, lr=7.04e-5, step_loss=0.0284]
Steps:  37%|███▋      | 366/1000 [06:13<10:18,  1.03it/s, lr=7.04e-5, step_loss=0.113] 
Steps:  37%|███▋      | 367/1000 [06:13<10:16,  1.03it/s, lr=7.04e-5, step_loss=0.113]
Steps:  37%|███▋      | 367/1000 [06:13<10:16,  1.03it/s, lr=7.03e-5, step_loss=0.0572]
Steps:  37%|███▋      | 367/1000 [06:13<10:16,  1.03it/s, lr=7.03e-5, step_loss=0.0411]
Steps:  37%|███▋      | 367/1000 [06:14<10:16,  1.03it/s, lr=7.03e-5, step_loss=0.141] 
Steps:  37%|███▋      | 367/1000 [06:14<10:16,  1.03it/s, lr=7.03e-5, step_loss=0.0937]
Steps:  37%|███▋      | 368/1000 [06:14<10:15,  1.03it/s, lr=7.03e-5, step_loss=0.0937]
Steps:  37%|███▋      | 368/1000 [06:14<10:15,  1.03it/s, lr=7.01e-5, step_loss=0.00579]
Steps:  37%|███▋      | 368/1000 [06:14<10:15,  1.03it/s, lr=7.01e-5, step_loss=0.17]   
Steps:  37%|███▋      | 368/1000 [06:15<10:15,  1.03it/s, lr=7.01e-5, step_loss=0.413]
Steps:  37%|███▋      | 368/1000 [06:15<10:15,  1.03it/s, lr=7.01e-5, step_loss=0.186]
Steps:  37%|███▋      | 369/1000 [06:15<10:14,  1.03it/s, lr=7.01e-5, step_loss=0.186]
Steps:  37%|███▋      | 369/1000 [06:15<10:14,  1.03it/s, lr=7e-5, step_loss=0.0092]  
Steps:  37%|███▋      | 369/1000 [06:15<10:14,  1.03it/s, lr=7e-5, step_loss=0.0682]
Steps:  37%|███▋      | 369/1000 [06:16<10:14,  1.03it/s, lr=7e-5, step_loss=0.034] 
Steps:  37%|███▋      | 369/1000 [06:16<10:14,  1.03it/s, lr=7e-5, step_loss=0.0291]
Steps:  37%|███▋      | 370/1000 [06:16<10:14,  1.03it/s, lr=7e-5, step_loss=0.0291]
Steps:  37%|███▋      | 370/1000 [06:16<10:14,  1.03it/s, lr=6.99e-5, step_loss=0.018]
Steps:  37%|███▋      | 370/1000 [06:16<10:14,  1.03it/s, lr=6.99e-5, step_loss=0.0691]
Steps:  37%|███▋      | 370/1000 [06:17<10:14,  1.03it/s, lr=6.99e-5, step_loss=0.00437]
Steps:  37%|███▋      | 370/1000 [06:17<10:14,  1.03it/s, lr=6.99e-5, step_loss=0.0831] 
Steps:  37%|███▋      | 371/1000 [06:17<10:13,  1.03it/s, lr=6.99e-5, step_loss=0.0831]
Steps:  37%|███▋      | 371/1000 [06:17<10:13,  1.03it/s, lr=6.97e-5, step_loss=0.0157]
Steps:  37%|███▋      | 371/1000 [06:17<10:13,  1.03it/s, lr=6.97e-5, step_loss=0.178] 
Steps:  37%|███▋      | 371/1000 [06:18<10:13,  1.03it/s, lr=6.97e-5, step_loss=0.00768]
Steps:  37%|███▋      | 371/1000 [06:18<10:13,  1.03it/s, lr=6.97e-5, step_loss=0.236]  
Steps:  37%|███▋      | 372/1000 [06:18<10:12,  1.03it/s, lr=6.97e-5, step_loss=0.236]
Steps:  37%|███▋      | 372/1000 [06:18<10:12,  1.03it/s, lr=6.96e-5, step_loss=0.0377]
Steps:  37%|███▋      | 372/1000 [06:18<10:12,  1.03it/s, lr=6.96e-5, step_loss=0.0175]
Steps:  37%|███▋      | 372/1000 [06:19<10:12,  1.03it/s, lr=6.96e-5, step_loss=0.0488]
Steps:  37%|███▋      | 372/1000 [06:19<10:12,  1.03it/s, lr=6.96e-5, step_loss=0.0334]
Steps:  37%|███▋      | 373/1000 [06:19<10:10,  1.03it/s, lr=6.96e-5, step_loss=0.0334]
Steps:  37%|███▋      | 373/1000 [06:19<10:10,  1.03it/s, lr=6.94e-5, step_loss=0.376] 
Steps:  37%|███▋      | 373/1000 [06:19<10:10,  1.03it/s, lr=6.94e-5, step_loss=0.133]
Steps:  37%|███▋      | 373/1000 [06:19<10:10,  1.03it/s, lr=6.94e-5, step_loss=0.158]
Steps:  37%|███▋      | 373/1000 [06:20<10:10,  1.03it/s, lr=6.94e-5, step_loss=0.176]
Steps:  37%|███▋      | 374/1000 [06:20<10:09,  1.03it/s, lr=6.94e-5, step_loss=0.176]
Steps:  37%|███▋      | 374/1000 [06:20<10:09,  1.03it/s, lr=6.93e-5, step_loss=0.0507]
Steps:  37%|███▋      | 374/1000 [06:20<10:09,  1.03it/s, lr=6.93e-5, step_loss=0.018] 
Steps:  37%|███▋      | 374/1000 [06:20<10:09,  1.03it/s, lr=6.93e-5, step_loss=0.179]
Steps:  37%|███▋      | 374/1000 [06:21<10:09,  1.03it/s, lr=6.93e-5, step_loss=0.0578]
Steps:  38%|███▊      | 375/1000 [06:21<10:08,  1.03it/s, lr=6.93e-5, step_loss=0.0578]
Steps:  38%|███▊      | 375/1000 [06:21<10:08,  1.03it/s, lr=6.91e-5, step_loss=0.00297]
Steps:  38%|███▊      | 375/1000 [06:21<10:08,  1.03it/s, lr=6.91e-5, step_loss=0.0778] 
Steps:  38%|███▊      | 375/1000 [06:21<10:08,  1.03it/s, lr=6.91e-5, step_loss=0.00595]
Steps:  38%|███▊      | 375/1000 [06:22<10:08,  1.03it/s, lr=6.91e-5, step_loss=0.158]  
Steps:  38%|███▊      | 376/1000 [06:22<10:07,  1.03it/s, lr=6.91e-5, step_loss=0.158]
Steps:  38%|███▊      | 376/1000 [06:22<10:07,  1.03it/s, lr=6.9e-5, step_loss=0.246] 
Steps:  38%|███▊      | 376/1000 [06:22<10:07,  1.03it/s, lr=6.9e-5, step_loss=0.0471]
Steps:  38%|███▊      | 376/1000 [06:22<10:07,  1.03it/s, lr=6.9e-5, step_loss=0.0588]
Steps:  38%|███▊      | 376/1000 [06:23<10:07,  1.03it/s, lr=6.9e-5, step_loss=0.00751]
Steps:  38%|███▊      | 377/1000 [06:23<10:06,  1.03it/s, lr=6.9e-5, step_loss=0.00751]
Steps:  38%|███▊      | 377/1000 [06:23<10:06,  1.03it/s, lr=6.88e-5, step_loss=0.00295]
Steps:  38%|███▊      | 377/1000 [06:23<10:06,  1.03it/s, lr=6.88e-5, step_loss=0.0209] 
Steps:  38%|███▊      | 377/1000 [06:23<10:06,  1.03it/s, lr=6.88e-5, step_loss=0.0247]
Steps:  38%|███▊      | 377/1000 [06:24<10:06,  1.03it/s, lr=6.88e-5, step_loss=0.0725]
Steps:  38%|███▊      | 378/1000 [06:24<10:05,  1.03it/s, lr=6.88e-5, step_loss=0.0725]
Steps:  38%|███▊      | 378/1000 [06:24<10:05,  1.03it/s, lr=6.87e-5, step_loss=0.00602]
Steps:  38%|███▊      | 378/1000 [06:24<10:05,  1.03it/s, lr=6.87e-5, step_loss=0.0474] 
Steps:  38%|███▊      | 378/1000 [06:24<10:05,  1.03it/s, lr=6.87e-5, step_loss=0.0506]
Steps:  38%|███▊      | 378/1000 [06:25<10:05,  1.03it/s, lr=6.87e-5, step_loss=0.00175]
Steps:  38%|███▊      | 379/1000 [06:25<10:03,  1.03it/s, lr=6.87e-5, step_loss=0.00175]
Steps:  38%|███▊      | 379/1000 [06:25<10:03,  1.03it/s, lr=6.86e-5, step_loss=0.126]  
Steps:  38%|███▊      | 379/1000 [06:25<10:03,  1.03it/s, lr=6.86e-5, step_loss=0.0058]
Steps:  38%|███▊      | 379/1000 [06:25<10:03,  1.03it/s, lr=6.86e-5, step_loss=0.133] 
Steps:  38%|███▊      | 379/1000 [06:26<10:03,  1.03it/s, lr=6.86e-5, step_loss=0.0339]
Steps:  38%|███▊      | 380/1000 [06:26<10:02,  1.03it/s, lr=6.86e-5, step_loss=0.0339]
Steps:  38%|███▊      | 380/1000 [06:26<10:02,  1.03it/s, lr=6.84e-5, step_loss=0.151] 
Steps:  38%|███▊      | 380/1000 [06:26<10:02,  1.03it/s, lr=6.84e-5, step_loss=0.0298]
Steps:  38%|███▊      | 380/1000 [06:26<10:02,  1.03it/s, lr=6.84e-5, step_loss=0.0608]
Steps:  38%|███▊      | 380/1000 [06:27<10:02,  1.03it/s, lr=6.84e-5, step_loss=0.0141]
Steps:  38%|███▊      | 381/1000 [06:27<10:02,  1.03it/s, lr=6.84e-5, step_loss=0.0141]
Steps:  38%|███▊      | 381/1000 [06:27<10:02,  1.03it/s, lr=6.83e-5, step_loss=0.136] 
Steps:  38%|███▊      | 381/1000 [06:27<10:02,  1.03it/s, lr=6.83e-5, step_loss=0.00292]
Steps:  38%|███▊      | 381/1000 [06:27<10:02,  1.03it/s, lr=6.83e-5, step_loss=0.00894]
Steps:  38%|███▊      | 381/1000 [06:27<10:02,  1.03it/s, lr=6.83e-5, step_loss=0.0353] 
Steps:  38%|███▊      | 382/1000 [06:28<10:01,  1.03it/s, lr=6.83e-5, step_loss=0.0353]
Steps:  38%|███▊      | 382/1000 [06:28<10:01,  1.03it/s, lr=6.81e-5, step_loss=0.00336]
Steps:  38%|███▊      | 382/1000 [06:28<10:01,  1.03it/s, lr=6.81e-5, step_loss=0.0541] 
Steps:  38%|███▊      | 382/1000 [06:28<10:01,  1.03it/s, lr=6.81e-5, step_loss=0.274] 
Steps:  38%|███▊      | 382/1000 [06:28<10:01,  1.03it/s, lr=6.81e-5, step_loss=0.0538]
Steps:  38%|███▊      | 383/1000 [06:29<10:00,  1.03it/s, lr=6.81e-5, step_loss=0.0538]
Steps:  38%|███▊      | 383/1000 [06:29<10:00,  1.03it/s, lr=6.8e-5, step_loss=0.0073] 
Steps:  38%|███▊      | 383/1000 [06:29<10:00,  1.03it/s, lr=6.8e-5, step_loss=0.0434]
Steps:  38%|███▊      | 383/1000 [06:29<10:00,  1.03it/s, lr=6.8e-5, step_loss=0.0129]
Steps:  38%|███▊      | 383/1000 [06:29<10:00,  1.03it/s, lr=6.8e-5, step_loss=0.133] 
Steps:  38%|███▊      | 384/1000 [06:30<09:59,  1.03it/s, lr=6.8e-5, step_loss=0.133]
Steps:  38%|███▊      | 384/1000 [06:30<09:59,  1.03it/s, lr=6.78e-5, step_loss=0.0179]
Steps:  38%|███▊      | 384/1000 [06:30<09:59,  1.03it/s, lr=6.78e-5, step_loss=0.00626]
Steps:  38%|███▊      | 384/1000 [06:30<09:59,  1.03it/s, lr=6.78e-5, step_loss=0.00793]
Steps:  38%|███▊      | 384/1000 [06:30<09:59,  1.03it/s, lr=6.78e-5, step_loss=0.269]  
Steps:  38%|███▊      | 385/1000 [06:31<09:58,  1.03it/s, lr=6.78e-5, step_loss=0.269]
Steps:  38%|███▊      | 385/1000 [06:31<09:58,  1.03it/s, lr=6.77e-5, step_loss=0.546]
Steps:  38%|███▊      | 385/1000 [06:31<09:58,  1.03it/s, lr=6.77e-5, step_loss=0.436]
Steps:  38%|███▊      | 385/1000 [06:31<09:58,  1.03it/s, lr=6.77e-5, step_loss=0.0888]
Steps:  38%|███▊      | 385/1000 [06:31<09:58,  1.03it/s, lr=6.77e-5, step_loss=0.0158]
Steps:  39%|███▊      | 386/1000 [06:32<09:57,  1.03it/s, lr=6.77e-5, step_loss=0.0158]
Steps:  39%|███▊      | 386/1000 [06:32<09:57,  1.03it/s, lr=6.75e-5, step_loss=0.206] 
Steps:  39%|███▊      | 386/1000 [06:32<09:57,  1.03it/s, lr=6.75e-5, step_loss=0.698]
Steps:  39%|███▊      | 386/1000 [06:32<09:57,  1.03it/s, lr=6.75e-5, step_loss=0.0463]
Steps:  39%|███▊      | 386/1000 [06:32<09:57,  1.03it/s, lr=6.75e-5, step_loss=0.0305]
Steps:  39%|███▊      | 387/1000 [06:33<09:56,  1.03it/s, lr=6.75e-5, step_loss=0.0305]
Steps:  39%|███▊      | 387/1000 [06:33<09:56,  1.03it/s, lr=6.74e-5, step_loss=0.00186]
Steps:  39%|███▊      | 387/1000 [06:33<09:56,  1.03it/s, lr=6.74e-5, step_loss=0.0309] 
Steps:  39%|███▊      | 387/1000 [06:33<09:56,  1.03it/s, lr=6.74e-5, step_loss=0.0254]
Steps:  39%|███▊      | 387/1000 [06:33<09:56,  1.03it/s, lr=6.74e-5, step_loss=0.0953]
Steps:  39%|███▉      | 388/1000 [06:34<09:55,  1.03it/s, lr=6.74e-5, step_loss=0.0953]
Steps:  39%|███▉      | 388/1000 [06:34<09:55,  1.03it/s, lr=6.72e-5, step_loss=0.0336]
Steps:  39%|███▉      | 388/1000 [06:34<09:55,  1.03it/s, lr=6.72e-5, step_loss=0.0315]
Steps:  39%|███▉      | 388/1000 [06:34<09:55,  1.03it/s, lr=6.72e-5, step_loss=0.003] 
Steps:  39%|███▉      | 388/1000 [06:34<09:55,  1.03it/s, lr=6.72e-5, step_loss=0.387]
Steps:  39%|███▉      | 389/1000 [06:35<09:54,  1.03it/s, lr=6.72e-5, step_loss=0.387]
Steps:  39%|███▉      | 389/1000 [06:35<09:54,  1.03it/s, lr=6.71e-5, step_loss=0.0772]
Steps:  39%|███▉      | 389/1000 [06:35<09:54,  1.03it/s, lr=6.71e-5, step_loss=0.0311]
Steps:  39%|███▉      | 389/1000 [06:35<09:54,  1.03it/s, lr=6.71e-5, step_loss=0.0103]
Steps:  39%|███▉      | 389/1000 [06:35<09:54,  1.03it/s, lr=6.71e-5, step_loss=0.0728]
Steps:  39%|███▉      | 390/1000 [06:35<09:53,  1.03it/s, lr=6.71e-5, step_loss=0.0728]
Steps:  39%|███▉      | 390/1000 [06:36<09:53,  1.03it/s, lr=6.69e-5, step_loss=0.109] 
Steps:  39%|███▉      | 390/1000 [06:36<09:53,  1.03it/s, lr=6.69e-5, step_loss=0.253]
Steps:  39%|███▉      | 390/1000 [06:36<09:53,  1.03it/s, lr=6.69e-5, step_loss=0.187]
Steps:  39%|███▉      | 390/1000 [06:36<09:53,  1.03it/s, lr=6.69e-5, step_loss=0.15] 
Steps:  39%|███▉      | 391/1000 [06:36<09:52,  1.03it/s, lr=6.69e-5, step_loss=0.15]
Steps:  39%|███▉      | 391/1000 [06:36<09:52,  1.03it/s, lr=6.68e-5, step_loss=0.0035]
Steps:  39%|███▉      | 391/1000 [06:37<09:52,  1.03it/s, lr=6.68e-5, step_loss=0.176] 
Steps:  39%|███▉      | 391/1000 [06:37<09:52,  1.03it/s, lr=6.68e-5, step_loss=0.0144]
Steps:  39%|███▉      | 391/1000 [06:37<09:52,  1.03it/s, lr=6.68e-5, step_loss=0.0049]
Steps:  39%|███▉      | 392/1000 [06:37<09:51,  1.03it/s, lr=6.68e-5, step_loss=0.0049]
Steps:  39%|███▉      | 392/1000 [06:37<09:51,  1.03it/s, lr=6.66e-5, step_loss=0.134] 
Steps:  39%|███▉      | 392/1000 [06:38<09:51,  1.03it/s, lr=6.66e-5, step_loss=0.0027]
Steps:  39%|███▉      | 392/1000 [06:38<09:51,  1.03it/s, lr=6.66e-5, step_loss=0.0813]
Steps:  39%|███▉      | 392/1000 [06:38<09:51,  1.03it/s, lr=6.66e-5, step_loss=0.109] 
Steps:  39%|███▉      | 393/1000 [06:38<09:50,  1.03it/s, lr=6.66e-5, step_loss=0.109]
Steps:  39%|███▉      | 393/1000 [06:38<09:50,  1.03it/s, lr=6.65e-5, step_loss=0.25] 
Steps:  39%|███▉      | 393/1000 [06:39<09:50,  1.03it/s, lr=6.65e-5, step_loss=0.0166]
Steps:  39%|███▉      | 393/1000 [06:39<09:50,  1.03it/s, lr=6.65e-5, step_loss=0.0207]
Steps:  39%|███▉      | 393/1000 [06:39<09:50,  1.03it/s, lr=6.65e-5, step_loss=0.0188]
Steps:  39%|███▉      | 394/1000 [06:39<09:49,  1.03it/s, lr=6.65e-5, step_loss=0.0188]
Steps:  39%|███▉      | 394/1000 [06:39<09:49,  1.03it/s, lr=6.63e-5, step_loss=0.664] 
Steps:  39%|███▉      | 394/1000 [06:40<09:49,  1.03it/s, lr=6.63e-5, step_loss=0.0429]
Steps:  39%|███▉      | 394/1000 [06:40<09:49,  1.03it/s, lr=6.63e-5, step_loss=0.00918]
Steps:  39%|███▉      | 394/1000 [06:40<09:49,  1.03it/s, lr=6.63e-5, step_loss=0.0122] 
Steps:  40%|███▉      | 395/1000 [06:40<09:48,  1.03it/s, lr=6.63e-5, step_loss=0.0122]
Steps:  40%|███▉      | 395/1000 [06:40<09:48,  1.03it/s, lr=6.62e-5, step_loss=0.268] 
Steps:  40%|███▉      | 395/1000 [06:41<09:48,  1.03it/s, lr=6.62e-5, step_loss=0.0573]
Steps:  40%|███▉      | 395/1000 [06:41<09:48,  1.03it/s, lr=6.62e-5, step_loss=0.00764]
Steps:  40%|███▉      | 395/1000 [06:41<09:48,  1.03it/s, lr=6.62e-5, step_loss=0.00563]
Steps:  40%|███▉      | 396/1000 [06:41<09:47,  1.03it/s, lr=6.62e-5, step_loss=0.00563]
Steps:  40%|███▉      | 396/1000 [06:41<09:47,  1.03it/s, lr=6.6e-5, step_loss=0.0424]  
Steps:  40%|███▉      | 396/1000 [06:42<09:47,  1.03it/s, lr=6.6e-5, step_loss=0.212] 
Steps:  40%|███▉      | 396/1000 [06:42<09:47,  1.03it/s, lr=6.6e-5, step_loss=0.0951]
Steps:  40%|███▉      | 396/1000 [06:42<09:47,  1.03it/s, lr=6.6e-5, step_loss=0.0214]
Steps:  40%|███▉      | 397/1000 [06:42<09:46,  1.03it/s, lr=6.6e-5, step_loss=0.0214]
Steps:  40%|███▉      | 397/1000 [06:42<09:46,  1.03it/s, lr=6.59e-5, step_loss=0.0188]
Steps:  40%|███▉      | 397/1000 [06:43<09:46,  1.03it/s, lr=6.59e-5, step_loss=0.0611]
Steps:  40%|███▉      | 397/1000 [06:43<09:46,  1.03it/s, lr=6.59e-5, step_loss=0.122] 
Steps:  40%|███▉      | 397/1000 [06:43<09:46,  1.03it/s, lr=6.59e-5, step_loss=0.0248]
Steps:  40%|███▉      | 398/1000 [06:43<09:45,  1.03it/s, lr=6.59e-5, step_loss=0.0248]
Steps:  40%|███▉      | 398/1000 [06:43<09:45,  1.03it/s, lr=6.57e-5, step_loss=0.0145]
Steps:  40%|███▉      | 398/1000 [06:44<09:45,  1.03it/s, lr=6.57e-5, step_loss=0.0697]
Steps:  40%|███▉      | 398/1000 [06:44<09:45,  1.03it/s, lr=6.57e-5, step_loss=0.402] 
Steps:  40%|███▉      | 398/1000 [06:44<09:45,  1.03it/s, lr=6.57e-5, step_loss=0.0918]
Steps:  40%|███▉      | 399/1000 [06:44<09:44,  1.03it/s, lr=6.57e-5, step_loss=0.0918]
Steps:  40%|███▉      | 399/1000 [06:44<09:44,  1.03it/s, lr=6.56e-5, step_loss=0.0598]
Steps:  40%|███▉      | 399/1000 [06:45<09:44,  1.03it/s, lr=6.56e-5, step_loss=0.0288]
Steps:  40%|███▉      | 399/1000 [06:45<09:44,  1.03it/s, lr=6.56e-5, step_loss=0.298] 
Steps:  40%|███▉      | 399/1000 [06:45<09:44,  1.03it/s, lr=6.56e-5, step_loss=0.00619]
Steps:  40%|████      | 400/1000 [06:45<09:43,  1.03it/s, lr=6.56e-5, step_loss=0.00619]
Steps:  40%|████      | 400/1000 [06:45<09:43,  1.03it/s, lr=6.55e-5, step_loss=0.00208]
Steps:  40%|████      | 400/1000 [06:45<09:43,  1.03it/s, lr=6.55e-5, step_loss=0.0458] 
Steps:  40%|████      | 400/1000 [06:46<09:43,  1.03it/s, lr=6.55e-5, step_loss=0.0472]
Steps:  40%|████      | 400/1000 [06:46<09:43,  1.03it/s, lr=6.55e-5, step_loss=0.0141]
Steps:  40%|████      | 401/1000 [06:46<09:42,  1.03it/s, lr=6.55e-5, step_loss=0.0141]
Steps:  40%|████      | 401/1000 [06:46<09:42,  1.03it/s, lr=6.53e-5, step_loss=0.067] 
Steps:  40%|████      | 401/1000 [06:46<09:42,  1.03it/s, lr=6.53e-5, step_loss=0.367]
Steps:  40%|████      | 401/1000 [06:47<09:42,  1.03it/s, lr=6.53e-5, step_loss=0.00447]
Steps:  40%|████      | 401/1000 [06:47<09:42,  1.03it/s, lr=6.53e-5, step_loss=0.0562] 
Steps:  40%|████      | 402/1000 [06:47<09:41,  1.03it/s, lr=6.53e-5, step_loss=0.0562]
Steps:  40%|████      | 402/1000 [06:47<09:41,  1.03it/s, lr=6.52e-5, step_loss=0.0419]
Steps:  40%|████      | 402/1000 [06:47<09:41,  1.03it/s, lr=6.52e-5, step_loss=0.0898]
Steps:  40%|████      | 402/1000 [06:48<09:41,  1.03it/s, lr=6.52e-5, step_loss=0.111] 
Steps:  40%|████      | 402/1000 [06:48<09:41,  1.03it/s, lr=6.52e-5, step_loss=0.192]
Steps:  40%|████      | 403/1000 [06:48<09:40,  1.03it/s, lr=6.52e-5, step_loss=0.192]
Steps:  40%|████      | 403/1000 [06:48<09:40,  1.03it/s, lr=6.5e-5, step_loss=0.00441]
Steps:  40%|████      | 403/1000 [06:48<09:40,  1.03it/s, lr=6.5e-5, step_loss=0.0677] 
Steps:  40%|████      | 403/1000 [06:49<09:40,  1.03it/s, lr=6.5e-5, step_loss=0.0199]
Steps:  40%|████      | 403/1000 [06:49<09:40,  1.03it/s, lr=6.5e-5, step_loss=0.12]  
Steps:  40%|████      | 404/1000 [06:49<09:39,  1.03it/s, lr=6.5e-5, step_loss=0.12]
Steps:  40%|████      | 404/1000 [06:49<09:39,  1.03it/s, lr=6.49e-5, step_loss=0.00332]
Steps:  40%|████      | 404/1000 [06:49<09:39,  1.03it/s, lr=6.49e-5, step_loss=0.0326] 
Steps:  40%|████      | 404/1000 [06:50<09:39,  1.03it/s, lr=6.49e-5, step_loss=0.156] 
Steps:  40%|████      | 404/1000 [06:50<09:39,  1.03it/s, lr=6.49e-5, step_loss=0.107]
Steps:  40%|████      | 405/1000 [06:50<09:38,  1.03it/s, lr=6.49e-5, step_loss=0.107]
Steps:  40%|████      | 405/1000 [06:50<09:38,  1.03it/s, lr=6.47e-5, step_loss=0.203]
Steps:  40%|████      | 405/1000 [06:50<09:38,  1.03it/s, lr=6.47e-5, step_loss=0.0454]
Steps:  40%|████      | 405/1000 [06:51<09:38,  1.03it/s, lr=6.47e-5, step_loss=0.0124]
Steps:  40%|████      | 405/1000 [06:51<09:38,  1.03it/s, lr=6.47e-5, step_loss=0.291] 
Steps:  41%|████      | 406/1000 [06:51<09:37,  1.03it/s, lr=6.47e-5, step_loss=0.291]
Steps:  41%|████      | 406/1000 [06:51<09:37,  1.03it/s, lr=6.46e-5, step_loss=0.0697]
Steps:  41%|████      | 406/1000 [06:51<09:37,  1.03it/s, lr=6.46e-5, step_loss=0.0421]
Steps:  41%|████      | 406/1000 [06:52<09:37,  1.03it/s, lr=6.46e-5, step_loss=0.022] 
Steps:  41%|████      | 406/1000 [06:52<09:37,  1.03it/s, lr=6.46e-5, step_loss=0.0628]
Steps:  41%|████      | 407/1000 [06:52<09:36,  1.03it/s, lr=6.46e-5, step_loss=0.0628]
Steps:  41%|████      | 407/1000 [06:52<09:36,  1.03it/s, lr=6.44e-5, step_loss=0.0851]
Steps:  41%|████      | 407/1000 [06:52<09:36,  1.03it/s, lr=6.44e-5, step_loss=0.0626]
Steps:  41%|████      | 407/1000 [06:53<09:36,  1.03it/s, lr=6.44e-5, step_loss=0.261] 
Steps:  41%|████      | 407/1000 [06:53<09:36,  1.03it/s, lr=6.44e-5, step_loss=0.0073]
Steps:  41%|████      | 408/1000 [06:53<09:35,  1.03it/s, lr=6.44e-5, step_loss=0.0073]
Steps:  41%|████      | 408/1000 [06:53<09:35,  1.03it/s, lr=6.43e-5, step_loss=0.114] 
Steps:  41%|████      | 408/1000 [06:53<09:35,  1.03it/s, lr=6.43e-5, step_loss=0.11] 
Steps:  41%|████      | 408/1000 [06:54<09:35,  1.03it/s, lr=6.43e-5, step_loss=0.0171]
Steps:  41%|████      | 408/1000 [06:54<09:35,  1.03it/s, lr=6.43e-5, step_loss=0.071] 
Steps:  41%|████      | 409/1000 [06:54<09:34,  1.03it/s, lr=6.43e-5, step_loss=0.071]
Steps:  41%|████      | 409/1000 [06:54<09:34,  1.03it/s, lr=6.41e-5, step_loss=0.214]
Steps:  41%|████      | 409/1000 [06:54<09:34,  1.03it/s, lr=6.41e-5, step_loss=0.0312]
Steps:  41%|████      | 409/1000 [06:54<09:34,  1.03it/s, lr=6.41e-5, step_loss=0.0258]
Steps:  41%|████      | 409/1000 [06:55<09:34,  1.03it/s, lr=6.41e-5, step_loss=0.0103]
Steps:  41%|████      | 410/1000 [06:55<09:34,  1.03it/s, lr=6.41e-5, step_loss=0.0103]
Steps:  41%|████      | 410/1000 [06:55<09:34,  1.03it/s, lr=6.39e-5, step_loss=0.247] 
Steps:  41%|████      | 410/1000 [06:55<09:34,  1.03it/s, lr=6.39e-5, step_loss=0.0133]
Steps:  41%|████      | 410/1000 [06:55<09:34,  1.03it/s, lr=6.39e-5, step_loss=0.0635]
Steps:  41%|████      | 410/1000 [06:56<09:34,  1.03it/s, lr=6.39e-5, step_loss=0.0182]
Steps:  41%|████      | 411/1000 [06:56<09:33,  1.03it/s, lr=6.39e-5, step_loss=0.0182]
Steps:  41%|████      | 411/1000 [06:56<09:33,  1.03it/s, lr=6.38e-5, step_loss=0.0316]
Steps:  41%|████      | 411/1000 [06:56<09:33,  1.03it/s, lr=6.38e-5, step_loss=0.0534]
Steps:  41%|████      | 411/1000 [06:56<09:33,  1.03it/s, lr=6.38e-5, step_loss=0.318] 
Steps:  41%|████      | 411/1000 [06:57<09:33,  1.03it/s, lr=6.38e-5, step_loss=0.249]
Steps:  41%|████      | 412/1000 [06:57<09:32,  1.03it/s, lr=6.38e-5, step_loss=0.249]
Steps:  41%|████      | 412/1000 [06:57<09:32,  1.03it/s, lr=6.36e-5, step_loss=0.167]
Steps:  41%|████      | 412/1000 [06:57<09:32,  1.03it/s, lr=6.36e-5, step_loss=0.00756]
Steps:  41%|████      | 412/1000 [06:57<09:32,  1.03it/s, lr=6.36e-5, step_loss=0.113]  
Steps:  41%|████      | 412/1000 [06:58<09:32,  1.03it/s, lr=6.36e-5, step_loss=0.31] 
Steps:  41%|████▏     | 413/1000 [06:58<09:31,  1.03it/s, lr=6.36e-5, step_loss=0.31]
Steps:  41%|████▏     | 413/1000 [06:58<09:31,  1.03it/s, lr=6.35e-5, step_loss=0.273]
Steps:  41%|████▏     | 413/1000 [06:58<09:31,  1.03it/s, lr=6.35e-5, step_loss=0.0024]
Steps:  41%|████▏     | 413/1000 [06:58<09:31,  1.03it/s, lr=6.35e-5, step_loss=0.174] 
Steps:  41%|████▏     | 413/1000 [06:59<09:31,  1.03it/s, lr=6.35e-5, step_loss=0.108]
Steps:  41%|████▏     | 414/1000 [06:59<09:30,  1.03it/s, lr=6.35e-5, step_loss=0.108]
Steps:  41%|████▏     | 414/1000 [06:59<09:30,  1.03it/s, lr=6.33e-5, step_loss=0.274]
Steps:  41%|████▏     | 414/1000 [06:59<09:30,  1.03it/s, lr=6.33e-5, step_loss=0.0774]
Steps:  41%|████▏     | 414/1000 [06:59<09:30,  1.03it/s, lr=6.33e-5, step_loss=0.0147]
Steps:  41%|████▏     | 414/1000 [07:00<09:30,  1.03it/s, lr=6.33e-5, step_loss=0.147] 
Steps:  42%|████▏     | 415/1000 [07:00<09:29,  1.03it/s, lr=6.33e-5, step_loss=0.147]
Steps:  42%|████▏     | 415/1000 [07:00<09:29,  1.03it/s, lr=6.32e-5, step_loss=0.0241]
Steps:  42%|████▏     | 415/1000 [07:00<09:29,  1.03it/s, lr=6.32e-5, step_loss=0.0649]
Steps:  42%|████▏     | 415/1000 [07:00<09:29,  1.03it/s, lr=6.32e-5, step_loss=0.12]  
Steps:  42%|████▏     | 415/1000 [07:01<09:29,  1.03it/s, lr=6.32e-5, step_loss=0.00235]
Steps:  42%|████▏     | 416/1000 [07:01<09:28,  1.03it/s, lr=6.32e-5, step_loss=0.00235]
Steps:  42%|████▏     | 416/1000 [07:01<09:28,  1.03it/s, lr=6.3e-5, step_loss=0.0444]  
Steps:  42%|████▏     | 416/1000 [07:01<09:28,  1.03it/s, lr=6.3e-5, step_loss=0.128] 
Steps:  42%|████▏     | 416/1000 [07:01<09:28,  1.03it/s, lr=6.3e-5, step_loss=0.106]
Steps:  42%|████▏     | 416/1000 [07:02<09:28,  1.03it/s, lr=6.3e-5, step_loss=0.307]
Steps:  42%|████▏     | 417/1000 [07:02<09:27,  1.03it/s, lr=6.3e-5, step_loss=0.307]
Steps:  42%|████▏     | 417/1000 [07:02<09:27,  1.03it/s, lr=6.29e-5, step_loss=0.0993]
Steps:  42%|████▏     | 417/1000 [07:02<09:27,  1.03it/s, lr=6.29e-5, step_loss=0.195] 
Steps:  42%|████▏     | 417/1000 [07:02<09:27,  1.03it/s, lr=6.29e-5, step_loss=0.147]
Steps:  42%|████▏     | 417/1000 [07:03<09:27,  1.03it/s, lr=6.29e-5, step_loss=0.0199]
Steps:  42%|████▏     | 418/1000 [07:03<09:26,  1.03it/s, lr=6.29e-5, step_loss=0.0199]
Steps:  42%|████▏     | 418/1000 [07:03<09:26,  1.03it/s, lr=6.27e-5, step_loss=0.245] 
Steps:  42%|████▏     | 418/1000 [07:03<09:26,  1.03it/s, lr=6.27e-5, step_loss=0.0117]
Steps:  42%|████▏     | 418/1000 [07:03<09:26,  1.03it/s, lr=6.27e-5, step_loss=0.12]  
Steps:  42%|████▏     | 418/1000 [07:03<09:26,  1.03it/s, lr=6.27e-5, step_loss=0.0721]
Steps:  42%|████▏     | 419/1000 [07:04<09:25,  1.03it/s, lr=6.27e-5, step_loss=0.0721]
Steps:  42%|████▏     | 419/1000 [07:04<09:25,  1.03it/s, lr=6.26e-5, step_loss=0.0472]
Steps:  42%|████▏     | 419/1000 [07:04<09:25,  1.03it/s, lr=6.26e-5, step_loss=0.0597]
Steps:  42%|████▏     | 419/1000 [07:04<09:25,  1.03it/s, lr=6.26e-5, step_loss=0.0284]
Steps:  42%|████▏     | 419/1000 [07:04<09:25,  1.03it/s, lr=6.26e-5, step_loss=0.253] 
Steps:  42%|████▏     | 420/1000 [07:05<09:24,  1.03it/s, lr=6.26e-5, step_loss=0.253]
Steps:  42%|████▏     | 420/1000 [07:05<09:24,  1.03it/s, lr=6.24e-5, step_loss=0.118]
Steps:  42%|████▏     | 420/1000 [07:05<09:24,  1.03it/s, lr=6.24e-5, step_loss=0.234]
Steps:  42%|████▏     | 420/1000 [07:05<09:24,  1.03it/s, lr=6.24e-5, step_loss=0.0231]
Steps:  42%|████▏     | 420/1000 [07:05<09:24,  1.03it/s, lr=6.24e-5, step_loss=0.0385]
Steps:  42%|████▏     | 421/1000 [07:06<09:23,  1.03it/s, lr=6.24e-5, step_loss=0.0385]
Steps:  42%|████▏     | 421/1000 [07:06<09:23,  1.03it/s, lr=6.23e-5, step_loss=0.00304]
Steps:  42%|████▏     | 421/1000 [07:06<09:23,  1.03it/s, lr=6.23e-5, step_loss=0.0136] 
Steps:  42%|████▏     | 421/1000 [07:06<09:23,  1.03it/s, lr=6.23e-5, step_loss=0.0183]
Steps:  42%|████▏     | 421/1000 [07:06<09:23,  1.03it/s, lr=6.23e-5, step_loss=0.00367]
Steps:  42%|████▏     | 422/1000 [07:07<09:22,  1.03it/s, lr=6.23e-5, step_loss=0.00367]
Steps:  42%|████▏     | 422/1000 [07:07<09:22,  1.03it/s, lr=6.21e-5, step_loss=0.0579] 
Steps:  42%|████▏     | 422/1000 [07:07<09:22,  1.03it/s, lr=6.21e-5, step_loss=0.0943]
Steps:  42%|████▏     | 422/1000 [07:07<09:22,  1.03it/s, lr=6.21e-5, step_loss=0.0447]
Steps:  42%|████▏     | 422/1000 [07:07<09:22,  1.03it/s, lr=6.21e-5, step_loss=0.00818]
Steps:  42%|████▏     | 423/1000 [07:08<09:21,  1.03it/s, lr=6.21e-5, step_loss=0.00818]
Steps:  42%|████▏     | 423/1000 [07:08<09:21,  1.03it/s, lr=6.2e-5, step_loss=0.0124]  
Steps:  42%|████▏     | 423/1000 [07:08<09:21,  1.03it/s, lr=6.2e-5, step_loss=0.0244]
Steps:  42%|████▏     | 423/1000 [07:08<09:21,  1.03it/s, lr=6.2e-5, step_loss=0.0248]
Steps:  42%|████▏     | 423/1000 [07:08<09:21,  1.03it/s, lr=6.2e-5, step_loss=0.0261]
Steps:  42%|████▏     | 424/1000 [07:09<09:20,  1.03it/s, lr=6.2e-5, step_loss=0.0261]
Steps:  42%|████▏     | 424/1000 [07:09<09:20,  1.03it/s, lr=6.18e-5, step_loss=0.00676]
Steps:  42%|████▏     | 424/1000 [07:09<09:20,  1.03it/s, lr=6.18e-5, step_loss=0.123]  
Steps:  42%|████▏     | 424/1000 [07:09<09:20,  1.03it/s, lr=6.18e-5, step_loss=0.104]
Steps:  42%|████▏     | 424/1000 [07:09<09:20,  1.03it/s, lr=6.18e-5, step_loss=0.213]
Steps:  42%|████▎     | 425/1000 [07:10<09:19,  1.03it/s, lr=6.18e-5, step_loss=0.213]
Steps:  42%|████▎     | 425/1000 [07:10<09:19,  1.03it/s, lr=6.17e-5, step_loss=0.104]
Steps:  42%|████▎     | 425/1000 [07:10<09:19,  1.03it/s, lr=6.17e-5, step_loss=0.0438]
Steps:  42%|████▎     | 425/1000 [07:10<09:19,  1.03it/s, lr=6.17e-5, step_loss=0.00456]
Steps:  42%|████▎     | 425/1000 [07:10<09:19,  1.03it/s, lr=6.17e-5, step_loss=0.0503] 
Steps:  43%|████▎     | 426/1000 [07:11<09:18,  1.03it/s, lr=6.17e-5, step_loss=0.0503]
Steps:  43%|████▎     | 426/1000 [07:11<09:18,  1.03it/s, lr=6.15e-5, step_loss=0.00886]
Steps:  43%|████▎     | 426/1000 [07:11<09:18,  1.03it/s, lr=6.15e-5, step_loss=0.00771]
Steps:  43%|████▎     | 426/1000 [07:11<09:18,  1.03it/s, lr=6.15e-5, step_loss=0.163]  
Steps:  43%|████▎     | 426/1000 [07:11<09:18,  1.03it/s, lr=6.15e-5, step_loss=0.172]
Steps:  43%|████▎     | 427/1000 [07:11<09:17,  1.03it/s, lr=6.15e-5, step_loss=0.172]
Steps:  43%|████▎     | 427/1000 [07:12<09:17,  1.03it/s, lr=6.14e-5, step_loss=0.00243]
Steps:  43%|████▎     | 427/1000 [07:12<09:17,  1.03it/s, lr=6.14e-5, step_loss=0.0709] 
Steps:  43%|████▎     | 427/1000 [07:12<09:17,  1.03it/s, lr=6.14e-5, step_loss=0.0417]
Steps:  43%|████▎     | 427/1000 [07:12<09:17,  1.03it/s, lr=6.14e-5, step_loss=0.0133]
Steps:  43%|████▎     | 428/1000 [07:12<09:16,  1.03it/s, lr=6.14e-5, step_loss=0.0133]
Steps:  43%|████▎     | 428/1000 [07:12<09:16,  1.03it/s, lr=6.12e-5, step_loss=0.11]  
Steps:  43%|████▎     | 428/1000 [07:13<09:16,  1.03it/s, lr=6.12e-5, step_loss=0.0889]
Steps:  43%|████▎     | 428/1000 [07:13<09:16,  1.03it/s, lr=6.12e-5, step_loss=0.00253]
Steps:  43%|████▎     | 428/1000 [07:13<09:16,  1.03it/s, lr=6.12e-5, step_loss=0.0259] 
Steps:  43%|████▎     | 429/1000 [07:13<09:15,  1.03it/s, lr=6.12e-5, step_loss=0.0259]
Steps:  43%|████▎     | 429/1000 [07:13<09:15,  1.03it/s, lr=6.11e-5, step_loss=0.18]  
Steps:  43%|████▎     | 429/1000 [07:14<09:15,  1.03it/s, lr=6.11e-5, step_loss=0.0431]
Steps:  43%|████▎     | 429/1000 [07:14<09:15,  1.03it/s, lr=6.11e-5, step_loss=0.062] 
Steps:  43%|████▎     | 429/1000 [07:14<09:15,  1.03it/s, lr=6.11e-5, step_loss=0.563]
Steps:  43%|████▎     | 430/1000 [07:14<09:14,  1.03it/s, lr=6.11e-5, step_loss=0.563]
Steps:  43%|████▎     | 430/1000 [07:14<09:14,  1.03it/s, lr=6.09e-5, step_loss=0.104]
Steps:  43%|████▎     | 430/1000 [07:15<09:14,  1.03it/s, lr=6.09e-5, step_loss=0.00435]
Steps:  43%|████▎     | 430/1000 [07:15<09:14,  1.03it/s, lr=6.09e-5, step_loss=0.00191]
Steps:  43%|████▎     | 430/1000 [07:15<09:14,  1.03it/s, lr=6.09e-5, step_loss=0.0525] 
Steps:  43%|████▎     | 431/1000 [07:15<09:13,  1.03it/s, lr=6.09e-5, step_loss=0.0525]
Steps:  43%|████▎     | 431/1000 [07:15<09:13,  1.03it/s, lr=6.08e-5, step_loss=0.121] 
Steps:  43%|████▎     | 431/1000 [07:16<09:13,  1.03it/s, lr=6.08e-5, step_loss=0.112]
Steps:  43%|████▎     | 431/1000 [07:16<09:13,  1.03it/s, lr=6.08e-5, step_loss=0.0313]
Steps:  43%|████▎     | 431/1000 [07:16<09:13,  1.03it/s, lr=6.08e-5, step_loss=0.0759]
Steps:  43%|████▎     | 432/1000 [07:16<09:12,  1.03it/s, lr=6.08e-5, step_loss=0.0759]
Steps:  43%|████▎     | 432/1000 [07:16<09:12,  1.03it/s, lr=6.06e-5, step_loss=0.00442]
Steps:  43%|████▎     | 432/1000 [07:17<09:12,  1.03it/s, lr=6.06e-5, step_loss=0.004]  
Steps:  43%|████▎     | 432/1000 [07:17<09:12,  1.03it/s, lr=6.06e-5, step_loss=0.0697]
Steps:  43%|████▎     | 432/1000 [07:17<09:12,  1.03it/s, lr=6.06e-5, step_loss=0.0446]
Steps:  43%|████▎     | 433/1000 [07:17<09:11,  1.03it/s, lr=6.06e-5, step_loss=0.0446]
Steps:  43%|████▎     | 433/1000 [07:17<09:11,  1.03it/s, lr=6.04e-5, step_loss=0.226] 
Steps:  43%|████▎     | 433/1000 [07:18<09:11,  1.03it/s, lr=6.04e-5, step_loss=0.0901]
Steps:  43%|████▎     | 433/1000 [07:18<09:11,  1.03it/s, lr=6.04e-5, step_loss=0.263] 
Steps:  43%|████▎     | 433/1000 [07:18<09:11,  1.03it/s, lr=6.04e-5, step_loss=0.0151]
Steps:  43%|████▎     | 434/1000 [07:18<09:11,  1.03it/s, lr=6.04e-5, step_loss=0.0151]
Steps:  43%|████▎     | 434/1000 [07:18<09:11,  1.03it/s, lr=6.03e-5, step_loss=0.0576]
Steps:  43%|████▎     | 434/1000 [07:19<09:11,  1.03it/s, lr=6.03e-5, step_loss=0.00449]
Steps:  43%|████▎     | 434/1000 [07:19<09:11,  1.03it/s, lr=6.03e-5, step_loss=0.0179] 
Steps:  43%|████▎     | 434/1000 [07:19<09:11,  1.03it/s, lr=6.03e-5, step_loss=0.152] 
Steps:  44%|████▎     | 435/1000 [07:19<09:10,  1.03it/s, lr=6.03e-5, step_loss=0.152]
Steps:  44%|████▎     | 435/1000 [07:19<09:10,  1.03it/s, lr=6.01e-5, step_loss=0.0413]
Steps:  44%|████▎     | 435/1000 [07:20<09:10,  1.03it/s, lr=6.01e-5, step_loss=0.0382]
Steps:  44%|████▎     | 435/1000 [07:20<09:10,  1.03it/s, lr=6.01e-5, step_loss=0.00361]
Steps:  44%|████▎     | 435/1000 [07:20<09:10,  1.03it/s, lr=6.01e-5, step_loss=0.0169] 
Steps:  44%|████▎     | 436/1000 [07:20<09:09,  1.03it/s, lr=6.01e-5, step_loss=0.0169]
Steps:  44%|████▎     | 436/1000 [07:20<09:09,  1.03it/s, lr=6e-5, step_loss=0.0322]   
Steps:  44%|████▎     | 436/1000 [07:21<09:09,  1.03it/s, lr=6e-5, step_loss=0.0243]
Steps:  44%|████▎     | 436/1000 [07:21<09:09,  1.03it/s, lr=6e-5, step_loss=0.0527]
Steps:  44%|████▎     | 436/1000 [07:21<09:09,  1.03it/s, lr=6e-5, step_loss=0.0327]
Steps:  44%|████▎     | 437/1000 [07:21<09:08,  1.03it/s, lr=6e-5, step_loss=0.0327]
Steps:  44%|████▎     | 437/1000 [07:21<09:08,  1.03it/s, lr=5.98e-5, step_loss=0.0507]
Steps:  44%|████▎     | 437/1000 [07:22<09:08,  1.03it/s, lr=5.98e-5, step_loss=0.0134]
Steps:  44%|████▎     | 437/1000 [07:22<09:08,  1.03it/s, lr=5.98e-5, step_loss=0.177] 
Steps:  44%|████▎     | 437/1000 [07:22<09:08,  1.03it/s, lr=5.98e-5, step_loss=0.0365]
Steps:  44%|████▍     | 438/1000 [07:22<09:08,  1.03it/s, lr=5.98e-5, step_loss=0.0365]
Steps:  44%|████▍     | 438/1000 [07:22<09:08,  1.03it/s, lr=5.97e-5, step_loss=0.118] 
Steps:  44%|████▍     | 438/1000 [07:22<09:08,  1.03it/s, lr=5.97e-5, step_loss=0.0292]
Steps:  44%|████▍     | 438/1000 [07:23<09:08,  1.03it/s, lr=5.97e-5, step_loss=0.0739]
Steps:  44%|████▍     | 438/1000 [07:23<09:08,  1.03it/s, lr=5.97e-5, step_loss=0.177] 
Steps:  44%|████▍     | 439/1000 [07:23<09:07,  1.03it/s, lr=5.97e-5, step_loss=0.177]
Steps:  44%|████▍     | 439/1000 [07:23<09:07,  1.03it/s, lr=5.95e-5, step_loss=0.0248]
Steps:  44%|████▍     | 439/1000 [07:23<09:07,  1.03it/s, lr=5.95e-5, step_loss=0.0511]
Steps:  44%|████▍     | 439/1000 [07:24<09:07,  1.03it/s, lr=5.95e-5, step_loss=0.0188]
Steps:  44%|████▍     | 439/1000 [07:24<09:07,  1.03it/s, lr=5.95e-5, step_loss=0.0573]
Steps:  44%|████▍     | 440/1000 [07:24<09:06,  1.03it/s, lr=5.95e-5, step_loss=0.0573]
Steps:  44%|████▍     | 440/1000 [07:24<09:06,  1.03it/s, lr=5.94e-5, step_loss=0.00212]
Steps:  44%|████▍     | 440/1000 [07:24<09:06,  1.03it/s, lr=5.94e-5, step_loss=0.0438] 
Steps:  44%|████▍     | 440/1000 [07:25<09:06,  1.03it/s, lr=5.94e-5, step_loss=0.0233]
Steps:  44%|████▍     | 440/1000 [07:25<09:06,  1.03it/s, lr=5.94e-5, step_loss=0.00899]
Steps:  44%|████▍     | 441/1000 [07:25<09:05,  1.02it/s, lr=5.94e-5, step_loss=0.00899]
Steps:  44%|████▍     | 441/1000 [07:25<09:05,  1.02it/s, lr=5.92e-5, step_loss=0.065]  
Steps:  44%|████▍     | 441/1000 [07:25<09:05,  1.02it/s, lr=5.92e-5, step_loss=0.0756]
Steps:  44%|████▍     | 441/1000 [07:26<09:05,  1.02it/s, lr=5.92e-5, step_loss=0.00868]
Steps:  44%|████▍     | 441/1000 [07:26<09:05,  1.02it/s, lr=5.92e-5, step_loss=0.0954] 
Steps:  44%|████▍     | 442/1000 [07:26<09:04,  1.03it/s, lr=5.92e-5, step_loss=0.0954]
Steps:  44%|████▍     | 442/1000 [07:26<09:04,  1.03it/s, lr=5.91e-5, step_loss=0.00668]
Steps:  44%|████▍     | 442/1000 [07:26<09:04,  1.03it/s, lr=5.91e-5, step_loss=0.0592] 
Steps:  44%|████▍     | 442/1000 [07:27<09:04,  1.03it/s, lr=5.91e-5, step_loss=0.0717]
Steps:  44%|████▍     | 442/1000 [07:27<09:04,  1.03it/s, lr=5.91e-5, step_loss=0.0774]
Steps:  44%|████▍     | 443/1000 [07:27<09:03,  1.03it/s, lr=5.91e-5, step_loss=0.0774]
Steps:  44%|████▍     | 443/1000 [07:27<09:03,  1.03it/s, lr=5.89e-5, step_loss=0.039] 
Steps:  44%|████▍     | 443/1000 [07:27<09:03,  1.03it/s, lr=5.89e-5, step_loss=0.00686]
Steps:  44%|████▍     | 443/1000 [07:28<09:03,  1.03it/s, lr=5.89e-5, step_loss=0.0757] 
Steps:  44%|████▍     | 443/1000 [07:28<09:03,  1.03it/s, lr=5.89e-5, step_loss=0.0811]
Steps:  44%|████▍     | 444/1000 [07:28<09:02,  1.03it/s, lr=5.89e-5, step_loss=0.0811]
Steps:  44%|████▍     | 444/1000 [07:28<09:02,  1.03it/s, lr=5.88e-5, step_loss=0.0134]
Steps:  44%|████▍     | 444/1000 [07:28<09:02,  1.03it/s, lr=5.88e-5, step_loss=0.0604]
Steps:  44%|████▍     | 444/1000 [07:29<09:02,  1.03it/s, lr=5.88e-5, step_loss=0.0823]
Steps:  44%|████▍     | 444/1000 [07:29<09:02,  1.03it/s, lr=5.88e-5, step_loss=0.139] 
Steps:  44%|████▍     | 445/1000 [07:29<09:01,  1.03it/s, lr=5.88e-5, step_loss=0.139]
Steps:  44%|████▍     | 445/1000 [07:29<09:01,  1.03it/s, lr=5.86e-5, step_loss=0.0496]
Steps:  44%|████▍     | 445/1000 [07:29<09:01,  1.03it/s, lr=5.86e-5, step_loss=0.0906]
Steps:  44%|████▍     | 445/1000 [07:30<09:01,  1.03it/s, lr=5.86e-5, step_loss=0.0262]
Steps:  44%|████▍     | 445/1000 [07:30<09:01,  1.03it/s, lr=5.86e-5, step_loss=0.0191]
Steps:  45%|████▍     | 446/1000 [07:30<09:00,  1.03it/s, lr=5.86e-5, step_loss=0.0191]
Steps:  45%|████▍     | 446/1000 [07:30<09:00,  1.03it/s, lr=5.84e-5, step_loss=0.0451]
Steps:  45%|████▍     | 446/1000 [07:30<09:00,  1.03it/s, lr=5.84e-5, step_loss=0.00722]
Steps:  45%|████▍     | 446/1000 [07:31<09:00,  1.03it/s, lr=5.84e-5, step_loss=0.434]  
Steps:  45%|████▍     | 446/1000 [07:31<09:00,  1.03it/s, lr=5.84e-5, step_loss=0.126]
Steps:  45%|████▍     | 447/1000 [07:31<08:59,  1.03it/s, lr=5.84e-5, step_loss=0.126]
Steps:  45%|████▍     | 447/1000 [07:31<08:59,  1.03it/s, lr=5.83e-5, step_loss=0.32] 
Steps:  45%|████▍     | 447/1000 [07:31<08:59,  1.03it/s, lr=5.83e-5, step_loss=0.154]
Steps:  45%|████▍     | 447/1000 [07:31<08:59,  1.03it/s, lr=5.83e-5, step_loss=0.00872]
Steps:  45%|████▍     | 447/1000 [07:32<08:59,  1.03it/s, lr=5.83e-5, step_loss=0.00869]
Steps:  45%|████▍     | 448/1000 [07:32<08:58,  1.03it/s, lr=5.83e-5, step_loss=0.00869]
Steps:  45%|████▍     | 448/1000 [07:32<08:58,  1.03it/s, lr=5.81e-5, step_loss=0.0395] 
Steps:  45%|████▍     | 448/1000 [07:32<08:58,  1.03it/s, lr=5.81e-5, step_loss=0.0473]
Steps:  45%|████▍     | 448/1000 [07:32<08:58,  1.03it/s, lr=5.81e-5, step_loss=0.226] 
Steps:  45%|████▍     | 448/1000 [07:33<08:58,  1.03it/s, lr=5.81e-5, step_loss=0.0818]
Steps:  45%|████▍     | 449/1000 [07:33<08:57,  1.03it/s, lr=5.81e-5, step_loss=0.0818]
Steps:  45%|████▍     | 449/1000 [07:33<08:57,  1.03it/s, lr=5.8e-5, step_loss=0.0124] 
Steps:  45%|████▍     | 449/1000 [07:33<08:57,  1.03it/s, lr=5.8e-5, step_loss=0.0496]
Steps:  45%|████▍     | 449/1000 [07:33<08:57,  1.03it/s, lr=5.8e-5, step_loss=0.0028]
Steps:  45%|████▍     | 449/1000 [07:34<08:57,  1.03it/s, lr=5.8e-5, step_loss=0.00855]
Steps:  45%|████▌     | 450/1000 [07:34<08:56,  1.03it/s, lr=5.8e-5, step_loss=0.00855]
Steps:  45%|████▌     | 450/1000 [07:34<08:56,  1.03it/s, lr=5.78e-5, step_loss=0.148] 
Steps:  45%|████▌     | 450/1000 [07:34<08:56,  1.03it/s, lr=5.78e-5, step_loss=0.0175]
Steps:  45%|████▌     | 450/1000 [07:34<08:56,  1.03it/s, lr=5.78e-5, step_loss=0.198] 
Steps:  45%|████▌     | 450/1000 [07:35<08:56,  1.03it/s, lr=5.78e-5, step_loss=0.144]
Steps:  45%|████▌     | 451/1000 [07:35<08:55,  1.03it/s, lr=5.78e-5, step_loss=0.144]
Steps:  45%|████▌     | 451/1000 [07:35<08:55,  1.03it/s, lr=5.77e-5, step_loss=0.0383]
Steps:  45%|████▌     | 451/1000 [07:35<08:55,  1.03it/s, lr=5.77e-5, step_loss=0.0797]
Steps:  45%|████▌     | 451/1000 [07:35<08:55,  1.03it/s, lr=5.77e-5, step_loss=0.238] 
Steps:  45%|████▌     | 451/1000 [07:36<08:55,  1.03it/s, lr=5.77e-5, step_loss=0.141]
Steps:  45%|████▌     | 452/1000 [07:36<08:54,  1.03it/s, lr=5.77e-5, step_loss=0.141]
Steps:  45%|████▌     | 452/1000 [07:36<08:54,  1.03it/s, lr=5.75e-5, step_loss=0.198]
Steps:  45%|████▌     | 452/1000 [07:36<08:54,  1.03it/s, lr=5.75e-5, step_loss=0.0293]
Steps:  45%|████▌     | 452/1000 [07:36<08:54,  1.03it/s, lr=5.75e-5, step_loss=0.145] 
Steps:  45%|████▌     | 452/1000 [07:37<08:54,  1.03it/s, lr=5.75e-5, step_loss=0.0168]
Steps:  45%|████▌     | 453/1000 [07:37<08:53,  1.03it/s, lr=5.75e-5, step_loss=0.0168]
Steps:  45%|████▌     | 453/1000 [07:37<08:53,  1.03it/s, lr=5.74e-5, step_loss=0.0233]
Steps:  45%|████▌     | 453/1000 [07:37<08:53,  1.03it/s, lr=5.74e-5, step_loss=0.00739]
Steps:  45%|████▌     | 453/1000 [07:37<08:53,  1.03it/s, lr=5.74e-5, step_loss=0.0211] 
Steps:  45%|████▌     | 453/1000 [07:38<08:53,  1.03it/s, lr=5.74e-5, step_loss=0.0202]
Steps:  45%|████▌     | 454/1000 [07:38<08:51,  1.03it/s, lr=5.74e-5, step_loss=0.0202]
Steps:  45%|████▌     | 454/1000 [07:38<08:51,  1.03it/s, lr=5.72e-5, step_loss=0.471] 
Steps:  45%|████▌     | 454/1000 [07:38<08:51,  1.03it/s, lr=5.72e-5, step_loss=0.00229]
Steps:  45%|████▌     | 454/1000 [07:38<08:51,  1.03it/s, lr=5.72e-5, step_loss=0.0297] 
Steps:  45%|████▌     | 454/1000 [07:39<08:51,  1.03it/s, lr=5.72e-5, step_loss=0.00437]
Steps:  46%|████▌     | 455/1000 [07:39<08:50,  1.03it/s, lr=5.72e-5, step_loss=0.00437]
Steps:  46%|████▌     | 455/1000 [07:39<08:50,  1.03it/s, lr=5.7e-5, step_loss=0.0841]  
Steps:  46%|████▌     | 455/1000 [07:39<08:50,  1.03it/s, lr=5.7e-5, step_loss=0.193] 
Steps:  46%|████▌     | 455/1000 [07:39<08:50,  1.03it/s, lr=5.7e-5, step_loss=0.175]
Steps:  46%|████▌     | 455/1000 [07:40<08:50,  1.03it/s, lr=5.7e-5, step_loss=0.0307]
Steps:  46%|████▌     | 456/1000 [07:40<08:49,  1.03it/s, lr=5.7e-5, step_loss=0.0307]
Steps:  46%|████▌     | 456/1000 [07:40<08:49,  1.03it/s, lr=5.69e-5, step_loss=0.0563]
Steps:  46%|████▌     | 456/1000 [07:40<08:49,  1.03it/s, lr=5.69e-5, step_loss=0.00696]
Steps:  46%|████▌     | 456/1000 [07:40<08:49,  1.03it/s, lr=5.69e-5, step_loss=0.00281]
Steps:  46%|████▌     | 456/1000 [07:41<08:49,  1.03it/s, lr=5.69e-5, step_loss=0.0189] 
Steps:  46%|████▌     | 457/1000 [07:41<08:48,  1.03it/s, lr=5.69e-5, step_loss=0.0189]
Steps:  46%|████▌     | 457/1000 [07:41<08:48,  1.03it/s, lr=5.67e-5, step_loss=0.125] 
Steps:  46%|████▌     | 457/1000 [07:41<08:48,  1.03it/s, lr=5.67e-5, step_loss=0.445]
Steps:  46%|████▌     | 457/1000 [07:41<08:48,  1.03it/s, lr=5.67e-5, step_loss=0.0869]
Steps:  46%|████▌     | 457/1000 [07:41<08:48,  1.03it/s, lr=5.67e-5, step_loss=0.0074]
Steps:  46%|████▌     | 458/1000 [07:42<08:47,  1.03it/s, lr=5.67e-5, step_loss=0.0074]
Steps:  46%|████▌     | 458/1000 [07:42<08:47,  1.03it/s, lr=5.66e-5, step_loss=0.0263]
Steps:  46%|████▌     | 458/1000 [07:42<08:47,  1.03it/s, lr=5.66e-5, step_loss=0.0775]
Steps:  46%|████▌     | 458/1000 [07:42<08:47,  1.03it/s, lr=5.66e-5, step_loss=0.196] 
Steps:  46%|████▌     | 458/1000 [07:42<08:47,  1.03it/s, lr=5.66e-5, step_loss=0.0489]
Steps:  46%|████▌     | 459/1000 [07:43<08:46,  1.03it/s, lr=5.66e-5, step_loss=0.0489]
Steps:  46%|████▌     | 459/1000 [07:43<08:46,  1.03it/s, lr=5.64e-5, step_loss=0.0458]
Steps:  46%|████▌     | 459/1000 [07:43<08:46,  1.03it/s, lr=5.64e-5, step_loss=0.0346]
Steps:  46%|████▌     | 459/1000 [07:43<08:46,  1.03it/s, lr=5.64e-5, step_loss=0.0867]
Steps:  46%|████▌     | 459/1000 [07:43<08:46,  1.03it/s, lr=5.64e-5, step_loss=0.0797]
Steps:  46%|████▌     | 460/1000 [07:44<08:45,  1.03it/s, lr=5.64e-5, step_loss=0.0797]
Steps:  46%|████▌     | 460/1000 [07:44<08:45,  1.03it/s, lr=5.63e-5, step_loss=0.0556]
Steps:  46%|████▌     | 460/1000 [07:44<08:45,  1.03it/s, lr=5.63e-5, step_loss=0.334] 
Steps:  46%|████▌     | 460/1000 [07:44<08:45,  1.03it/s, lr=5.63e-5, step_loss=0.0298]
Steps:  46%|████▌     | 460/1000 [07:44<08:45,  1.03it/s, lr=5.63e-5, step_loss=0.0151]
Steps:  46%|████▌     | 461/1000 [07:45<08:44,  1.03it/s, lr=5.63e-5, step_loss=0.0151]
Steps:  46%|████▌     | 461/1000 [07:45<08:44,  1.03it/s, lr=5.61e-5, step_loss=0.196] 
Steps:  46%|████▌     | 461/1000 [07:45<08:44,  1.03it/s, lr=5.61e-5, step_loss=0.0435]
Steps:  46%|████▌     | 461/1000 [07:45<08:44,  1.03it/s, lr=5.61e-5, step_loss=0.0551]
Steps:  46%|████▌     | 461/1000 [07:45<08:44,  1.03it/s, lr=5.61e-5, step_loss=0.554] 
Steps:  46%|████▌     | 462/1000 [07:46<08:43,  1.03it/s, lr=5.61e-5, step_loss=0.554]
Steps:  46%|████▌     | 462/1000 [07:46<08:43,  1.03it/s, lr=5.6e-5, step_loss=0.00556]
Steps:  46%|████▌     | 462/1000 [07:46<08:43,  1.03it/s, lr=5.6e-5, step_loss=0.00471]
Steps:  46%|████▌     | 462/1000 [07:46<08:43,  1.03it/s, lr=5.6e-5, step_loss=0.075]  
Steps:  46%|████▌     | 462/1000 [07:46<08:43,  1.03it/s, lr=5.6e-5, step_loss=0.00688]
Steps:  46%|████▋     | 463/1000 [07:47<08:42,  1.03it/s, lr=5.6e-5, step_loss=0.00688]
Steps:  46%|████▋     | 463/1000 [07:47<08:42,  1.03it/s, lr=5.58e-5, step_loss=0.0135]
Steps:  46%|████▋     | 463/1000 [07:47<08:42,  1.03it/s, lr=5.58e-5, step_loss=0.011] 
Steps:  46%|████▋     | 463/1000 [07:47<08:42,  1.03it/s, lr=5.58e-5, step_loss=0.296]
Steps:  46%|████▋     | 463/1000 [07:47<08:42,  1.03it/s, lr=5.58e-5, step_loss=0.00946]
Steps:  46%|████▋     | 464/1000 [07:48<08:41,  1.03it/s, lr=5.58e-5, step_loss=0.00946]
Steps:  46%|████▋     | 464/1000 [07:48<08:41,  1.03it/s, lr=5.56e-5, step_loss=0.0214] 
Steps:  46%|████▋     | 464/1000 [07:48<08:41,  1.03it/s, lr=5.56e-5, step_loss=0.0193]
Steps:  46%|████▋     | 464/1000 [07:48<08:41,  1.03it/s, lr=5.56e-5, step_loss=0.199] 
Steps:  46%|████▋     | 464/1000 [07:48<08:41,  1.03it/s, lr=5.56e-5, step_loss=0.495]
Steps:  46%|████▋     | 465/1000 [07:49<08:41,  1.03it/s, lr=5.56e-5, step_loss=0.495]
Steps:  46%|████▋     | 465/1000 [07:49<08:41,  1.03it/s, lr=5.55e-5, step_loss=0.0249]
Steps:  46%|████▋     | 465/1000 [07:49<08:41,  1.03it/s, lr=5.55e-5, step_loss=0.0208]
Steps:  46%|████▋     | 465/1000 [07:49<08:41,  1.03it/s, lr=5.55e-5, step_loss=0.00744]
Steps:  46%|████▋     | 465/1000 [07:49<08:41,  1.03it/s, lr=5.55e-5, step_loss=0.00928]
Steps:  47%|████▋     | 466/1000 [07:49<08:41,  1.02it/s, lr=5.55e-5, step_loss=0.00928]
Steps:  47%|████▋     | 466/1000 [07:50<08:41,  1.02it/s, lr=5.53e-5, step_loss=0.0957] 
Steps:  47%|████▋     | 466/1000 [07:50<08:41,  1.02it/s, lr=5.53e-5, step_loss=0.0493]
Steps:  47%|████▋     | 466/1000 [07:50<08:41,  1.02it/s, lr=5.53e-5, step_loss=0.0634]
Steps:  47%|████▋     | 466/1000 [07:50<08:41,  1.02it/s, lr=5.53e-5, step_loss=0.0185]
Steps:  47%|████▋     | 467/1000 [07:50<08:40,  1.02it/s, lr=5.53e-5, step_loss=0.0185]
Steps:  47%|████▋     | 467/1000 [07:50<08:40,  1.02it/s, lr=5.52e-5, step_loss=0.105] 
Steps:  47%|████▋     | 467/1000 [07:51<08:40,  1.02it/s, lr=5.52e-5, step_loss=0.197]
Steps:  47%|████▋     | 467/1000 [07:51<08:40,  1.02it/s, lr=5.52e-5, step_loss=0.014]
Steps:  47%|████▋     | 467/1000 [07:51<08:40,  1.02it/s, lr=5.52e-5, step_loss=0.0306]
Steps:  47%|████▋     | 468/1000 [07:51<08:38,  1.03it/s, lr=5.52e-5, step_loss=0.0306]
Steps:  47%|████▋     | 468/1000 [07:51<08:38,  1.03it/s, lr=5.5e-5, step_loss=0.0217] 
Steps:  47%|████▋     | 468/1000 [07:52<08:38,  1.03it/s, lr=5.5e-5, step_loss=0.0446]
Steps:  47%|████▋     | 468/1000 [07:52<08:38,  1.03it/s, lr=5.5e-5, step_loss=0.0229]
Steps:  47%|████▋     | 468/1000 [07:52<08:38,  1.03it/s, lr=5.5e-5, step_loss=0.249] 
Steps:  47%|████▋     | 469/1000 [07:52<08:37,  1.03it/s, lr=5.5e-5, step_loss=0.249]
Steps:  47%|████▋     | 469/1000 [07:52<08:37,  1.03it/s, lr=5.49e-5, step_loss=0.012]
Steps:  47%|████▋     | 469/1000 [07:53<08:37,  1.03it/s, lr=5.49e-5, step_loss=0.104]
Steps:  47%|████▋     | 469/1000 [07:53<08:37,  1.03it/s, lr=5.49e-5, step_loss=0.175]
Steps:  47%|████▋     | 469/1000 [07:53<08:37,  1.03it/s, lr=5.49e-5, step_loss=0.128]
Steps:  47%|████▋     | 470/1000 [07:53<08:36,  1.03it/s, lr=5.49e-5, step_loss=0.128]
Steps:  47%|████▋     | 470/1000 [07:53<08:36,  1.03it/s, lr=5.47e-5, step_loss=0.378]
Steps:  47%|████▋     | 470/1000 [07:54<08:36,  1.03it/s, lr=5.47e-5, step_loss=0.00599]
Steps:  47%|████▋     | 470/1000 [07:54<08:36,  1.03it/s, lr=5.47e-5, step_loss=0.372]  
Steps:  47%|████▋     | 470/1000 [07:54<08:36,  1.03it/s, lr=5.47e-5, step_loss=0.0628]
Steps:  47%|████▋     | 471/1000 [07:54<08:35,  1.03it/s, lr=5.47e-5, step_loss=0.0628]
Steps:  47%|████▋     | 471/1000 [07:54<08:35,  1.03it/s, lr=5.45e-5, step_loss=0.056] 
Steps:  47%|████▋     | 471/1000 [07:55<08:35,  1.03it/s, lr=5.45e-5, step_loss=0.0396]
Steps:  47%|████▋     | 471/1000 [07:55<08:35,  1.03it/s, lr=5.45e-5, step_loss=0.112] 
Steps:  47%|████▋     | 471/1000 [07:55<08:35,  1.03it/s, lr=5.45e-5, step_loss=0.327]
Steps:  47%|████▋     | 472/1000 [07:55<08:34,  1.03it/s, lr=5.45e-5, step_loss=0.327]
Steps:  47%|████▋     | 472/1000 [07:55<08:34,  1.03it/s, lr=5.44e-5, step_loss=0.0635]
Steps:  47%|████▋     | 472/1000 [07:56<08:34,  1.03it/s, lr=5.44e-5, step_loss=0.129] 
Steps:  47%|████▋     | 472/1000 [07:56<08:34,  1.03it/s, lr=5.44e-5, step_loss=0.00766]
Steps:  47%|████▋     | 472/1000 [07:56<08:34,  1.03it/s, lr=5.44e-5, step_loss=0.193]  
Steps:  47%|████▋     | 473/1000 [07:56<08:33,  1.03it/s, lr=5.44e-5, step_loss=0.193]
Steps:  47%|████▋     | 473/1000 [07:56<08:33,  1.03it/s, lr=5.42e-5, step_loss=0.297]
Steps:  47%|████▋     | 473/1000 [07:57<08:33,  1.03it/s, lr=5.42e-5, step_loss=0.24] 
Steps:  47%|████▋     | 473/1000 [07:57<08:33,  1.03it/s, lr=5.42e-5, step_loss=0.14]
Steps:  47%|████▋     | 473/1000 [07:57<08:33,  1.03it/s, lr=5.42e-5, step_loss=0.161]
Steps:  47%|████▋     | 474/1000 [07:57<08:32,  1.03it/s, lr=5.42e-5, step_loss=0.161]
Steps:  47%|████▋     | 474/1000 [07:57<08:32,  1.03it/s, lr=5.41e-5, step_loss=0.072]
Steps:  47%|████▋     | 474/1000 [07:58<08:32,  1.03it/s, lr=5.41e-5, step_loss=0.0589]
Steps:  47%|████▋     | 474/1000 [07:58<08:32,  1.03it/s, lr=5.41e-5, step_loss=0.0165]
Steps:  47%|████▋     | 474/1000 [07:58<08:32,  1.03it/s, lr=5.41e-5, step_loss=0.0208]
Steps:  48%|████▊     | 475/1000 [07:58<08:30,  1.03it/s, lr=5.41e-5, step_loss=0.0208]
Steps:  48%|████▊     | 475/1000 [07:58<08:30,  1.03it/s, lr=5.39e-5, step_loss=0.101] 
Steps:  48%|████▊     | 475/1000 [07:59<08:30,  1.03it/s, lr=5.39e-5, step_loss=0.0718]
Steps:  48%|████▊     | 475/1000 [07:59<08:30,  1.03it/s, lr=5.39e-5, step_loss=0.00313]
Steps:  48%|████▊     | 475/1000 [07:59<08:30,  1.03it/s, lr=5.39e-5, step_loss=0.00846]
Steps:  48%|████▊     | 476/1000 [07:59<08:29,  1.03it/s, lr=5.39e-5, step_loss=0.00846]
Steps:  48%|████▊     | 476/1000 [07:59<08:29,  1.03it/s, lr=5.38e-5, step_loss=0.0333] 
Steps:  48%|████▊     | 476/1000 [07:59<08:29,  1.03it/s, lr=5.38e-5, step_loss=0.00974]
Steps:  48%|████▊     | 476/1000 [08:00<08:29,  1.03it/s, lr=5.38e-5, step_loss=0.0261] 
Steps:  48%|████▊     | 476/1000 [08:00<08:29,  1.03it/s, lr=5.38e-5, step_loss=0.276] 
Steps:  48%|████▊     | 477/1000 [08:00<08:29,  1.03it/s, lr=5.38e-5, step_loss=0.276]
Steps:  48%|████▊     | 477/1000 [08:00<08:29,  1.03it/s, lr=5.36e-5, step_loss=0.169]
Steps:  48%|████▊     | 477/1000 [08:00<08:29,  1.03it/s, lr=5.36e-5, step_loss=0.00289]
Steps:  48%|████▊     | 477/1000 [08:01<08:29,  1.03it/s, lr=5.36e-5, step_loss=0.0766] 
Steps:  48%|████▊     | 477/1000 [08:01<08:29,  1.03it/s, lr=5.36e-5, step_loss=0.00638]
Steps:  48%|████▊     | 478/1000 [08:01<08:27,  1.03it/s, lr=5.36e-5, step_loss=0.00638]
Steps:  48%|████▊     | 478/1000 [08:01<08:27,  1.03it/s, lr=5.35e-5, step_loss=0.0356] 
Steps:  48%|████▊     | 478/1000 [08:01<08:27,  1.03it/s, lr=5.35e-5, step_loss=0.0155]
Steps:  48%|████▊     | 478/1000 [08:02<08:27,  1.03it/s, lr=5.35e-5, step_loss=0.0128]
Steps:  48%|████▊     | 478/1000 [08:02<08:27,  1.03it/s, lr=5.35e-5, step_loss=0.116] 
Steps:  48%|████▊     | 479/1000 [08:02<08:26,  1.03it/s, lr=5.35e-5, step_loss=0.116]
Steps:  48%|████▊     | 479/1000 [08:02<08:26,  1.03it/s, lr=5.33e-5, step_loss=0.0603]
Steps:  48%|████▊     | 479/1000 [08:02<08:26,  1.03it/s, lr=5.33e-5, step_loss=0.673] 
Steps:  48%|████▊     | 479/1000 [08:03<08:26,  1.03it/s, lr=5.33e-5, step_loss=0.187]
Steps:  48%|████▊     | 479/1000 [08:03<08:26,  1.03it/s, lr=5.33e-5, step_loss=0.0307]
Steps:  48%|████▊     | 480/1000 [08:03<08:26,  1.03it/s, lr=5.33e-5, step_loss=0.0307]
Steps:  48%|████▊     | 480/1000 [08:03<08:26,  1.03it/s, lr=5.31e-5, step_loss=0.0913]
Steps:  48%|████▊     | 480/1000 [08:03<08:26,  1.03it/s, lr=5.31e-5, step_loss=0.0935]
Steps:  48%|████▊     | 480/1000 [08:04<08:26,  1.03it/s, lr=5.31e-5, step_loss=0.245] 
Steps:  48%|████▊     | 480/1000 [08:04<08:26,  1.03it/s, lr=5.31e-5, step_loss=0.00314]
Steps:  48%|████▊     | 481/1000 [08:04<08:24,  1.03it/s, lr=5.31e-5, step_loss=0.00314]
Steps:  48%|████▊     | 481/1000 [08:04<08:24,  1.03it/s, lr=5.3e-5, step_loss=0.0513]  
Steps:  48%|████▊     | 481/1000 [08:04<08:24,  1.03it/s, lr=5.3e-5, step_loss=0.00666]
Steps:  48%|████▊     | 481/1000 [08:05<08:24,  1.03it/s, lr=5.3e-5, step_loss=0.0462] 
Steps:  48%|████▊     | 481/1000 [08:05<08:24,  1.03it/s, lr=5.3e-5, step_loss=0.542] 
Steps:  48%|████▊     | 482/1000 [08:05<08:24,  1.03it/s, lr=5.3e-5, step_loss=0.542]
Steps:  48%|████▊     | 482/1000 [08:05<08:24,  1.03it/s, lr=5.28e-5, step_loss=0.235]
Steps:  48%|████▊     | 482/1000 [08:05<08:24,  1.03it/s, lr=5.28e-5, step_loss=0.212]
Steps:  48%|████▊     | 482/1000 [08:06<08:24,  1.03it/s, lr=5.28e-5, step_loss=0.232]
Steps:  48%|████▊     | 482/1000 [08:06<08:24,  1.03it/s, lr=5.28e-5, step_loss=0.072]
Steps:  48%|████▊     | 483/1000 [08:06<08:23,  1.03it/s, lr=5.28e-5, step_loss=0.072]
Steps:  48%|████▊     | 483/1000 [08:06<08:23,  1.03it/s, lr=5.27e-5, step_loss=0.267]
Steps:  48%|████▊     | 483/1000 [08:06<08:23,  1.03it/s, lr=5.27e-5, step_loss=0.0434]
Steps:  48%|████▊     | 483/1000 [08:07<08:23,  1.03it/s, lr=5.27e-5, step_loss=0.0873]
Steps:  48%|████▊     | 483/1000 [08:07<08:23,  1.03it/s, lr=5.27e-5, step_loss=0.0877]
Steps:  48%|████▊     | 484/1000 [08:07<08:22,  1.03it/s, lr=5.27e-5, step_loss=0.0877]
Steps:  48%|████▊     | 484/1000 [08:07<08:22,  1.03it/s, lr=5.25e-5, step_loss=0.00444]
Steps:  48%|████▊     | 484/1000 [08:07<08:22,  1.03it/s, lr=5.25e-5, step_loss=0.00958]
Steps:  48%|████▊     | 484/1000 [08:08<08:22,  1.03it/s, lr=5.25e-5, step_loss=0.00293]
Steps:  48%|████▊     | 484/1000 [08:08<08:22,  1.03it/s, lr=5.25e-5, step_loss=0.0119] 
Steps:  48%|████▊     | 485/1000 [08:08<08:21,  1.03it/s, lr=5.25e-5, step_loss=0.0119]
Steps:  48%|████▊     | 485/1000 [08:08<08:21,  1.03it/s, lr=5.24e-5, step_loss=0.105] 
Steps:  48%|████▊     | 485/1000 [08:08<08:21,  1.03it/s, lr=5.24e-5, step_loss=0.0459]
Steps:  48%|████▊     | 485/1000 [08:09<08:21,  1.03it/s, lr=5.24e-5, step_loss=0.0068]
Steps:  48%|████▊     | 485/1000 [08:09<08:21,  1.03it/s, lr=5.24e-5, step_loss=0.197] 
Steps:  49%|████▊     | 486/1000 [08:09<08:20,  1.03it/s, lr=5.24e-5, step_loss=0.197]
Steps:  49%|████▊     | 486/1000 [08:09<08:20,  1.03it/s, lr=5.22e-5, step_loss=0.0819]
Steps:  49%|████▊     | 486/1000 [08:09<08:20,  1.03it/s, lr=5.22e-5, step_loss=0.0622]
Steps:  49%|████▊     | 486/1000 [08:09<08:20,  1.03it/s, lr=5.22e-5, step_loss=0.412] 
Steps:  49%|████▊     | 486/1000 [08:10<08:20,  1.03it/s, lr=5.22e-5, step_loss=0.00212]
Steps:  49%|████▊     | 487/1000 [08:10<08:19,  1.03it/s, lr=5.22e-5, step_loss=0.00212]
Steps:  49%|████▊     | 487/1000 [08:10<08:19,  1.03it/s, lr=5.2e-5, step_loss=0.00663] 
Steps:  49%|████▊     | 487/1000 [08:10<08:19,  1.03it/s, lr=5.2e-5, step_loss=0.00395]
Steps:  49%|████▊     | 487/1000 [08:10<08:19,  1.03it/s, lr=5.2e-5, step_loss=0.0646] 
Steps:  49%|████▊     | 487/1000 [08:11<08:19,  1.03it/s, lr=5.2e-5, step_loss=0.0505]
Steps:  49%|████▉     | 488/1000 [08:11<08:18,  1.03it/s, lr=5.2e-5, step_loss=0.0505]
Steps:  49%|████▉     | 488/1000 [08:11<08:18,  1.03it/s, lr=5.19e-5, step_loss=0.0227]
Steps:  49%|████▉     | 488/1000 [08:11<08:18,  1.03it/s, lr=5.19e-5, step_loss=0.118] 
Steps:  49%|████▉     | 488/1000 [08:11<08:18,  1.03it/s, lr=5.19e-5, step_loss=0.256]
Steps:  49%|████▉     | 488/1000 [08:12<08:18,  1.03it/s, lr=5.19e-5, step_loss=0.135]
Steps:  49%|████▉     | 489/1000 [08:12<08:17,  1.03it/s, lr=5.19e-5, step_loss=0.135]
Steps:  49%|████▉     | 489/1000 [08:12<08:17,  1.03it/s, lr=5.17e-5, step_loss=0.276]
Steps:  49%|████▉     | 489/1000 [08:12<08:17,  1.03it/s, lr=5.17e-5, step_loss=0.0449]
Steps:  49%|████▉     | 489/1000 [08:12<08:17,  1.03it/s, lr=5.17e-5, step_loss=0.0059]
Steps:  49%|████▉     | 489/1000 [08:13<08:17,  1.03it/s, lr=5.17e-5, step_loss=0.00733]
Steps:  49%|████▉     | 490/1000 [08:13<08:16,  1.03it/s, lr=5.17e-5, step_loss=0.00733]
Steps:  49%|████▉     | 490/1000 [08:13<08:16,  1.03it/s, lr=5.16e-5, step_loss=0.077]  
Steps:  49%|████▉     | 490/1000 [08:13<08:16,  1.03it/s, lr=5.16e-5, step_loss=0.0654]
Steps:  49%|████▉     | 490/1000 [08:13<08:16,  1.03it/s, lr=5.16e-5, step_loss=0.177] 
Steps:  49%|████▉     | 490/1000 [08:14<08:16,  1.03it/s, lr=5.16e-5, step_loss=0.115]
Steps:  49%|████▉     | 491/1000 [08:14<08:15,  1.03it/s, lr=5.16e-5, step_loss=0.115]
Steps:  49%|████▉     | 491/1000 [08:14<08:15,  1.03it/s, lr=5.14e-5, step_loss=0.0561]
Steps:  49%|████▉     | 491/1000 [08:14<08:15,  1.03it/s, lr=5.14e-5, step_loss=0.0188]
Steps:  49%|████▉     | 491/1000 [08:14<08:15,  1.03it/s, lr=5.14e-5, step_loss=0.0126]
Steps:  49%|████▉     | 491/1000 [08:15<08:15,  1.03it/s, lr=5.14e-5, step_loss=0.00852]
Steps:  49%|████▉     | 492/1000 [08:15<08:14,  1.03it/s, lr=5.14e-5, step_loss=0.00852]
Steps:  49%|████▉     | 492/1000 [08:15<08:14,  1.03it/s, lr=5.13e-5, step_loss=0.0415] 
Steps:  49%|████▉     | 492/1000 [08:15<08:14,  1.03it/s, lr=5.13e-5, step_loss=0.00437]
Steps:  49%|████▉     | 492/1000 [08:15<08:14,  1.03it/s, lr=5.13e-5, step_loss=0.271]  
Steps:  49%|████▉     | 492/1000 [08:16<08:14,  1.03it/s, lr=5.13e-5, step_loss=0.0794]
Steps:  49%|████▉     | 493/1000 [08:16<08:13,  1.03it/s, lr=5.13e-5, step_loss=0.0794]
Steps:  49%|████▉     | 493/1000 [08:16<08:13,  1.03it/s, lr=5.11e-5, step_loss=0.0384]
Steps:  49%|████▉     | 493/1000 [08:16<08:13,  1.03it/s, lr=5.11e-5, step_loss=0.0201]
Steps:  49%|████▉     | 493/1000 [08:16<08:13,  1.03it/s, lr=5.11e-5, step_loss=0.013] 
Steps:  49%|████▉     | 493/1000 [08:17<08:13,  1.03it/s, lr=5.11e-5, step_loss=0.051]
Steps:  49%|████▉     | 494/1000 [08:17<08:12,  1.03it/s, lr=5.11e-5, step_loss=0.051]
Steps:  49%|████▉     | 494/1000 [08:17<08:12,  1.03it/s, lr=5.09e-5, step_loss=0.00807]
Steps:  49%|████▉     | 494/1000 [08:17<08:12,  1.03it/s, lr=5.09e-5, step_loss=0.00793]
Steps:  49%|████▉     | 494/1000 [08:17<08:12,  1.03it/s, lr=5.09e-5, step_loss=0.0103] 
Steps:  49%|████▉     | 494/1000 [08:18<08:12,  1.03it/s, lr=5.09e-5, step_loss=0.042] 
Steps:  50%|████▉     | 495/1000 [08:18<08:11,  1.03it/s, lr=5.09e-5, step_loss=0.042]
Steps:  50%|████▉     | 495/1000 [08:18<08:11,  1.03it/s, lr=5.08e-5, step_loss=0.0216]
Steps:  50%|████▉     | 495/1000 [08:18<08:11,  1.03it/s, lr=5.08e-5, step_loss=0.0171]
Steps:  50%|████▉     | 495/1000 [08:18<08:11,  1.03it/s, lr=5.08e-5, step_loss=0.014] 
Steps:  50%|████▉     | 495/1000 [08:18<08:11,  1.03it/s, lr=5.08e-5, step_loss=0.127]
Steps:  50%|████▉     | 496/1000 [08:19<08:10,  1.03it/s, lr=5.08e-5, step_loss=0.127]
Steps:  50%|████▉     | 496/1000 [08:19<08:10,  1.03it/s, lr=5.06e-5, step_loss=0.00786]
Steps:  50%|████▉     | 496/1000 [08:19<08:10,  1.03it/s, lr=5.06e-5, step_loss=0.158]  
Steps:  50%|████▉     | 496/1000 [08:19<08:10,  1.03it/s, lr=5.06e-5, step_loss=0.296]
Steps:  50%|████▉     | 496/1000 [08:19<08:10,  1.03it/s, lr=5.06e-5, step_loss=0.0108]
Steps:  50%|████▉     | 497/1000 [08:20<08:09,  1.03it/s, lr=5.06e-5, step_loss=0.0108]
Steps:  50%|████▉     | 497/1000 [08:20<08:09,  1.03it/s, lr=5.05e-5, step_loss=0.115] 
Steps:  50%|████▉     | 497/1000 [08:20<08:09,  1.03it/s, lr=5.05e-5, step_loss=0.185]
Steps:  50%|████▉     | 497/1000 [08:20<08:09,  1.03it/s, lr=5.05e-5, step_loss=0.269]
Steps:  50%|████▉     | 497/1000 [08:20<08:09,  1.03it/s, lr=5.05e-5, step_loss=0.158]
Steps:  50%|████▉     | 498/1000 [08:21<08:08,  1.03it/s, lr=5.05e-5, step_loss=0.158]
Steps:  50%|████▉     | 498/1000 [08:21<08:08,  1.03it/s, lr=5.03e-5, step_loss=0.0358]
Steps:  50%|████▉     | 498/1000 [08:21<08:08,  1.03it/s, lr=5.03e-5, step_loss=0.38]  
Steps:  50%|████▉     | 498/1000 [08:21<08:08,  1.03it/s, lr=5.03e-5, step_loss=0.00409]
Steps:  50%|████▉     | 498/1000 [08:21<08:08,  1.03it/s, lr=5.03e-5, step_loss=0.0736] 
Steps:  50%|████▉     | 499/1000 [08:22<08:07,  1.03it/s, lr=5.03e-5, step_loss=0.0736]
Steps:  50%|████▉     | 499/1000 [08:22<08:07,  1.03it/s, lr=5.02e-5, step_loss=0.00344]
Steps:  50%|████▉     | 499/1000 [08:22<08:07,  1.03it/s, lr=5.02e-5, step_loss=0.0172] 
Steps:  50%|████▉     | 499/1000 [08:22<08:07,  1.03it/s, lr=5.02e-5, step_loss=0.136] 
Steps:  50%|████▉     | 499/1000 [08:22<08:07,  1.03it/s, lr=5.02e-5, step_loss=0.115]
Steps:  50%|█████     | 500/1000 [08:23<08:06,  1.03it/s, lr=5.02e-5, step_loss=0.115]
Steps:  50%|█████     | 500/1000 [08:23<08:06,  1.03it/s, lr=5e-5, step_loss=0.112]   
Steps:  50%|█████     | 500/1000 [08:23<08:06,  1.03it/s, lr=5e-5, step_loss=0.00458]
Steps:  50%|█████     | 500/1000 [08:23<08:06,  1.03it/s, lr=5e-5, step_loss=0.0809] 
Steps:  50%|█████     | 500/1000 [08:23<08:06,  1.03it/s, lr=5e-5, step_loss=0.0279]
Steps:  50%|█████     | 501/1000 [08:24<08:05,  1.03it/s, lr=5e-5, step_loss=0.0279]
Steps:  50%|█████     | 501/1000 [08:24<08:05,  1.03it/s, lr=4.98e-5, step_loss=0.0198]
Steps:  50%|█████     | 501/1000 [08:24<08:05,  1.03it/s, lr=4.98e-5, step_loss=0.205] 
Steps:  50%|█████     | 501/1000 [08:24<08:05,  1.03it/s, lr=4.98e-5, step_loss=0.0115]
Steps:  50%|█████     | 501/1000 [08:24<08:05,  1.03it/s, lr=4.98e-5, step_loss=0.0578]
Steps:  50%|█████     | 502/1000 [08:25<08:04,  1.03it/s, lr=4.98e-5, step_loss=0.0578]
Steps:  50%|█████     | 502/1000 [08:25<08:04,  1.03it/s, lr=4.97e-5, step_loss=0.142] 
Steps:  50%|█████     | 502/1000 [08:25<08:04,  1.03it/s, lr=4.97e-5, step_loss=0.0314]
Steps:  50%|█████     | 502/1000 [08:25<08:04,  1.03it/s, lr=4.97e-5, step_loss=0.0138]
Steps:  50%|█████     | 502/1000 [08:25<08:04,  1.03it/s, lr=4.97e-5, step_loss=0.0245]
Steps:  50%|█████     | 503/1000 [08:25<08:03,  1.03it/s, lr=4.97e-5, step_loss=0.0245]
Steps:  50%|█████     | 503/1000 [08:26<08:03,  1.03it/s, lr=4.95e-5, step_loss=0.00426]
Steps:  50%|█████     | 503/1000 [08:26<08:03,  1.03it/s, lr=4.95e-5, step_loss=0.163]  
Steps:  50%|█████     | 503/1000 [08:26<08:03,  1.03it/s, lr=4.95e-5, step_loss=0.062]
Steps:  50%|█████     | 503/1000 [08:26<08:03,  1.03it/s, lr=4.95e-5, step_loss=0.0526]
Steps:  50%|█████     | 504/1000 [08:26<08:02,  1.03it/s, lr=4.95e-5, step_loss=0.0526]
Steps:  50%|█████     | 504/1000 [08:27<08:02,  1.03it/s, lr=4.94e-5, step_loss=0.0163]
Steps:  50%|█████     | 504/1000 [08:27<08:02,  1.03it/s, lr=4.94e-5, step_loss=0.00344]
Steps:  50%|█████     | 504/1000 [08:27<08:02,  1.03it/s, lr=4.94e-5, step_loss=0.167]  
Steps:  50%|█████     | 504/1000 [08:27<08:02,  1.03it/s, lr=4.94e-5, step_loss=0.0795]
Steps:  50%|█████     | 505/1000 [08:27<08:01,  1.03it/s, lr=4.94e-5, step_loss=0.0795]
Steps:  50%|█████     | 505/1000 [08:27<08:01,  1.03it/s, lr=4.92e-5, step_loss=0.201] 
Steps:  50%|█████     | 505/1000 [08:28<08:01,  1.03it/s, lr=4.92e-5, step_loss=0.058]
Steps:  50%|█████     | 505/1000 [08:28<08:01,  1.03it/s, lr=4.92e-5, step_loss=0.055]
Steps:  50%|█████     | 505/1000 [08:28<08:01,  1.03it/s, lr=4.92e-5, step_loss=0.00324]
Steps:  51%|█████     | 506/1000 [08:28<08:00,  1.03it/s, lr=4.92e-5, step_loss=0.00324]
Steps:  51%|█████     | 506/1000 [08:28<08:00,  1.03it/s, lr=4.91e-5, step_loss=0.0143] 
Steps:  51%|█████     | 506/1000 [08:29<08:00,  1.03it/s, lr=4.91e-5, step_loss=0.0599]
Steps:  51%|█████     | 506/1000 [08:29<08:00,  1.03it/s, lr=4.91e-5, step_loss=0.0342]
Steps:  51%|█████     | 506/1000 [08:29<08:00,  1.03it/s, lr=4.91e-5, step_loss=0.124] 
Steps:  51%|█████     | 507/1000 [08:29<07:59,  1.03it/s, lr=4.91e-5, step_loss=0.124]
Steps:  51%|█████     | 507/1000 [08:29<07:59,  1.03it/s, lr=4.89e-5, step_loss=0.00265]
Steps:  51%|█████     | 507/1000 [08:30<07:59,  1.03it/s, lr=4.89e-5, step_loss=0.139]  
Steps:  51%|█████     | 507/1000 [08:30<07:59,  1.03it/s, lr=4.89e-5, step_loss=0.165]
Steps:  51%|█████     | 507/1000 [08:30<07:59,  1.03it/s, lr=4.89e-5, step_loss=0.732]
Steps:  51%|█████     | 508/1000 [08:30<07:58,  1.03it/s, lr=4.89e-5, step_loss=0.732]
Steps:  51%|█████     | 508/1000 [08:30<07:58,  1.03it/s, lr=4.87e-5, step_loss=0.386]
Steps:  51%|█████     | 508/1000 [08:31<07:58,  1.03it/s, lr=4.87e-5, step_loss=0.185]
Steps:  51%|█████     | 508/1000 [08:31<07:58,  1.03it/s, lr=4.87e-5, step_loss=0.0237]
Steps:  51%|█████     | 508/1000 [08:31<07:58,  1.03it/s, lr=4.87e-5, step_loss=0.00771]
Steps:  51%|█████     | 509/1000 [08:31<07:57,  1.03it/s, lr=4.87e-5, step_loss=0.00771]
Steps:  51%|█████     | 509/1000 [08:31<07:57,  1.03it/s, lr=4.86e-5, step_loss=0.00576]
Steps:  51%|█████     | 509/1000 [08:32<07:57,  1.03it/s, lr=4.86e-5, step_loss=0.111]  
Steps:  51%|█████     | 509/1000 [08:32<07:57,  1.03it/s, lr=4.86e-5, step_loss=0.00848]
Steps:  51%|█████     | 509/1000 [08:32<07:57,  1.03it/s, lr=4.86e-5, step_loss=0.0128] 
Steps:  51%|█████     | 510/1000 [08:32<07:56,  1.03it/s, lr=4.86e-5, step_loss=0.0128]
Steps:  51%|█████     | 510/1000 [08:32<07:56,  1.03it/s, lr=4.84e-5, step_loss=0.102] 
Steps:  51%|█████     | 510/1000 [08:33<07:56,  1.03it/s, lr=4.84e-5, step_loss=0.417]
Steps:  51%|█████     | 510/1000 [08:33<07:56,  1.03it/s, lr=4.84e-5, step_loss=0.0104]
Steps:  51%|█████     | 510/1000 [08:33<07:56,  1.03it/s, lr=4.84e-5, step_loss=0.00606]
Steps:  51%|█████     | 511/1000 [08:33<07:55,  1.03it/s, lr=4.84e-5, step_loss=0.00606]
Steps:  51%|█████     | 511/1000 [08:33<07:55,  1.03it/s, lr=4.83e-5, step_loss=0.0309] 
Steps:  51%|█████     | 511/1000 [08:34<07:55,  1.03it/s, lr=4.83e-5, step_loss=0.00646]
Steps:  51%|█████     | 511/1000 [08:34<07:55,  1.03it/s, lr=4.83e-5, step_loss=0.0843] 
Steps:  51%|█████     | 511/1000 [08:34<07:55,  1.03it/s, lr=4.83e-5, step_loss=0.061] 
Steps:  51%|█████     | 512/1000 [08:34<07:54,  1.03it/s, lr=4.83e-5, step_loss=0.061]
Steps:  51%|█████     | 512/1000 [08:34<07:54,  1.03it/s, lr=4.81e-5, step_loss=0.0157]
Steps:  51%|█████     | 512/1000 [08:35<07:54,  1.03it/s, lr=4.81e-5, step_loss=0.00556]
Steps:  51%|█████     | 512/1000 [08:35<07:54,  1.03it/s, lr=4.81e-5, step_loss=0.0196] 
Steps:  51%|█████     | 512/1000 [08:35<07:54,  1.03it/s, lr=4.81e-5, step_loss=0.0732]
Steps:  51%|█████▏    | 513/1000 [08:35<07:53,  1.03it/s, lr=4.81e-5, step_loss=0.0732]
Steps:  51%|█████▏    | 513/1000 [08:35<07:53,  1.03it/s, lr=4.8e-5, step_loss=0.0105] 
Steps:  51%|█████▏    | 513/1000 [08:36<07:53,  1.03it/s, lr=4.8e-5, step_loss=0.00946]
Steps:  51%|█████▏    | 513/1000 [08:36<07:53,  1.03it/s, lr=4.8e-5, step_loss=0.161]  
Steps:  51%|█████▏    | 513/1000 [08:36<07:53,  1.03it/s, lr=4.8e-5, step_loss=0.219]
Steps:  51%|█████▏    | 514/1000 [08:36<07:52,  1.03it/s, lr=4.8e-5, step_loss=0.219]
Steps:  51%|█████▏    | 514/1000 [08:36<07:52,  1.03it/s, lr=4.78e-5, step_loss=0.0323]
Steps:  51%|█████▏    | 514/1000 [08:36<07:52,  1.03it/s, lr=4.78e-5, step_loss=0.00749]
Steps:  51%|█████▏    | 514/1000 [08:37<07:52,  1.03it/s, lr=4.78e-5, step_loss=0.00283]
Steps:  51%|█████▏    | 514/1000 [08:37<07:52,  1.03it/s, lr=4.78e-5, step_loss=0.123]  
Steps:  52%|█████▏    | 515/1000 [08:37<07:51,  1.03it/s, lr=4.78e-5, step_loss=0.123]
Steps:  52%|█████▏    | 515/1000 [08:37<07:51,  1.03it/s, lr=4.76e-5, step_loss=0.0551]
Steps:  52%|█████▏    | 515/1000 [08:37<07:51,  1.03it/s, lr=4.76e-5, step_loss=0.138] 
Steps:  52%|█████▏    | 515/1000 [08:38<07:51,  1.03it/s, lr=4.76e-5, step_loss=0.00869]
Steps:  52%|█████▏    | 515/1000 [08:38<07:51,  1.03it/s, lr=4.76e-5, step_loss=0.0146] 
Steps:  52%|█████▏    | 516/1000 [08:38<07:51,  1.03it/s, lr=4.76e-5, step_loss=0.0146]
Steps:  52%|█████▏    | 516/1000 [08:38<07:51,  1.03it/s, lr=4.75e-5, step_loss=0.203] 
Steps:  52%|█████▏    | 516/1000 [08:38<07:51,  1.03it/s, lr=4.75e-5, step_loss=0.00667]
Steps:  52%|█████▏    | 516/1000 [08:39<07:51,  1.03it/s, lr=4.75e-5, step_loss=0.193]  
Steps:  52%|█████▏    | 516/1000 [08:39<07:51,  1.03it/s, lr=4.75e-5, step_loss=0.0854]
Steps:  52%|█████▏    | 517/1000 [08:39<07:49,  1.03it/s, lr=4.75e-5, step_loss=0.0854]
Steps:  52%|█████▏    | 517/1000 [08:39<07:49,  1.03it/s, lr=4.73e-5, step_loss=0.0721]
Steps:  52%|█████▏    | 517/1000 [08:39<07:49,  1.03it/s, lr=4.73e-5, step_loss=0.121] 
Steps:  52%|█████▏    | 517/1000 [08:40<07:49,  1.03it/s, lr=4.73e-5, step_loss=0.0241]
Steps:  52%|█████▏    | 517/1000 [08:40<07:49,  1.03it/s, lr=4.73e-5, step_loss=0.0669]
Steps:  52%|█████▏    | 518/1000 [08:40<07:48,  1.03it/s, lr=4.73e-5, step_loss=0.0669]
Steps:  52%|█████▏    | 518/1000 [08:40<07:48,  1.03it/s, lr=4.72e-5, step_loss=0.247] 
Steps:  52%|█████▏    | 518/1000 [08:40<07:48,  1.03it/s, lr=4.72e-5, step_loss=0.0644]
Steps:  52%|█████▏    | 518/1000 [08:41<07:48,  1.03it/s, lr=4.72e-5, step_loss=0.0125]
Steps:  52%|█████▏    | 518/1000 [08:41<07:48,  1.03it/s, lr=4.72e-5, step_loss=0.0122]
Steps:  52%|█████▏    | 519/1000 [08:41<07:48,  1.03it/s, lr=4.72e-5, step_loss=0.0122]
Steps:  52%|█████▏    | 519/1000 [08:41<07:48,  1.03it/s, lr=4.7e-5, step_loss=0.489]  
Steps:  52%|█████▏    | 519/1000 [08:41<07:48,  1.03it/s, lr=4.7e-5, step_loss=0.0186]
Steps:  52%|█████▏    | 519/1000 [08:42<07:48,  1.03it/s, lr=4.7e-5, step_loss=0.211] 
Steps:  52%|█████▏    | 519/1000 [08:42<07:48,  1.03it/s, lr=4.7e-5, step_loss=0.0116]
Steps:  52%|█████▏    | 520/1000 [08:42<07:47,  1.03it/s, lr=4.7e-5, step_loss=0.0116]
Steps:  52%|█████▏    | 520/1000 [08:42<07:47,  1.03it/s, lr=4.69e-5, step_loss=0.435]
Steps:  52%|█████▏    | 520/1000 [08:42<07:47,  1.03it/s, lr=4.69e-5, step_loss=0.0792]
Steps:  52%|█████▏    | 520/1000 [08:43<07:47,  1.03it/s, lr=4.69e-5, step_loss=0.0102]
Steps:  52%|█████▏    | 520/1000 [08:43<07:47,  1.03it/s, lr=4.69e-5, step_loss=0.0344]
Steps:  52%|█████▏    | 521/1000 [08:43<07:46,  1.03it/s, lr=4.69e-5, step_loss=0.0344]
Steps:  52%|█████▏    | 521/1000 [08:43<07:46,  1.03it/s, lr=4.67e-5, step_loss=0.0244]
Steps:  52%|█████▏    | 521/1000 [08:43<07:46,  1.03it/s, lr=4.67e-5, step_loss=0.00841]
Steps:  52%|█████▏    | 521/1000 [08:44<07:46,  1.03it/s, lr=4.67e-5, step_loss=0.0164] 
Steps:  52%|█████▏    | 521/1000 [08:44<07:46,  1.03it/s, lr=4.67e-5, step_loss=0.0669]
Steps:  52%|█████▏    | 522/1000 [08:44<07:45,  1.03it/s, lr=4.67e-5, step_loss=0.0669]
Steps:  52%|█████▏    | 522/1000 [08:44<07:45,  1.03it/s, lr=4.65e-5, step_loss=0.0103]
Steps:  52%|█████▏    | 522/1000 [08:44<07:45,  1.03it/s, lr=4.65e-5, step_loss=0.0134]
Steps:  52%|█████▏    | 522/1000 [08:45<07:45,  1.03it/s, lr=4.65e-5, step_loss=0.067] 
Steps:  52%|█████▏    | 522/1000 [08:45<07:45,  1.03it/s, lr=4.65e-5, step_loss=0.0167]
Steps:  52%|█████▏    | 523/1000 [08:45<07:44,  1.03it/s, lr=4.65e-5, step_loss=0.0167]
Steps:  52%|█████▏    | 523/1000 [08:45<07:44,  1.03it/s, lr=4.64e-5, step_loss=0.0519]
Steps:  52%|█████▏    | 523/1000 [08:45<07:44,  1.03it/s, lr=4.64e-5, step_loss=0.144] 
Steps:  52%|█████▏    | 523/1000 [08:45<07:44,  1.03it/s, lr=4.64e-5, step_loss=0.356]
Steps:  52%|█████▏    | 523/1000 [08:46<07:44,  1.03it/s, lr=4.64e-5, step_loss=0.0342]
Steps:  52%|█████▏    | 524/1000 [08:46<07:43,  1.03it/s, lr=4.64e-5, step_loss=0.0342]
Steps:  52%|█████▏    | 524/1000 [08:46<07:43,  1.03it/s, lr=4.62e-5, step_loss=0.0105]
Steps:  52%|█████▏    | 524/1000 [08:46<07:43,  1.03it/s, lr=4.62e-5, step_loss=0.0842]
Steps:  52%|█████▏    | 524/1000 [08:46<07:43,  1.03it/s, lr=4.62e-5, step_loss=0.335] 
Steps:  52%|█████▏    | 524/1000 [08:47<07:43,  1.03it/s, lr=4.62e-5, step_loss=0.0892]
Steps:  52%|█████▎    | 525/1000 [08:47<07:42,  1.03it/s, lr=4.62e-5, step_loss=0.0892]
Steps:  52%|█████▎    | 525/1000 [08:47<07:42,  1.03it/s, lr=4.61e-5, step_loss=0.273] 
Steps:  52%|█████▎    | 525/1000 [08:47<07:42,  1.03it/s, lr=4.61e-5, step_loss=0.0225]
Steps:  52%|█████▎    | 525/1000 [08:47<07:42,  1.03it/s, lr=4.61e-5, step_loss=0.145] 
Steps:  52%|█████▎    | 525/1000 [08:48<07:42,  1.03it/s, lr=4.61e-5, step_loss=0.0138]
Steps:  53%|█████▎    | 526/1000 [08:48<07:41,  1.03it/s, lr=4.61e-5, step_loss=0.0138]
Steps:  53%|█████▎    | 526/1000 [08:48<07:41,  1.03it/s, lr=4.59e-5, step_loss=0.026] 
Steps:  53%|█████▎    | 526/1000 [08:48<07:41,  1.03it/s, lr=4.59e-5, step_loss=0.0529]
Steps:  53%|█████▎    | 526/1000 [08:48<07:41,  1.03it/s, lr=4.59e-5, step_loss=0.165] 
Steps:  53%|█████▎    | 526/1000 [08:49<07:41,  1.03it/s, lr=4.59e-5, step_loss=0.048]
Steps:  53%|█████▎    | 527/1000 [08:49<07:40,  1.03it/s, lr=4.59e-5, step_loss=0.048]
Steps:  53%|█████▎    | 527/1000 [08:49<07:40,  1.03it/s, lr=4.58e-5, step_loss=0.00579]
Steps:  53%|█████▎    | 527/1000 [08:49<07:40,  1.03it/s, lr=4.58e-5, step_loss=0.0212] 
Steps:  53%|█████▎    | 527/1000 [08:49<07:40,  1.03it/s, lr=4.58e-5, step_loss=0.00597]
Steps:  53%|█████▎    | 527/1000 [08:50<07:40,  1.03it/s, lr=4.58e-5, step_loss=0.00899]
Steps:  53%|█████▎    | 528/1000 [08:50<07:39,  1.03it/s, lr=4.58e-5, step_loss=0.00899]
Steps:  53%|█████▎    | 528/1000 [08:50<07:39,  1.03it/s, lr=4.56e-5, step_loss=0.0121] 
Steps:  53%|█████▎    | 528/1000 [08:50<07:39,  1.03it/s, lr=4.56e-5, step_loss=0.317] 
Steps:  53%|█████▎    | 528/1000 [08:50<07:39,  1.03it/s, lr=4.56e-5, step_loss=0.0119]
Steps:  53%|█████▎    | 528/1000 [08:51<07:39,  1.03it/s, lr=4.56e-5, step_loss=0.0934]
Steps:  53%|█████▎    | 529/1000 [08:51<07:38,  1.03it/s, lr=4.56e-5, step_loss=0.0934]
Steps:  53%|█████▎    | 529/1000 [08:51<07:38,  1.03it/s, lr=4.55e-5, step_loss=0.0343]
Steps:  53%|█████▎    | 529/1000 [08:51<07:38,  1.03it/s, lr=4.55e-5, step_loss=0.00536]
Steps:  53%|█████▎    | 529/1000 [08:51<07:38,  1.03it/s, lr=4.55e-5, step_loss=0.0799] 
Steps:  53%|█████▎    | 529/1000 [08:52<07:38,  1.03it/s, lr=4.55e-5, step_loss=0.0124]
Steps:  53%|█████▎    | 530/1000 [08:52<07:37,  1.03it/s, lr=4.55e-5, step_loss=0.0124]
Steps:  53%|█████▎    | 530/1000 [08:52<07:37,  1.03it/s, lr=4.53e-5, step_loss=0.0406]
Steps:  53%|█████▎    | 530/1000 [08:52<07:37,  1.03it/s, lr=4.53e-5, step_loss=0.416] 
Steps:  53%|█████▎    | 530/1000 [08:52<07:37,  1.03it/s, lr=4.53e-5, step_loss=0.0357]
Steps:  53%|█████▎    | 530/1000 [08:53<07:37,  1.03it/s, lr=4.53e-5, step_loss=0.573] 
Steps:  53%|█████▎    | 531/1000 [08:53<07:36,  1.03it/s, lr=4.53e-5, step_loss=0.573]
Steps:  53%|█████▎    | 531/1000 [08:53<07:36,  1.03it/s, lr=4.51e-5, step_loss=0.00567]
Steps:  53%|█████▎    | 531/1000 [08:53<07:36,  1.03it/s, lr=4.51e-5, step_loss=0.131]  
Steps:  53%|█████▎    | 531/1000 [08:53<07:36,  1.03it/s, lr=4.51e-5, step_loss=0.00271]
Steps:  53%|█████▎    | 531/1000 [08:54<07:36,  1.03it/s, lr=4.51e-5, step_loss=0.00743]
Steps:  53%|█████▎    | 532/1000 [08:54<07:35,  1.03it/s, lr=4.51e-5, step_loss=0.00743]
Steps:  53%|█████▎    | 532/1000 [08:54<07:35,  1.03it/s, lr=4.5e-5, step_loss=0.0769]  
Steps:  53%|█████▎    | 532/1000 [08:54<07:35,  1.03it/s, lr=4.5e-5, step_loss=0.0711]
Steps:  53%|█████▎    | 532/1000 [08:54<07:35,  1.03it/s, lr=4.5e-5, step_loss=0.00746]
Steps:  53%|█████▎    | 532/1000 [08:54<07:35,  1.03it/s, lr=4.5e-5, step_loss=0.162]  
Steps:  53%|█████▎    | 533/1000 [08:55<07:34,  1.03it/s, lr=4.5e-5, step_loss=0.162]
Steps:  53%|█████▎    | 533/1000 [08:55<07:34,  1.03it/s, lr=4.48e-5, step_loss=0.272]
Steps:  53%|█████▎    | 533/1000 [08:55<07:34,  1.03it/s, lr=4.48e-5, step_loss=0.0572]
Steps:  53%|█████▎    | 533/1000 [08:55<07:34,  1.03it/s, lr=4.48e-5, step_loss=0.00646]
Steps:  53%|█████▎    | 533/1000 [08:55<07:34,  1.03it/s, lr=4.48e-5, step_loss=0.0688] 
Steps:  53%|█████▎    | 534/1000 [08:56<07:33,  1.03it/s, lr=4.48e-5, step_loss=0.0688]
Steps:  53%|█████▎    | 534/1000 [08:56<07:33,  1.03it/s, lr=4.47e-5, step_loss=0.00452]
Steps:  53%|█████▎    | 534/1000 [08:56<07:33,  1.03it/s, lr=4.47e-5, step_loss=0.0396] 
Steps:  53%|█████▎    | 534/1000 [08:56<07:33,  1.03it/s, lr=4.47e-5, step_loss=0.032] 
Steps:  53%|█████▎    | 534/1000 [08:56<07:33,  1.03it/s, lr=4.47e-5, step_loss=0.2]  
Steps:  54%|█████▎    | 535/1000 [08:57<07:32,  1.03it/s, lr=4.47e-5, step_loss=0.2]
Steps:  54%|█████▎    | 535/1000 [08:57<07:32,  1.03it/s, lr=4.45e-5, step_loss=0.0464]
Steps:  54%|█████▎    | 535/1000 [08:57<07:32,  1.03it/s, lr=4.45e-5, step_loss=0.325] 
Steps:  54%|█████▎    | 535/1000 [08:57<07:32,  1.03it/s, lr=4.45e-5, step_loss=0.0291]
Steps:  54%|█████▎    | 535/1000 [08:57<07:32,  1.03it/s, lr=4.45e-5, step_loss=0.046] 
Steps:  54%|█████▎    | 536/1000 [08:58<07:31,  1.03it/s, lr=4.45e-5, step_loss=0.046]
Steps:  54%|█████▎    | 536/1000 [08:58<07:31,  1.03it/s, lr=4.44e-5, step_loss=0.0481]
Steps:  54%|█████▎    | 536/1000 [08:58<07:31,  1.03it/s, lr=4.44e-5, step_loss=0.131] 
Steps:  54%|█████▎    | 536/1000 [08:58<07:31,  1.03it/s, lr=4.44e-5, step_loss=0.161]
Steps:  54%|█████▎    | 536/1000 [08:58<07:31,  1.03it/s, lr=4.44e-5, step_loss=0.811]
Steps:  54%|█████▎    | 537/1000 [08:59<07:30,  1.03it/s, lr=4.44e-5, step_loss=0.811]
Steps:  54%|█████▎    | 537/1000 [08:59<07:30,  1.03it/s, lr=4.42e-5, step_loss=0.0125]
Steps:  54%|█████▎    | 537/1000 [08:59<07:30,  1.03it/s, lr=4.42e-5, step_loss=0.0407]
Steps:  54%|█████▎    | 537/1000 [08:59<07:30,  1.03it/s, lr=4.42e-5, step_loss=0.0151]
Steps:  54%|█████▎    | 537/1000 [08:59<07:30,  1.03it/s, lr=4.42e-5, step_loss=0.0239]
Steps:  54%|█████▍    | 538/1000 [09:00<07:29,  1.03it/s, lr=4.42e-5, step_loss=0.0239]
Steps:  54%|█████▍    | 538/1000 [09:00<07:29,  1.03it/s, lr=4.4e-5, step_loss=0.169]  
Steps:  54%|█████▍    | 538/1000 [09:00<07:29,  1.03it/s, lr=4.4e-5, step_loss=0.00605]
Steps:  54%|█████▍    | 538/1000 [09:00<07:29,  1.03it/s, lr=4.4e-5, step_loss=0.0982] 
Steps:  54%|█████▍    | 538/1000 [09:00<07:29,  1.03it/s, lr=4.4e-5, step_loss=0.163] 
Steps:  54%|█████▍    | 539/1000 [09:01<07:28,  1.03it/s, lr=4.4e-5, step_loss=0.163]
Steps:  54%|█████▍    | 539/1000 [09:01<07:28,  1.03it/s, lr=4.39e-5, step_loss=0.0187]
Steps:  54%|█████▍    | 539/1000 [09:01<07:28,  1.03it/s, lr=4.39e-5, step_loss=0.0699]
Steps:  54%|█████▍    | 539/1000 [09:01<07:28,  1.03it/s, lr=4.39e-5, step_loss=0.114] 
Steps:  54%|█████▍    | 539/1000 [09:01<07:28,  1.03it/s, lr=4.39e-5, step_loss=0.00241]
Steps:  54%|█████▍    | 540/1000 [09:01<07:27,  1.03it/s, lr=4.39e-5, step_loss=0.00241]
Steps:  54%|█████▍    | 540/1000 [09:02<07:27,  1.03it/s, lr=4.37e-5, step_loss=0.0625] 
Steps:  54%|█████▍    | 540/1000 [09:02<07:27,  1.03it/s, lr=4.37e-5, step_loss=0.00298]
Steps:  54%|█████▍    | 540/1000 [09:02<07:27,  1.03it/s, lr=4.37e-5, step_loss=0.228]  
Steps:  54%|█████▍    | 540/1000 [09:02<07:27,  1.03it/s, lr=4.37e-5, step_loss=0.0229]
Steps:  54%|█████▍    | 541/1000 [09:02<07:26,  1.03it/s, lr=4.37e-5, step_loss=0.0229]
Steps:  54%|█████▍    | 541/1000 [09:03<07:26,  1.03it/s, lr=4.36e-5, step_loss=0.0481]
Steps:  54%|█████▍    | 541/1000 [09:03<07:26,  1.03it/s, lr=4.36e-5, step_loss=0.0757]
Steps:  54%|█████▍    | 541/1000 [09:03<07:26,  1.03it/s, lr=4.36e-5, step_loss=0.0347]
Steps:  54%|█████▍    | 541/1000 [09:03<07:26,  1.03it/s, lr=4.36e-5, step_loss=0.19]  
Steps:  54%|█████▍    | 542/1000 [09:03<07:25,  1.03it/s, lr=4.36e-5, step_loss=0.19]
Steps:  54%|█████▍    | 542/1000 [09:03<07:25,  1.03it/s, lr=4.34e-5, step_loss=0.166]
Steps:  54%|█████▍    | 542/1000 [09:04<07:25,  1.03it/s, lr=4.34e-5, step_loss=0.107]
Steps:  54%|█████▍    | 542/1000 [09:04<07:25,  1.03it/s, lr=4.34e-5, step_loss=0.0285]
Steps:  54%|█████▍    | 542/1000 [09:04<07:25,  1.03it/s, lr=4.34e-5, step_loss=0.00532]
Steps:  54%|█████▍    | 543/1000 [09:04<07:24,  1.03it/s, lr=4.34e-5, step_loss=0.00532]
Steps:  54%|█████▍    | 543/1000 [09:04<07:24,  1.03it/s, lr=4.33e-5, step_loss=0.208]  
Steps:  54%|█████▍    | 543/1000 [09:05<07:24,  1.03it/s, lr=4.33e-5, step_loss=0.0586]
Steps:  54%|█████▍    | 543/1000 [09:05<07:24,  1.03it/s, lr=4.33e-5, step_loss=0.0515]
Steps:  54%|█████▍    | 543/1000 [09:05<07:24,  1.03it/s, lr=4.33e-5, step_loss=0.037] 
Steps:  54%|█████▍    | 544/1000 [09:05<07:23,  1.03it/s, lr=4.33e-5, step_loss=0.037]
Steps:  54%|█████▍    | 544/1000 [09:05<07:23,  1.03it/s, lr=4.31e-5, step_loss=0.0956]
Steps:  54%|█████▍    | 544/1000 [09:06<07:23,  1.03it/s, lr=4.31e-5, step_loss=0.031] 
Steps:  54%|█████▍    | 544/1000 [09:06<07:23,  1.03it/s, lr=4.31e-5, step_loss=0.0984]
Steps:  54%|█████▍    | 544/1000 [09:06<07:23,  1.03it/s, lr=4.31e-5, step_loss=0.11]  
Steps:  55%|█████▍    | 545/1000 [09:06<07:23,  1.03it/s, lr=4.31e-5, step_loss=0.11]
Steps:  55%|█████▍    | 545/1000 [09:06<07:23,  1.03it/s, lr=4.3e-5, step_loss=0.0116]
Steps:  55%|█████▍    | 545/1000 [09:07<07:23,  1.03it/s, lr=4.3e-5, step_loss=0.0207]
Steps:  55%|█████▍    | 545/1000 [09:07<07:23,  1.03it/s, lr=4.3e-5, step_loss=0.0456]
Steps:  55%|█████▍    | 545/1000 [09:07<07:23,  1.03it/s, lr=4.3e-5, step_loss=0.0907]
Steps:  55%|█████▍    | 546/1000 [09:07<07:22,  1.03it/s, lr=4.3e-5, step_loss=0.0907]
Steps:  55%|█████▍    | 546/1000 [09:07<07:22,  1.03it/s, lr=4.28e-5, step_loss=0.0275]
Steps:  55%|█████▍    | 546/1000 [09:08<07:22,  1.03it/s, lr=4.28e-5, step_loss=0.00289]
Steps:  55%|█████▍    | 546/1000 [09:08<07:22,  1.03it/s, lr=4.28e-5, step_loss=0.36]   
Steps:  55%|█████▍    | 546/1000 [09:08<07:22,  1.03it/s, lr=4.28e-5, step_loss=0.111]
Steps:  55%|█████▍    | 547/1000 [09:08<07:21,  1.03it/s, lr=4.28e-5, step_loss=0.111]
Steps:  55%|█████▍    | 547/1000 [09:08<07:21,  1.03it/s, lr=4.26e-5, step_loss=0.281]
Steps:  55%|█████▍    | 547/1000 [09:09<07:21,  1.03it/s, lr=4.26e-5, step_loss=0.00349]
Steps:  55%|█████▍    | 547/1000 [09:09<07:21,  1.03it/s, lr=4.26e-5, step_loss=0.483]  
Steps:  55%|█████▍    | 547/1000 [09:09<07:21,  1.03it/s, lr=4.26e-5, step_loss=0.137]
Steps:  55%|█████▍    | 548/1000 [09:09<07:20,  1.03it/s, lr=4.26e-5, step_loss=0.137]
Steps:  55%|█████▍    | 548/1000 [09:09<07:20,  1.03it/s, lr=4.25e-5, step_loss=0.39] 
Steps:  55%|█████▍    | 548/1000 [09:10<07:20,  1.03it/s, lr=4.25e-5, step_loss=0.00474]
Steps:  55%|█████▍    | 548/1000 [09:10<07:20,  1.03it/s, lr=4.25e-5, step_loss=0.0216] 
Steps:  55%|█████▍    | 548/1000 [09:10<07:20,  1.03it/s, lr=4.25e-5, step_loss=0.119] 
Steps:  55%|█████▍    | 549/1000 [09:10<07:19,  1.03it/s, lr=4.25e-5, step_loss=0.119]
Steps:  55%|█████▍    | 549/1000 [09:10<07:19,  1.03it/s, lr=4.23e-5, step_loss=0.00452]
Steps:  55%|█████▍    | 549/1000 [09:11<07:19,  1.03it/s, lr=4.23e-5, step_loss=0.0546] 
Steps:  55%|█████▍    | 549/1000 [09:11<07:19,  1.03it/s, lr=4.23e-5, step_loss=0.148] 
Steps:  55%|█████▍    | 549/1000 [09:11<07:19,  1.03it/s, lr=4.23e-5, step_loss=0.0188]
Steps:  55%|█████▌    | 550/1000 [09:11<07:18,  1.03it/s, lr=4.23e-5, step_loss=0.0188]
Steps:  55%|█████▌    | 550/1000 [09:11<07:18,  1.03it/s, lr=4.22e-5, step_loss=0.0279]
Steps:  55%|█████▌    | 550/1000 [09:12<07:18,  1.03it/s, lr=4.22e-5, step_loss=0.0459]
Steps:  55%|█████▌    | 550/1000 [09:12<07:18,  1.03it/s, lr=4.22e-5, step_loss=0.101] 
Steps:  55%|█████▌    | 550/1000 [09:12<07:18,  1.03it/s, lr=4.22e-5, step_loss=0.192]
Steps:  55%|█████▌    | 551/1000 [09:12<07:17,  1.03it/s, lr=4.22e-5, step_loss=0.192]
Steps:  55%|█████▌    | 551/1000 [09:12<07:17,  1.03it/s, lr=4.2e-5, step_loss=0.278] 
Steps:  55%|█████▌    | 551/1000 [09:12<07:17,  1.03it/s, lr=4.2e-5, step_loss=0.0246]
Steps:  55%|█████▌    | 551/1000 [09:13<07:17,  1.03it/s, lr=4.2e-5, step_loss=0.108] 
Steps:  55%|█████▌    | 551/1000 [09:13<07:17,  1.03it/s, lr=4.2e-5, step_loss=0.133]
Steps:  55%|█████▌    | 552/1000 [09:13<07:16,  1.03it/s, lr=4.2e-5, step_loss=0.133]
Steps:  55%|█████▌    | 552/1000 [09:13<07:16,  1.03it/s, lr=4.19e-5, step_loss=0.0701]
Steps:  55%|█████▌    | 552/1000 [09:13<07:16,  1.03it/s, lr=4.19e-5, step_loss=0.115] 
Steps:  55%|█████▌    | 552/1000 [09:14<07:16,  1.03it/s, lr=4.19e-5, step_loss=0.105]
Steps:  55%|█████▌    | 552/1000 [09:14<07:16,  1.03it/s, lr=4.19e-5, step_loss=0.363]
Steps:  55%|█████▌    | 553/1000 [09:14<07:15,  1.03it/s, lr=4.19e-5, step_loss=0.363]
Steps:  55%|█████▌    | 553/1000 [09:14<07:15,  1.03it/s, lr=4.17e-5, step_loss=0.0267]
Steps:  55%|█████▌    | 553/1000 [09:14<07:15,  1.03it/s, lr=4.17e-5, step_loss=0.0168]
Steps:  55%|█████▌    | 553/1000 [09:15<07:15,  1.03it/s, lr=4.17e-5, step_loss=0.126] 
Steps:  55%|█████▌    | 553/1000 [09:15<07:15,  1.03it/s, lr=4.17e-5, step_loss=0.0414]
Steps:  55%|█████▌    | 554/1000 [09:15<07:14,  1.03it/s, lr=4.17e-5, step_loss=0.0414]
Steps:  55%|█████▌    | 554/1000 [09:15<07:14,  1.03it/s, lr=4.16e-5, step_loss=0.0609]
Steps:  55%|█████▌    | 554/1000 [09:15<07:14,  1.03it/s, lr=4.16e-5, step_loss=0.0445]
Steps:  55%|█████▌    | 554/1000 [09:16<07:14,  1.03it/s, lr=4.16e-5, step_loss=0.182] 
Steps:  55%|█████▌    | 554/1000 [09:16<07:14,  1.03it/s, lr=4.16e-5, step_loss=0.0586]
Steps:  56%|█████▌    | 555/1000 [09:16<07:13,  1.03it/s, lr=4.16e-5, step_loss=0.0586]
Steps:  56%|█████▌    | 555/1000 [09:16<07:13,  1.03it/s, lr=4.14e-5, step_loss=0.0312]
Steps:  56%|█████▌    | 555/1000 [09:16<07:13,  1.03it/s, lr=4.14e-5, step_loss=0.41]  
Steps:  56%|█████▌    | 555/1000 [09:17<07:13,  1.03it/s, lr=4.14e-5, step_loss=0.132]
Steps:  56%|█████▌    | 555/1000 [09:17<07:13,  1.03it/s, lr=4.14e-5, step_loss=0.115]
Steps:  56%|█████▌    | 556/1000 [09:17<07:12,  1.03it/s, lr=4.14e-5, step_loss=0.115]
Steps:  56%|█████▌    | 556/1000 [09:17<07:12,  1.03it/s, lr=4.12e-5, step_loss=0.00393]
Steps:  56%|█████▌    | 556/1000 [09:17<07:12,  1.03it/s, lr=4.12e-5, step_loss=0.0433] 
Steps:  56%|█████▌    | 556/1000 [09:18<07:12,  1.03it/s, lr=4.12e-5, step_loss=0.0146]
Steps:  56%|█████▌    | 556/1000 [09:18<07:12,  1.03it/s, lr=4.12e-5, step_loss=0.319] 
Steps:  56%|█████▌    | 557/1000 [09:18<07:11,  1.03it/s, lr=4.12e-5, step_loss=0.319]
Steps:  56%|█████▌    | 557/1000 [09:18<07:11,  1.03it/s, lr=4.11e-5, step_loss=0.007]
Steps:  56%|█████▌    | 557/1000 [09:18<07:11,  1.03it/s, lr=4.11e-5, step_loss=0.0339]
Steps:  56%|█████▌    | 557/1000 [09:19<07:11,  1.03it/s, lr=4.11e-5, step_loss=0.0442]
Steps:  56%|█████▌    | 557/1000 [09:19<07:11,  1.03it/s, lr=4.11e-5, step_loss=0.0339]
Steps:  56%|█████▌    | 558/1000 [09:19<07:10,  1.03it/s, lr=4.11e-5, step_loss=0.0339]
Steps:  56%|█████▌    | 558/1000 [09:19<07:10,  1.03it/s, lr=4.09e-5, step_loss=0.0678]
Steps:  56%|█████▌    | 558/1000 [09:19<07:10,  1.03it/s, lr=4.09e-5, step_loss=0.0175]
Steps:  56%|█████▌    | 558/1000 [09:20<07:10,  1.03it/s, lr=4.09e-5, step_loss=0.572] 
Steps:  56%|█████▌    | 558/1000 [09:20<07:10,  1.03it/s, lr=4.09e-5, step_loss=0.0438]
Steps:  56%|█████▌    | 559/1000 [09:20<07:09,  1.03it/s, lr=4.09e-5, step_loss=0.0438]
Steps:  56%|█████▌    | 559/1000 [09:20<07:09,  1.03it/s, lr=4.08e-5, step_loss=0.0233]
Steps:  56%|█████▌    | 559/1000 [09:20<07:09,  1.03it/s, lr=4.08e-5, step_loss=0.00702]
Steps:  56%|█████▌    | 559/1000 [09:21<07:09,  1.03it/s, lr=4.08e-5, step_loss=0.00424]
Steps:  56%|█████▌    | 559/1000 [09:21<07:09,  1.03it/s, lr=4.08e-5, step_loss=0.0506] 
Steps:  56%|█████▌    | 560/1000 [09:21<07:08,  1.03it/s, lr=4.08e-5, step_loss=0.0506]
Steps:  56%|█████▌    | 560/1000 [09:21<07:08,  1.03it/s, lr=4.06e-5, step_loss=0.00388]
Steps:  56%|█████▌    | 560/1000 [09:21<07:08,  1.03it/s, lr=4.06e-5, step_loss=0.00815]
Steps:  56%|█████▌    | 560/1000 [09:22<07:08,  1.03it/s, lr=4.06e-5, step_loss=0.15]   
Steps:  56%|█████▌    | 560/1000 [09:22<07:08,  1.03it/s, lr=4.06e-5, step_loss=0.11]
Steps:  56%|█████▌    | 561/1000 [09:22<07:07,  1.03it/s, lr=4.06e-5, step_loss=0.11]
Steps:  56%|█████▌    | 561/1000 [09:22<07:07,  1.03it/s, lr=4.05e-5, step_loss=0.127]
Steps:  56%|█████▌    | 561/1000 [09:22<07:07,  1.03it/s, lr=4.05e-5, step_loss=0.024]
Steps:  56%|█████▌    | 561/1000 [09:22<07:07,  1.03it/s, lr=4.05e-5, step_loss=0.0508]
Steps:  56%|█████▌    | 561/1000 [09:23<07:07,  1.03it/s, lr=4.05e-5, step_loss=0.0328]
Steps:  56%|█████▌    | 562/1000 [09:23<07:06,  1.03it/s, lr=4.05e-5, step_loss=0.0328]
Steps:  56%|█████▌    | 562/1000 [09:23<07:06,  1.03it/s, lr=4.03e-5, step_loss=0.0716]
Steps:  56%|█████▌    | 562/1000 [09:23<07:06,  1.03it/s, lr=4.03e-5, step_loss=0.0424]
Steps:  56%|█████▌    | 562/1000 [09:23<07:06,  1.03it/s, lr=4.03e-5, step_loss=0.0414]
Steps:  56%|█████▌    | 562/1000 [09:24<07:06,  1.03it/s, lr=4.03e-5, step_loss=0.533] 
Steps:  56%|█████▋    | 563/1000 [09:24<07:05,  1.03it/s, lr=4.03e-5, step_loss=0.533]
Steps:  56%|█████▋    | 563/1000 [09:24<07:05,  1.03it/s, lr=4.02e-5, step_loss=0.00481]
Steps:  56%|█████▋    | 563/1000 [09:24<07:05,  1.03it/s, lr=4.02e-5, step_loss=0.0153] 
Steps:  56%|█████▋    | 563/1000 [09:24<07:05,  1.03it/s, lr=4.02e-5, step_loss=0.041] 
Steps:  56%|█████▋    | 563/1000 [09:25<07:05,  1.03it/s, lr=4.02e-5, step_loss=0.0169]
Steps:  56%|█████▋    | 564/1000 [09:25<07:04,  1.03it/s, lr=4.02e-5, step_loss=0.0169]
Steps:  56%|█████▋    | 564/1000 [09:25<07:04,  1.03it/s, lr=4e-5, step_loss=0.00767]  
Steps:  56%|█████▋    | 564/1000 [09:25<07:04,  1.03it/s, lr=4e-5, step_loss=0.166]  
Steps:  56%|█████▋    | 564/1000 [09:25<07:04,  1.03it/s, lr=4e-5, step_loss=0.0825]
Steps:  56%|█████▋    | 564/1000 [09:26<07:04,  1.03it/s, lr=4e-5, step_loss=0.185] 
Steps:  56%|█████▋    | 565/1000 [09:26<07:03,  1.03it/s, lr=4e-5, step_loss=0.185]
Steps:  56%|█████▋    | 565/1000 [09:26<07:03,  1.03it/s, lr=3.99e-5, step_loss=0.035]
Steps:  56%|█████▋    | 565/1000 [09:26<07:03,  1.03it/s, lr=3.99e-5, step_loss=0.092]
Steps:  56%|█████▋    | 565/1000 [09:26<07:03,  1.03it/s, lr=3.99e-5, step_loss=0.053]
Steps:  56%|█████▋    | 565/1000 [09:27<07:03,  1.03it/s, lr=3.99e-5, step_loss=0.0241]
Steps:  57%|█████▋    | 566/1000 [09:27<07:02,  1.03it/s, lr=3.99e-5, step_loss=0.0241]
Steps:  57%|█████▋    | 566/1000 [09:27<07:02,  1.03it/s, lr=3.97e-5, step_loss=0.019] 
Steps:  57%|█████▋    | 566/1000 [09:27<07:02,  1.03it/s, lr=3.97e-5, step_loss=0.0785]
Steps:  57%|█████▋    | 566/1000 [09:27<07:02,  1.03it/s, lr=3.97e-5, step_loss=0.00966]
Steps:  57%|█████▋    | 566/1000 [09:28<07:02,  1.03it/s, lr=3.97e-5, step_loss=0.0165] 
Steps:  57%|█████▋    | 567/1000 [09:28<07:01,  1.03it/s, lr=3.97e-5, step_loss=0.0165]
Steps:  57%|█████▋    | 567/1000 [09:28<07:01,  1.03it/s, lr=3.96e-5, step_loss=0.00496]
Steps:  57%|█████▋    | 567/1000 [09:28<07:01,  1.03it/s, lr=3.96e-5, step_loss=0.222]  
Steps:  57%|█████▋    | 567/1000 [09:28<07:01,  1.03it/s, lr=3.96e-5, step_loss=0.175]
Steps:  57%|█████▋    | 567/1000 [09:29<07:01,  1.03it/s, lr=3.96e-5, step_loss=0.0129]
Steps:  57%|█████▋    | 568/1000 [09:29<07:01,  1.03it/s, lr=3.96e-5, step_loss=0.0129]
Steps:  57%|█████▋    | 568/1000 [09:29<07:01,  1.03it/s, lr=3.94e-5, step_loss=0.0086]
Steps:  57%|█████▋    | 568/1000 [09:29<07:01,  1.03it/s, lr=3.94e-5, step_loss=0.00613]
Steps:  57%|█████▋    | 568/1000 [09:29<07:01,  1.03it/s, lr=3.94e-5, step_loss=0.0689] 
Steps:  57%|█████▋    | 568/1000 [09:30<07:01,  1.03it/s, lr=3.94e-5, step_loss=0.156] 
Steps:  57%|█████▋    | 569/1000 [09:30<07:00,  1.03it/s, lr=3.94e-5, step_loss=0.156]
Steps:  57%|█████▋    | 569/1000 [09:30<07:00,  1.03it/s, lr=3.92e-5, step_loss=0.0259]
Steps:  57%|█████▋    | 569/1000 [09:30<07:00,  1.03it/s, lr=3.92e-5, step_loss=0.0034]
Steps:  57%|█████▋    | 569/1000 [09:30<07:00,  1.03it/s, lr=3.92e-5, step_loss=0.105] 
Steps:  57%|█████▋    | 569/1000 [09:31<07:00,  1.03it/s, lr=3.92e-5, step_loss=0.326]
Steps:  57%|█████▋    | 570/1000 [09:31<06:58,  1.03it/s, lr=3.92e-5, step_loss=0.326]
Steps:  57%|█████▋    | 570/1000 [09:31<06:58,  1.03it/s, lr=3.91e-5, step_loss=0.0234]
Steps:  57%|█████▋    | 570/1000 [09:31<06:58,  1.03it/s, lr=3.91e-5, step_loss=0.0388]
Steps:  57%|█████▋    | 570/1000 [09:31<06:58,  1.03it/s, lr=3.91e-5, step_loss=0.393] 
Steps:  57%|█████▋    | 570/1000 [09:31<06:58,  1.03it/s, lr=3.91e-5, step_loss=0.0403]
Steps:  57%|█████▋    | 571/1000 [09:32<06:57,  1.03it/s, lr=3.91e-5, step_loss=0.0403]
Steps:  57%|█████▋    | 571/1000 [09:32<06:57,  1.03it/s, lr=3.89e-5, step_loss=0.0134]
Steps:  57%|█████▋    | 571/1000 [09:32<06:57,  1.03it/s, lr=3.89e-5, step_loss=0.098] 
Steps:  57%|█████▋    | 571/1000 [09:32<06:57,  1.03it/s, lr=3.89e-5, step_loss=0.215]
Steps:  57%|█████▋    | 571/1000 [09:32<06:57,  1.03it/s, lr=3.89e-5, step_loss=0.116]
Steps:  57%|█████▋    | 572/1000 [09:33<06:57,  1.03it/s, lr=3.89e-5, step_loss=0.116]
Steps:  57%|█████▋    | 572/1000 [09:33<06:57,  1.03it/s, lr=3.88e-5, step_loss=0.161]
Steps:  57%|█████▋    | 572/1000 [09:33<06:57,  1.03it/s, lr=3.88e-5, step_loss=0.0853]
Steps:  57%|█████▋    | 572/1000 [09:33<06:57,  1.03it/s, lr=3.88e-5, step_loss=0.0113]
Steps:  57%|█████▋    | 572/1000 [09:33<06:57,  1.03it/s, lr=3.88e-5, step_loss=0.00293]
Steps:  57%|█████▋    | 573/1000 [09:34<06:55,  1.03it/s, lr=3.88e-5, step_loss=0.00293]
Steps:  57%|█████▋    | 573/1000 [09:34<06:55,  1.03it/s, lr=3.86e-5, step_loss=0.0858] 
Steps:  57%|█████▋    | 573/1000 [09:34<06:55,  1.03it/s, lr=3.86e-5, step_loss=0.201] 
Steps:  57%|█████▋    | 573/1000 [09:34<06:55,  1.03it/s, lr=3.86e-5, step_loss=0.0603]
Steps:  57%|█████▋    | 573/1000 [09:34<06:55,  1.03it/s, lr=3.86e-5, step_loss=0.00386]
Steps:  57%|█████▋    | 574/1000 [09:35<06:55,  1.03it/s, lr=3.86e-5, step_loss=0.00386]
Steps:  57%|█████▋    | 574/1000 [09:35<06:55,  1.03it/s, lr=3.85e-5, step_loss=0.0101] 
Steps:  57%|█████▋    | 574/1000 [09:35<06:55,  1.03it/s, lr=3.85e-5, step_loss=0.0768]
Steps:  57%|█████▋    | 574/1000 [09:35<06:55,  1.03it/s, lr=3.85e-5, step_loss=0.0916]
Steps:  57%|█████▋    | 574/1000 [09:35<06:55,  1.03it/s, lr=3.85e-5, step_loss=0.0208]
Steps:  57%|█████▊    | 575/1000 [09:36<06:54,  1.03it/s, lr=3.85e-5, step_loss=0.0208]
Steps:  57%|█████▊    | 575/1000 [09:36<06:54,  1.03it/s, lr=3.83e-5, step_loss=0.0086]
Steps:  57%|█████▊    | 575/1000 [09:36<06:54,  1.03it/s, lr=3.83e-5, step_loss=0.11]  
Steps:  57%|█████▊    | 575/1000 [09:36<06:54,  1.03it/s, lr=3.83e-5, step_loss=0.0303]
Steps:  57%|█████▊    | 575/1000 [09:36<06:54,  1.03it/s, lr=3.83e-5, step_loss=0.0599]
Steps:  58%|█████▊    | 576/1000 [09:37<06:53,  1.03it/s, lr=3.83e-5, step_loss=0.0599]
Steps:  58%|█████▊    | 576/1000 [09:37<06:53,  1.03it/s, lr=3.82e-5, step_loss=0.00424]
Steps:  58%|█████▊    | 576/1000 [09:37<06:53,  1.03it/s, lr=3.82e-5, step_loss=0.0126] 
Steps:  58%|█████▊    | 576/1000 [09:37<06:53,  1.03it/s, lr=3.82e-5, step_loss=0.0036]
Steps:  58%|█████▊    | 576/1000 [09:37<06:53,  1.03it/s, lr=3.82e-5, step_loss=0.115] 
Steps:  58%|█████▊    | 577/1000 [09:38<06:52,  1.03it/s, lr=3.82e-5, step_loss=0.115]
Steps:  58%|█████▊    | 577/1000 [09:38<06:52,  1.03it/s, lr=3.8e-5, step_loss=0.0669]
Steps:  58%|█████▊    | 577/1000 [09:38<06:52,  1.03it/s, lr=3.8e-5, step_loss=0.0738]
Steps:  58%|█████▊    | 577/1000 [09:38<06:52,  1.03it/s, lr=3.8e-5, step_loss=0.0303]
Steps:  58%|█████▊    | 577/1000 [09:38<06:52,  1.03it/s, lr=3.8e-5, step_loss=0.0536]
Steps:  58%|█████▊    | 578/1000 [09:39<06:51,  1.03it/s, lr=3.8e-5, step_loss=0.0536]
Steps:  58%|█████▊    | 578/1000 [09:39<06:51,  1.03it/s, lr=3.79e-5, step_loss=0.00301]
Steps:  58%|█████▊    | 578/1000 [09:39<06:51,  1.03it/s, lr=3.79e-5, step_loss=0.203]  
Steps:  58%|█████▊    | 578/1000 [09:39<06:51,  1.03it/s, lr=3.79e-5, step_loss=0.00838]
Steps:  58%|█████▊    | 578/1000 [09:39<06:51,  1.03it/s, lr=3.79e-5, step_loss=0.188]  
Steps:  58%|█████▊    | 579/1000 [09:39<06:50,  1.03it/s, lr=3.79e-5, step_loss=0.188]
Steps:  58%|█████▊    | 579/1000 [09:40<06:50,  1.03it/s, lr=3.77e-5, step_loss=0.119]
Steps:  58%|█████▊    | 579/1000 [09:40<06:50,  1.03it/s, lr=3.77e-5, step_loss=0.259]
Steps:  58%|█████▊    | 579/1000 [09:40<06:50,  1.03it/s, lr=3.77e-5, step_loss=0.0222]
Steps:  58%|█████▊    | 579/1000 [09:40<06:50,  1.03it/s, lr=3.77e-5, step_loss=0.228] 
Steps:  58%|█████▊    | 580/1000 [09:40<06:49,  1.03it/s, lr=3.77e-5, step_loss=0.228]
Steps:  58%|█████▊    | 580/1000 [09:41<06:49,  1.03it/s, lr=3.76e-5, step_loss=0.0163]
Steps:  58%|█████▊    | 580/1000 [09:41<06:49,  1.03it/s, lr=3.76e-5, step_loss=0.065] 
Steps:  58%|█████▊    | 580/1000 [09:41<06:49,  1.03it/s, lr=3.76e-5, step_loss=0.00997]
Steps:  58%|█████▊    | 580/1000 [09:41<06:49,  1.03it/s, lr=3.76e-5, step_loss=0.276]  
Steps:  58%|█████▊    | 581/1000 [09:41<06:48,  1.03it/s, lr=3.76e-5, step_loss=0.276]
Steps:  58%|█████▊    | 581/1000 [09:41<06:48,  1.03it/s, lr=3.74e-5, step_loss=0.141]
Steps:  58%|█████▊    | 581/1000 [09:42<06:48,  1.03it/s, lr=3.74e-5, step_loss=0.0772]
Steps:  58%|█████▊    | 581/1000 [09:42<06:48,  1.03it/s, lr=3.74e-5, step_loss=0.136] 
Steps:  58%|█████▊    | 581/1000 [09:42<06:48,  1.03it/s, lr=3.74e-5, step_loss=0.216]
Steps:  58%|█████▊    | 582/1000 [09:42<06:47,  1.02it/s, lr=3.74e-5, step_loss=0.216]
Steps:  58%|█████▊    | 582/1000 [09:42<06:47,  1.02it/s, lr=3.73e-5, step_loss=0.025]
Steps:  58%|█████▊    | 582/1000 [09:43<06:47,  1.02it/s, lr=3.73e-5, step_loss=0.00727]
Steps:  58%|█████▊    | 582/1000 [09:43<06:47,  1.02it/s, lr=3.73e-5, step_loss=0.0135] 
Steps:  58%|█████▊    | 582/1000 [09:43<06:47,  1.02it/s, lr=3.73e-5, step_loss=0.184] 
Steps:  58%|█████▊    | 583/1000 [09:43<06:46,  1.02it/s, lr=3.73e-5, step_loss=0.184]
Steps:  58%|█████▊    | 583/1000 [09:43<06:46,  1.02it/s, lr=3.71e-5, step_loss=0.0292]
Steps:  58%|█████▊    | 583/1000 [09:44<06:46,  1.02it/s, lr=3.71e-5, step_loss=0.0561]
Steps:  58%|█████▊    | 583/1000 [09:44<06:46,  1.02it/s, lr=3.71e-5, step_loss=0.053] 
Steps:  58%|█████▊    | 583/1000 [09:44<06:46,  1.02it/s, lr=3.71e-5, step_loss=0.375]
Steps:  58%|█████▊    | 584/1000 [09:44<06:45,  1.02it/s, lr=3.71e-5, step_loss=0.375]
Steps:  58%|█████▊    | 584/1000 [09:44<06:45,  1.02it/s, lr=3.7e-5, step_loss=0.0399]
Steps:  58%|█████▊    | 584/1000 [09:45<06:45,  1.02it/s, lr=3.7e-5, step_loss=0.111] 
Steps:  58%|█████▊    | 584/1000 [09:45<06:45,  1.02it/s, lr=3.7e-5, step_loss=0.019]
Steps:  58%|█████▊    | 584/1000 [09:45<06:45,  1.02it/s, lr=3.7e-5, step_loss=0.00396]
Steps:  58%|█████▊    | 585/1000 [09:45<06:44,  1.03it/s, lr=3.7e-5, step_loss=0.00396]
Steps:  58%|█████▊    | 585/1000 [09:45<06:44,  1.03it/s, lr=3.68e-5, step_loss=0.087] 
Steps:  58%|█████▊    | 585/1000 [09:46<06:44,  1.03it/s, lr=3.68e-5, step_loss=0.00626]
Steps:  58%|█████▊    | 585/1000 [09:46<06:44,  1.03it/s, lr=3.68e-5, step_loss=0.71]   
Steps:  58%|█████▊    | 585/1000 [09:46<06:44,  1.03it/s, lr=3.68e-5, step_loss=0.228]
Steps:  59%|█████▊    | 586/1000 [09:46<06:43,  1.03it/s, lr=3.68e-5, step_loss=0.228]
Steps:  59%|█████▊    | 586/1000 [09:46<06:43,  1.03it/s, lr=3.67e-5, step_loss=0.0107]
Steps:  59%|█████▊    | 586/1000 [09:47<06:43,  1.03it/s, lr=3.67e-5, step_loss=0.135] 
Steps:  59%|█████▊    | 586/1000 [09:47<06:43,  1.03it/s, lr=3.67e-5, step_loss=0.0625]
Steps:  59%|█████▊    | 586/1000 [09:47<06:43,  1.03it/s, lr=3.67e-5, step_loss=0.051] 
Steps:  59%|█████▊    | 587/1000 [09:47<06:42,  1.03it/s, lr=3.67e-5, step_loss=0.051]
Steps:  59%|█████▊    | 587/1000 [09:47<06:42,  1.03it/s, lr=3.65e-5, step_loss=0.11] 
Steps:  59%|█████▊    | 587/1000 [09:48<06:42,  1.03it/s, lr=3.65e-5, step_loss=0.0959]
Steps:  59%|█████▊    | 587/1000 [09:48<06:42,  1.03it/s, lr=3.65e-5, step_loss=0.0361]
Steps:  59%|█████▊    | 587/1000 [09:48<06:42,  1.03it/s, lr=3.65e-5, step_loss=0.0879]
Steps:  59%|█████▉    | 588/1000 [09:48<06:41,  1.03it/s, lr=3.65e-5, step_loss=0.0879]
Steps:  59%|█████▉    | 588/1000 [09:48<06:41,  1.03it/s, lr=3.64e-5, step_loss=0.0036]
Steps:  59%|█████▉    | 588/1000 [09:49<06:41,  1.03it/s, lr=3.64e-5, step_loss=0.0144]
Steps:  59%|█████▉    | 588/1000 [09:49<06:41,  1.03it/s, lr=3.64e-5, step_loss=0.298] 
Steps:  59%|█████▉    | 588/1000 [09:49<06:41,  1.03it/s, lr=3.64e-5, step_loss=0.0929]
Steps:  59%|█████▉    | 589/1000 [09:49<06:40,  1.03it/s, lr=3.64e-5, step_loss=0.0929]
Steps:  59%|█████▉    | 589/1000 [09:49<06:40,  1.03it/s, lr=3.62e-5, step_loss=0.00217]
Steps:  59%|█████▉    | 589/1000 [09:50<06:40,  1.03it/s, lr=3.62e-5, step_loss=0.0996] 
Steps:  59%|█████▉    | 589/1000 [09:50<06:40,  1.03it/s, lr=3.62e-5, step_loss=0.189] 
Steps:  59%|█████▉    | 589/1000 [09:50<06:40,  1.03it/s, lr=3.62e-5, step_loss=0.0699]
Steps:  59%|█████▉    | 590/1000 [09:50<06:39,  1.03it/s, lr=3.62e-5, step_loss=0.0699]
Steps:  59%|█████▉    | 590/1000 [09:50<06:39,  1.03it/s, lr=3.61e-5, step_loss=0.0653]
Steps:  59%|█████▉    | 590/1000 [09:50<06:39,  1.03it/s, lr=3.61e-5, step_loss=0.0205]
Steps:  59%|█████▉    | 590/1000 [09:51<06:39,  1.03it/s, lr=3.61e-5, step_loss=0.0746]
Steps:  59%|█████▉    | 590/1000 [09:51<06:39,  1.03it/s, lr=3.61e-5, step_loss=0.0128]
Steps:  59%|█████▉    | 591/1000 [09:51<06:38,  1.03it/s, lr=3.61e-5, step_loss=0.0128]
Steps:  59%|█████▉    | 591/1000 [09:51<06:38,  1.03it/s, lr=3.59e-5, step_loss=0.431] 
Steps:  59%|█████▉    | 591/1000 [09:51<06:38,  1.03it/s, lr=3.59e-5, step_loss=0.0524]
Steps:  59%|█████▉    | 591/1000 [09:52<06:38,  1.03it/s, lr=3.59e-5, step_loss=0.0817]
Steps:  59%|█████▉    | 591/1000 [09:52<06:38,  1.03it/s, lr=3.59e-5, step_loss=0.0777]
Steps:  59%|█████▉    | 592/1000 [09:52<06:37,  1.03it/s, lr=3.59e-5, step_loss=0.0777]
Steps:  59%|█████▉    | 592/1000 [09:52<06:37,  1.03it/s, lr=3.57e-5, step_loss=0.0886]
Steps:  59%|█████▉    | 592/1000 [09:52<06:37,  1.03it/s, lr=3.57e-5, step_loss=0.159] 
Steps:  59%|█████▉    | 592/1000 [09:53<06:37,  1.03it/s, lr=3.57e-5, step_loss=0.00682]
Steps:  59%|█████▉    | 592/1000 [09:53<06:37,  1.03it/s, lr=3.57e-5, step_loss=0.0959] 
Steps:  59%|█████▉    | 593/1000 [09:53<06:36,  1.03it/s, lr=3.57e-5, step_loss=0.0959]
Steps:  59%|█████▉    | 593/1000 [09:53<06:36,  1.03it/s, lr=3.56e-5, step_loss=0.0143]
Steps:  59%|█████▉    | 593/1000 [09:53<06:36,  1.03it/s, lr=3.56e-5, step_loss=0.0315]
Steps:  59%|█████▉    | 593/1000 [09:54<06:36,  1.03it/s, lr=3.56e-5, step_loss=0.269] 
Steps:  59%|█████▉    | 593/1000 [09:54<06:36,  1.03it/s, lr=3.56e-5, step_loss=0.0337]
Steps:  59%|█████▉    | 594/1000 [09:54<06:35,  1.03it/s, lr=3.56e-5, step_loss=0.0337]
Steps:  59%|█████▉    | 594/1000 [09:54<06:35,  1.03it/s, lr=3.54e-5, step_loss=0.093] 
Steps:  59%|█████▉    | 594/1000 [09:54<06:35,  1.03it/s, lr=3.54e-5, step_loss=0.492]
Steps:  59%|█████▉    | 594/1000 [09:55<06:35,  1.03it/s, lr=3.54e-5, step_loss=0.0292]
Steps:  59%|█████▉    | 594/1000 [09:55<06:35,  1.03it/s, lr=3.54e-5, step_loss=0.0736]
Steps:  60%|█████▉    | 595/1000 [09:55<06:34,  1.03it/s, lr=3.54e-5, step_loss=0.0736]
Steps:  60%|█████▉    | 595/1000 [09:55<06:34,  1.03it/s, lr=3.53e-5, step_loss=0.0857]
Steps:  60%|█████▉    | 595/1000 [09:55<06:34,  1.03it/s, lr=3.53e-5, step_loss=0.00812]
Steps:  60%|█████▉    | 595/1000 [09:56<06:34,  1.03it/s, lr=3.53e-5, step_loss=0.0233] 
Steps:  60%|█████▉    | 595/1000 [09:56<06:34,  1.03it/s, lr=3.53e-5, step_loss=0.0379]
Steps:  60%|█████▉    | 596/1000 [09:56<06:33,  1.03it/s, lr=3.53e-5, step_loss=0.0379]
Steps:  60%|█████▉    | 596/1000 [09:56<06:33,  1.03it/s, lr=3.51e-5, step_loss=0.0603]
Steps:  60%|█████▉    | 596/1000 [09:56<06:33,  1.03it/s, lr=3.51e-5, step_loss=0.0132]
Steps:  60%|█████▉    | 596/1000 [09:57<06:33,  1.03it/s, lr=3.51e-5, step_loss=0.00472]
Steps:  60%|█████▉    | 596/1000 [09:57<06:33,  1.03it/s, lr=3.51e-5, step_loss=0.065]  
Steps:  60%|█████▉    | 597/1000 [09:57<06:32,  1.03it/s, lr=3.51e-5, step_loss=0.065]
Steps:  60%|█████▉    | 597/1000 [09:57<06:32,  1.03it/s, lr=3.5e-5, step_loss=0.0538]
Steps:  60%|█████▉    | 597/1000 [09:57<06:32,  1.03it/s, lr=3.5e-5, step_loss=0.00342]
Steps:  60%|█████▉    | 597/1000 [09:58<06:32,  1.03it/s, lr=3.5e-5, step_loss=0.0258] 
Steps:  60%|█████▉    | 597/1000 [09:58<06:32,  1.03it/s, lr=3.5e-5, step_loss=0.0637]
Steps:  60%|█████▉    | 598/1000 [09:58<06:31,  1.03it/s, lr=3.5e-5, step_loss=0.0637]
Steps:  60%|█████▉    | 598/1000 [09:58<06:31,  1.03it/s, lr=3.48e-5, step_loss=0.018]
Steps:  60%|█████▉    | 598/1000 [09:58<06:31,  1.03it/s, lr=3.48e-5, step_loss=0.0397]
Steps:  60%|█████▉    | 598/1000 [09:59<06:31,  1.03it/s, lr=3.48e-5, step_loss=0.0736]
Steps:  60%|█████▉    | 598/1000 [09:59<06:31,  1.03it/s, lr=3.48e-5, step_loss=0.121] 
Steps:  60%|█████▉    | 599/1000 [09:59<06:30,  1.03it/s, lr=3.48e-5, step_loss=0.121]
Steps:  60%|█████▉    | 599/1000 [09:59<06:30,  1.03it/s, lr=3.47e-5, step_loss=0.134]
Steps:  60%|█████▉    | 599/1000 [09:59<06:30,  1.03it/s, lr=3.47e-5, step_loss=0.021]
Steps:  60%|█████▉    | 599/1000 [10:00<06:30,  1.03it/s, lr=3.47e-5, step_loss=0.189]
Steps:  60%|█████▉    | 599/1000 [10:00<06:30,  1.03it/s, lr=3.47e-5, step_loss=0.00424]
Steps:  60%|██████    | 600/1000 [10:00<06:29,  1.03it/s, lr=3.47e-5, step_loss=0.00424]
Steps:  60%|██████    | 600/1000 [10:00<06:29,  1.03it/s, lr=3.45e-5, step_loss=0.0183] 
Steps:  60%|██████    | 600/1000 [10:00<06:29,  1.03it/s, lr=3.45e-5, step_loss=0.00272]
Steps:  60%|██████    | 600/1000 [10:00<06:29,  1.03it/s, lr=3.45e-5, step_loss=0.00219]
Steps:  60%|██████    | 600/1000 [10:01<06:29,  1.03it/s, lr=3.45e-5, step_loss=0.0163] 
Steps:  60%|██████    | 601/1000 [10:01<06:28,  1.03it/s, lr=3.45e-5, step_loss=0.0163]
Steps:  60%|██████    | 601/1000 [10:01<06:28,  1.03it/s, lr=3.44e-5, step_loss=0.00425]
Steps:  60%|██████    | 601/1000 [10:01<06:28,  1.03it/s, lr=3.44e-5, step_loss=0.00463]
Steps:  60%|██████    | 601/1000 [10:01<06:28,  1.03it/s, lr=3.44e-5, step_loss=0.133]  
Steps:  60%|██████    | 601/1000 [10:02<06:28,  1.03it/s, lr=3.44e-5, step_loss=0.0398]
Steps:  60%|██████    | 602/1000 [10:02<06:27,  1.03it/s, lr=3.44e-5, step_loss=0.0398]
Steps:  60%|██████    | 602/1000 [10:02<06:27,  1.03it/s, lr=3.43e-5, step_loss=0.122] 
Steps:  60%|██████    | 602/1000 [10:02<06:27,  1.03it/s, lr=3.43e-5, step_loss=0.0307]
Steps:  60%|██████    | 602/1000 [10:02<06:27,  1.03it/s, lr=3.43e-5, step_loss=0.00509]
Steps:  60%|██████    | 602/1000 [10:03<06:27,  1.03it/s, lr=3.43e-5, step_loss=0.235]  
Steps:  60%|██████    | 603/1000 [10:03<06:26,  1.03it/s, lr=3.43e-5, step_loss=0.235]
Steps:  60%|██████    | 603/1000 [10:03<06:26,  1.03it/s, lr=3.41e-5, step_loss=0.212]
Steps:  60%|██████    | 603/1000 [10:03<06:26,  1.03it/s, lr=3.41e-5, step_loss=0.0453]
Steps:  60%|██████    | 603/1000 [10:03<06:26,  1.03it/s, lr=3.41e-5, step_loss=0.0069]
Steps:  60%|██████    | 603/1000 [10:04<06:26,  1.03it/s, lr=3.41e-5, step_loss=0.0344]
Steps:  60%|██████    | 604/1000 [10:04<06:25,  1.03it/s, lr=3.41e-5, step_loss=0.0344]
Steps:  60%|██████    | 604/1000 [10:04<06:25,  1.03it/s, lr=3.4e-5, step_loss=0.0437] 
Steps:  60%|██████    | 604/1000 [10:04<06:25,  1.03it/s, lr=3.4e-5, step_loss=0.298] 
Steps:  60%|██████    | 604/1000 [10:04<06:25,  1.03it/s, lr=3.4e-5, step_loss=0.231]
Steps:  60%|██████    | 604/1000 [10:05<06:25,  1.03it/s, lr=3.4e-5, step_loss=0.0472]
Steps:  60%|██████    | 605/1000 [10:05<06:24,  1.03it/s, lr=3.4e-5, step_loss=0.0472]
Steps:  60%|██████    | 605/1000 [10:05<06:24,  1.03it/s, lr=3.38e-5, step_loss=0.00991]
Steps:  60%|██████    | 605/1000 [10:05<06:24,  1.03it/s, lr=3.38e-5, step_loss=0.0996] 
Steps:  60%|██████    | 605/1000 [10:05<06:24,  1.03it/s, lr=3.38e-5, step_loss=0.0113]
Steps:  60%|██████    | 605/1000 [10:06<06:24,  1.03it/s, lr=3.38e-5, step_loss=0.129] 
Steps:  61%|██████    | 606/1000 [10:06<06:23,  1.03it/s, lr=3.38e-5, step_loss=0.129]
Steps:  61%|██████    | 606/1000 [10:06<06:23,  1.03it/s, lr=3.37e-5, step_loss=0.0285]
Steps:  61%|██████    | 606/1000 [10:06<06:23,  1.03it/s, lr=3.37e-5, step_loss=0.269] 
Steps:  61%|██████    | 606/1000 [10:06<06:23,  1.03it/s, lr=3.37e-5, step_loss=0.0152]
Steps:  61%|██████    | 606/1000 [10:07<06:23,  1.03it/s, lr=3.37e-5, step_loss=0.317] 
Steps:  61%|██████    | 607/1000 [10:07<06:23,  1.03it/s, lr=3.37e-5, step_loss=0.317]
Steps:  61%|██████    | 607/1000 [10:07<06:23,  1.03it/s, lr=3.35e-5, step_loss=0.0192]
Steps:  61%|██████    | 607/1000 [10:07<06:23,  1.03it/s, lr=3.35e-5, step_loss=0.0304]
Steps:  61%|██████    | 607/1000 [10:07<06:23,  1.03it/s, lr=3.35e-5, step_loss=0.00485]
Steps:  61%|██████    | 607/1000 [10:08<06:23,  1.03it/s, lr=3.35e-5, step_loss=0.066]  
Steps:  61%|██████    | 608/1000 [10:08<06:21,  1.03it/s, lr=3.35e-5, step_loss=0.066]
Steps:  61%|██████    | 608/1000 [10:08<06:21,  1.03it/s, lr=3.34e-5, step_loss=0.00598]
Steps:  61%|██████    | 608/1000 [10:08<06:21,  1.03it/s, lr=3.34e-5, step_loss=0.146]  
Steps:  61%|██████    | 608/1000 [10:08<06:21,  1.03it/s, lr=3.34e-5, step_loss=0.093]
Steps:  61%|██████    | 608/1000 [10:09<06:21,  1.03it/s, lr=3.34e-5, step_loss=0.155]
Steps:  61%|██████    | 609/1000 [10:09<06:20,  1.03it/s, lr=3.34e-5, step_loss=0.155]
Steps:  61%|██████    | 609/1000 [10:09<06:20,  1.03it/s, lr=3.32e-5, step_loss=0.00424]
Steps:  61%|██████    | 609/1000 [10:09<06:20,  1.03it/s, lr=3.32e-5, step_loss=0.0236] 
Steps:  61%|██████    | 609/1000 [10:09<06:20,  1.03it/s, lr=3.32e-5, step_loss=0.0286]
Steps:  61%|██████    | 609/1000 [10:09<06:20,  1.03it/s, lr=3.32e-5, step_loss=0.00787]
Steps:  61%|██████    | 610/1000 [10:10<06:19,  1.03it/s, lr=3.32e-5, step_loss=0.00787]
Steps:  61%|██████    | 610/1000 [10:10<06:19,  1.03it/s, lr=3.31e-5, step_loss=0.02]   
Steps:  61%|██████    | 610/1000 [10:10<06:19,  1.03it/s, lr=3.31e-5, step_loss=0.315]
Steps:  61%|██████    | 610/1000 [10:10<06:19,  1.03it/s, lr=3.31e-5, step_loss=0.127]
Steps:  61%|██████    | 610/1000 [10:10<06:19,  1.03it/s, lr=3.31e-5, step_loss=0.0119]
Steps:  61%|██████    | 611/1000 [10:11<06:19,  1.03it/s, lr=3.31e-5, step_loss=0.0119]
Steps:  61%|██████    | 611/1000 [10:11<06:19,  1.03it/s, lr=3.29e-5, step_loss=0.00288]
Steps:  61%|██████    | 612/1000 [10:11<04:59,  1.29it/s, lr=3.29e-5, step_loss=0.00288]
Steps:  61%|██████    | 612/1000 [10:11<04:59,  1.29it/s, lr=3.28e-5, step_loss=0.129]  {'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 18.28it/s]Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of runwayml/stable-diffusion-v1-5.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 14.00it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.23it/s]
07/28/2024 20:46:14 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A naruto with blue eyes..
Steps:  61%|██████    | 612/1000 [10:21<04:59,  1.29it/s, lr=3.28e-5, step_loss=0.103]
Steps:  61%|██████    | 612/1000 [10:21<04:59,  1.29it/s, lr=3.28e-5, step_loss=0.0698]
Steps:  61%|██████    | 612/1000 [10:21<04:59,  1.29it/s, lr=3.28e-5, step_loss=0.0928]
Steps:  61%|██████▏   | 613/1000 [10:22<24:02,  3.73s/it, lr=3.28e-5, step_loss=0.0928]
Steps:  61%|██████▏   | 613/1000 [10:22<24:02,  3.73s/it, lr=3.26e-5, step_loss=0.0185]
Steps:  61%|██████▏   | 613/1000 [10:22<24:02,  3.73s/it, lr=3.26e-5, step_loss=0.00468]
Steps:  61%|██████▏   | 613/1000 [10:22<24:02,  3.73s/it, lr=3.26e-5, step_loss=0.0121] 
Steps:  61%|██████▏   | 613/1000 [10:22<24:02,  3.73s/it, lr=3.26e-5, step_loss=0.00441]
Steps:  61%|██████▏   | 614/1000 [10:23<18:39,  2.90s/it, lr=3.26e-5, step_loss=0.00441]
Steps:  61%|██████▏   | 614/1000 [10:23<18:39,  2.90s/it, lr=3.25e-5, step_loss=0.104]  
Steps:  61%|██████▏   | 614/1000 [10:23<18:39,  2.90s/it, lr=3.25e-5, step_loss=0.0108]
Steps:  61%|██████▏   | 614/1000 [10:23<18:39,  2.90s/it, lr=3.25e-5, step_loss=0.0138]
Steps:  61%|██████▏   | 614/1000 [10:23<18:39,  2.90s/it, lr=3.25e-5, step_loss=0.165] 
Steps:  62%|██████▏   | 615/1000 [10:24<14:53,  2.32s/it, lr=3.25e-5, step_loss=0.165]
Steps:  62%|██████▏   | 615/1000 [10:24<14:53,  2.32s/it, lr=3.23e-5, step_loss=0.0235]
Steps:  62%|██████▏   | 615/1000 [10:24<14:53,  2.32s/it, lr=3.23e-5, step_loss=0.0447]
Steps:  62%|██████▏   | 615/1000 [10:24<14:53,  2.32s/it, lr=3.23e-5, step_loss=0.0412]
Steps:  62%|██████▏   | 615/1000 [10:24<14:53,  2.32s/it, lr=3.23e-5, step_loss=0.147] 
Steps:  62%|██████▏   | 616/1000 [10:25<12:16,  1.92s/it, lr=3.23e-5, step_loss=0.147]
Steps:  62%|██████▏   | 616/1000 [10:25<12:16,  1.92s/it, lr=3.22e-5, step_loss=0.0224]
Steps:  62%|██████▏   | 616/1000 [10:25<12:16,  1.92s/it, lr=3.22e-5, step_loss=0.0371]
Steps:  62%|██████▏   | 616/1000 [10:25<12:16,  1.92s/it, lr=3.22e-5, step_loss=0.0475]
Steps:  62%|██████▏   | 616/1000 [10:25<12:16,  1.92s/it, lr=3.22e-5, step_loss=0.0668]
Steps:  62%|██████▏   | 617/1000 [10:25<10:25,  1.63s/it, lr=3.22e-5, step_loss=0.0668]
Steps:  62%|██████▏   | 617/1000 [10:26<10:25,  1.63s/it, lr=3.2e-5, step_loss=0.0182] 
Steps:  62%|██████▏   | 617/1000 [10:26<10:25,  1.63s/it, lr=3.2e-5, step_loss=0.0145]
Steps:  62%|██████▏   | 617/1000 [10:26<10:25,  1.63s/it, lr=3.2e-5, step_loss=0.0239]
Steps:  62%|██████▏   | 617/1000 [10:26<10:25,  1.63s/it, lr=3.2e-5, step_loss=0.0852]
Steps:  62%|██████▏   | 618/1000 [10:26<09:08,  1.44s/it, lr=3.2e-5, step_loss=0.0852]
Steps:  62%|██████▏   | 618/1000 [10:26<09:08,  1.44s/it, lr=3.19e-5, step_loss=0.109]
Steps:  62%|██████▏   | 618/1000 [10:27<09:08,  1.44s/it, lr=3.19e-5, step_loss=0.042]
Steps:  62%|██████▏   | 618/1000 [10:27<09:08,  1.44s/it, lr=3.19e-5, step_loss=0.283]
Steps:  62%|██████▏   | 618/1000 [10:27<09:08,  1.44s/it, lr=3.19e-5, step_loss=0.00473]
Steps:  62%|██████▏   | 619/1000 [10:27<08:14,  1.30s/it, lr=3.19e-5, step_loss=0.00473]
Steps:  62%|██████▏   | 619/1000 [10:27<08:14,  1.30s/it, lr=3.17e-5, step_loss=0.418]  
Steps:  62%|██████▏   | 619/1000 [10:28<08:14,  1.30s/it, lr=3.17e-5, step_loss=0.00277]
Steps:  62%|██████▏   | 619/1000 [10:28<08:14,  1.30s/it, lr=3.17e-5, step_loss=0.0857] 
Steps:  62%|██████▏   | 619/1000 [10:28<08:14,  1.30s/it, lr=3.17e-5, step_loss=0.261] 
Steps:  62%|██████▏   | 620/1000 [10:28<07:36,  1.20s/it, lr=3.17e-5, step_loss=0.261]
Steps:  62%|██████▏   | 620/1000 [10:28<07:36,  1.20s/it, lr=3.16e-5, step_loss=0.144]
Steps:  62%|██████▏   | 620/1000 [10:29<07:36,  1.20s/it, lr=3.16e-5, step_loss=0.0688]
Steps:  62%|██████▏   | 620/1000 [10:29<07:36,  1.20s/it, lr=3.16e-5, step_loss=0.0188]
Steps:  62%|██████▏   | 620/1000 [10:29<07:36,  1.20s/it, lr=3.16e-5, step_loss=0.134] 
Steps:  62%|██████▏   | 621/1000 [10:29<07:09,  1.13s/it, lr=3.16e-5, step_loss=0.134]
Steps:  62%|██████▏   | 621/1000 [10:29<07:09,  1.13s/it, lr=3.14e-5, step_loss=0.0741]
Steps:  62%|██████▏   | 621/1000 [10:30<07:09,  1.13s/it, lr=3.14e-5, step_loss=0.0782]
Steps:  62%|██████▏   | 621/1000 [10:30<07:09,  1.13s/it, lr=3.14e-5, step_loss=0.0612]
Steps:  62%|██████▏   | 621/1000 [10:30<07:09,  1.13s/it, lr=3.14e-5, step_loss=0.0448]
Steps:  62%|██████▏   | 622/1000 [10:30<06:49,  1.08s/it, lr=3.14e-5, step_loss=0.0448]
Steps:  62%|██████▏   | 622/1000 [10:30<06:49,  1.08s/it, lr=3.13e-5, step_loss=0.0238]
Steps:  62%|██████▏   | 622/1000 [10:31<06:49,  1.08s/it, lr=3.13e-5, step_loss=0.107] 
Steps:  62%|██████▏   | 622/1000 [10:31<06:49,  1.08s/it, lr=3.13e-5, step_loss=0.106]
Steps:  62%|██████▏   | 622/1000 [10:31<06:49,  1.08s/it, lr=3.13e-5, step_loss=0.0542]
Steps:  62%|██████▏   | 623/1000 [10:31<06:36,  1.05s/it, lr=3.13e-5, step_loss=0.0542]
Steps:  62%|██████▏   | 623/1000 [10:31<06:36,  1.05s/it, lr=3.12e-5, step_loss=0.00626]
Steps:  62%|██████▏   | 623/1000 [10:32<06:36,  1.05s/it, lr=3.12e-5, step_loss=0.0156] 
Steps:  62%|██████▏   | 623/1000 [10:32<06:36,  1.05s/it, lr=3.12e-5, step_loss=0.00733]
Steps:  62%|██████▏   | 623/1000 [10:32<06:36,  1.05s/it, lr=3.12e-5, step_loss=0.0444] 
Steps:  62%|██████▏   | 624/1000 [10:32<06:26,  1.03s/it, lr=3.12e-5, step_loss=0.0444]
Steps:  62%|██████▏   | 624/1000 [10:32<06:26,  1.03s/it, lr=3.1e-5, step_loss=0.0995] 
Steps:  62%|██████▏   | 624/1000 [10:33<06:26,  1.03s/it, lr=3.1e-5, step_loss=0.0993]
Steps:  62%|██████▏   | 624/1000 [10:33<06:26,  1.03s/it, lr=3.1e-5, step_loss=0.0616]
Steps:  62%|██████▏   | 624/1000 [10:33<06:26,  1.03s/it, lr=3.1e-5, step_loss=0.00919]
Steps:  62%|██████▎   | 625/1000 [10:33<06:19,  1.01s/it, lr=3.1e-5, step_loss=0.00919]
Steps:  62%|██████▎   | 625/1000 [10:33<06:19,  1.01s/it, lr=3.09e-5, step_loss=0.00569]
Steps:  62%|██████▎   | 625/1000 [10:34<06:19,  1.01s/it, lr=3.09e-5, step_loss=0.0435] 
Steps:  62%|██████▎   | 625/1000 [10:34<06:19,  1.01s/it, lr=3.09e-5, step_loss=0.0538]
Steps:  62%|██████▎   | 625/1000 [10:34<06:19,  1.01s/it, lr=3.09e-5, step_loss=0.0158]
Steps:  63%|██████▎   | 626/1000 [10:34<06:14,  1.00s/it, lr=3.09e-5, step_loss=0.0158]
Steps:  63%|██████▎   | 626/1000 [10:34<06:14,  1.00s/it, lr=3.07e-5, step_loss=0.272] 
Steps:  63%|██████▎   | 626/1000 [10:35<06:14,  1.00s/it, lr=3.07e-5, step_loss=0.0908]
Steps:  63%|██████▎   | 626/1000 [10:35<06:14,  1.00s/it, lr=3.07e-5, step_loss=0.0428]
Steps:  63%|██████▎   | 626/1000 [10:35<06:14,  1.00s/it, lr=3.07e-5, step_loss=0.0122]
Steps:  63%|██████▎   | 627/1000 [10:35<06:10,  1.01it/s, lr=3.07e-5, step_loss=0.0122]
Steps:  63%|██████▎   | 627/1000 [10:35<06:10,  1.01it/s, lr=3.06e-5, step_loss=0.152] 
Steps:  63%|██████▎   | 627/1000 [10:36<06:10,  1.01it/s, lr=3.06e-5, step_loss=0.198]
Steps:  63%|██████▎   | 627/1000 [10:36<06:10,  1.01it/s, lr=3.06e-5, step_loss=0.00371]
Steps:  63%|██████▎   | 627/1000 [10:36<06:10,  1.01it/s, lr=3.06e-5, step_loss=0.13]   
Steps:  63%|██████▎   | 628/1000 [10:36<06:07,  1.01it/s, lr=3.06e-5, step_loss=0.13]
Steps:  63%|██████▎   | 628/1000 [10:36<06:07,  1.01it/s, lr=3.04e-5, step_loss=0.0453]
Steps:  63%|██████▎   | 628/1000 [10:36<06:07,  1.01it/s, lr=3.04e-5, step_loss=0.0581]
Steps:  63%|██████▎   | 628/1000 [10:37<06:07,  1.01it/s, lr=3.04e-5, step_loss=0.00987]
Steps:  63%|██████▎   | 628/1000 [10:37<06:07,  1.01it/s, lr=3.04e-5, step_loss=0.181]  
Steps:  63%|██████▎   | 629/1000 [10:37<06:04,  1.02it/s, lr=3.04e-5, step_loss=0.181]
Steps:  63%|██████▎   | 629/1000 [10:37<06:04,  1.02it/s, lr=3.03e-5, step_loss=0.0493]
Steps:  63%|██████▎   | 629/1000 [10:37<06:04,  1.02it/s, lr=3.03e-5, step_loss=0.0457]
Steps:  63%|██████▎   | 629/1000 [10:38<06:04,  1.02it/s, lr=3.03e-5, step_loss=0.00316]
Steps:  63%|██████▎   | 629/1000 [10:38<06:04,  1.02it/s, lr=3.03e-5, step_loss=0.134]  
Steps:  63%|██████▎   | 630/1000 [10:38<06:02,  1.02it/s, lr=3.03e-5, step_loss=0.134]
Steps:  63%|██████▎   | 630/1000 [10:38<06:02,  1.02it/s, lr=3.01e-5, step_loss=0.207]
Steps:  63%|██████▎   | 630/1000 [10:38<06:02,  1.02it/s, lr=3.01e-5, step_loss=0.0176]
Steps:  63%|██████▎   | 630/1000 [10:39<06:02,  1.02it/s, lr=3.01e-5, step_loss=0.45]  
Steps:  63%|██████▎   | 630/1000 [10:39<06:02,  1.02it/s, lr=3.01e-5, step_loss=0.0224]
Steps:  63%|██████▎   | 631/1000 [10:39<06:00,  1.02it/s, lr=3.01e-5, step_loss=0.0224]
Steps:  63%|██████▎   | 631/1000 [10:39<06:00,  1.02it/s, lr=3e-5, step_loss=0.139]    
Steps:  63%|██████▎   | 631/1000 [10:39<06:00,  1.02it/s, lr=3e-5, step_loss=0.0413]
Steps:  63%|██████▎   | 631/1000 [10:40<06:00,  1.02it/s, lr=3e-5, step_loss=0.192] 
Steps:  63%|██████▎   | 631/1000 [10:40<06:00,  1.02it/s, lr=3e-5, step_loss=0.164]
Steps:  63%|██████▎   | 632/1000 [10:40<05:59,  1.02it/s, lr=3e-5, step_loss=0.164]
Steps:  63%|██████▎   | 632/1000 [10:40<05:59,  1.02it/s, lr=2.99e-5, step_loss=0.216]
Steps:  63%|██████▎   | 632/1000 [10:40<05:59,  1.02it/s, lr=2.99e-5, step_loss=0.00257]
Steps:  63%|██████▎   | 632/1000 [10:41<05:59,  1.02it/s, lr=2.99e-5, step_loss=0.038]  
Steps:  63%|██████▎   | 632/1000 [10:41<05:59,  1.02it/s, lr=2.99e-5, step_loss=0.0871]
Steps:  63%|██████▎   | 633/1000 [10:41<05:58,  1.02it/s, lr=2.99e-5, step_loss=0.0871]
Steps:  63%|██████▎   | 633/1000 [10:41<05:58,  1.02it/s, lr=2.97e-5, step_loss=0.316] 
Steps:  63%|██████▎   | 633/1000 [10:41<05:58,  1.02it/s, lr=2.97e-5, step_loss=0.0118]
Steps:  63%|██████▎   | 633/1000 [10:42<05:58,  1.02it/s, lr=2.97e-5, step_loss=0.0477]
Steps:  63%|██████▎   | 633/1000 [10:42<05:58,  1.02it/s, lr=2.97e-5, step_loss=0.305] 
Steps:  63%|██████▎   | 634/1000 [10:42<05:57,  1.02it/s, lr=2.97e-5, step_loss=0.305]
Steps:  63%|██████▎   | 634/1000 [10:42<05:57,  1.02it/s, lr=2.96e-5, step_loss=0.14] 
Steps:  63%|██████▎   | 634/1000 [10:42<05:57,  1.02it/s, lr=2.96e-5, step_loss=0.0685]
Steps:  63%|██████▎   | 634/1000 [10:43<05:57,  1.02it/s, lr=2.96e-5, step_loss=0.0563]
Steps:  63%|██████▎   | 634/1000 [10:43<05:57,  1.02it/s, lr=2.96e-5, step_loss=0.00855]
Steps:  64%|██████▎   | 635/1000 [10:43<05:55,  1.03it/s, lr=2.96e-5, step_loss=0.00855]
Steps:  64%|██████▎   | 635/1000 [10:43<05:55,  1.03it/s, lr=2.94e-5, step_loss=0.295]  
Steps:  64%|██████▎   | 635/1000 [10:43<05:55,  1.03it/s, lr=2.94e-5, step_loss=0.0125]
Steps:  64%|██████▎   | 635/1000 [10:44<05:55,  1.03it/s, lr=2.94e-5, step_loss=0.147] 
Steps:  64%|██████▎   | 635/1000 [10:44<05:55,  1.03it/s, lr=2.94e-5, step_loss=0.0628]
Steps:  64%|██████▎   | 636/1000 [10:44<05:54,  1.03it/s, lr=2.94e-5, step_loss=0.0628]
Steps:  64%|██████▎   | 636/1000 [10:44<05:54,  1.03it/s, lr=2.93e-5, step_loss=0.0258]
Steps:  64%|██████▎   | 636/1000 [10:44<05:54,  1.03it/s, lr=2.93e-5, step_loss=0.0969]
Steps:  64%|██████▎   | 636/1000 [10:45<05:54,  1.03it/s, lr=2.93e-5, step_loss=0.0528]
Steps:  64%|██████▎   | 636/1000 [10:45<05:54,  1.03it/s, lr=2.93e-5, step_loss=0.0287]
Steps:  64%|██████▎   | 637/1000 [10:45<05:53,  1.03it/s, lr=2.93e-5, step_loss=0.0287]
Steps:  64%|██████▎   | 637/1000 [10:45<05:53,  1.03it/s, lr=2.91e-5, step_loss=0.473] 
Steps:  64%|██████▎   | 637/1000 [10:45<05:53,  1.03it/s, lr=2.91e-5, step_loss=0.037]
Steps:  64%|██████▎   | 637/1000 [10:45<05:53,  1.03it/s, lr=2.91e-5, step_loss=0.108]
Steps:  64%|██████▎   | 637/1000 [10:46<05:53,  1.03it/s, lr=2.91e-5, step_loss=0.00994]
Steps:  64%|██████▍   | 638/1000 [10:46<05:52,  1.03it/s, lr=2.91e-5, step_loss=0.00994]
Steps:  64%|██████▍   | 638/1000 [10:46<05:52,  1.03it/s, lr=2.9e-5, step_loss=0.00844] 
Steps:  64%|██████▍   | 638/1000 [10:46<05:52,  1.03it/s, lr=2.9e-5, step_loss=0.0197] 
Steps:  64%|██████▍   | 638/1000 [10:46<05:52,  1.03it/s, lr=2.9e-5, step_loss=0.201] 
Steps:  64%|██████▍   | 638/1000 [10:47<05:52,  1.03it/s, lr=2.9e-5, step_loss=0.158]
Steps:  64%|██████▍   | 639/1000 [10:47<05:51,  1.03it/s, lr=2.9e-5, step_loss=0.158]
Steps:  64%|██████▍   | 639/1000 [10:47<05:51,  1.03it/s, lr=2.89e-5, step_loss=0.0189]
Steps:  64%|██████▍   | 639/1000 [10:47<05:51,  1.03it/s, lr=2.89e-5, step_loss=0.0271]
Steps:  64%|██████▍   | 639/1000 [10:47<05:51,  1.03it/s, lr=2.89e-5, step_loss=0.0834]
Steps:  64%|██████▍   | 639/1000 [10:48<05:51,  1.03it/s, lr=2.89e-5, step_loss=0.432] 
Steps:  64%|██████▍   | 640/1000 [10:48<05:50,  1.03it/s, lr=2.89e-5, step_loss=0.432]
Steps:  64%|██████▍   | 640/1000 [10:48<05:50,  1.03it/s, lr=2.87e-5, step_loss=0.363]
Steps:  64%|██████▍   | 640/1000 [10:48<05:50,  1.03it/s, lr=2.87e-5, step_loss=0.0207]
Steps:  64%|██████▍   | 640/1000 [10:48<05:50,  1.03it/s, lr=2.87e-5, step_loss=0.0688]
Steps:  64%|██████▍   | 640/1000 [10:49<05:50,  1.03it/s, lr=2.87e-5, step_loss=0.0657]
Steps:  64%|██████▍   | 641/1000 [10:49<05:49,  1.03it/s, lr=2.87e-5, step_loss=0.0657]
Steps:  64%|██████▍   | 641/1000 [10:49<05:49,  1.03it/s, lr=2.86e-5, step_loss=0.0347]
Steps:  64%|██████▍   | 641/1000 [10:49<05:49,  1.03it/s, lr=2.86e-5, step_loss=0.0816]
Steps:  64%|██████▍   | 641/1000 [10:49<05:49,  1.03it/s, lr=2.86e-5, step_loss=0.00267]
Steps:  64%|██████▍   | 641/1000 [10:50<05:49,  1.03it/s, lr=2.86e-5, step_loss=0.198]  
Steps:  64%|██████▍   | 642/1000 [10:50<05:48,  1.03it/s, lr=2.86e-5, step_loss=0.198]
Steps:  64%|██████▍   | 642/1000 [10:50<05:48,  1.03it/s, lr=2.84e-5, step_loss=0.0259]
Steps:  64%|██████▍   | 642/1000 [10:50<05:48,  1.03it/s, lr=2.84e-5, step_loss=0.00302]
Steps:  64%|██████▍   | 642/1000 [10:50<05:48,  1.03it/s, lr=2.84e-5, step_loss=0.0356] 
Steps:  64%|██████▍   | 642/1000 [10:51<05:48,  1.03it/s, lr=2.84e-5, step_loss=0.297] 
Steps:  64%|██████▍   | 643/1000 [10:51<05:47,  1.03it/s, lr=2.84e-5, step_loss=0.297]
Steps:  64%|██████▍   | 643/1000 [10:51<05:47,  1.03it/s, lr=2.83e-5, step_loss=0.0686]
Steps:  64%|██████▍   | 643/1000 [10:51<05:47,  1.03it/s, lr=2.83e-5, step_loss=0.0533]
Steps:  64%|██████▍   | 643/1000 [10:51<05:47,  1.03it/s, lr=2.83e-5, step_loss=0.124] 
Steps:  64%|██████▍   | 643/1000 [10:52<05:47,  1.03it/s, lr=2.83e-5, step_loss=0.436]
Steps:  64%|██████▍   | 644/1000 [10:52<05:46,  1.03it/s, lr=2.83e-5, step_loss=0.436]
Steps:  64%|██████▍   | 644/1000 [10:52<05:46,  1.03it/s, lr=2.81e-5, step_loss=0.118]
Steps:  64%|██████▍   | 644/1000 [10:52<05:46,  1.03it/s, lr=2.81e-5, step_loss=0.0869]
Steps:  64%|██████▍   | 644/1000 [10:52<05:46,  1.03it/s, lr=2.81e-5, step_loss=0.0491]
Steps:  64%|██████▍   | 644/1000 [10:53<05:46,  1.03it/s, lr=2.81e-5, step_loss=0.211] 
Steps:  64%|██████▍   | 645/1000 [10:53<05:45,  1.03it/s, lr=2.81e-5, step_loss=0.211]
Steps:  64%|██████▍   | 645/1000 [10:53<05:45,  1.03it/s, lr=2.8e-5, step_loss=0.0654]
Steps:  64%|██████▍   | 645/1000 [10:53<05:45,  1.03it/s, lr=2.8e-5, step_loss=0.0256]
Steps:  64%|██████▍   | 645/1000 [10:53<05:45,  1.03it/s, lr=2.8e-5, step_loss=0.0428]
Steps:  64%|██████▍   | 645/1000 [10:54<05:45,  1.03it/s, lr=2.8e-5, step_loss=0.107] 
Steps:  65%|██████▍   | 646/1000 [10:54<05:44,  1.03it/s, lr=2.8e-5, step_loss=0.107]
Steps:  65%|██████▍   | 646/1000 [10:54<05:44,  1.03it/s, lr=2.79e-5, step_loss=0.0131]
Steps:  65%|██████▍   | 646/1000 [10:54<05:44,  1.03it/s, lr=2.79e-5, step_loss=0.0347]
Steps:  65%|██████▍   | 646/1000 [10:54<05:44,  1.03it/s, lr=2.79e-5, step_loss=0.111] 
Steps:  65%|██████▍   | 646/1000 [10:54<05:44,  1.03it/s, lr=2.79e-5, step_loss=0.0502]
Steps:  65%|██████▍   | 647/1000 [10:55<05:43,  1.03it/s, lr=2.79e-5, step_loss=0.0502]
Steps:  65%|██████▍   | 647/1000 [10:55<05:43,  1.03it/s, lr=2.77e-5, step_loss=0.314] 
Steps:  65%|██████▍   | 647/1000 [10:55<05:43,  1.03it/s, lr=2.77e-5, step_loss=0.00477]
Steps:  65%|██████▍   | 647/1000 [10:55<05:43,  1.03it/s, lr=2.77e-5, step_loss=0.0176] 
Steps:  65%|██████▍   | 647/1000 [10:55<05:43,  1.03it/s, lr=2.77e-5, step_loss=0.097] 
Steps:  65%|██████▍   | 648/1000 [10:56<05:42,  1.03it/s, lr=2.77e-5, step_loss=0.097]
Steps:  65%|██████▍   | 648/1000 [10:56<05:42,  1.03it/s, lr=2.76e-5, step_loss=0.0296]
Steps:  65%|██████▍   | 648/1000 [10:56<05:42,  1.03it/s, lr=2.76e-5, step_loss=0.0698]
Steps:  65%|██████▍   | 648/1000 [10:56<05:42,  1.03it/s, lr=2.76e-5, step_loss=0.00596]
Steps:  65%|██████▍   | 648/1000 [10:56<05:42,  1.03it/s, lr=2.76e-5, step_loss=0.0603] 
Steps:  65%|██████▍   | 649/1000 [10:57<05:42,  1.03it/s, lr=2.76e-5, step_loss=0.0603]
Steps:  65%|██████▍   | 649/1000 [10:57<05:42,  1.03it/s, lr=2.74e-5, step_loss=0.017] 
Steps:  65%|██████▍   | 649/1000 [10:57<05:42,  1.03it/s, lr=2.74e-5, step_loss=0.0119]
Steps:  65%|██████▍   | 649/1000 [10:57<05:42,  1.03it/s, lr=2.74e-5, step_loss=0.005] 
Steps:  65%|██████▍   | 649/1000 [10:57<05:42,  1.03it/s, lr=2.74e-5, step_loss=0.0581]
Steps:  65%|██████▌   | 650/1000 [10:58<05:41,  1.03it/s, lr=2.74e-5, step_loss=0.0581]
Steps:  65%|██████▌   | 650/1000 [10:58<05:41,  1.03it/s, lr=2.73e-5, step_loss=0.00605]
Steps:  65%|██████▌   | 650/1000 [10:58<05:41,  1.03it/s, lr=2.73e-5, step_loss=0.105]  
Steps:  65%|██████▌   | 650/1000 [10:58<05:41,  1.03it/s, lr=2.73e-5, step_loss=0.0224]
Steps:  65%|██████▌   | 650/1000 [10:58<05:41,  1.03it/s, lr=2.73e-5, step_loss=0.0527]
Steps:  65%|██████▌   | 651/1000 [10:59<05:40,  1.03it/s, lr=2.73e-5, step_loss=0.0527]
Steps:  65%|██████▌   | 651/1000 [10:59<05:40,  1.03it/s, lr=2.72e-5, step_loss=0.00288]
Steps:  65%|██████▌   | 651/1000 [10:59<05:40,  1.03it/s, lr=2.72e-5, step_loss=0.245]  
Steps:  65%|██████▌   | 651/1000 [10:59<05:40,  1.03it/s, lr=2.72e-5, step_loss=0.0114]
Steps:  65%|██████▌   | 651/1000 [10:59<05:40,  1.03it/s, lr=2.72e-5, step_loss=0.0479]
Steps:  65%|██████▌   | 652/1000 [11:00<05:39,  1.03it/s, lr=2.72e-5, step_loss=0.0479]
Steps:  65%|██████▌   | 652/1000 [11:00<05:39,  1.03it/s, lr=2.7e-5, step_loss=0.175]  
Steps:  65%|██████▌   | 652/1000 [11:00<05:39,  1.03it/s, lr=2.7e-5, step_loss=0.0082]
Steps:  65%|██████▌   | 652/1000 [11:00<05:39,  1.03it/s, lr=2.7e-5, step_loss=0.255] 
Steps:  65%|██████▌   | 652/1000 [11:00<05:39,  1.03it/s, lr=2.7e-5, step_loss=0.0141]
Steps:  65%|██████▌   | 653/1000 [11:01<05:37,  1.03it/s, lr=2.7e-5, step_loss=0.0141]
Steps:  65%|██████▌   | 653/1000 [11:01<05:37,  1.03it/s, lr=2.69e-5, step_loss=0.0546]
Steps:  65%|██████▌   | 653/1000 [11:01<05:37,  1.03it/s, lr=2.69e-5, step_loss=0.00472]
Steps:  65%|██████▌   | 653/1000 [11:01<05:37,  1.03it/s, lr=2.69e-5, step_loss=0.0179] 
Steps:  65%|██████▌   | 653/1000 [11:01<05:37,  1.03it/s, lr=2.69e-5, step_loss=0.125] 
Steps:  65%|██████▌   | 654/1000 [11:02<05:37,  1.03it/s, lr=2.69e-5, step_loss=0.125]
Steps:  65%|██████▌   | 654/1000 [11:02<05:37,  1.03it/s, lr=2.67e-5, step_loss=0.19] 
Steps:  65%|██████▌   | 654/1000 [11:02<05:37,  1.03it/s, lr=2.67e-5, step_loss=0.0475]
Steps:  65%|██████▌   | 654/1000 [11:02<05:37,  1.03it/s, lr=2.67e-5, step_loss=0.0437]
Steps:  65%|██████▌   | 654/1000 [11:02<05:37,  1.03it/s, lr=2.67e-5, step_loss=0.0498]
Steps:  66%|██████▌   | 655/1000 [11:02<05:36,  1.03it/s, lr=2.67e-5, step_loss=0.0498]
Steps:  66%|██████▌   | 655/1000 [11:03<05:36,  1.03it/s, lr=2.66e-5, step_loss=0.067] 
Steps:  66%|██████▌   | 655/1000 [11:03<05:36,  1.03it/s, lr=2.66e-5, step_loss=0.555]
Steps:  66%|██████▌   | 655/1000 [11:03<05:36,  1.03it/s, lr=2.66e-5, step_loss=0.0499]
Steps:  66%|██████▌   | 655/1000 [11:03<05:36,  1.03it/s, lr=2.66e-5, step_loss=0.0121]
Steps:  66%|██████▌   | 656/1000 [11:03<05:34,  1.03it/s, lr=2.66e-5, step_loss=0.0121]
Steps:  66%|██████▌   | 656/1000 [11:04<05:34,  1.03it/s, lr=2.65e-5, step_loss=0.201] 
Steps:  66%|██████▌   | 656/1000 [11:04<05:34,  1.03it/s, lr=2.65e-5, step_loss=0.27] 
Steps:  66%|██████▌   | 656/1000 [11:04<05:34,  1.03it/s, lr=2.65e-5, step_loss=0.0717]
Steps:  66%|██████▌   | 656/1000 [11:04<05:34,  1.03it/s, lr=2.65e-5, step_loss=0.0028]
Steps:  66%|██████▌   | 657/1000 [11:04<05:34,  1.03it/s, lr=2.65e-5, step_loss=0.0028]
Steps:  66%|██████▌   | 657/1000 [11:04<05:34,  1.03it/s, lr=2.63e-5, step_loss=0.0169]
Steps:  66%|██████▌   | 657/1000 [11:05<05:34,  1.03it/s, lr=2.63e-5, step_loss=0.19]  
Steps:  66%|██████▌   | 657/1000 [11:05<05:34,  1.03it/s, lr=2.63e-5, step_loss=0.0556]
Steps:  66%|██████▌   | 657/1000 [11:05<05:34,  1.03it/s, lr=2.63e-5, step_loss=0.00565]
Steps:  66%|██████▌   | 658/1000 [11:05<05:33,  1.03it/s, lr=2.63e-5, step_loss=0.00565]
Steps:  66%|██████▌   | 658/1000 [11:05<05:33,  1.03it/s, lr=2.62e-5, step_loss=0.035]  
Steps:  66%|██████▌   | 658/1000 [11:06<05:33,  1.03it/s, lr=2.62e-5, step_loss=0.0369]
Steps:  66%|██████▌   | 658/1000 [11:06<05:33,  1.03it/s, lr=2.62e-5, step_loss=0.158] 
Steps:  66%|██████▌   | 658/1000 [11:06<05:33,  1.03it/s, lr=2.62e-5, step_loss=0.00834]
Steps:  66%|██████▌   | 659/1000 [11:06<05:32,  1.03it/s, lr=2.62e-5, step_loss=0.00834]
Steps:  66%|██████▌   | 659/1000 [11:06<05:32,  1.03it/s, lr=2.61e-5, step_loss=0.0194] 
Steps:  66%|██████▌   | 659/1000 [11:07<05:32,  1.03it/s, lr=2.61e-5, step_loss=0.261] 
Steps:  66%|██████▌   | 659/1000 [11:07<05:32,  1.03it/s, lr=2.61e-5, step_loss=0.0144]
Steps:  66%|██████▌   | 659/1000 [11:07<05:32,  1.03it/s, lr=2.61e-5, step_loss=0.0493]
Steps:  66%|██████▌   | 660/1000 [11:07<05:31,  1.03it/s, lr=2.61e-5, step_loss=0.0493]
Steps:  66%|██████▌   | 660/1000 [11:07<05:31,  1.03it/s, lr=2.59e-5, step_loss=0.035] 
Steps:  66%|██████▌   | 660/1000 [11:08<05:31,  1.03it/s, lr=2.59e-5, step_loss=0.0783]
Steps:  66%|██████▌   | 660/1000 [11:08<05:31,  1.03it/s, lr=2.59e-5, step_loss=0.113] 
Steps:  66%|██████▌   | 660/1000 [11:08<05:31,  1.03it/s, lr=2.59e-5, step_loss=0.0316]
Steps:  66%|██████▌   | 661/1000 [11:08<05:30,  1.03it/s, lr=2.59e-5, step_loss=0.0316]
Steps:  66%|██████▌   | 661/1000 [11:08<05:30,  1.03it/s, lr=2.58e-5, step_loss=0.0586]
Steps:  66%|██████▌   | 661/1000 [11:09<05:30,  1.03it/s, lr=2.58e-5, step_loss=0.0338]
Steps:  66%|██████▌   | 661/1000 [11:09<05:30,  1.03it/s, lr=2.58e-5, step_loss=0.492] 
Steps:  66%|██████▌   | 661/1000 [11:09<05:30,  1.03it/s, lr=2.58e-5, step_loss=0.0426]
Steps:  66%|██████▌   | 662/1000 [11:09<05:30,  1.02it/s, lr=2.58e-5, step_loss=0.0426]
Steps:  66%|██████▌   | 662/1000 [11:09<05:30,  1.02it/s, lr=2.56e-5, step_loss=0.00339]
Steps:  66%|██████▌   | 662/1000 [11:10<05:30,  1.02it/s, lr=2.56e-5, step_loss=0.00793]
Steps:  66%|██████▌   | 662/1000 [11:10<05:30,  1.02it/s, lr=2.56e-5, step_loss=0.144]  
Steps:  66%|██████▌   | 662/1000 [11:10<05:30,  1.02it/s, lr=2.56e-5, step_loss=0.131]
Steps:  66%|██████▋   | 663/1000 [11:10<05:28,  1.02it/s, lr=2.56e-5, step_loss=0.131]
Steps:  66%|██████▋   | 663/1000 [11:10<05:28,  1.02it/s, lr=2.55e-5, step_loss=0.0801]
Steps:  66%|██████▋   | 663/1000 [11:11<05:28,  1.02it/s, lr=2.55e-5, step_loss=0.00697]
Steps:  66%|██████▋   | 663/1000 [11:11<05:28,  1.02it/s, lr=2.55e-5, step_loss=0.0201] 
Steps:  66%|██████▋   | 663/1000 [11:11<05:28,  1.02it/s, lr=2.55e-5, step_loss=0.0902]
Steps:  66%|██████▋   | 664/1000 [11:11<05:27,  1.03it/s, lr=2.55e-5, step_loss=0.0902]
Steps:  66%|██████▋   | 664/1000 [11:11<05:27,  1.03it/s, lr=2.54e-5, step_loss=0.0493]
Steps:  66%|██████▋   | 664/1000 [11:12<05:27,  1.03it/s, lr=2.54e-5, step_loss=0.0688]
Steps:  66%|██████▋   | 664/1000 [11:12<05:27,  1.03it/s, lr=2.54e-5, step_loss=0.109] 
Steps:  66%|██████▋   | 664/1000 [11:12<05:27,  1.03it/s, lr=2.54e-5, step_loss=0.0444]
Steps:  66%|██████▋   | 665/1000 [11:12<05:26,  1.03it/s, lr=2.54e-5, step_loss=0.0444]
Steps:  66%|██████▋   | 665/1000 [11:12<05:26,  1.03it/s, lr=2.52e-5, step_loss=0.421] 
Steps:  66%|██████▋   | 665/1000 [11:13<05:26,  1.03it/s, lr=2.52e-5, step_loss=0.189]
Steps:  66%|██████▋   | 665/1000 [11:13<05:26,  1.03it/s, lr=2.52e-5, step_loss=0.00328]
Steps:  66%|██████▋   | 665/1000 [11:13<05:26,  1.03it/s, lr=2.52e-5, step_loss=0.0104] 
Steps:  67%|██████▋   | 666/1000 [11:13<05:25,  1.03it/s, lr=2.52e-5, step_loss=0.0104]
Steps:  67%|██████▋   | 666/1000 [11:13<05:25,  1.03it/s, lr=2.51e-5, step_loss=0.081] 
Steps:  67%|██████▋   | 666/1000 [11:13<05:25,  1.03it/s, lr=2.51e-5, step_loss=0.157]
Steps:  67%|██████▋   | 666/1000 [11:14<05:25,  1.03it/s, lr=2.51e-5, step_loss=0.047]
Steps:  67%|██████▋   | 666/1000 [11:14<05:25,  1.03it/s, lr=2.51e-5, step_loss=0.00455]
Steps:  67%|██████▋   | 667/1000 [11:14<05:24,  1.03it/s, lr=2.51e-5, step_loss=0.00455]
Steps:  67%|██████▋   | 667/1000 [11:14<05:24,  1.03it/s, lr=2.5e-5, step_loss=0.0212]  
Steps:  67%|██████▋   | 667/1000 [11:14<05:24,  1.03it/s, lr=2.5e-5, step_loss=0.0738]
Steps:  67%|██████▋   | 667/1000 [11:15<05:24,  1.03it/s, lr=2.5e-5, step_loss=0.0188]
Steps:  67%|██████▋   | 667/1000 [11:15<05:24,  1.03it/s, lr=2.5e-5, step_loss=0.0619]
Steps:  67%|██████▋   | 668/1000 [11:15<05:23,  1.03it/s, lr=2.5e-5, step_loss=0.0619]
Steps:  67%|██████▋   | 668/1000 [11:15<05:23,  1.03it/s, lr=2.48e-5, step_loss=0.0465]
Steps:  67%|██████▋   | 668/1000 [11:15<05:23,  1.03it/s, lr=2.48e-5, step_loss=0.00278]
Steps:  67%|██████▋   | 668/1000 [11:16<05:23,  1.03it/s, lr=2.48e-5, step_loss=0.0492] 
Steps:  67%|██████▋   | 668/1000 [11:16<05:23,  1.03it/s, lr=2.48e-5, step_loss=0.0145]
Steps:  67%|██████▋   | 669/1000 [11:16<05:22,  1.03it/s, lr=2.48e-5, step_loss=0.0145]
Steps:  67%|██████▋   | 669/1000 [11:16<05:22,  1.03it/s, lr=2.47e-5, step_loss=0.0511]
Steps:  67%|██████▋   | 669/1000 [11:16<05:22,  1.03it/s, lr=2.47e-5, step_loss=0.0184]
Steps:  67%|██████▋   | 669/1000 [11:17<05:22,  1.03it/s, lr=2.47e-5, step_loss=0.0223]
Steps:  67%|██████▋   | 669/1000 [11:17<05:22,  1.03it/s, lr=2.47e-5, step_loss=0.192] 
Steps:  67%|██████▋   | 670/1000 [11:17<05:21,  1.03it/s, lr=2.47e-5, step_loss=0.192]
Steps:  67%|██████▋   | 670/1000 [11:17<05:21,  1.03it/s, lr=2.45e-5, step_loss=0.0698]
Steps:  67%|██████▋   | 670/1000 [11:17<05:21,  1.03it/s, lr=2.45e-5, step_loss=0.00357]
Steps:  67%|██████▋   | 670/1000 [11:18<05:21,  1.03it/s, lr=2.45e-5, step_loss=0.0487] 
Steps:  67%|██████▋   | 670/1000 [11:18<05:21,  1.03it/s, lr=2.45e-5, step_loss=0.00209]
Steps:  67%|██████▋   | 671/1000 [11:18<05:20,  1.03it/s, lr=2.45e-5, step_loss=0.00209]
Steps:  67%|██████▋   | 671/1000 [11:18<05:20,  1.03it/s, lr=2.44e-5, step_loss=0.0406] 
Steps:  67%|██████▋   | 671/1000 [11:18<05:20,  1.03it/s, lr=2.44e-5, step_loss=0.0332]
Steps:  67%|██████▋   | 671/1000 [11:19<05:20,  1.03it/s, lr=2.44e-5, step_loss=0.0389]
Steps:  67%|██████▋   | 671/1000 [11:19<05:20,  1.03it/s, lr=2.44e-5, step_loss=0.00704]
Steps:  67%|██████▋   | 672/1000 [11:19<05:19,  1.03it/s, lr=2.44e-5, step_loss=0.00704]
Steps:  67%|██████▋   | 672/1000 [11:19<05:19,  1.03it/s, lr=2.43e-5, step_loss=0.00475]
Steps:  67%|██████▋   | 672/1000 [11:19<05:19,  1.03it/s, lr=2.43e-5, step_loss=0.00585]
Steps:  67%|██████▋   | 672/1000 [11:20<05:19,  1.03it/s, lr=2.43e-5, step_loss=0.298]  
Steps:  67%|██████▋   | 672/1000 [11:20<05:19,  1.03it/s, lr=2.43e-5, step_loss=0.00848]
Steps:  67%|██████▋   | 673/1000 [11:20<05:18,  1.03it/s, lr=2.43e-5, step_loss=0.00848]
Steps:  67%|██████▋   | 673/1000 [11:20<05:18,  1.03it/s, lr=2.41e-5, step_loss=0.0415] 
Steps:  67%|██████▋   | 673/1000 [11:20<05:18,  1.03it/s, lr=2.41e-5, step_loss=0.00704]
Steps:  67%|██████▋   | 673/1000 [11:21<05:18,  1.03it/s, lr=2.41e-5, step_loss=0.131]  
Steps:  67%|██████▋   | 673/1000 [11:21<05:18,  1.03it/s, lr=2.41e-5, step_loss=0.0587]
Steps:  67%|██████▋   | 674/1000 [11:21<05:17,  1.03it/s, lr=2.41e-5, step_loss=0.0587]
Steps:  67%|██████▋   | 674/1000 [11:21<05:17,  1.03it/s, lr=2.4e-5, step_loss=0.101]  
Steps:  67%|██████▋   | 674/1000 [11:21<05:17,  1.03it/s, lr=2.4e-5, step_loss=0.0946]
Steps:  67%|██████▋   | 674/1000 [11:22<05:17,  1.03it/s, lr=2.4e-5, step_loss=0.0993]
Steps:  67%|██████▋   | 674/1000 [11:22<05:17,  1.03it/s, lr=2.4e-5, step_loss=0.0478]
Steps:  68%|██████▊   | 675/1000 [11:22<05:16,  1.03it/s, lr=2.4e-5, step_loss=0.0478]
Steps:  68%|██████▊   | 675/1000 [11:22<05:16,  1.03it/s, lr=2.39e-5, step_loss=0.00537]
Steps:  68%|██████▊   | 675/1000 [11:22<05:16,  1.03it/s, lr=2.39e-5, step_loss=0.00211]
Steps:  68%|██████▊   | 675/1000 [11:23<05:16,  1.03it/s, lr=2.39e-5, step_loss=0.00651]
Steps:  68%|██████▊   | 675/1000 [11:23<05:16,  1.03it/s, lr=2.39e-5, step_loss=0.328]  
Steps:  68%|██████▊   | 676/1000 [11:23<05:15,  1.03it/s, lr=2.39e-5, step_loss=0.328]
Steps:  68%|██████▊   | 676/1000 [11:23<05:15,  1.03it/s, lr=2.37e-5, step_loss=0.192]
Steps:  68%|██████▊   | 676/1000 [11:23<05:15,  1.03it/s, lr=2.37e-5, step_loss=0.079]
Steps:  68%|██████▊   | 676/1000 [11:23<05:15,  1.03it/s, lr=2.37e-5, step_loss=0.00924]
Steps:  68%|██████▊   | 676/1000 [11:24<05:15,  1.03it/s, lr=2.37e-5, step_loss=0.0333] 
Steps:  68%|██████▊   | 677/1000 [11:24<05:14,  1.03it/s, lr=2.37e-5, step_loss=0.0333]
Steps:  68%|██████▊   | 677/1000 [11:24<05:14,  1.03it/s, lr=2.36e-5, step_loss=0.0673]
Steps:  68%|██████▊   | 677/1000 [11:24<05:14,  1.03it/s, lr=2.36e-5, step_loss=0.00286]
Steps:  68%|██████▊   | 677/1000 [11:24<05:14,  1.03it/s, lr=2.36e-5, step_loss=0.0277] 
Steps:  68%|██████▊   | 677/1000 [11:25<05:14,  1.03it/s, lr=2.36e-5, step_loss=0.111] 
Steps:  68%|██████▊   | 678/1000 [11:25<05:13,  1.03it/s, lr=2.36e-5, step_loss=0.111]
Steps:  68%|██████▊   | 678/1000 [11:25<05:13,  1.03it/s, lr=2.35e-5, step_loss=0.0128]
Steps:  68%|██████▊   | 678/1000 [11:25<05:13,  1.03it/s, lr=2.35e-5, step_loss=0.0745]
Steps:  68%|██████▊   | 678/1000 [11:25<05:13,  1.03it/s, lr=2.35e-5, step_loss=0.168] 
Steps:  68%|██████▊   | 678/1000 [11:26<05:13,  1.03it/s, lr=2.35e-5, step_loss=0.0422]
Steps:  68%|██████▊   | 679/1000 [11:26<05:12,  1.03it/s, lr=2.35e-5, step_loss=0.0422]
Steps:  68%|██████▊   | 679/1000 [11:26<05:12,  1.03it/s, lr=2.33e-5, step_loss=0.0114]
Steps:  68%|██████▊   | 679/1000 [11:26<05:12,  1.03it/s, lr=2.33e-5, step_loss=0.01]  
Steps:  68%|██████▊   | 679/1000 [11:26<05:12,  1.03it/s, lr=2.33e-5, step_loss=0.00679]
Steps:  68%|██████▊   | 679/1000 [11:27<05:12,  1.03it/s, lr=2.33e-5, step_loss=0.146]  
Steps:  68%|██████▊   | 680/1000 [11:27<05:12,  1.03it/s, lr=2.33e-5, step_loss=0.146]
Steps:  68%|██████▊   | 680/1000 [11:27<05:12,  1.03it/s, lr=2.32e-5, step_loss=0.0206]
Steps:  68%|██████▊   | 680/1000 [11:27<05:12,  1.03it/s, lr=2.32e-5, step_loss=0.00838]
Steps:  68%|██████▊   | 680/1000 [11:27<05:12,  1.03it/s, lr=2.32e-5, step_loss=0.0397] 
Steps:  68%|██████▊   | 680/1000 [11:28<05:12,  1.03it/s, lr=2.32e-5, step_loss=0.205] 
Steps:  68%|██████▊   | 681/1000 [11:28<05:11,  1.03it/s, lr=2.32e-5, step_loss=0.205]
Steps:  68%|██████▊   | 681/1000 [11:28<05:11,  1.03it/s, lr=2.31e-5, step_loss=0.0541]
Steps:  68%|██████▊   | 681/1000 [11:28<05:11,  1.03it/s, lr=2.31e-5, step_loss=0.111] 
Steps:  68%|██████▊   | 681/1000 [11:28<05:11,  1.03it/s, lr=2.31e-5, step_loss=0.0101]
Steps:  68%|██████▊   | 681/1000 [11:29<05:11,  1.03it/s, lr=2.31e-5, step_loss=0.0689]
Steps:  68%|██████▊   | 682/1000 [11:29<05:10,  1.02it/s, lr=2.31e-5, step_loss=0.0689]
Steps:  68%|██████▊   | 682/1000 [11:29<05:10,  1.02it/s, lr=2.29e-5, step_loss=0.531] 
Steps:  68%|██████▊   | 682/1000 [11:29<05:10,  1.02it/s, lr=2.29e-5, step_loss=0.0367]
Steps:  68%|██████▊   | 682/1000 [11:29<05:10,  1.02it/s, lr=2.29e-5, step_loss=0.00645]
Steps:  68%|██████▊   | 682/1000 [11:30<05:10,  1.02it/s, lr=2.29e-5, step_loss=0.0577] 
Steps:  68%|██████▊   | 683/1000 [11:30<05:09,  1.02it/s, lr=2.29e-5, step_loss=0.0577]
Steps:  68%|██████▊   | 683/1000 [11:30<05:09,  1.02it/s, lr=2.28e-5, step_loss=0.177] 
Steps:  68%|██████▊   | 683/1000 [11:30<05:09,  1.02it/s, lr=2.28e-5, step_loss=0.0525]
Steps:  68%|██████▊   | 683/1000 [11:30<05:09,  1.02it/s, lr=2.28e-5, step_loss=0.0624]
Steps:  68%|██████▊   | 683/1000 [11:31<05:09,  1.02it/s, lr=2.28e-5, step_loss=0.00518]
Steps:  68%|██████▊   | 684/1000 [11:31<05:08,  1.02it/s, lr=2.28e-5, step_loss=0.00518]
Steps:  68%|██████▊   | 684/1000 [11:31<05:08,  1.02it/s, lr=2.27e-5, step_loss=0.0152] 
Steps:  68%|██████▊   | 684/1000 [11:31<05:08,  1.02it/s, lr=2.27e-5, step_loss=0.0223]
Steps:  68%|██████▊   | 684/1000 [11:31<05:08,  1.02it/s, lr=2.27e-5, step_loss=0.108] 
Steps:  68%|██████▊   | 684/1000 [11:32<05:08,  1.02it/s, lr=2.27e-5, step_loss=0.168]
Steps:  68%|██████▊   | 685/1000 [11:32<05:07,  1.02it/s, lr=2.27e-5, step_loss=0.168]
Steps:  68%|██████▊   | 685/1000 [11:32<05:07,  1.02it/s, lr=2.25e-5, step_loss=0.00607]
Steps:  68%|██████▊   | 685/1000 [11:32<05:07,  1.02it/s, lr=2.25e-5, step_loss=0.536]  
Steps:  68%|██████▊   | 685/1000 [11:32<05:07,  1.02it/s, lr=2.25e-5, step_loss=0.077]
Steps:  68%|██████▊   | 685/1000 [11:33<05:07,  1.02it/s, lr=2.25e-5, step_loss=0.00499]
Steps:  69%|██████▊   | 686/1000 [11:33<05:06,  1.02it/s, lr=2.25e-5, step_loss=0.00499]
Steps:  69%|██████▊   | 686/1000 [11:33<05:06,  1.02it/s, lr=2.24e-5, step_loss=0.261]  
Steps:  69%|██████▊   | 686/1000 [11:33<05:06,  1.02it/s, lr=2.24e-5, step_loss=0.00381]
Steps:  69%|██████▊   | 686/1000 [11:33<05:06,  1.02it/s, lr=2.24e-5, step_loss=0.0642] 
Steps:  69%|██████▊   | 686/1000 [11:33<05:06,  1.02it/s, lr=2.24e-5, step_loss=0.00413]
Steps:  69%|██████▊   | 687/1000 [11:34<05:05,  1.02it/s, lr=2.24e-5, step_loss=0.00413]
Steps:  69%|██████▊   | 687/1000 [11:34<05:05,  1.02it/s, lr=2.23e-5, step_loss=0.0662] 
Steps:  69%|██████▊   | 687/1000 [11:34<05:05,  1.02it/s, lr=2.23e-5, step_loss=0.00192]
Steps:  69%|██████▊   | 687/1000 [11:34<05:05,  1.02it/s, lr=2.23e-5, step_loss=0.103]  
Steps:  69%|██████▊   | 687/1000 [11:34<05:05,  1.02it/s, lr=2.23e-5, step_loss=0.00794]
Steps:  69%|██████▉   | 688/1000 [11:35<05:04,  1.02it/s, lr=2.23e-5, step_loss=0.00794]
Steps:  69%|██████▉   | 688/1000 [11:35<05:04,  1.02it/s, lr=2.22e-5, step_loss=0.137]  
Steps:  69%|██████▉   | 688/1000 [11:35<05:04,  1.02it/s, lr=2.22e-5, step_loss=0.051]
Steps:  69%|██████▉   | 688/1000 [11:35<05:04,  1.02it/s, lr=2.22e-5, step_loss=0.0763]
Steps:  69%|██████▉   | 688/1000 [11:35<05:04,  1.02it/s, lr=2.22e-5, step_loss=0.104] 
Steps:  69%|██████▉   | 689/1000 [11:36<05:03,  1.02it/s, lr=2.22e-5, step_loss=0.104]
Steps:  69%|██████▉   | 689/1000 [11:36<05:03,  1.02it/s, lr=2.2e-5, step_loss=0.00744]
Steps:  69%|██████▉   | 689/1000 [11:36<05:03,  1.02it/s, lr=2.2e-5, step_loss=0.00674]
Steps:  69%|██████▉   | 689/1000 [11:36<05:03,  1.02it/s, lr=2.2e-5, step_loss=0.0694] 
Steps:  69%|██████▉   | 689/1000 [11:36<05:03,  1.02it/s, lr=2.2e-5, step_loss=0.1]   
Steps:  69%|██████▉   | 690/1000 [11:37<05:02,  1.02it/s, lr=2.2e-5, step_loss=0.1]
Steps:  69%|██████▉   | 690/1000 [11:37<05:02,  1.02it/s, lr=2.19e-5, step_loss=0.142]
Steps:  69%|██████▉   | 690/1000 [11:37<05:02,  1.02it/s, lr=2.19e-5, step_loss=0.00809]
Steps:  69%|██████▉   | 690/1000 [11:37<05:02,  1.02it/s, lr=2.19e-5, step_loss=0.0268] 
Steps:  69%|██████▉   | 690/1000 [11:37<05:02,  1.02it/s, lr=2.19e-5, step_loss=0.0244]
Steps:  69%|██████▉   | 691/1000 [11:38<05:02,  1.02it/s, lr=2.19e-5, step_loss=0.0244]
Steps:  69%|██████▉   | 691/1000 [11:38<05:02,  1.02it/s, lr=2.18e-5, step_loss=0.105] 
Steps:  69%|██████▉   | 691/1000 [11:38<05:02,  1.02it/s, lr=2.18e-5, step_loss=0.156]
Steps:  69%|██████▉   | 691/1000 [11:38<05:02,  1.02it/s, lr=2.18e-5, step_loss=0.0254]
Steps:  69%|██████▉   | 691/1000 [11:38<05:02,  1.02it/s, lr=2.18e-5, step_loss=0.00257]
Steps:  69%|██████▉   | 692/1000 [11:39<05:01,  1.02it/s, lr=2.18e-5, step_loss=0.00257]
Steps:  69%|██████▉   | 692/1000 [11:39<05:01,  1.02it/s, lr=2.16e-5, step_loss=0.0377] 
Steps:  69%|██████▉   | 692/1000 [11:39<05:01,  1.02it/s, lr=2.16e-5, step_loss=0.04]  
Steps:  69%|██████▉   | 692/1000 [11:39<05:01,  1.02it/s, lr=2.16e-5, step_loss=0.00351]
Steps:  69%|██████▉   | 692/1000 [11:39<05:01,  1.02it/s, lr=2.16e-5, step_loss=0.0172] 
Steps:  69%|██████▉   | 693/1000 [11:40<05:00,  1.02it/s, lr=2.16e-5, step_loss=0.0172]
Steps:  69%|██████▉   | 693/1000 [11:40<05:00,  1.02it/s, lr=2.15e-5, step_loss=0.105] 
Steps:  69%|██████▉   | 693/1000 [11:40<05:00,  1.02it/s, lr=2.15e-5, step_loss=0.00943]
Steps:  69%|██████▉   | 693/1000 [11:40<05:00,  1.02it/s, lr=2.15e-5, step_loss=0.0634] 
Steps:  69%|██████▉   | 693/1000 [11:40<05:00,  1.02it/s, lr=2.15e-5, step_loss=0.0467]
Steps:  69%|██████▉   | 694/1000 [11:41<04:59,  1.02it/s, lr=2.15e-5, step_loss=0.0467]
Steps:  69%|██████▉   | 694/1000 [11:41<04:59,  1.02it/s, lr=2.14e-5, step_loss=0.0038]
Steps:  69%|██████▉   | 694/1000 [11:41<04:59,  1.02it/s, lr=2.14e-5, step_loss=0.0457]
Steps:  69%|██████▉   | 694/1000 [11:41<04:59,  1.02it/s, lr=2.14e-5, step_loss=0.0302]
Steps:  69%|██████▉   | 694/1000 [11:41<04:59,  1.02it/s, lr=2.14e-5, step_loss=0.0329]
Steps:  70%|██████▉   | 695/1000 [11:42<04:58,  1.02it/s, lr=2.14e-5, step_loss=0.0329]
Steps:  70%|██████▉   | 695/1000 [11:42<04:58,  1.02it/s, lr=2.12e-5, step_loss=0.00233]
Steps:  70%|██████▉   | 695/1000 [11:42<04:58,  1.02it/s, lr=2.12e-5, step_loss=0.00715]
Steps:  70%|██████▉   | 695/1000 [11:42<04:58,  1.02it/s, lr=2.12e-5, step_loss=0.0547] 
Steps:  70%|██████▉   | 695/1000 [11:42<04:58,  1.02it/s, lr=2.12e-5, step_loss=0.0104]
Steps:  70%|██████▉   | 696/1000 [11:42<04:57,  1.02it/s, lr=2.12e-5, step_loss=0.0104]
Steps:  70%|██████▉   | 696/1000 [11:43<04:57,  1.02it/s, lr=2.11e-5, step_loss=0.00434]
Steps:  70%|██████▉   | 696/1000 [11:43<04:57,  1.02it/s, lr=2.11e-5, step_loss=0.00289]
Steps:  70%|██████▉   | 696/1000 [11:43<04:57,  1.02it/s, lr=2.11e-5, step_loss=0.0186] 
Steps:  70%|██████▉   | 696/1000 [11:43<04:57,  1.02it/s, lr=2.11e-5, step_loss=0.174] 
Steps:  70%|██████▉   | 697/1000 [11:43<04:56,  1.02it/s, lr=2.11e-5, step_loss=0.174]
Steps:  70%|██████▉   | 697/1000 [11:44<04:56,  1.02it/s, lr=2.1e-5, step_loss=0.00747]
Steps:  70%|██████▉   | 697/1000 [11:44<04:56,  1.02it/s, lr=2.1e-5, step_loss=0.0344] 
Steps:  70%|██████▉   | 697/1000 [11:44<04:56,  1.02it/s, lr=2.1e-5, step_loss=0.0213]
Steps:  70%|██████▉   | 697/1000 [11:44<04:56,  1.02it/s, lr=2.1e-5, step_loss=0.066] 
Steps:  70%|██████▉   | 698/1000 [11:44<04:55,  1.02it/s, lr=2.1e-5, step_loss=0.066]
Steps:  70%|██████▉   | 698/1000 [11:44<04:55,  1.02it/s, lr=2.09e-5, step_loss=0.0412]
Steps:  70%|██████▉   | 698/1000 [11:45<04:55,  1.02it/s, lr=2.09e-5, step_loss=0.139] 
Steps:  70%|██████▉   | 698/1000 [11:45<04:55,  1.02it/s, lr=2.09e-5, step_loss=0.0934]
Steps:  70%|██████▉   | 698/1000 [11:45<04:55,  1.02it/s, lr=2.09e-5, step_loss=0.0374]
Steps:  70%|██████▉   | 699/1000 [11:45<04:54,  1.02it/s, lr=2.09e-5, step_loss=0.0374]
Steps:  70%|██████▉   | 699/1000 [11:45<04:54,  1.02it/s, lr=2.07e-5, step_loss=0.00755]
Steps:  70%|██████▉   | 699/1000 [11:46<04:54,  1.02it/s, lr=2.07e-5, step_loss=0.0441] 
Steps:  70%|██████▉   | 699/1000 [11:46<04:54,  1.02it/s, lr=2.07e-5, step_loss=0.126] 
Steps:  70%|██████▉   | 699/1000 [11:46<04:54,  1.02it/s, lr=2.07e-5, step_loss=0.169]
Steps:  70%|███████   | 700/1000 [11:46<04:53,  1.02it/s, lr=2.07e-5, step_loss=0.169]
Steps:  70%|███████   | 700/1000 [11:46<04:53,  1.02it/s, lr=2.06e-5, step_loss=0.0623]
Steps:  70%|███████   | 700/1000 [11:47<04:53,  1.02it/s, lr=2.06e-5, step_loss=0.0509]
Steps:  70%|███████   | 700/1000 [11:47<04:53,  1.02it/s, lr=2.06e-5, step_loss=0.142] 
Steps:  70%|███████   | 700/1000 [11:47<04:53,  1.02it/s, lr=2.06e-5, step_loss=0.0404]
Steps:  70%|███████   | 701/1000 [11:47<04:52,  1.02it/s, lr=2.06e-5, step_loss=0.0404]
Steps:  70%|███████   | 701/1000 [11:47<04:52,  1.02it/s, lr=2.05e-5, step_loss=0.633] 
Steps:  70%|███████   | 701/1000 [11:48<04:52,  1.02it/s, lr=2.05e-5, step_loss=0.126]
Steps:  70%|███████   | 701/1000 [11:48<04:52,  1.02it/s, lr=2.05e-5, step_loss=0.036]
Steps:  70%|███████   | 701/1000 [11:48<04:52,  1.02it/s, lr=2.05e-5, step_loss=0.069]
Steps:  70%|███████   | 702/1000 [11:48<04:51,  1.02it/s, lr=2.05e-5, step_loss=0.069]
Steps:  70%|███████   | 702/1000 [11:48<04:51,  1.02it/s, lr=2.04e-5, step_loss=0.00348]
Steps:  70%|███████   | 702/1000 [11:49<04:51,  1.02it/s, lr=2.04e-5, step_loss=0.167]  
Steps:  70%|███████   | 702/1000 [11:49<04:51,  1.02it/s, lr=2.04e-5, step_loss=0.0335]
Steps:  70%|███████   | 702/1000 [11:49<04:51,  1.02it/s, lr=2.04e-5, step_loss=0.00306]
Steps:  70%|███████   | 703/1000 [11:49<04:49,  1.02it/s, lr=2.04e-5, step_loss=0.00306]
Steps:  70%|███████   | 703/1000 [11:49<04:49,  1.02it/s, lr=2.02e-5, step_loss=0.00486]
Steps:  70%|███████   | 703/1000 [11:50<04:49,  1.02it/s, lr=2.02e-5, step_loss=0.0175] 
Steps:  70%|███████   | 703/1000 [11:50<04:49,  1.02it/s, lr=2.02e-5, step_loss=0.266] 
Steps:  70%|███████   | 703/1000 [11:50<04:49,  1.02it/s, lr=2.02e-5, step_loss=0.0885]
Steps:  70%|███████   | 704/1000 [11:50<04:48,  1.02it/s, lr=2.02e-5, step_loss=0.0885]
Steps:  70%|███████   | 704/1000 [11:50<04:48,  1.02it/s, lr=2.01e-5, step_loss=0.00979]
Steps:  70%|███████   | 704/1000 [11:51<04:48,  1.02it/s, lr=2.01e-5, step_loss=0.177]  
Steps:  70%|███████   | 704/1000 [11:51<04:48,  1.02it/s, lr=2.01e-5, step_loss=0.00555]
Steps:  70%|███████   | 704/1000 [11:51<04:48,  1.02it/s, lr=2.01e-5, step_loss=0.0127] 
Steps:  70%|███████   | 705/1000 [11:51<04:48,  1.02it/s, lr=2.01e-5, step_loss=0.0127]
Steps:  70%|███████   | 705/1000 [11:51<04:48,  1.02it/s, lr=2e-5, step_loss=0.00713]  
Steps:  70%|███████   | 705/1000 [11:52<04:48,  1.02it/s, lr=2e-5, step_loss=0.00497]
Steps:  70%|███████   | 705/1000 [11:52<04:48,  1.02it/s, lr=2e-5, step_loss=0.0167] 
Steps:  70%|███████   | 705/1000 [11:52<04:48,  1.02it/s, lr=2e-5, step_loss=0.00296]
Steps:  71%|███████   | 706/1000 [11:52<04:47,  1.02it/s, lr=2e-5, step_loss=0.00296]
Steps:  71%|███████   | 706/1000 [11:52<04:47,  1.02it/s, lr=1.99e-5, step_loss=0.153]
Steps:  71%|███████   | 706/1000 [11:53<04:47,  1.02it/s, lr=1.99e-5, step_loss=0.249]
Steps:  71%|███████   | 706/1000 [11:53<04:47,  1.02it/s, lr=1.99e-5, step_loss=0.15] 
Steps:  71%|███████   | 706/1000 [11:53<04:47,  1.02it/s, lr=1.99e-5, step_loss=0.0468]
Steps:  71%|███████   | 707/1000 [11:53<04:46,  1.02it/s, lr=1.99e-5, step_loss=0.0468]
Steps:  71%|███████   | 707/1000 [11:53<04:46,  1.02it/s, lr=1.97e-5, step_loss=0.0363]
Steps:  71%|███████   | 707/1000 [11:54<04:46,  1.02it/s, lr=1.97e-5, step_loss=0.0329]
Steps:  71%|███████   | 707/1000 [11:54<04:46,  1.02it/s, lr=1.97e-5, step_loss=0.013] 
Steps:  71%|███████   | 707/1000 [11:54<04:46,  1.02it/s, lr=1.97e-5, step_loss=0.0154]
Steps:  71%|███████   | 708/1000 [11:54<04:45,  1.02it/s, lr=1.97e-5, step_loss=0.0154]
Steps:  71%|███████   | 708/1000 [11:54<04:45,  1.02it/s, lr=1.96e-5, step_loss=0.0179]
Steps:  71%|███████   | 708/1000 [11:54<04:45,  1.02it/s, lr=1.96e-5, step_loss=0.118] 
Steps:  71%|███████   | 708/1000 [11:55<04:45,  1.02it/s, lr=1.96e-5, step_loss=0.0531]
Steps:  71%|███████   | 708/1000 [11:55<04:45,  1.02it/s, lr=1.96e-5, step_loss=0.0643]
Steps:  71%|███████   | 709/1000 [11:55<04:44,  1.02it/s, lr=1.96e-5, step_loss=0.0643]
Steps:  71%|███████   | 709/1000 [11:55<04:44,  1.02it/s, lr=1.95e-5, step_loss=0.0912]
Steps:  71%|███████   | 709/1000 [11:55<04:44,  1.02it/s, lr=1.95e-5, step_loss=0.00331]
Steps:  71%|███████   | 709/1000 [11:56<04:44,  1.02it/s, lr=1.95e-5, step_loss=0.0273] 
Steps:  71%|███████   | 709/1000 [11:56<04:44,  1.02it/s, lr=1.95e-5, step_loss=0.0185]
Steps:  71%|███████   | 710/1000 [11:56<04:43,  1.02it/s, lr=1.95e-5, step_loss=0.0185]
Steps:  71%|███████   | 710/1000 [11:56<04:43,  1.02it/s, lr=1.94e-5, step_loss=0.00435]
Steps:  71%|███████   | 710/1000 [11:56<04:43,  1.02it/s, lr=1.94e-5, step_loss=0.305]  
Steps:  71%|███████   | 710/1000 [11:57<04:43,  1.02it/s, lr=1.94e-5, step_loss=0.0128]
Steps:  71%|███████   | 710/1000 [11:57<04:43,  1.02it/s, lr=1.94e-5, step_loss=0.0911]
Steps:  71%|███████   | 711/1000 [11:57<04:42,  1.02it/s, lr=1.94e-5, step_loss=0.0911]
Steps:  71%|███████   | 711/1000 [11:57<04:42,  1.02it/s, lr=1.92e-5, step_loss=0.0862]
Steps:  71%|███████   | 711/1000 [11:57<04:42,  1.02it/s, lr=1.92e-5, step_loss=0.0792]
Steps:  71%|███████   | 711/1000 [11:58<04:42,  1.02it/s, lr=1.92e-5, step_loss=0.167] 
Steps:  71%|███████   | 711/1000 [11:58<04:42,  1.02it/s, lr=1.92e-5, step_loss=0.0924]
Steps:  71%|███████   | 712/1000 [11:58<04:41,  1.02it/s, lr=1.92e-5, step_loss=0.0924]
Steps:  71%|███████   | 712/1000 [11:58<04:41,  1.02it/s, lr=1.91e-5, step_loss=0.103] 
Steps:  71%|███████   | 712/1000 [11:58<04:41,  1.02it/s, lr=1.91e-5, step_loss=0.00928]
Steps:  71%|███████   | 712/1000 [11:59<04:41,  1.02it/s, lr=1.91e-5, step_loss=0.023]  
Steps:  71%|███████   | 712/1000 [11:59<04:41,  1.02it/s, lr=1.91e-5, step_loss=0.027]
Steps:  71%|███████▏  | 713/1000 [11:59<04:40,  1.02it/s, lr=1.91e-5, step_loss=0.027]
Steps:  71%|███████▏  | 713/1000 [11:59<04:40,  1.02it/s, lr=1.9e-5, step_loss=0.209] 
Steps:  71%|███████▏  | 713/1000 [11:59<04:40,  1.02it/s, lr=1.9e-5, step_loss=0.046]
Steps:  71%|███████▏  | 713/1000 [12:00<04:40,  1.02it/s, lr=1.9e-5, step_loss=0.0235]
Steps:  71%|███████▏  | 713/1000 [12:00<04:40,  1.02it/s, lr=1.9e-5, step_loss=0.028] 
Steps:  71%|███████▏  | 714/1000 [12:00<04:39,  1.02it/s, lr=1.9e-5, step_loss=0.028]
Steps:  71%|███████▏  | 714/1000 [12:00<04:39,  1.02it/s, lr=1.89e-5, step_loss=0.0305]
Steps:  71%|███████▏  | 714/1000 [12:00<04:39,  1.02it/s, lr=1.89e-5, step_loss=0.0429]
Steps:  71%|███████▏  | 714/1000 [12:01<04:39,  1.02it/s, lr=1.89e-5, step_loss=0.0932]
Steps:  71%|███████▏  | 714/1000 [12:01<04:39,  1.02it/s, lr=1.89e-5, step_loss=0.0475]
Steps:  72%|███████▏  | 715/1000 [12:01<04:38,  1.02it/s, lr=1.89e-5, step_loss=0.0475]
Steps:  72%|███████▏  | 715/1000 [12:01<04:38,  1.02it/s, lr=1.87e-5, step_loss=0.00258]
Steps:  72%|███████▏  | 715/1000 [12:01<04:38,  1.02it/s, lr=1.87e-5, step_loss=0.034]  
Steps:  72%|███████▏  | 715/1000 [12:02<04:38,  1.02it/s, lr=1.87e-5, step_loss=0.12] 
Steps:  72%|███████▏  | 715/1000 [12:02<04:38,  1.02it/s, lr=1.87e-5, step_loss=0.013]
Steps:  72%|███████▏  | 716/1000 [12:02<04:37,  1.02it/s, lr=1.87e-5, step_loss=0.013]
Steps:  72%|███████▏  | 716/1000 [12:02<04:37,  1.02it/s, lr=1.86e-5, step_loss=0.00666]
Steps:  72%|███████▏  | 716/1000 [12:02<04:37,  1.02it/s, lr=1.86e-5, step_loss=0.0152] 
Steps:  72%|███████▏  | 716/1000 [12:03<04:37,  1.02it/s, lr=1.86e-5, step_loss=0.0358]
Steps:  72%|███████▏  | 716/1000 [12:03<04:37,  1.02it/s, lr=1.86e-5, step_loss=0.0361]
Steps:  72%|███████▏  | 717/1000 [12:03<04:36,  1.02it/s, lr=1.86e-5, step_loss=0.0361]
Steps:  72%|███████▏  | 717/1000 [12:03<04:36,  1.02it/s, lr=1.85e-5, step_loss=0.0103]
Steps:  72%|███████▏  | 717/1000 [12:03<04:36,  1.02it/s, lr=1.85e-5, step_loss=0.0378]
Steps:  72%|███████▏  | 717/1000 [12:04<04:36,  1.02it/s, lr=1.85e-5, step_loss=0.0826]
Steps:  72%|███████▏  | 717/1000 [12:04<04:36,  1.02it/s, lr=1.85e-5, step_loss=0.0698]
Steps:  72%|███████▏  | 718/1000 [12:04<04:35,  1.02it/s, lr=1.85e-5, step_loss=0.0698]
Steps:  72%|███████▏  | 718/1000 [12:04<04:35,  1.02it/s, lr=1.84e-5, step_loss=0.311] 
Steps:  72%|███████▏  | 718/1000 [12:04<04:35,  1.02it/s, lr=1.84e-5, step_loss=0.087]
Steps:  72%|███████▏  | 718/1000 [12:05<04:35,  1.02it/s, lr=1.84e-5, step_loss=0.122]
Steps:  72%|███████▏  | 718/1000 [12:05<04:35,  1.02it/s, lr=1.84e-5, step_loss=0.00729]
Steps:  72%|███████▏  | 719/1000 [12:05<04:34,  1.02it/s, lr=1.84e-5, step_loss=0.00729]
Steps:  72%|███████▏  | 719/1000 [12:05<04:34,  1.02it/s, lr=1.82e-5, step_loss=0.0409] 
Steps:  72%|███████▏  | 719/1000 [12:05<04:34,  1.02it/s, lr=1.82e-5, step_loss=0.266] 
Steps:  72%|███████▏  | 719/1000 [12:05<04:34,  1.02it/s, lr=1.82e-5, step_loss=0.0935]
Steps:  72%|███████▏  | 719/1000 [12:06<04:34,  1.02it/s, lr=1.82e-5, step_loss=0.0584]
Steps:  72%|███████▏  | 720/1000 [12:06<04:33,  1.02it/s, lr=1.82e-5, step_loss=0.0584]
Steps:  72%|███████▏  | 720/1000 [12:06<04:33,  1.02it/s, lr=1.81e-5, step_loss=0.096] 
Steps:  72%|███████▏  | 720/1000 [12:06<04:33,  1.02it/s, lr=1.81e-5, step_loss=0.0305]
Steps:  72%|███████▏  | 720/1000 [12:06<04:33,  1.02it/s, lr=1.81e-5, step_loss=0.247] 
Steps:  72%|███████▏  | 720/1000 [12:07<04:33,  1.02it/s, lr=1.81e-5, step_loss=0.0635]
Steps:  72%|███████▏  | 721/1000 [12:07<04:32,  1.02it/s, lr=1.81e-5, step_loss=0.0635]
Steps:  72%|███████▏  | 721/1000 [12:07<04:32,  1.02it/s, lr=1.8e-5, step_loss=0.206]  
Steps:  72%|███████▏  | 721/1000 [12:07<04:32,  1.02it/s, lr=1.8e-5, step_loss=0.298]
Steps:  72%|███████▏  | 721/1000 [12:07<04:32,  1.02it/s, lr=1.8e-5, step_loss=0.0557]
Steps:  72%|███████▏  | 721/1000 [12:08<04:32,  1.02it/s, lr=1.8e-5, step_loss=0.00361]
Steps:  72%|███████▏  | 722/1000 [12:08<04:31,  1.02it/s, lr=1.8e-5, step_loss=0.00361]
Steps:  72%|███████▏  | 722/1000 [12:08<04:31,  1.02it/s, lr=1.79e-5, step_loss=0.0139]
Steps:  72%|███████▏  | 722/1000 [12:08<04:31,  1.02it/s, lr=1.79e-5, step_loss=0.0217]
Steps:  72%|███████▏  | 722/1000 [12:08<04:31,  1.02it/s, lr=1.79e-5, step_loss=0.133] 
Steps:  72%|███████▏  | 722/1000 [12:09<04:31,  1.02it/s, lr=1.79e-5, step_loss=0.224]
Steps:  72%|███████▏  | 723/1000 [12:09<04:30,  1.02it/s, lr=1.79e-5, step_loss=0.224]
Steps:  72%|███████▏  | 723/1000 [12:09<04:30,  1.02it/s, lr=1.78e-5, step_loss=0.0486]
Steps:  72%|███████▏  | 723/1000 [12:09<04:30,  1.02it/s, lr=1.78e-5, step_loss=0.791] 
Steps:  72%|███████▏  | 723/1000 [12:09<04:30,  1.02it/s, lr=1.78e-5, step_loss=0.0535]
Steps:  72%|███████▏  | 723/1000 [12:10<04:30,  1.02it/s, lr=1.78e-5, step_loss=0.018] 
Steps:  72%|███████▏  | 724/1000 [12:10<04:29,  1.02it/s, lr=1.78e-5, step_loss=0.018]
Steps:  72%|███████▏  | 724/1000 [12:10<04:29,  1.02it/s, lr=1.76e-5, step_loss=0.204]
Steps:  72%|███████▏  | 724/1000 [12:10<04:29,  1.02it/s, lr=1.76e-5, step_loss=0.0298]
Steps:  72%|███████▏  | 724/1000 [12:10<04:29,  1.02it/s, lr=1.76e-5, step_loss=0.0802]
Steps:  72%|███████▏  | 724/1000 [12:11<04:29,  1.02it/s, lr=1.76e-5, step_loss=0.0323]
Steps:  72%|███████▎  | 725/1000 [12:11<04:28,  1.02it/s, lr=1.76e-5, step_loss=0.0323]
Steps:  72%|███████▎  | 725/1000 [12:11<04:28,  1.02it/s, lr=1.75e-5, step_loss=0.087] 
Steps:  72%|███████▎  | 725/1000 [12:11<04:28,  1.02it/s, lr=1.75e-5, step_loss=0.208]
Steps:  72%|███████▎  | 725/1000 [12:11<04:28,  1.02it/s, lr=1.75e-5, step_loss=0.00472]
Steps:  72%|███████▎  | 725/1000 [12:12<04:28,  1.02it/s, lr=1.75e-5, step_loss=0.097]  
Steps:  73%|███████▎  | 726/1000 [12:12<04:27,  1.02it/s, lr=1.75e-5, step_loss=0.097]
Steps:  73%|███████▎  | 726/1000 [12:12<04:27,  1.02it/s, lr=1.74e-5, step_loss=0.0255]
Steps:  73%|███████▎  | 726/1000 [12:12<04:27,  1.02it/s, lr=1.74e-5, step_loss=0.146] 
Steps:  73%|███████▎  | 726/1000 [12:12<04:27,  1.02it/s, lr=1.74e-5, step_loss=0.161]
Steps:  73%|███████▎  | 726/1000 [12:13<04:27,  1.02it/s, lr=1.74e-5, step_loss=0.0161]
Steps:  73%|███████▎  | 727/1000 [12:13<04:26,  1.02it/s, lr=1.74e-5, step_loss=0.0161]
Steps:  73%|███████▎  | 727/1000 [12:13<04:26,  1.02it/s, lr=1.73e-5, step_loss=0.0402]
Steps:  73%|███████▎  | 727/1000 [12:13<04:26,  1.02it/s, lr=1.73e-5, step_loss=0.0101]
Steps:  73%|███████▎  | 727/1000 [12:13<04:26,  1.02it/s, lr=1.73e-5, step_loss=0.129] 
Steps:  73%|███████▎  | 727/1000 [12:14<04:26,  1.02it/s, lr=1.73e-5, step_loss=0.145]
Steps:  73%|███████▎  | 728/1000 [12:14<04:25,  1.02it/s, lr=1.73e-5, step_loss=0.145]
Steps:  73%|███████▎  | 728/1000 [12:14<04:25,  1.02it/s, lr=1.72e-5, step_loss=0.0267]
Steps:  73%|███████▎  | 728/1000 [12:14<04:25,  1.02it/s, lr=1.72e-5, step_loss=0.08]  
Steps:  73%|███████▎  | 728/1000 [12:14<04:25,  1.02it/s, lr=1.72e-5, step_loss=0.176]
Steps:  73%|███████▎  | 728/1000 [12:15<04:25,  1.02it/s, lr=1.72e-5, step_loss=0.14] 
Steps:  73%|███████▎  | 729/1000 [12:15<04:24,  1.02it/s, lr=1.72e-5, step_loss=0.14]
Steps:  73%|███████▎  | 729/1000 [12:15<04:24,  1.02it/s, lr=1.71e-5, step_loss=0.0798]
Steps:  73%|███████▎  | 729/1000 [12:15<04:24,  1.02it/s, lr=1.71e-5, step_loss=0.254] 
Steps:  73%|███████▎  | 729/1000 [12:15<04:24,  1.02it/s, lr=1.71e-5, step_loss=0.0354]
Steps:  73%|███████▎  | 729/1000 [12:15<04:24,  1.02it/s, lr=1.71e-5, step_loss=0.0195]
Steps:  73%|███████▎  | 730/1000 [12:16<04:23,  1.02it/s, lr=1.71e-5, step_loss=0.0195]
Steps:  73%|███████▎  | 730/1000 [12:16<04:23,  1.02it/s, lr=1.69e-5, step_loss=0.0188]
Steps:  73%|███████▎  | 730/1000 [12:16<04:23,  1.02it/s, lr=1.69e-5, step_loss=0.118] 
Steps:  73%|███████▎  | 730/1000 [12:16<04:23,  1.02it/s, lr=1.69e-5, step_loss=0.0948]
Steps:  73%|███████▎  | 730/1000 [12:16<04:23,  1.02it/s, lr=1.69e-5, step_loss=0.151] 
Steps:  73%|███████▎  | 731/1000 [12:17<04:22,  1.02it/s, lr=1.69e-5, step_loss=0.151]
Steps:  73%|███████▎  | 731/1000 [12:17<04:22,  1.02it/s, lr=1.68e-5, step_loss=0.147]
Steps:  73%|███████▎  | 731/1000 [12:17<04:22,  1.02it/s, lr=1.68e-5, step_loss=0.107]
Steps:  73%|███████▎  | 731/1000 [12:17<04:22,  1.02it/s, lr=1.68e-5, step_loss=0.335]
Steps:  73%|███████▎  | 731/1000 [12:17<04:22,  1.02it/s, lr=1.68e-5, step_loss=0.136]
Steps:  73%|███████▎  | 732/1000 [12:18<04:21,  1.02it/s, lr=1.68e-5, step_loss=0.136]
Steps:  73%|███████▎  | 732/1000 [12:18<04:21,  1.02it/s, lr=1.67e-5, step_loss=0.0065]
Steps:  73%|███████▎  | 732/1000 [12:18<04:21,  1.02it/s, lr=1.67e-5, step_loss=0.119] 
Steps:  73%|███████▎  | 732/1000 [12:18<04:21,  1.02it/s, lr=1.67e-5, step_loss=0.099]
Steps:  73%|███████▎  | 732/1000 [12:18<04:21,  1.02it/s, lr=1.67e-5, step_loss=0.0483]
Steps:  73%|███████▎  | 733/1000 [12:19<04:20,  1.02it/s, lr=1.67e-5, step_loss=0.0483]
Steps:  73%|███████▎  | 733/1000 [12:19<04:20,  1.02it/s, lr=1.66e-5, step_loss=0.15]  
Steps:  73%|███████▎  | 733/1000 [12:19<04:20,  1.02it/s, lr=1.66e-5, step_loss=0.00595]
Steps:  73%|███████▎  | 733/1000 [12:19<04:20,  1.02it/s, lr=1.66e-5, step_loss=0.0682] 
Steps:  73%|███████▎  | 733/1000 [12:19<04:20,  1.02it/s, lr=1.66e-5, step_loss=0.0654]
Steps:  73%|███████▎  | 734/1000 [12:20<04:19,  1.02it/s, lr=1.66e-5, step_loss=0.0654]
Steps:  73%|███████▎  | 734/1000 [12:20<04:19,  1.02it/s, lr=1.65e-5, step_loss=0.113] 
Steps:  73%|███████▎  | 734/1000 [12:20<04:19,  1.02it/s, lr=1.65e-5, step_loss=0.00362]
Steps:  73%|███████▎  | 734/1000 [12:20<04:19,  1.02it/s, lr=1.65e-5, step_loss=0.166]  
Steps:  73%|███████▎  | 734/1000 [12:20<04:19,  1.02it/s, lr=1.65e-5, step_loss=0.0145]
Steps:  74%|███████▎  | 735/1000 [12:21<04:18,  1.02it/s, lr=1.65e-5, step_loss=0.0145]
Steps:  74%|███████▎  | 735/1000 [12:21<04:18,  1.02it/s, lr=1.63e-5, step_loss=0.0428]
Steps:  74%|███████▎  | 735/1000 [12:21<04:18,  1.02it/s, lr=1.63e-5, step_loss=0.00365]
Steps:  74%|███████▎  | 735/1000 [12:21<04:18,  1.02it/s, lr=1.63e-5, step_loss=0.00855]
Steps:  74%|███████▎  | 735/1000 [12:21<04:18,  1.02it/s, lr=1.63e-5, step_loss=0.202]  
Steps:  74%|███████▎  | 736/1000 [12:22<04:17,  1.02it/s, lr=1.63e-5, step_loss=0.202]
Steps:  74%|███████▎  | 736/1000 [12:22<04:17,  1.02it/s, lr=1.62e-5, step_loss=0.0041]
Steps:  74%|███████▎  | 736/1000 [12:22<04:17,  1.02it/s, lr=1.62e-5, step_loss=0.0122]
Steps:  74%|███████▎  | 736/1000 [12:22<04:17,  1.02it/s, lr=1.62e-5, step_loss=0.109] 
Steps:  74%|███████▎  | 736/1000 [12:22<04:17,  1.02it/s, lr=1.62e-5, step_loss=0.0437]
Steps:  74%|███████▎  | 737/1000 [12:23<04:16,  1.02it/s, lr=1.62e-5, step_loss=0.0437]
Steps:  74%|███████▎  | 737/1000 [12:23<04:16,  1.02it/s, lr=1.61e-5, step_loss=0.0105]
Steps:  74%|███████▎  | 737/1000 [12:23<04:16,  1.02it/s, lr=1.61e-5, step_loss=0.272] 
Steps:  74%|███████▎  | 737/1000 [12:23<04:16,  1.02it/s, lr=1.61e-5, step_loss=0.0159]
Steps:  74%|███████▎  | 737/1000 [12:23<04:16,  1.02it/s, lr=1.61e-5, step_loss=0.00499]
Steps:  74%|███████▍  | 738/1000 [12:24<04:15,  1.02it/s, lr=1.61e-5, step_loss=0.00499]
Steps:  74%|███████▍  | 738/1000 [12:24<04:15,  1.02it/s, lr=1.6e-5, step_loss=0.0302]  
Steps:  74%|███████▍  | 738/1000 [12:24<04:15,  1.02it/s, lr=1.6e-5, step_loss=0.0292]
Steps:  74%|███████▍  | 738/1000 [12:24<04:15,  1.02it/s, lr=1.6e-5, step_loss=0.0265]
Steps:  74%|███████▍  | 738/1000 [12:24<04:15,  1.02it/s, lr=1.6e-5, step_loss=0.00228]
Steps:  74%|███████▍  | 739/1000 [12:24<04:14,  1.02it/s, lr=1.6e-5, step_loss=0.00228]
Steps:  74%|███████▍  | 739/1000 [12:25<04:14,  1.02it/s, lr=1.59e-5, step_loss=0.0663]
Steps:  74%|███████▍  | 739/1000 [12:25<04:14,  1.02it/s, lr=1.59e-5, step_loss=0.0522]
Steps:  74%|███████▍  | 739/1000 [12:25<04:14,  1.02it/s, lr=1.59e-5, step_loss=0.0928]
Steps:  74%|███████▍  | 739/1000 [12:25<04:14,  1.02it/s, lr=1.59e-5, step_loss=0.342] 
Steps:  74%|███████▍  | 740/1000 [12:25<04:13,  1.02it/s, lr=1.59e-5, step_loss=0.342]
Steps:  74%|███████▍  | 740/1000 [12:25<04:13,  1.02it/s, lr=1.58e-5, step_loss=0.125]
Steps:  74%|███████▍  | 740/1000 [12:26<04:13,  1.02it/s, lr=1.58e-5, step_loss=0.0378]
Steps:  74%|███████▍  | 740/1000 [12:26<04:13,  1.02it/s, lr=1.58e-5, step_loss=0.00439]
Steps:  74%|███████▍  | 740/1000 [12:26<04:13,  1.02it/s, lr=1.58e-5, step_loss=0.0441] 
Steps:  74%|███████▍  | 741/1000 [12:26<04:12,  1.02it/s, lr=1.58e-5, step_loss=0.0441]
Steps:  74%|███████▍  | 741/1000 [12:26<04:12,  1.02it/s, lr=1.57e-5, step_loss=0.00711]
Steps:  74%|███████▍  | 741/1000 [12:27<04:12,  1.02it/s, lr=1.57e-5, step_loss=0.0118] 
Steps:  74%|███████▍  | 741/1000 [12:27<04:12,  1.02it/s, lr=1.57e-5, step_loss=0.182] 
Steps:  74%|███████▍  | 741/1000 [12:27<04:12,  1.02it/s, lr=1.57e-5, step_loss=0.0612]
Steps:  74%|███████▍  | 742/1000 [12:27<04:11,  1.02it/s, lr=1.57e-5, step_loss=0.0612]
Steps:  74%|███████▍  | 742/1000 [12:27<04:11,  1.02it/s, lr=1.55e-5, step_loss=0.148] 
Steps:  74%|███████▍  | 742/1000 [12:28<04:11,  1.02it/s, lr=1.55e-5, step_loss=0.0563]
Steps:  74%|███████▍  | 742/1000 [12:28<04:11,  1.02it/s, lr=1.55e-5, step_loss=0.00833]
Steps:  74%|███████▍  | 742/1000 [12:28<04:11,  1.02it/s, lr=1.55e-5, step_loss=0.11]   
Steps:  74%|███████▍  | 743/1000 [12:28<04:10,  1.02it/s, lr=1.55e-5, step_loss=0.11]
Steps:  74%|███████▍  | 743/1000 [12:28<04:10,  1.02it/s, lr=1.54e-5, step_loss=0.0619]
Steps:  74%|███████▍  | 743/1000 [12:29<04:10,  1.02it/s, lr=1.54e-5, step_loss=0.113] 
Steps:  74%|███████▍  | 743/1000 [12:29<04:10,  1.02it/s, lr=1.54e-5, step_loss=0.00805]
Steps:  74%|███████▍  | 743/1000 [12:29<04:10,  1.02it/s, lr=1.54e-5, step_loss=0.0336] 
Steps:  74%|███████▍  | 744/1000 [12:29<04:09,  1.02it/s, lr=1.54e-5, step_loss=0.0336]
Steps:  74%|███████▍  | 744/1000 [12:29<04:09,  1.02it/s, lr=1.53e-5, step_loss=0.0064]
Steps:  74%|███████▍  | 744/1000 [12:30<04:09,  1.02it/s, lr=1.53e-5, step_loss=0.554] 
Steps:  74%|███████▍  | 744/1000 [12:30<04:09,  1.02it/s, lr=1.53e-5, step_loss=0.152]
Steps:  74%|███████▍  | 744/1000 [12:30<04:09,  1.02it/s, lr=1.53e-5, step_loss=0.00514]
Steps:  74%|███████▍  | 745/1000 [12:30<04:08,  1.02it/s, lr=1.53e-5, step_loss=0.00514]
Steps:  74%|███████▍  | 745/1000 [12:30<04:08,  1.02it/s, lr=1.52e-5, step_loss=0.00646]
Steps:  74%|███████▍  | 745/1000 [12:31<04:08,  1.02it/s, lr=1.52e-5, step_loss=0.00234]
Steps:  74%|███████▍  | 745/1000 [12:31<04:08,  1.02it/s, lr=1.52e-5, step_loss=0.0346] 
Steps:  74%|███████▍  | 745/1000 [12:31<04:08,  1.02it/s, lr=1.52e-5, step_loss=0.292] 
Steps:  75%|███████▍  | 746/1000 [12:31<04:08,  1.02it/s, lr=1.52e-5, step_loss=0.292]
Steps:  75%|███████▍  | 746/1000 [12:31<04:08,  1.02it/s, lr=1.51e-5, step_loss=0.0642]
Steps:  75%|███████▍  | 746/1000 [12:32<04:08,  1.02it/s, lr=1.51e-5, step_loss=0.00522]
Steps:  75%|███████▍  | 746/1000 [12:32<04:08,  1.02it/s, lr=1.51e-5, step_loss=0.0337] 
Steps:  75%|███████▍  | 746/1000 [12:32<04:08,  1.02it/s, lr=1.51e-5, step_loss=0.0922]
Steps:  75%|███████▍  | 747/1000 [12:32<04:06,  1.02it/s, lr=1.51e-5, step_loss=0.0922]
Steps:  75%|███████▍  | 747/1000 [12:32<04:06,  1.02it/s, lr=1.5e-5, step_loss=0.123]  
Steps:  75%|███████▍  | 747/1000 [12:33<04:06,  1.02it/s, lr=1.5e-5, step_loss=0.324]
Steps:  75%|███████▍  | 747/1000 [12:33<04:06,  1.02it/s, lr=1.5e-5, step_loss=0.498]
Steps:  75%|███████▍  | 747/1000 [12:33<04:06,  1.02it/s, lr=1.5e-5, step_loss=0.0186]
Steps:  75%|███████▍  | 748/1000 [12:33<04:06,  1.02it/s, lr=1.5e-5, step_loss=0.0186]
Steps:  75%|███████▍  | 748/1000 [12:33<04:06,  1.02it/s, lr=1.49e-5, step_loss=0.0307]
Steps:  75%|███████▍  | 748/1000 [12:34<04:06,  1.02it/s, lr=1.49e-5, step_loss=0.00343]
Steps:  75%|███████▍  | 748/1000 [12:34<04:06,  1.02it/s, lr=1.49e-5, step_loss=0.135]  
Steps:  75%|███████▍  | 748/1000 [12:34<04:06,  1.02it/s, lr=1.49e-5, step_loss=0.103]
Steps:  75%|███████▍  | 749/1000 [12:34<04:05,  1.02it/s, lr=1.49e-5, step_loss=0.103]
Steps:  75%|███████▍  | 749/1000 [12:34<04:05,  1.02it/s, lr=1.48e-5, step_loss=0.00959]
Steps:  75%|███████▍  | 749/1000 [12:35<04:05,  1.02it/s, lr=1.48e-5, step_loss=0.0079] 
Steps:  75%|███████▍  | 749/1000 [12:35<04:05,  1.02it/s, lr=1.48e-5, step_loss=0.0776]
Steps:  75%|███████▍  | 749/1000 [12:35<04:05,  1.02it/s, lr=1.48e-5, step_loss=0.00844]
Steps:  75%|███████▌  | 750/1000 [12:35<04:04,  1.02it/s, lr=1.48e-5, step_loss=0.00844]
Steps:  75%|███████▌  | 750/1000 [12:35<04:04,  1.02it/s, lr=1.46e-5, step_loss=0.189]  
Steps:  75%|███████▌  | 750/1000 [12:36<04:04,  1.02it/s, lr=1.46e-5, step_loss=0.132]
Steps:  75%|███████▌  | 750/1000 [12:36<04:04,  1.02it/s, lr=1.46e-5, step_loss=0.00306]
Steps:  75%|███████▌  | 750/1000 [12:36<04:04,  1.02it/s, lr=1.46e-5, step_loss=0.00349]
Steps:  75%|███████▌  | 751/1000 [12:36<04:03,  1.02it/s, lr=1.46e-5, step_loss=0.00349]
Steps:  75%|███████▌  | 751/1000 [12:36<04:03,  1.02it/s, lr=1.45e-5, step_loss=0.652]  
Steps:  75%|███████▌  | 751/1000 [12:36<04:03,  1.02it/s, lr=1.45e-5, step_loss=0.135]
Steps:  75%|███████▌  | 751/1000 [12:37<04:03,  1.02it/s, lr=1.45e-5, step_loss=0.0304]
Steps:  75%|███████▌  | 751/1000 [12:37<04:03,  1.02it/s, lr=1.45e-5, step_loss=0.047] 
Steps:  75%|███████▌  | 752/1000 [12:37<04:02,  1.02it/s, lr=1.45e-5, step_loss=0.047]
Steps:  75%|███████▌  | 752/1000 [12:37<04:02,  1.02it/s, lr=1.44e-5, step_loss=0.142]
Steps:  75%|███████▌  | 752/1000 [12:37<04:02,  1.02it/s, lr=1.44e-5, step_loss=0.0524]
Steps:  75%|███████▌  | 752/1000 [12:38<04:02,  1.02it/s, lr=1.44e-5, step_loss=0.0464]
Steps:  75%|███████▌  | 752/1000 [12:38<04:02,  1.02it/s, lr=1.44e-5, step_loss=0.056] 
Steps:  75%|███████▌  | 753/1000 [12:38<04:01,  1.02it/s, lr=1.44e-5, step_loss=0.056]
Steps:  75%|███████▌  | 753/1000 [12:38<04:01,  1.02it/s, lr=1.43e-5, step_loss=0.0585]
Steps:  75%|███████▌  | 753/1000 [12:38<04:01,  1.02it/s, lr=1.43e-5, step_loss=0.256] 
Steps:  75%|███████▌  | 753/1000 [12:39<04:01,  1.02it/s, lr=1.43e-5, step_loss=0.0543]
Steps:  75%|███████▌  | 753/1000 [12:39<04:01,  1.02it/s, lr=1.43e-5, step_loss=0.202] 
Steps:  75%|███████▌  | 754/1000 [12:39<04:00,  1.02it/s, lr=1.43e-5, step_loss=0.202]
Steps:  75%|███████▌  | 754/1000 [12:39<04:00,  1.02it/s, lr=1.42e-5, step_loss=0.469]
Steps:  75%|███████▌  | 754/1000 [12:39<04:00,  1.02it/s, lr=1.42e-5, step_loss=0.0891]
Steps:  75%|███████▌  | 754/1000 [12:40<04:00,  1.02it/s, lr=1.42e-5, step_loss=0.0536]
Steps:  75%|███████▌  | 754/1000 [12:40<04:00,  1.02it/s, lr=1.42e-5, step_loss=0.0706]
Steps:  76%|███████▌  | 755/1000 [12:40<03:59,  1.02it/s, lr=1.42e-5, step_loss=0.0706]
Steps:  76%|███████▌  | 755/1000 [12:40<03:59,  1.02it/s, lr=1.41e-5, step_loss=0.015] 
Steps:  76%|███████▌  | 755/1000 [12:40<03:59,  1.02it/s, lr=1.41e-5, step_loss=0.0687]
Steps:  76%|███████▌  | 755/1000 [12:41<03:59,  1.02it/s, lr=1.41e-5, step_loss=0.00316]
Steps:  76%|███████▌  | 755/1000 [12:41<03:59,  1.02it/s, lr=1.41e-5, step_loss=0.0643] 
Steps:  76%|███████▌  | 756/1000 [12:41<03:58,  1.02it/s, lr=1.41e-5, step_loss=0.0643]
Steps:  76%|███████▌  | 756/1000 [12:41<03:58,  1.02it/s, lr=1.4e-5, step_loss=0.138]  
Steps:  76%|███████▌  | 756/1000 [12:41<03:58,  1.02it/s, lr=1.4e-5, step_loss=0.0749]
Steps:  76%|███████▌  | 756/1000 [12:42<03:58,  1.02it/s, lr=1.4e-5, step_loss=0.153] 
Steps:  76%|███████▌  | 756/1000 [12:42<03:58,  1.02it/s, lr=1.4e-5, step_loss=0.274]
Steps:  76%|███████▌  | 757/1000 [12:42<03:57,  1.02it/s, lr=1.4e-5, step_loss=0.274]
Steps:  76%|███████▌  | 757/1000 [12:42<03:57,  1.02it/s, lr=1.39e-5, step_loss=0.195]
Steps:  76%|███████▌  | 757/1000 [12:42<03:57,  1.02it/s, lr=1.39e-5, step_loss=0.0139]
Steps:  76%|███████▌  | 757/1000 [12:43<03:57,  1.02it/s, lr=1.39e-5, step_loss=0.00801]
Steps:  76%|███████▌  | 757/1000 [12:43<03:57,  1.02it/s, lr=1.39e-5, step_loss=0.0166] 
Steps:  76%|███████▌  | 758/1000 [12:43<03:56,  1.02it/s, lr=1.39e-5, step_loss=0.0166]
Steps:  76%|███████▌  | 758/1000 [12:43<03:56,  1.02it/s, lr=1.38e-5, step_loss=0.0738]
Steps:  76%|███████▌  | 758/1000 [12:43<03:56,  1.02it/s, lr=1.38e-5, step_loss=0.00265]
Steps:  76%|███████▌  | 758/1000 [12:44<03:56,  1.02it/s, lr=1.38e-5, step_loss=0.0823] 
Steps:  76%|███████▌  | 758/1000 [12:44<03:56,  1.02it/s, lr=1.38e-5, step_loss=0.0248]
Steps:  76%|███████▌  | 759/1000 [12:44<03:55,  1.02it/s, lr=1.38e-5, step_loss=0.0248]
Steps:  76%|███████▌  | 759/1000 [12:44<03:55,  1.02it/s, lr=1.37e-5, step_loss=0.107] 
Steps:  76%|███████▌  | 759/1000 [12:44<03:55,  1.02it/s, lr=1.37e-5, step_loss=0.0327]
Steps:  76%|███████▌  | 759/1000 [12:45<03:55,  1.02it/s, lr=1.37e-5, step_loss=0.418] 
Steps:  76%|███████▌  | 759/1000 [12:45<03:55,  1.02it/s, lr=1.37e-5, step_loss=0.0652]
Steps:  76%|███████▌  | 760/1000 [12:45<03:54,  1.02it/s, lr=1.37e-5, step_loss=0.0652]
Steps:  76%|███████▌  | 760/1000 [12:45<03:54,  1.02it/s, lr=1.36e-5, step_loss=0.00962]
Steps:  76%|███████▌  | 760/1000 [12:45<03:54,  1.02it/s, lr=1.36e-5, step_loss=0.303]  
Steps:  76%|███████▌  | 760/1000 [12:46<03:54,  1.02it/s, lr=1.36e-5, step_loss=0.205]
Steps:  76%|███████▌  | 760/1000 [12:46<03:54,  1.02it/s, lr=1.36e-5, step_loss=0.311]
Steps:  76%|███████▌  | 761/1000 [12:46<03:53,  1.02it/s, lr=1.36e-5, step_loss=0.311]
Steps:  76%|███████▌  | 761/1000 [12:46<03:53,  1.02it/s, lr=1.34e-5, step_loss=0.0615]
Steps:  76%|███████▌  | 761/1000 [12:46<03:53,  1.02it/s, lr=1.34e-5, step_loss=0.16]  
Steps:  76%|███████▌  | 761/1000 [12:47<03:53,  1.02it/s, lr=1.34e-5, step_loss=0.257]
Steps:  76%|███████▌  | 761/1000 [12:47<03:53,  1.02it/s, lr=1.34e-5, step_loss=0.161]
Steps:  76%|███████▌  | 762/1000 [12:47<03:52,  1.02it/s, lr=1.34e-5, step_loss=0.161]
Steps:  76%|███████▌  | 762/1000 [12:47<03:52,  1.02it/s, lr=1.33e-5, step_loss=0.0131]
Steps:  76%|███████▌  | 762/1000 [12:47<03:52,  1.02it/s, lr=1.33e-5, step_loss=0.08]  
Steps:  76%|███████▌  | 762/1000 [12:47<03:52,  1.02it/s, lr=1.33e-5, step_loss=0.306]
Steps:  76%|███████▌  | 762/1000 [12:48<03:52,  1.02it/s, lr=1.33e-5, step_loss=0.286]
Steps:  76%|███████▋  | 763/1000 [12:48<03:51,  1.02it/s, lr=1.33e-5, step_loss=0.286]
Steps:  76%|███████▋  | 763/1000 [12:48<03:51,  1.02it/s, lr=1.32e-5, step_loss=0.0021]
Steps:  76%|███████▋  | 763/1000 [12:48<03:51,  1.02it/s, lr=1.32e-5, step_loss=0.0498]
Steps:  76%|███████▋  | 763/1000 [12:48<03:51,  1.02it/s, lr=1.32e-5, step_loss=0.00611]
Steps:  76%|███████▋  | 763/1000 [12:49<03:51,  1.02it/s, lr=1.32e-5, step_loss=0.0285] 
Steps:  76%|███████▋  | 764/1000 [12:49<03:51,  1.02it/s, lr=1.32e-5, step_loss=0.0285]
Steps:  76%|███████▋  | 764/1000 [12:49<03:51,  1.02it/s, lr=1.31e-5, step_loss=0.396] 
Steps:  76%|███████▋  | 764/1000 [12:49<03:51,  1.02it/s, lr=1.31e-5, step_loss=0.00624]
Steps:  76%|███████▋  | 764/1000 [12:49<03:51,  1.02it/s, lr=1.31e-5, step_loss=0.0264] 
Steps:  76%|███████▋  | 764/1000 [12:50<03:51,  1.02it/s, lr=1.31e-5, step_loss=0.0313]
Steps:  76%|███████▋  | 765/1000 [12:50<03:49,  1.02it/s, lr=1.31e-5, step_loss=0.0313]
Steps:  76%|███████▋  | 765/1000 [12:50<03:49,  1.02it/s, lr=1.3e-5, step_loss=0.106]  
Steps:  76%|███████▋  | 765/1000 [12:50<03:49,  1.02it/s, lr=1.3e-5, step_loss=0.0774]
Steps:  76%|███████▋  | 765/1000 [12:50<03:49,  1.02it/s, lr=1.3e-5, step_loss=0.249] 
Steps:  76%|███████▋  | 765/1000 [12:51<03:49,  1.02it/s, lr=1.3e-5, step_loss=0.0561]
Steps:  77%|███████▋  | 766/1000 [12:51<03:48,  1.02it/s, lr=1.3e-5, step_loss=0.0561]
Steps:  77%|███████▋  | 766/1000 [12:51<03:48,  1.02it/s, lr=1.29e-5, step_loss=0.134]
Steps:  77%|███████▋  | 766/1000 [12:51<03:48,  1.02it/s, lr=1.29e-5, step_loss=0.00798]
Steps:  77%|███████▋  | 766/1000 [12:51<03:48,  1.02it/s, lr=1.29e-5, step_loss=0.18]   
Steps:  77%|███████▋  | 766/1000 [12:52<03:48,  1.02it/s, lr=1.29e-5, step_loss=0.116]
Steps:  77%|███████▋  | 767/1000 [12:52<03:48,  1.02it/s, lr=1.29e-5, step_loss=0.116]
Steps:  77%|███████▋  | 767/1000 [12:52<03:48,  1.02it/s, lr=1.28e-5, step_loss=0.0559]
Steps:  77%|███████▋  | 767/1000 [12:52<03:48,  1.02it/s, lr=1.28e-5, step_loss=0.0234]
Steps:  77%|███████▋  | 767/1000 [12:52<03:48,  1.02it/s, lr=1.28e-5, step_loss=0.0665]
Steps:  77%|███████▋  | 767/1000 [12:53<03:48,  1.02it/s, lr=1.28e-5, step_loss=0.00441]
Steps:  77%|███████▋  | 768/1000 [12:53<03:47,  1.02it/s, lr=1.28e-5, step_loss=0.00441]
Steps:  77%|███████▋  | 768/1000 [12:53<03:47,  1.02it/s, lr=1.27e-5, step_loss=0.0105] 
Steps:  77%|███████▋  | 768/1000 [12:53<03:47,  1.02it/s, lr=1.27e-5, step_loss=0.00253]
Steps:  77%|███████▋  | 768/1000 [12:53<03:47,  1.02it/s, lr=1.27e-5, step_loss=0.258]  
Steps:  77%|███████▋  | 768/1000 [12:54<03:47,  1.02it/s, lr=1.27e-5, step_loss=0.137]
Steps:  77%|███████▋  | 769/1000 [12:54<03:45,  1.02it/s, lr=1.27e-5, step_loss=0.137]
Steps:  77%|███████▋  | 769/1000 [12:54<03:45,  1.02it/s, lr=1.26e-5, step_loss=0.0365]
Steps:  77%|███████▋  | 769/1000 [12:54<03:45,  1.02it/s, lr=1.26e-5, step_loss=0.042] 
Steps:  77%|███████▋  | 769/1000 [12:54<03:45,  1.02it/s, lr=1.26e-5, step_loss=0.00918]
Steps:  77%|███████▋  | 769/1000 [12:55<03:45,  1.02it/s, lr=1.26e-5, step_loss=0.011]  
Steps:  77%|███████▋  | 770/1000 [12:55<03:44,  1.02it/s, lr=1.26e-5, step_loss=0.011]
Steps:  77%|███████▋  | 770/1000 [12:55<03:44,  1.02it/s, lr=1.25e-5, step_loss=0.0205]
Steps:  77%|███████▋  | 770/1000 [12:55<03:44,  1.02it/s, lr=1.25e-5, step_loss=0.0963]
Steps:  77%|███████▋  | 770/1000 [12:55<03:44,  1.02it/s, lr=1.25e-5, step_loss=0.144] 
Steps:  77%|███████▋  | 770/1000 [12:56<03:44,  1.02it/s, lr=1.25e-5, step_loss=0.0331]
Steps:  77%|███████▋  | 771/1000 [12:56<03:43,  1.02it/s, lr=1.25e-5, step_loss=0.0331]
Steps:  77%|███████▋  | 771/1000 [12:56<03:43,  1.02it/s, lr=1.24e-5, step_loss=0.0546]
Steps:  77%|███████▋  | 771/1000 [12:56<03:43,  1.02it/s, lr=1.24e-5, step_loss=0.296] 
Steps:  77%|███████▋  | 771/1000 [12:56<03:43,  1.02it/s, lr=1.24e-5, step_loss=0.0595]
Steps:  77%|███████▋  | 771/1000 [12:57<03:43,  1.02it/s, lr=1.24e-5, step_loss=0.345] 
Steps:  77%|███████▋  | 772/1000 [12:57<03:42,  1.02it/s, lr=1.24e-5, step_loss=0.345]
Steps:  77%|███████▋  | 772/1000 [12:57<03:42,  1.02it/s, lr=1.23e-5, step_loss=0.0149]
Steps:  77%|███████▋  | 772/1000 [12:57<03:42,  1.02it/s, lr=1.23e-5, step_loss=0.00716]
Steps:  77%|███████▋  | 772/1000 [12:57<03:42,  1.02it/s, lr=1.23e-5, step_loss=0.0259] 
Steps:  77%|███████▋  | 772/1000 [12:58<03:42,  1.02it/s, lr=1.23e-5, step_loss=0.0174]
Steps:  77%|███████▋  | 773/1000 [12:58<03:41,  1.02it/s, lr=1.23e-5, step_loss=0.0174]
Steps:  77%|███████▋  | 773/1000 [12:58<03:41,  1.02it/s, lr=1.22e-5, step_loss=0.00192]
Steps:  77%|███████▋  | 773/1000 [12:58<03:41,  1.02it/s, lr=1.22e-5, step_loss=0.0855] 
Steps:  77%|███████▋  | 773/1000 [12:58<03:41,  1.02it/s, lr=1.22e-5, step_loss=0.019] 
Steps:  77%|███████▋  | 773/1000 [12:58<03:41,  1.02it/s, lr=1.22e-5, step_loss=0.00725]
Steps:  77%|███████▋  | 774/1000 [12:59<03:40,  1.02it/s, lr=1.22e-5, step_loss=0.00725]
Steps:  77%|███████▋  | 774/1000 [12:59<03:40,  1.02it/s, lr=1.21e-5, step_loss=0.164]  
Steps:  77%|███████▋  | 774/1000 [12:59<03:40,  1.02it/s, lr=1.21e-5, step_loss=0.0901]
Steps:  77%|███████▋  | 774/1000 [12:59<03:40,  1.02it/s, lr=1.21e-5, step_loss=0.0426]
Steps:  77%|███████▋  | 774/1000 [12:59<03:40,  1.02it/s, lr=1.21e-5, step_loss=0.0547]
Steps:  78%|███████▊  | 775/1000 [13:00<03:39,  1.02it/s, lr=1.21e-5, step_loss=0.0547]
Steps:  78%|███████▊  | 775/1000 [13:00<03:39,  1.02it/s, lr=1.2e-5, step_loss=0.033]  
Steps:  78%|███████▊  | 775/1000 [13:00<03:39,  1.02it/s, lr=1.2e-5, step_loss=0.00378]
Steps:  78%|███████▊  | 775/1000 [13:00<03:39,  1.02it/s, lr=1.2e-5, step_loss=0.0339] 
Steps:  78%|███████▊  | 775/1000 [13:00<03:39,  1.02it/s, lr=1.2e-5, step_loss=0.00238]
Steps:  78%|███████▊  | 776/1000 [13:01<03:38,  1.02it/s, lr=1.2e-5, step_loss=0.00238]
Steps:  78%|███████▊  | 776/1000 [13:01<03:38,  1.02it/s, lr=1.19e-5, step_loss=0.0863]
Steps:  78%|███████▊  | 776/1000 [13:01<03:38,  1.02it/s, lr=1.19e-5, step_loss=0.134] 
Steps:  78%|███████▊  | 776/1000 [13:01<03:38,  1.02it/s, lr=1.19e-5, step_loss=0.0303]
Steps:  78%|███████▊  | 776/1000 [13:01<03:38,  1.02it/s, lr=1.19e-5, step_loss=0.0186]
Steps:  78%|███████▊  | 777/1000 [13:02<03:37,  1.02it/s, lr=1.19e-5, step_loss=0.0186]
Steps:  78%|███████▊  | 777/1000 [13:02<03:37,  1.02it/s, lr=1.18e-5, step_loss=0.143] 
Steps:  78%|███████▊  | 777/1000 [13:02<03:37,  1.02it/s, lr=1.18e-5, step_loss=0.0951]
Steps:  78%|███████▊  | 777/1000 [13:02<03:37,  1.02it/s, lr=1.18e-5, step_loss=0.0185]
Steps:  78%|███████▊  | 777/1000 [13:02<03:37,  1.02it/s, lr=1.18e-5, step_loss=0.067] 
Steps:  78%|███████▊  | 778/1000 [13:03<03:36,  1.02it/s, lr=1.18e-5, step_loss=0.067]
Steps:  78%|███████▊  | 778/1000 [13:03<03:36,  1.02it/s, lr=1.17e-5, step_loss=0.00191]
Steps:  78%|███████▊  | 778/1000 [13:03<03:36,  1.02it/s, lr=1.17e-5, step_loss=0.0338] 
Steps:  78%|███████▊  | 778/1000 [13:03<03:36,  1.02it/s, lr=1.17e-5, step_loss=0.417] 
Steps:  78%|███████▊  | 778/1000 [13:03<03:36,  1.02it/s, lr=1.17e-5, step_loss=0.0739]
Steps:  78%|███████▊  | 779/1000 [13:04<03:35,  1.03it/s, lr=1.17e-5, step_loss=0.0739]
Steps:  78%|███████▊  | 779/1000 [13:04<03:35,  1.03it/s, lr=1.16e-5, step_loss=0.00432]
Steps:  78%|███████▊  | 779/1000 [13:04<03:35,  1.03it/s, lr=1.16e-5, step_loss=0.0379] 
Steps:  78%|███████▊  | 779/1000 [13:04<03:35,  1.03it/s, lr=1.16e-5, step_loss=0.0892]
Steps:  78%|███████▊  | 779/1000 [13:04<03:35,  1.03it/s, lr=1.16e-5, step_loss=0.0895]
Steps:  78%|███████▊  | 780/1000 [13:05<03:34,  1.03it/s, lr=1.16e-5, step_loss=0.0895]
Steps:  78%|███████▊  | 780/1000 [13:05<03:34,  1.03it/s, lr=1.15e-5, step_loss=0.00351]
Steps:  78%|███████▊  | 780/1000 [13:05<03:34,  1.03it/s, lr=1.15e-5, step_loss=0.0348] 
Steps:  78%|███████▊  | 780/1000 [13:05<03:34,  1.03it/s, lr=1.15e-5, step_loss=0.0865]
Steps:  78%|███████▊  | 780/1000 [13:05<03:34,  1.03it/s, lr=1.15e-5, step_loss=0.0525]
Steps:  78%|███████▊  | 781/1000 [13:06<03:33,  1.02it/s, lr=1.15e-5, step_loss=0.0525]
Steps:  78%|███████▊  | 781/1000 [13:06<03:33,  1.02it/s, lr=1.14e-5, step_loss=0.0443]
Steps:  78%|███████▊  | 781/1000 [13:06<03:33,  1.02it/s, lr=1.14e-5, step_loss=0.00716]
Steps:  78%|███████▊  | 781/1000 [13:06<03:33,  1.02it/s, lr=1.14e-5, step_loss=0.0262] 
Steps:  78%|███████▊  | 781/1000 [13:06<03:33,  1.02it/s, lr=1.14e-5, step_loss=0.0408]
Steps:  78%|███████▊  | 782/1000 [13:07<03:32,  1.02it/s, lr=1.14e-5, step_loss=0.0408]
Steps:  78%|███████▊  | 782/1000 [13:07<03:32,  1.02it/s, lr=1.13e-5, step_loss=0.0133]
Steps:  78%|███████▊  | 782/1000 [13:07<03:32,  1.02it/s, lr=1.13e-5, step_loss=0.00237]
Steps:  78%|███████▊  | 782/1000 [13:07<03:32,  1.02it/s, lr=1.13e-5, step_loss=0.156]  
Steps:  78%|███████▊  | 782/1000 [13:07<03:32,  1.02it/s, lr=1.13e-5, step_loss=0.0138]
Steps:  78%|███████▊  | 783/1000 [13:07<03:31,  1.02it/s, lr=1.13e-5, step_loss=0.0138]
Steps:  78%|███████▊  | 783/1000 [13:08<03:31,  1.02it/s, lr=1.12e-5, step_loss=0.299] 
Steps:  78%|███████▊  | 783/1000 [13:08<03:31,  1.02it/s, lr=1.12e-5, step_loss=0.208]
Steps:  78%|███████▊  | 783/1000 [13:08<03:31,  1.02it/s, lr=1.12e-5, step_loss=0.0923]
Steps:  78%|███████▊  | 783/1000 [13:08<03:31,  1.02it/s, lr=1.12e-5, step_loss=0.00256]
Steps:  78%|███████▊  | 784/1000 [13:08<03:30,  1.02it/s, lr=1.12e-5, step_loss=0.00256]
Steps:  78%|███████▊  | 784/1000 [13:08<03:30,  1.02it/s, lr=1.11e-5, step_loss=0.00326]
Steps:  78%|███████▊  | 784/1000 [13:09<03:30,  1.02it/s, lr=1.11e-5, step_loss=0.00854]
Steps:  78%|███████▊  | 784/1000 [13:09<03:30,  1.02it/s, lr=1.11e-5, step_loss=0.161]  
Steps:  78%|███████▊  | 784/1000 [13:09<03:30,  1.02it/s, lr=1.11e-5, step_loss=0.00277]
Steps:  78%|███████▊  | 785/1000 [13:09<03:29,  1.02it/s, lr=1.11e-5, step_loss=0.00277]
Steps:  78%|███████▊  | 785/1000 [13:09<03:29,  1.02it/s, lr=1.1e-5, step_loss=0.0704]  
Steps:  78%|███████▊  | 785/1000 [13:10<03:29,  1.02it/s, lr=1.1e-5, step_loss=0.339] 
Steps:  78%|███████▊  | 785/1000 [13:10<03:29,  1.02it/s, lr=1.1e-5, step_loss=0.0369]
Steps:  78%|███████▊  | 785/1000 [13:10<03:29,  1.02it/s, lr=1.1e-5, step_loss=0.0651]
Steps:  79%|███████▊  | 786/1000 [13:10<03:28,  1.02it/s, lr=1.1e-5, step_loss=0.0651]
Steps:  79%|███████▊  | 786/1000 [13:10<03:28,  1.02it/s, lr=1.09e-5, step_loss=0.00264]
Steps:  79%|███████▊  | 786/1000 [13:11<03:28,  1.02it/s, lr=1.09e-5, step_loss=0.00273]
Steps:  79%|███████▊  | 786/1000 [13:11<03:28,  1.02it/s, lr=1.09e-5, step_loss=0.0302] 
Steps:  79%|███████▊  | 786/1000 [13:11<03:28,  1.02it/s, lr=1.09e-5, step_loss=0.0822]
Steps:  79%|███████▊  | 787/1000 [13:11<03:28,  1.02it/s, lr=1.09e-5, step_loss=0.0822]
Steps:  79%|███████▊  | 787/1000 [13:11<03:28,  1.02it/s, lr=1.08e-5, step_loss=0.163] 
Steps:  79%|███████▊  | 787/1000 [13:12<03:28,  1.02it/s, lr=1.08e-5, step_loss=0.015]
Steps:  79%|███████▊  | 787/1000 [13:12<03:28,  1.02it/s, lr=1.08e-5, step_loss=0.0617]
Steps:  79%|███████▊  | 787/1000 [13:12<03:28,  1.02it/s, lr=1.08e-5, step_loss=0.0024]
Steps:  79%|███████▉  | 788/1000 [13:12<03:26,  1.02it/s, lr=1.08e-5, step_loss=0.0024]
Steps:  79%|███████▉  | 788/1000 [13:12<03:26,  1.02it/s, lr=1.07e-5, step_loss=0.0361]
Steps:  79%|███████▉  | 788/1000 [13:13<03:26,  1.02it/s, lr=1.07e-5, step_loss=0.0717]
Steps:  79%|███████▉  | 788/1000 [13:13<03:26,  1.02it/s, lr=1.07e-5, step_loss=0.0821]
Steps:  79%|███████▉  | 788/1000 [13:13<03:26,  1.02it/s, lr=1.07e-5, step_loss=0.0646]
Steps:  79%|███████▉  | 789/1000 [13:13<03:25,  1.02it/s, lr=1.07e-5, step_loss=0.0646]
Steps:  79%|███████▉  | 789/1000 [13:13<03:25,  1.02it/s, lr=1.06e-5, step_loss=0.127] 
Steps:  79%|███████▉  | 789/1000 [13:14<03:25,  1.02it/s, lr=1.06e-5, step_loss=0.0186]
Steps:  79%|███████▉  | 789/1000 [13:14<03:25,  1.02it/s, lr=1.06e-5, step_loss=0.0184]
Steps:  79%|███████▉  | 789/1000 [13:14<03:25,  1.02it/s, lr=1.06e-5, step_loss=0.0667]
Steps:  79%|███████▉  | 790/1000 [13:14<03:24,  1.03it/s, lr=1.06e-5, step_loss=0.0667]
Steps:  79%|███████▉  | 790/1000 [13:14<03:24,  1.03it/s, lr=1.05e-5, step_loss=0.272] 
Steps:  79%|███████▉  | 790/1000 [13:15<03:24,  1.03it/s, lr=1.05e-5, step_loss=0.04] 
Steps:  79%|███████▉  | 790/1000 [13:15<03:24,  1.03it/s, lr=1.05e-5, step_loss=0.00445]
Steps:  79%|███████▉  | 790/1000 [13:15<03:24,  1.03it/s, lr=1.05e-5, step_loss=0.0277] 
Steps:  79%|███████▉  | 791/1000 [13:15<03:23,  1.03it/s, lr=1.05e-5, step_loss=0.0277]
Steps:  79%|███████▉  | 791/1000 [13:15<03:23,  1.03it/s, lr=1.04e-5, step_loss=0.0877]
Steps:  79%|███████▉  | 791/1000 [13:16<03:23,  1.03it/s, lr=1.04e-5, step_loss=0.296] 
Steps:  79%|███████▉  | 791/1000 [13:16<03:23,  1.03it/s, lr=1.04e-5, step_loss=0.0888]
Steps:  79%|███████▉  | 791/1000 [13:16<03:23,  1.03it/s, lr=1.04e-5, step_loss=0.124] 
Steps:  79%|███████▉  | 792/1000 [13:16<03:22,  1.03it/s, lr=1.04e-5, step_loss=0.124]
Steps:  79%|███████▉  | 792/1000 [13:16<03:22,  1.03it/s, lr=1.03e-5, step_loss=0.0859]
Steps:  79%|███████▉  | 792/1000 [13:17<03:22,  1.03it/s, lr=1.03e-5, step_loss=0.0279]
Steps:  79%|███████▉  | 792/1000 [13:17<03:22,  1.03it/s, lr=1.03e-5, step_loss=0.308] 
Steps:  79%|███████▉  | 792/1000 [13:17<03:22,  1.03it/s, lr=1.03e-5, step_loss=0.0573]
Steps:  79%|███████▉  | 793/1000 [13:17<03:21,  1.03it/s, lr=1.03e-5, step_loss=0.0573]
Steps:  79%|███████▉  | 793/1000 [13:17<03:21,  1.03it/s, lr=1.02e-5, step_loss=0.117] 
Steps:  79%|███████▉  | 793/1000 [13:18<03:21,  1.03it/s, lr=1.02e-5, step_loss=0.166]
Steps:  79%|███████▉  | 793/1000 [13:18<03:21,  1.03it/s, lr=1.02e-5, step_loss=0.029]
Steps:  79%|███████▉  | 793/1000 [13:18<03:21,  1.03it/s, lr=1.02e-5, step_loss=0.0179]
Steps:  79%|███████▉  | 794/1000 [13:18<03:20,  1.03it/s, lr=1.02e-5, step_loss=0.0179]
Steps:  79%|███████▉  | 794/1000 [13:18<03:20,  1.03it/s, lr=1.01e-5, step_loss=0.00396]
Steps:  79%|███████▉  | 794/1000 [13:18<03:20,  1.03it/s, lr=1.01e-5, step_loss=0.124]  
Steps:  79%|███████▉  | 794/1000 [13:19<03:20,  1.03it/s, lr=1.01e-5, step_loss=0.117]
Steps:  79%|███████▉  | 794/1000 [13:19<03:20,  1.03it/s, lr=1.01e-5, step_loss=0.0692]
Steps:  80%|███████▉  | 795/1000 [13:19<03:19,  1.03it/s, lr=1.01e-5, step_loss=0.0692]
Steps:  80%|███████▉  | 795/1000 [13:19<03:19,  1.03it/s, lr=1e-5, step_loss=0.0019]   
Steps:  80%|███████▉  | 795/1000 [13:19<03:19,  1.03it/s, lr=1e-5, step_loss=0.171] 
Steps:  80%|███████▉  | 795/1000 [13:20<03:19,  1.03it/s, lr=1e-5, step_loss=0.0302]
Steps:  80%|███████▉  | 795/1000 [13:20<03:19,  1.03it/s, lr=1e-5, step_loss=0.163] 
Steps:  80%|███████▉  | 796/1000 [13:20<03:18,  1.03it/s, lr=1e-5, step_loss=0.163]
Steps:  80%|███████▉  | 796/1000 [13:20<03:18,  1.03it/s, lr=9.92e-6, step_loss=0.0886]
Steps:  80%|███████▉  | 796/1000 [13:20<03:18,  1.03it/s, lr=9.92e-6, step_loss=0.0236]
Steps:  80%|███████▉  | 796/1000 [13:21<03:18,  1.03it/s, lr=9.92e-6, step_loss=0.063] 
Steps:  80%|███████▉  | 796/1000 [13:21<03:18,  1.03it/s, lr=9.92e-6, step_loss=0.0175]
Steps:  80%|███████▉  | 797/1000 [13:21<03:17,  1.03it/s, lr=9.92e-6, step_loss=0.0175]
Steps:  80%|███████▉  | 797/1000 [13:21<03:17,  1.03it/s, lr=9.83e-6, step_loss=0.00749]
Steps:  80%|███████▉  | 797/1000 [13:21<03:17,  1.03it/s, lr=9.83e-6, step_loss=0.0235] 
Steps:  80%|███████▉  | 797/1000 [13:22<03:17,  1.03it/s, lr=9.83e-6, step_loss=0.111] 
Steps:  80%|███████▉  | 797/1000 [13:22<03:17,  1.03it/s, lr=9.83e-6, step_loss=0.00283]
Steps:  80%|███████▉  | 798/1000 [13:22<03:16,  1.03it/s, lr=9.83e-6, step_loss=0.00283]
Steps:  80%|███████▉  | 798/1000 [13:22<03:16,  1.03it/s, lr=9.73e-6, step_loss=0.0635] 
Steps:  80%|███████▉  | 798/1000 [13:22<03:16,  1.03it/s, lr=9.73e-6, step_loss=0.0292]
Steps:  80%|███████▉  | 798/1000 [13:23<03:16,  1.03it/s, lr=9.73e-6, step_loss=0.0325]
Steps:  80%|███████▉  | 798/1000 [13:23<03:16,  1.03it/s, lr=9.73e-6, step_loss=0.0715]
Steps:  80%|███████▉  | 799/1000 [13:23<03:15,  1.03it/s, lr=9.73e-6, step_loss=0.0715]
Steps:  80%|███████▉  | 799/1000 [13:23<03:15,  1.03it/s, lr=9.64e-6, step_loss=0.0833]
Steps:  80%|███████▉  | 799/1000 [13:23<03:15,  1.03it/s, lr=9.64e-6, step_loss=0.00614]
Steps:  80%|███████▉  | 799/1000 [13:24<03:15,  1.03it/s, lr=9.64e-6, step_loss=0.0604] 
Steps:  80%|███████▉  | 799/1000 [13:24<03:15,  1.03it/s, lr=9.64e-6, step_loss=0.00883]
Steps:  80%|████████  | 800/1000 [13:24<03:14,  1.03it/s, lr=9.64e-6, step_loss=0.00883]
Steps:  80%|████████  | 800/1000 [13:24<03:14,  1.03it/s, lr=9.55e-6, step_loss=0.104]  
Steps:  80%|████████  | 800/1000 [13:24<03:14,  1.03it/s, lr=9.55e-6, step_loss=0.121]
Steps:  80%|████████  | 800/1000 [13:25<03:14,  1.03it/s, lr=9.55e-6, step_loss=0.141]
Steps:  80%|████████  | 800/1000 [13:25<03:14,  1.03it/s, lr=9.55e-6, step_loss=0.0768]
Steps:  80%|████████  | 801/1000 [13:25<03:13,  1.03it/s, lr=9.55e-6, step_loss=0.0768]
Steps:  80%|████████  | 801/1000 [13:25<03:13,  1.03it/s, lr=9.46e-6, step_loss=0.178] 
Steps:  80%|████████  | 801/1000 [13:25<03:13,  1.03it/s, lr=9.46e-6, step_loss=0.0344]
Steps:  80%|████████  | 801/1000 [13:26<03:13,  1.03it/s, lr=9.46e-6, step_loss=0.0743]
Steps:  80%|████████  | 801/1000 [13:26<03:13,  1.03it/s, lr=9.46e-6, step_loss=0.0808]
Steps:  80%|████████  | 802/1000 [13:26<03:12,  1.03it/s, lr=9.46e-6, step_loss=0.0808]
Steps:  80%|████████  | 802/1000 [13:26<03:12,  1.03it/s, lr=9.37e-6, step_loss=0.0046]
Steps:  80%|████████  | 802/1000 [13:26<03:12,  1.03it/s, lr=9.37e-6, step_loss=0.0721]
Steps:  80%|████████  | 802/1000 [13:27<03:12,  1.03it/s, lr=9.37e-6, step_loss=0.0419]
Steps:  80%|████████  | 802/1000 [13:27<03:12,  1.03it/s, lr=9.37e-6, step_loss=0.0121]
Steps:  80%|████████  | 803/1000 [13:27<03:11,  1.03it/s, lr=9.37e-6, step_loss=0.0121]
Steps:  80%|████████  | 803/1000 [13:27<03:11,  1.03it/s, lr=9.27e-6, step_loss=0.0954]
Steps:  80%|████████  | 803/1000 [13:27<03:11,  1.03it/s, lr=9.27e-6, step_loss=0.053] 
Steps:  80%|████████  | 803/1000 [13:27<03:11,  1.03it/s, lr=9.27e-6, step_loss=0.00522]
Steps:  80%|████████  | 803/1000 [13:28<03:11,  1.03it/s, lr=9.27e-6, step_loss=0.367]  
Steps:  80%|████████  | 804/1000 [13:28<03:11,  1.03it/s, lr=9.27e-6, step_loss=0.367]
Steps:  80%|████████  | 804/1000 [13:28<03:11,  1.03it/s, lr=9.18e-6, step_loss=0.0569]
Steps:  80%|████████  | 804/1000 [13:28<03:11,  1.03it/s, lr=9.18e-6, step_loss=0.0643]
Steps:  80%|████████  | 804/1000 [13:28<03:11,  1.03it/s, lr=9.18e-6, step_loss=0.00739]
Steps:  80%|████████  | 804/1000 [13:29<03:11,  1.03it/s, lr=9.18e-6, step_loss=0.125]  
Steps:  80%|████████  | 805/1000 [13:29<03:10,  1.02it/s, lr=9.18e-6, step_loss=0.125]
Steps:  80%|████████  | 805/1000 [13:29<03:10,  1.02it/s, lr=9.09e-6, step_loss=0.00197]
Steps:  80%|████████  | 805/1000 [13:29<03:10,  1.02it/s, lr=9.09e-6, step_loss=0.0109] 
Steps:  80%|████████  | 805/1000 [13:29<03:10,  1.02it/s, lr=9.09e-6, step_loss=0.0402]
Steps:  80%|████████  | 805/1000 [13:30<03:10,  1.02it/s, lr=9.09e-6, step_loss=0.0117]
Steps:  81%|████████  | 806/1000 [13:30<03:09,  1.02it/s, lr=9.09e-6, step_loss=0.0117]
Steps:  81%|████████  | 806/1000 [13:30<03:09,  1.02it/s, lr=9e-6, step_loss=0.00358]  
Steps:  81%|████████  | 806/1000 [13:30<03:09,  1.02it/s, lr=9e-6, step_loss=0.0864] 
Steps:  81%|████████  | 806/1000 [13:30<03:09,  1.02it/s, lr=9e-6, step_loss=0.0221]
Steps:  81%|████████  | 806/1000 [13:31<03:09,  1.02it/s, lr=9e-6, step_loss=0.0652]
Steps:  81%|████████  | 807/1000 [13:31<03:08,  1.02it/s, lr=9e-6, step_loss=0.0652]
Steps:  81%|████████  | 807/1000 [13:31<03:08,  1.02it/s, lr=8.91e-6, step_loss=0.0232]
Steps:  81%|████████  | 807/1000 [13:31<03:08,  1.02it/s, lr=8.91e-6, step_loss=0.0345]
Steps:  81%|████████  | 807/1000 [13:31<03:08,  1.02it/s, lr=8.91e-6, step_loss=0.191] 
Steps:  81%|████████  | 807/1000 [13:32<03:08,  1.02it/s, lr=8.91e-6, step_loss=0.0503]
Steps:  81%|████████  | 808/1000 [13:32<03:07,  1.02it/s, lr=8.91e-6, step_loss=0.0503]
Steps:  81%|████████  | 808/1000 [13:32<03:07,  1.02it/s, lr=8.82e-6, step_loss=0.00444]
Steps:  81%|████████  | 808/1000 [13:32<03:07,  1.02it/s, lr=8.82e-6, step_loss=0.0995] 
Steps:  81%|████████  | 808/1000 [13:32<03:07,  1.02it/s, lr=8.82e-6, step_loss=0.0473]
Steps:  81%|████████  | 808/1000 [13:33<03:07,  1.02it/s, lr=8.82e-6, step_loss=0.02]  
Steps:  81%|████████  | 809/1000 [13:33<03:06,  1.03it/s, lr=8.82e-6, step_loss=0.02]
Steps:  81%|████████  | 809/1000 [13:33<03:06,  1.03it/s, lr=8.73e-6, step_loss=0.122]
Steps:  81%|████████  | 809/1000 [13:33<03:06,  1.03it/s, lr=8.73e-6, step_loss=0.0687]
Steps:  81%|████████  | 809/1000 [13:33<03:06,  1.03it/s, lr=8.73e-6, step_loss=0.00637]
Steps:  81%|████████  | 809/1000 [13:34<03:06,  1.03it/s, lr=8.73e-6, step_loss=0.0377] 
Steps:  81%|████████  | 810/1000 [13:34<03:05,  1.03it/s, lr=8.73e-6, step_loss=0.0377]
Steps:  81%|████████  | 810/1000 [13:34<03:05,  1.03it/s, lr=8.65e-6, step_loss=0.228] 
Steps:  81%|████████  | 810/1000 [13:34<03:05,  1.03it/s, lr=8.65e-6, step_loss=0.00386]
Steps:  81%|████████  | 810/1000 [13:34<03:05,  1.03it/s, lr=8.65e-6, step_loss=0.0298] 
Steps:  81%|████████  | 810/1000 [13:35<03:05,  1.03it/s, lr=8.65e-6, step_loss=0.0754]
Steps:  81%|████████  | 811/1000 [13:35<03:04,  1.03it/s, lr=8.65e-6, step_loss=0.0754]
Steps:  81%|████████  | 811/1000 [13:35<03:04,  1.03it/s, lr=8.56e-6, step_loss=0.0796]
Steps:  81%|████████  | 811/1000 [13:35<03:04,  1.03it/s, lr=8.56e-6, step_loss=0.0309]
Steps:  81%|████████  | 811/1000 [13:35<03:04,  1.03it/s, lr=8.56e-6, step_loss=0.014] 
Steps:  81%|████████  | 811/1000 [13:36<03:04,  1.03it/s, lr=8.56e-6, step_loss=0.0104]
Steps:  81%|████████  | 812/1000 [13:36<03:03,  1.03it/s, lr=8.56e-6, step_loss=0.0104]
Steps:  81%|████████  | 812/1000 [13:36<03:03,  1.03it/s, lr=8.47e-6, step_loss=0.056] 
Steps:  81%|████████  | 812/1000 [13:36<03:03,  1.03it/s, lr=8.47e-6, step_loss=0.00322]
Steps:  81%|████████  | 812/1000 [13:36<03:03,  1.03it/s, lr=8.47e-6, step_loss=0.0322] 
Steps:  81%|████████  | 812/1000 [13:37<03:03,  1.03it/s, lr=8.47e-6, step_loss=0.12]  
Steps:  81%|████████▏ | 813/1000 [13:37<03:02,  1.03it/s, lr=8.47e-6, step_loss=0.12]
Steps:  81%|████████▏ | 813/1000 [13:37<03:02,  1.03it/s, lr=8.38e-6, step_loss=0.00345]
Steps:  81%|████████▏ | 813/1000 [13:37<03:02,  1.03it/s, lr=8.38e-6, step_loss=0.0141] 
Steps:  81%|████████▏ | 813/1000 [13:37<03:02,  1.03it/s, lr=8.38e-6, step_loss=0.202] 
Steps:  81%|████████▏ | 813/1000 [13:37<03:02,  1.03it/s, lr=8.38e-6, step_loss=0.0539]
Steps:  81%|████████▏ | 814/1000 [13:38<03:01,  1.03it/s, lr=8.38e-6, step_loss=0.0539]
Steps:  81%|████████▏ | 814/1000 [13:38<03:01,  1.03it/s, lr=8.3e-6, step_loss=0.0567] 
Steps:  81%|████████▏ | 814/1000 [13:38<03:01,  1.03it/s, lr=8.3e-6, step_loss=0.061] 
Steps:  81%|████████▏ | 814/1000 [13:38<03:01,  1.03it/s, lr=8.3e-6, step_loss=0.0197]
Steps:  81%|████████▏ | 814/1000 [13:38<03:01,  1.03it/s, lr=8.3e-6, step_loss=0.15]  
Steps:  82%|████████▏ | 815/1000 [13:39<03:00,  1.03it/s, lr=8.3e-6, step_loss=0.15]
Steps:  82%|████████▏ | 815/1000 [13:39<03:00,  1.03it/s, lr=8.21e-6, step_loss=0.0888]
Steps:  82%|████████▏ | 815/1000 [13:39<03:00,  1.03it/s, lr=8.21e-6, step_loss=0.131] 
Steps:  82%|████████▏ | 815/1000 [13:39<03:00,  1.03it/s, lr=8.21e-6, step_loss=0.0871]
Steps:  82%|████████▏ | 815/1000 [13:39<03:00,  1.03it/s, lr=8.21e-6, step_loss=0.0972]
Steps:  82%|████████▏ | 816/1000 [13:40<02:59,  1.03it/s, lr=8.21e-6, step_loss=0.0972]
Steps:  82%|████████▏ | 816/1000 [13:40<02:59,  1.03it/s, lr=8.12e-6, step_loss=0.0319]
Steps:  82%|████████▏ | 816/1000 [13:40<02:59,  1.03it/s, lr=8.12e-6, step_loss=0.0323]
Steps:  82%|████████▏ | 816/1000 [13:40<02:59,  1.03it/s, lr=8.12e-6, step_loss=0.00531]
Steps:  82%|████████▏ | 816/1000 [13:40<02:59,  1.03it/s, lr=8.12e-6, step_loss=0.86]   
Steps:  82%|████████▏ | 817/1000 [13:41<02:58,  1.03it/s, lr=8.12e-6, step_loss=0.86]
Steps:  82%|████████▏ | 817/1000 [13:41<02:58,  1.03it/s, lr=8.04e-6, step_loss=0.00249]
Steps:  82%|████████▏ | 817/1000 [13:41<02:58,  1.03it/s, lr=8.04e-6, step_loss=0.236]  
Steps:  82%|████████▏ | 817/1000 [13:41<02:58,  1.03it/s, lr=8.04e-6, step_loss=0.0399]
Steps:  82%|████████▏ | 817/1000 [13:41<02:58,  1.03it/s, lr=8.04e-6, step_loss=0.238] 
Steps:  82%|████████▏ | 818/1000 [13:42<02:57,  1.03it/s, lr=8.04e-6, step_loss=0.238]
Steps:  82%|████████▏ | 818/1000 [13:42<02:57,  1.03it/s, lr=7.95e-6, step_loss=0.0828]
Steps:  82%|████████▏ | 818/1000 [13:42<02:57,  1.03it/s, lr=7.95e-6, step_loss=0.0191]
Steps:  82%|████████▏ | 818/1000 [13:42<02:57,  1.03it/s, lr=7.95e-6, step_loss=0.0741]
Steps:  82%|████████▏ | 818/1000 [13:42<02:57,  1.03it/s, lr=7.95e-6, step_loss=0.0636]
Steps:  82%|████████▏ | 819/1000 [13:43<02:56,  1.02it/s, lr=7.95e-6, step_loss=0.0636]
Steps:  82%|████████▏ | 819/1000 [13:43<02:56,  1.02it/s, lr=7.87e-6, step_loss=0.342] 
Steps:  82%|████████▏ | 819/1000 [13:43<02:56,  1.02it/s, lr=7.87e-6, step_loss=0.0901]
Steps:  82%|████████▏ | 819/1000 [13:43<02:56,  1.02it/s, lr=7.87e-6, step_loss=0.0179]
Steps:  82%|████████▏ | 819/1000 [13:43<02:56,  1.02it/s, lr=7.87e-6, step_loss=0.00424]
Steps:  82%|████████▏ | 820/1000 [13:44<02:55,  1.02it/s, lr=7.87e-6, step_loss=0.00424]
Steps:  82%|████████▏ | 820/1000 [13:44<02:55,  1.02it/s, lr=7.78e-6, step_loss=0.0262] 
Steps:  82%|████████▏ | 820/1000 [13:44<02:55,  1.02it/s, lr=7.78e-6, step_loss=0.039] 
Steps:  82%|████████▏ | 820/1000 [13:44<02:55,  1.02it/s, lr=7.78e-6, step_loss=0.0402]
Steps:  82%|████████▏ | 820/1000 [13:44<02:55,  1.02it/s, lr=7.78e-6, step_loss=0.114] 
Steps:  82%|████████▏ | 821/1000 [13:45<02:54,  1.02it/s, lr=7.78e-6, step_loss=0.114]
Steps:  82%|████████▏ | 821/1000 [13:45<02:54,  1.02it/s, lr=7.7e-6, step_loss=0.00754]
Steps:  82%|████████▏ | 821/1000 [13:45<02:54,  1.02it/s, lr=7.7e-6, step_loss=0.0941] 
Steps:  82%|████████▏ | 821/1000 [13:45<02:54,  1.02it/s, lr=7.7e-6, step_loss=0.0117]
Steps:  82%|████████▏ | 821/1000 [13:45<02:54,  1.02it/s, lr=7.7e-6, step_loss=0.0588]
Steps:  82%|████████▏ | 822/1000 [13:46<02:53,  1.02it/s, lr=7.7e-6, step_loss=0.0588]
Steps:  82%|████████▏ | 822/1000 [13:46<02:53,  1.02it/s, lr=7.62e-6, step_loss=0.116]
Steps:  82%|████████▏ | 822/1000 [13:46<02:53,  1.02it/s, lr=7.62e-6, step_loss=0.00404]
Steps:  82%|████████▏ | 822/1000 [13:46<02:53,  1.02it/s, lr=7.62e-6, step_loss=0.00609]
Steps:  82%|████████▏ | 822/1000 [13:46<02:53,  1.02it/s, lr=7.62e-6, step_loss=0.075]  
Steps:  82%|████████▏ | 823/1000 [13:46<02:52,  1.02it/s, lr=7.62e-6, step_loss=0.075]
Steps:  82%|████████▏ | 823/1000 [13:47<02:52,  1.02it/s, lr=7.53e-6, step_loss=0.137]
Steps:  82%|████████▏ | 823/1000 [13:47<02:52,  1.02it/s, lr=7.53e-6, step_loss=0.0334]
Steps:  82%|████████▏ | 823/1000 [13:47<02:52,  1.02it/s, lr=7.53e-6, step_loss=0.163] 
Steps:  82%|████████▏ | 823/1000 [13:47<02:52,  1.02it/s, lr=7.53e-6, step_loss=0.182]
Steps:  82%|████████▏ | 824/1000 [13:47<02:51,  1.02it/s, lr=7.53e-6, step_loss=0.182]
Steps:  82%|████████▏ | 824/1000 [13:47<02:51,  1.02it/s, lr=7.45e-6, step_loss=0.16] 
Steps:  82%|████████▏ | 824/1000 [13:48<02:51,  1.02it/s, lr=7.45e-6, step_loss=0.0329]
Steps:  82%|████████▏ | 824/1000 [13:48<02:51,  1.02it/s, lr=7.45e-6, step_loss=0.0286]
Steps:  82%|████████▏ | 824/1000 [13:48<02:51,  1.02it/s, lr=7.45e-6, step_loss=0.019] 
Steps:  82%|████████▎ | 825/1000 [13:48<02:50,  1.03it/s, lr=7.45e-6, step_loss=0.019]
Steps:  82%|████████▎ | 825/1000 [13:48<02:50,  1.03it/s, lr=7.37e-6, step_loss=0.053]
Steps:  82%|████████▎ | 825/1000 [13:49<02:50,  1.03it/s, lr=7.37e-6, step_loss=0.066]
Steps:  82%|████████▎ | 825/1000 [13:49<02:50,  1.03it/s, lr=7.37e-6, step_loss=0.0212]
Steps:  82%|████████▎ | 825/1000 [13:49<02:50,  1.03it/s, lr=7.37e-6, step_loss=0.157] 
Steps:  83%|████████▎ | 826/1000 [13:49<02:49,  1.03it/s, lr=7.37e-6, step_loss=0.157]
Steps:  83%|████████▎ | 826/1000 [13:49<02:49,  1.03it/s, lr=7.29e-6, step_loss=0.0761]
Steps:  83%|████████▎ | 826/1000 [13:50<02:49,  1.03it/s, lr=7.29e-6, step_loss=0.00219]
Steps:  83%|████████▎ | 826/1000 [13:50<02:49,  1.03it/s, lr=7.29e-6, step_loss=0.00803]
Steps:  83%|████████▎ | 826/1000 [13:50<02:49,  1.03it/s, lr=7.29e-6, step_loss=0.245]  
Steps:  83%|████████▎ | 827/1000 [13:50<02:48,  1.03it/s, lr=7.29e-6, step_loss=0.245]
Steps:  83%|████████▎ | 827/1000 [13:50<02:48,  1.03it/s, lr=7.2e-6, step_loss=0.0405]
Steps:  83%|████████▎ | 827/1000 [13:51<02:48,  1.03it/s, lr=7.2e-6, step_loss=0.00776]
Steps:  83%|████████▎ | 827/1000 [13:51<02:48,  1.03it/s, lr=7.2e-6, step_loss=0.00533]
Steps:  83%|████████▎ | 827/1000 [13:51<02:48,  1.03it/s, lr=7.2e-6, step_loss=0.0104] 
Steps:  83%|████████▎ | 828/1000 [13:51<02:47,  1.03it/s, lr=7.2e-6, step_loss=0.0104]
Steps:  83%|████████▎ | 828/1000 [13:51<02:47,  1.03it/s, lr=7.12e-6, step_loss=0.0513]
Steps:  83%|████████▎ | 828/1000 [13:52<02:47,  1.03it/s, lr=7.12e-6, step_loss=0.00442]
Steps:  83%|████████▎ | 828/1000 [13:52<02:47,  1.03it/s, lr=7.12e-6, step_loss=0.0554] 
Steps:  83%|████████▎ | 828/1000 [13:52<02:47,  1.03it/s, lr=7.12e-6, step_loss=0.192] 
Steps:  83%|████████▎ | 829/1000 [13:52<02:46,  1.03it/s, lr=7.12e-6, step_loss=0.192]
Steps:  83%|████████▎ | 829/1000 [13:52<02:46,  1.03it/s, lr=7.04e-6, step_loss=0.144]
Steps:  83%|████████▎ | 829/1000 [13:53<02:46,  1.03it/s, lr=7.04e-6, step_loss=0.0396]
Steps:  83%|████████▎ | 829/1000 [13:53<02:46,  1.03it/s, lr=7.04e-6, step_loss=0.0879]
Steps:  83%|████████▎ | 829/1000 [13:53<02:46,  1.03it/s, lr=7.04e-6, step_loss=0.0407]
Steps:  83%|████████▎ | 830/1000 [13:53<02:45,  1.03it/s, lr=7.04e-6, step_loss=0.0407]
Steps:  83%|████████▎ | 830/1000 [13:53<02:45,  1.03it/s, lr=6.96e-6, step_loss=0.122] 
Steps:  83%|████████▎ | 830/1000 [13:54<02:45,  1.03it/s, lr=6.96e-6, step_loss=0.00926]
Steps:  83%|████████▎ | 830/1000 [13:54<02:45,  1.03it/s, lr=6.96e-6, step_loss=0.298]  
Steps:  83%|████████▎ | 830/1000 [13:54<02:45,  1.03it/s, lr=6.96e-6, step_loss=0.127]
Steps:  83%|████████▎ | 831/1000 [13:54<02:44,  1.03it/s, lr=6.96e-6, step_loss=0.127]
Steps:  83%|████████▎ | 831/1000 [13:54<02:44,  1.03it/s, lr=6.88e-6, step_loss=0.00591]
Steps:  83%|████████▎ | 831/1000 [13:55<02:44,  1.03it/s, lr=6.88e-6, step_loss=0.717]  
Steps:  83%|████████▎ | 831/1000 [13:55<02:44,  1.03it/s, lr=6.88e-6, step_loss=0.0153]
Steps:  83%|████████▎ | 831/1000 [13:55<02:44,  1.03it/s, lr=6.88e-6, step_loss=0.381] 
Steps:  83%|████████▎ | 832/1000 [13:55<02:43,  1.03it/s, lr=6.88e-6, step_loss=0.381]
Steps:  83%|████████▎ | 832/1000 [13:55<02:43,  1.03it/s, lr=6.8e-6, step_loss=0.0364]
Steps:  83%|████████▎ | 832/1000 [13:56<02:43,  1.03it/s, lr=6.8e-6, step_loss=0.0341]
Steps:  83%|████████▎ | 832/1000 [13:56<02:43,  1.03it/s, lr=6.8e-6, step_loss=0.103] 
Steps:  83%|████████▎ | 832/1000 [13:56<02:43,  1.03it/s, lr=6.8e-6, step_loss=0.00253]
Steps:  83%|████████▎ | 833/1000 [13:56<02:42,  1.03it/s, lr=6.8e-6, step_loss=0.00253]
Steps:  83%|████████▎ | 833/1000 [13:56<02:42,  1.03it/s, lr=6.72e-6, step_loss=0.129] 
Steps:  83%|████████▎ | 833/1000 [13:57<02:42,  1.03it/s, lr=6.72e-6, step_loss=0.00842]
Steps:  83%|████████▎ | 833/1000 [13:57<02:42,  1.03it/s, lr=6.72e-6, step_loss=0.00553]
Steps:  83%|████████▎ | 833/1000 [13:57<02:42,  1.03it/s, lr=6.72e-6, step_loss=0.00465]
Steps:  83%|████████▎ | 834/1000 [13:57<02:41,  1.03it/s, lr=6.72e-6, step_loss=0.00465]
Steps:  83%|████████▎ | 834/1000 [13:57<02:41,  1.03it/s, lr=6.65e-6, step_loss=0.0477] 
Steps:  83%|████████▎ | 834/1000 [13:57<02:41,  1.03it/s, lr=6.65e-6, step_loss=0.0879]
Steps:  83%|████████▎ | 834/1000 [13:58<02:41,  1.03it/s, lr=6.65e-6, step_loss=0.0308]
Steps:  83%|████████▎ | 834/1000 [13:58<02:41,  1.03it/s, lr=6.65e-6, step_loss=0.0733]
Steps:  84%|████████▎ | 835/1000 [13:58<02:40,  1.02it/s, lr=6.65e-6, step_loss=0.0733]
Steps:  84%|████████▎ | 835/1000 [13:58<02:40,  1.02it/s, lr=6.57e-6, step_loss=0.173] 
Steps:  84%|████████▎ | 835/1000 [13:58<02:40,  1.02it/s, lr=6.57e-6, step_loss=0.0304]
Steps:  84%|████████▎ | 835/1000 [13:59<02:40,  1.02it/s, lr=6.57e-6, step_loss=0.059] 
Steps:  84%|████████▎ | 835/1000 [13:59<02:40,  1.02it/s, lr=6.57e-6, step_loss=0.0534]
Steps:  84%|████████▎ | 836/1000 [13:59<02:39,  1.03it/s, lr=6.57e-6, step_loss=0.0534]
Steps:  84%|████████▎ | 836/1000 [13:59<02:39,  1.03it/s, lr=6.49e-6, step_loss=0.0326]
Steps:  84%|████████▎ | 836/1000 [13:59<02:39,  1.03it/s, lr=6.49e-6, step_loss=0.0402]
Steps:  84%|████████▎ | 836/1000 [14:00<02:39,  1.03it/s, lr=6.49e-6, step_loss=0.143] 
Steps:  84%|████████▎ | 836/1000 [14:00<02:39,  1.03it/s, lr=6.49e-6, step_loss=0.0232]
Steps:  84%|████████▎ | 837/1000 [14:00<02:39,  1.02it/s, lr=6.49e-6, step_loss=0.0232]
Steps:  84%|████████▎ | 837/1000 [14:00<02:39,  1.02it/s, lr=6.41e-6, step_loss=0.0601]
Steps:  84%|████████▎ | 837/1000 [14:00<02:39,  1.02it/s, lr=6.41e-6, step_loss=0.177] 
Steps:  84%|████████▎ | 837/1000 [14:01<02:39,  1.02it/s, lr=6.41e-6, step_loss=0.0638]
Steps:  84%|████████▎ | 837/1000 [14:01<02:39,  1.02it/s, lr=6.41e-6, step_loss=0.0212]
Steps:  84%|████████▍ | 838/1000 [14:01<02:38,  1.02it/s, lr=6.41e-6, step_loss=0.0212]
Steps:  84%|████████▍ | 838/1000 [14:01<02:38,  1.02it/s, lr=6.34e-6, step_loss=0.00896]
Steps:  84%|████████▍ | 838/1000 [14:01<02:38,  1.02it/s, lr=6.34e-6, step_loss=0.113]  
Steps:  84%|████████▍ | 838/1000 [14:02<02:38,  1.02it/s, lr=6.34e-6, step_loss=0.0414]
Steps:  84%|████████▍ | 838/1000 [14:02<02:38,  1.02it/s, lr=6.34e-6, step_loss=0.00883]
Steps:  84%|████████▍ | 839/1000 [14:02<02:37,  1.02it/s, lr=6.34e-6, step_loss=0.00883]
Steps:  84%|████████▍ | 839/1000 [14:02<02:37,  1.02it/s, lr=6.26e-6, step_loss=0.167]  
Steps:  84%|████████▍ | 839/1000 [14:02<02:37,  1.02it/s, lr=6.26e-6, step_loss=0.167]
Steps:  84%|████████▍ | 839/1000 [14:03<02:37,  1.02it/s, lr=6.26e-6, step_loss=0.204]
Steps:  84%|████████▍ | 839/1000 [14:03<02:37,  1.02it/s, lr=6.26e-6, step_loss=0.0128]
Steps:  84%|████████▍ | 840/1000 [14:03<02:36,  1.03it/s, lr=6.26e-6, step_loss=0.0128]
Steps:  84%|████████▍ | 840/1000 [14:03<02:36,  1.03it/s, lr=6.18e-6, step_loss=0.017] 
Steps:  84%|████████▍ | 840/1000 [14:03<02:36,  1.03it/s, lr=6.18e-6, step_loss=0.142]
Steps:  84%|████████▍ | 840/1000 [14:04<02:36,  1.03it/s, lr=6.18e-6, step_loss=0.176]
Steps:  84%|████████▍ | 840/1000 [14:04<02:36,  1.03it/s, lr=6.18e-6, step_loss=0.0105]
Steps:  84%|████████▍ | 841/1000 [14:04<02:35,  1.03it/s, lr=6.18e-6, step_loss=0.0105]
Steps:  84%|████████▍ | 841/1000 [14:04<02:35,  1.03it/s, lr=6.11e-6, step_loss=0.00722]
Steps:  84%|████████▍ | 841/1000 [14:04<02:35,  1.03it/s, lr=6.11e-6, step_loss=0.0427] 
Steps:  84%|████████▍ | 841/1000 [14:05<02:35,  1.03it/s, lr=6.11e-6, step_loss=0.0548]
Steps:  84%|████████▍ | 841/1000 [14:05<02:35,  1.03it/s, lr=6.11e-6, step_loss=0.00385]
Steps:  84%|████████▍ | 842/1000 [14:05<02:33,  1.03it/s, lr=6.11e-6, step_loss=0.00385]
Steps:  84%|████████▍ | 842/1000 [14:05<02:33,  1.03it/s, lr=6.03e-6, step_loss=0.0388] 
Steps:  84%|████████▍ | 842/1000 [14:05<02:33,  1.03it/s, lr=6.03e-6, step_loss=0.0696]
Steps:  84%|████████▍ | 842/1000 [14:06<02:33,  1.03it/s, lr=6.03e-6, step_loss=0.192] 
Steps:  84%|████████▍ | 842/1000 [14:06<02:33,  1.03it/s, lr=6.03e-6, step_loss=0.0343]
Steps:  84%|████████▍ | 843/1000 [14:06<02:32,  1.03it/s, lr=6.03e-6, step_loss=0.0343]
Steps:  84%|████████▍ | 843/1000 [14:06<02:32,  1.03it/s, lr=5.96e-6, step_loss=0.0041]
Steps:  84%|████████▍ | 843/1000 [14:06<02:32,  1.03it/s, lr=5.96e-6, step_loss=0.0824]
Steps:  84%|████████▍ | 843/1000 [14:07<02:32,  1.03it/s, lr=5.96e-6, step_loss=0.0368]
Steps:  84%|████████▍ | 843/1000 [14:07<02:32,  1.03it/s, lr=5.96e-6, step_loss=0.081] 
Steps:  84%|████████▍ | 844/1000 [14:07<02:32,  1.03it/s, lr=5.96e-6, step_loss=0.081]
Steps:  84%|████████▍ | 844/1000 [14:07<02:32,  1.03it/s, lr=5.89e-6, step_loss=0.0114]
Steps:  84%|████████▍ | 844/1000 [14:07<02:32,  1.03it/s, lr=5.89e-6, step_loss=0.0204]
Steps:  84%|████████▍ | 844/1000 [14:07<02:32,  1.03it/s, lr=5.89e-6, step_loss=0.384] 
Steps:  84%|████████▍ | 844/1000 [14:08<02:32,  1.03it/s, lr=5.89e-6, step_loss=0.00226]
Steps:  84%|████████▍ | 845/1000 [14:08<02:31,  1.03it/s, lr=5.89e-6, step_loss=0.00226]
Steps:  84%|████████▍ | 845/1000 [14:08<02:31,  1.03it/s, lr=5.81e-6, step_loss=0.0529] 
Steps:  84%|████████▍ | 845/1000 [14:08<02:31,  1.03it/s, lr=5.81e-6, step_loss=0.232] 
Steps:  84%|████████▍ | 845/1000 [14:08<02:31,  1.03it/s, lr=5.81e-6, step_loss=0.003]
Steps:  84%|████████▍ | 845/1000 [14:09<02:31,  1.03it/s, lr=5.81e-6, step_loss=0.0442]
Steps:  85%|████████▍ | 846/1000 [14:09<02:30,  1.03it/s, lr=5.81e-6, step_loss=0.0442]
Steps:  85%|████████▍ | 846/1000 [14:09<02:30,  1.03it/s, lr=5.74e-6, step_loss=0.00666]
Steps:  85%|████████▍ | 846/1000 [14:09<02:30,  1.03it/s, lr=5.74e-6, step_loss=0.00703]
Steps:  85%|████████▍ | 846/1000 [14:09<02:30,  1.03it/s, lr=5.74e-6, step_loss=0.0647] 
Steps:  85%|████████▍ | 846/1000 [14:10<02:30,  1.03it/s, lr=5.74e-6, step_loss=0.0476]
Steps:  85%|████████▍ | 847/1000 [14:10<02:29,  1.03it/s, lr=5.74e-6, step_loss=0.0476]
Steps:  85%|████████▍ | 847/1000 [14:10<02:29,  1.03it/s, lr=5.67e-6, step_loss=0.06]  
Steps:  85%|████████▍ | 847/1000 [14:10<02:29,  1.03it/s, lr=5.67e-6, step_loss=0.0765]
Steps:  85%|████████▍ | 847/1000 [14:10<02:29,  1.03it/s, lr=5.67e-6, step_loss=0.00227]
Steps:  85%|████████▍ | 847/1000 [14:11<02:29,  1.03it/s, lr=5.67e-6, step_loss=0.0859] 
Steps:  85%|████████▍ | 848/1000 [14:11<02:28,  1.03it/s, lr=5.67e-6, step_loss=0.0859]
Steps:  85%|████████▍ | 848/1000 [14:11<02:28,  1.03it/s, lr=5.59e-6, step_loss=0.0886]
Steps:  85%|████████▍ | 848/1000 [14:11<02:28,  1.03it/s, lr=5.59e-6, step_loss=0.223] 
Steps:  85%|████████▍ | 848/1000 [14:11<02:28,  1.03it/s, lr=5.59e-6, step_loss=0.221]
Steps:  85%|████████▍ | 848/1000 [14:12<02:28,  1.03it/s, lr=5.59e-6, step_loss=0.0527]
Steps:  85%|████████▍ | 849/1000 [14:12<02:27,  1.03it/s, lr=5.59e-6, step_loss=0.0527]
Steps:  85%|████████▍ | 849/1000 [14:12<02:27,  1.03it/s, lr=5.52e-6, step_loss=0.0142]
Steps:  85%|████████▍ | 849/1000 [14:12<02:27,  1.03it/s, lr=5.52e-6, step_loss=0.0807]
Steps:  85%|████████▍ | 849/1000 [14:12<02:27,  1.03it/s, lr=5.52e-6, step_loss=0.0257]
Steps:  85%|████████▍ | 849/1000 [14:13<02:27,  1.03it/s, lr=5.52e-6, step_loss=0.0803]
Steps:  85%|████████▌ | 850/1000 [14:13<02:26,  1.03it/s, lr=5.52e-6, step_loss=0.0803]
Steps:  85%|████████▌ | 850/1000 [14:13<02:26,  1.03it/s, lr=5.45e-6, step_loss=0.0256]
Steps:  85%|████████▌ | 850/1000 [14:13<02:26,  1.03it/s, lr=5.45e-6, step_loss=0.117] 
Steps:  85%|████████▌ | 850/1000 [14:13<02:26,  1.03it/s, lr=5.45e-6, step_loss=0.0312]
Steps:  85%|████████▌ | 850/1000 [14:14<02:26,  1.03it/s, lr=5.45e-6, step_loss=0.152] 
Steps:  85%|████████▌ | 851/1000 [14:14<02:25,  1.03it/s, lr=5.45e-6, step_loss=0.152]
Steps:  85%|████████▌ | 851/1000 [14:14<02:25,  1.03it/s, lr=5.38e-6, step_loss=0.00977]
Steps:  85%|████████▌ | 851/1000 [14:14<02:25,  1.03it/s, lr=5.38e-6, step_loss=0.00362]
Steps:  85%|████████▌ | 851/1000 [14:14<02:25,  1.03it/s, lr=5.38e-6, step_loss=0.0303] 
Steps:  85%|████████▌ | 851/1000 [14:15<02:25,  1.03it/s, lr=5.38e-6, step_loss=0.0477]
Steps:  85%|████████▌ | 852/1000 [14:15<02:24,  1.03it/s, lr=5.38e-6, step_loss=0.0477]
Steps:  85%|████████▌ | 852/1000 [14:15<02:24,  1.03it/s, lr=5.31e-6, step_loss=0.104] 
Steps:  85%|████████▌ | 852/1000 [14:15<02:24,  1.03it/s, lr=5.31e-6, step_loss=0.0423]
Steps:  85%|████████▌ | 852/1000 [14:15<02:24,  1.03it/s, lr=5.31e-6, step_loss=0.014] 
Steps:  85%|████████▌ | 852/1000 [14:16<02:24,  1.03it/s, lr=5.31e-6, step_loss=0.0735]
Steps:  85%|████████▌ | 853/1000 [14:16<02:23,  1.03it/s, lr=5.31e-6, step_loss=0.0735]
Steps:  85%|████████▌ | 853/1000 [14:16<02:23,  1.03it/s, lr=5.24e-6, step_loss=0.0988]
Steps:  85%|████████▌ | 853/1000 [14:16<02:23,  1.03it/s, lr=5.24e-6, step_loss=0.119] 
Steps:  85%|████████▌ | 853/1000 [14:16<02:23,  1.03it/s, lr=5.24e-6, step_loss=0.0783]
Steps:  85%|████████▌ | 853/1000 [14:16<02:23,  1.03it/s, lr=5.24e-6, step_loss=0.0317]
Steps:  85%|████████▌ | 854/1000 [14:17<02:22,  1.03it/s, lr=5.24e-6, step_loss=0.0317]
Steps:  85%|████████▌ | 854/1000 [14:17<02:22,  1.03it/s, lr=5.17e-6, step_loss=0.0516]
Steps:  85%|████████▌ | 854/1000 [14:17<02:22,  1.03it/s, lr=5.17e-6, step_loss=0.0092]
Steps:  85%|████████▌ | 854/1000 [14:17<02:22,  1.03it/s, lr=5.17e-6, step_loss=0.309] 
Steps:  85%|████████▌ | 854/1000 [14:17<02:22,  1.03it/s, lr=5.17e-6, step_loss=0.0112]
Steps:  86%|████████▌ | 855/1000 [14:18<02:21,  1.03it/s, lr=5.17e-6, step_loss=0.0112]
Steps:  86%|████████▌ | 855/1000 [14:18<02:21,  1.03it/s, lr=5.1e-6, step_loss=0.0951] 
Steps:  86%|████████▌ | 855/1000 [14:18<02:21,  1.03it/s, lr=5.1e-6, step_loss=0.0355]
Steps:  86%|████████▌ | 855/1000 [14:18<02:21,  1.03it/s, lr=5.1e-6, step_loss=0.0539]
Steps:  86%|████████▌ | 855/1000 [14:18<02:21,  1.03it/s, lr=5.1e-6, step_loss=0.016] 
Steps:  86%|████████▌ | 856/1000 [14:19<02:20,  1.03it/s, lr=5.1e-6, step_loss=0.016]
Steps:  86%|████████▌ | 856/1000 [14:19<02:20,  1.03it/s, lr=5.03e-6, step_loss=0.0025]
Steps:  86%|████████▌ | 856/1000 [14:19<02:20,  1.03it/s, lr=5.03e-6, step_loss=0.0103]
Steps:  86%|████████▌ | 856/1000 [14:19<02:20,  1.03it/s, lr=5.03e-6, step_loss=0.0349]
Steps:  86%|████████▌ | 856/1000 [14:19<02:20,  1.03it/s, lr=5.03e-6, step_loss=0.0388]
Steps:  86%|████████▌ | 857/1000 [14:20<02:19,  1.03it/s, lr=5.03e-6, step_loss=0.0388]
Steps:  86%|████████▌ | 857/1000 [14:20<02:19,  1.03it/s, lr=4.96e-6, step_loss=0.0327]
Steps:  86%|████████▌ | 857/1000 [14:20<02:19,  1.03it/s, lr=4.96e-6, step_loss=0.0154]
Steps:  86%|████████▌ | 857/1000 [14:20<02:19,  1.03it/s, lr=4.96e-6, step_loss=0.151] 
Steps:  86%|████████▌ | 857/1000 [14:20<02:19,  1.03it/s, lr=4.96e-6, step_loss=0.00562]
Steps:  86%|████████▌ | 858/1000 [14:21<02:18,  1.03it/s, lr=4.96e-6, step_loss=0.00562]
Steps:  86%|████████▌ | 858/1000 [14:21<02:18,  1.03it/s, lr=4.89e-6, step_loss=0.0565] 
Steps:  86%|████████▌ | 858/1000 [14:21<02:18,  1.03it/s, lr=4.89e-6, step_loss=0.0189]
Steps:  86%|████████▌ | 858/1000 [14:21<02:18,  1.03it/s, lr=4.89e-6, step_loss=0.0344]
Steps:  86%|████████▌ | 858/1000 [14:21<02:18,  1.03it/s, lr=4.89e-6, step_loss=0.0047]
Steps:  86%|████████▌ | 859/1000 [14:22<02:17,  1.03it/s, lr=4.89e-6, step_loss=0.0047]
Steps:  86%|████████▌ | 859/1000 [14:22<02:17,  1.03it/s, lr=4.83e-6, step_loss=0.00949]
Steps:  86%|████████▌ | 859/1000 [14:22<02:17,  1.03it/s, lr=4.83e-6, step_loss=0.0155] 
Steps:  86%|████████▌ | 859/1000 [14:22<02:17,  1.03it/s, lr=4.83e-6, step_loss=0.209] 
Steps:  86%|████████▌ | 859/1000 [14:22<02:17,  1.03it/s, lr=4.83e-6, step_loss=0.0253]
Steps:  86%|████████▌ | 860/1000 [14:23<02:16,  1.03it/s, lr=4.83e-6, step_loss=0.0253]
Steps:  86%|████████▌ | 860/1000 [14:23<02:16,  1.03it/s, lr=4.76e-6, step_loss=0.113] 
Steps:  86%|████████▌ | 860/1000 [14:23<02:16,  1.03it/s, lr=4.76e-6, step_loss=0.0186]
Steps:  86%|████████▌ | 860/1000 [14:23<02:16,  1.03it/s, lr=4.76e-6, step_loss=0.0982]
Steps:  86%|████████▌ | 860/1000 [14:23<02:16,  1.03it/s, lr=4.76e-6, step_loss=0.0796]
Steps:  86%|████████▌ | 861/1000 [14:24<02:15,  1.03it/s, lr=4.76e-6, step_loss=0.0796]
Steps:  86%|████████▌ | 861/1000 [14:24<02:15,  1.03it/s, lr=4.69e-6, step_loss=0.083] 
Steps:  86%|████████▌ | 861/1000 [14:24<02:15,  1.03it/s, lr=4.69e-6, step_loss=0.0149]
Steps:  86%|████████▌ | 861/1000 [14:24<02:15,  1.03it/s, lr=4.69e-6, step_loss=0.00383]
Steps:  86%|████████▌ | 861/1000 [14:24<02:15,  1.03it/s, lr=4.69e-6, step_loss=0.0915] 
Steps:  86%|████████▌ | 862/1000 [14:24<02:14,  1.03it/s, lr=4.69e-6, step_loss=0.0915]
Steps:  86%|████████▌ | 862/1000 [14:25<02:14,  1.03it/s, lr=4.63e-6, step_loss=0.0728]
Steps:  86%|████████▌ | 862/1000 [14:25<02:14,  1.03it/s, lr=4.63e-6, step_loss=0.00273]
Steps:  86%|████████▌ | 862/1000 [14:25<02:14,  1.03it/s, lr=4.63e-6, step_loss=0.0297] 
Steps:  86%|████████▌ | 862/1000 [14:25<02:14,  1.03it/s, lr=4.63e-6, step_loss=0.119] 
Steps:  86%|████████▋ | 863/1000 [14:25<02:13,  1.03it/s, lr=4.63e-6, step_loss=0.119]
Steps:  86%|████████▋ | 863/1000 [14:26<02:13,  1.03it/s, lr=4.56e-6, step_loss=0.016]
Steps:  86%|████████▋ | 863/1000 [14:26<02:13,  1.03it/s, lr=4.56e-6, step_loss=0.0234]
Steps:  86%|████████▋ | 863/1000 [14:26<02:13,  1.03it/s, lr=4.56e-6, step_loss=0.153] 
Steps:  86%|████████▋ | 863/1000 [14:26<02:13,  1.03it/s, lr=4.56e-6, step_loss=0.0297]
Steps:  86%|████████▋ | 864/1000 [14:26<02:12,  1.03it/s, lr=4.56e-6, step_loss=0.0297]
Steps:  86%|████████▋ | 864/1000 [14:26<02:12,  1.03it/s, lr=4.49e-6, step_loss=0.146] 
Steps:  86%|████████▋ | 864/1000 [14:27<02:12,  1.03it/s, lr=4.49e-6, step_loss=0.235]
Steps:  86%|████████▋ | 864/1000 [14:27<02:12,  1.03it/s, lr=4.49e-6, step_loss=0.0045]
Steps:  86%|████████▋ | 864/1000 [14:27<02:12,  1.03it/s, lr=4.49e-6, step_loss=0.0207]
Steps:  86%|████████▋ | 865/1000 [14:27<02:11,  1.03it/s, lr=4.49e-6, step_loss=0.0207]
Steps:  86%|████████▋ | 865/1000 [14:27<02:11,  1.03it/s, lr=4.43e-6, step_loss=0.0466]
Steps:  86%|████████▋ | 865/1000 [14:28<02:11,  1.03it/s, lr=4.43e-6, step_loss=0.0496]
Steps:  86%|████████▋ | 865/1000 [14:28<02:11,  1.03it/s, lr=4.43e-6, step_loss=0.0293]
Steps:  86%|████████▋ | 865/1000 [14:28<02:11,  1.03it/s, lr=4.43e-6, step_loss=0.0674]
Steps:  87%|████████▋ | 866/1000 [14:28<02:10,  1.03it/s, lr=4.43e-6, step_loss=0.0674]
Steps:  87%|████████▋ | 866/1000 [14:28<02:10,  1.03it/s, lr=4.37e-6, step_loss=0.00469]
Steps:  87%|████████▋ | 866/1000 [14:29<02:10,  1.03it/s, lr=4.37e-6, step_loss=0.072]  
Steps:  87%|████████▋ | 866/1000 [14:29<02:10,  1.03it/s, lr=4.37e-6, step_loss=0.109]
Steps:  87%|████████▋ | 866/1000 [14:29<02:10,  1.03it/s, lr=4.37e-6, step_loss=0.0138]
Steps:  87%|████████▋ | 867/1000 [14:29<02:10,  1.02it/s, lr=4.37e-6, step_loss=0.0138]
Steps:  87%|████████▋ | 867/1000 [14:29<02:10,  1.02it/s, lr=4.3e-6, step_loss=0.238]  
Steps:  87%|████████▋ | 867/1000 [14:30<02:10,  1.02it/s, lr=4.3e-6, step_loss=0.022]
Steps:  87%|████████▋ | 867/1000 [14:30<02:10,  1.02it/s, lr=4.3e-6, step_loss=0.0127]
Steps:  87%|████████▋ | 867/1000 [14:30<02:10,  1.02it/s, lr=4.3e-6, step_loss=0.00929]
Steps:  87%|████████▋ | 868/1000 [14:30<02:09,  1.02it/s, lr=4.3e-6, step_loss=0.00929]
Steps:  87%|████████▋ | 868/1000 [14:30<02:09,  1.02it/s, lr=4.24e-6, step_loss=0.0402]
Steps:  87%|████████▋ | 868/1000 [14:31<02:09,  1.02it/s, lr=4.24e-6, step_loss=0.067] 
Steps:  87%|████████▋ | 868/1000 [14:31<02:09,  1.02it/s, lr=4.24e-6, step_loss=0.00845]
Steps:  87%|████████▋ | 868/1000 [14:31<02:09,  1.02it/s, lr=4.24e-6, step_loss=0.182]  
Steps:  87%|████████▋ | 869/1000 [14:31<02:08,  1.02it/s, lr=4.24e-6, step_loss=0.182]
Steps:  87%|████████▋ | 869/1000 [14:31<02:08,  1.02it/s, lr=4.17e-6, step_loss=0.0839]
Steps:  87%|████████▋ | 869/1000 [14:32<02:08,  1.02it/s, lr=4.17e-6, step_loss=0.00275]
Steps:  87%|████████▋ | 869/1000 [14:32<02:08,  1.02it/s, lr=4.17e-6, step_loss=0.0219] 
Steps:  87%|████████▋ | 869/1000 [14:32<02:08,  1.02it/s, lr=4.17e-6, step_loss=0.0172]
Steps:  87%|████████▋ | 870/1000 [14:32<02:07,  1.02it/s, lr=4.17e-6, step_loss=0.0172]
Steps:  87%|████████▋ | 870/1000 [14:32<02:07,  1.02it/s, lr=4.11e-6, step_loss=0.0322]
Steps:  87%|████████▋ | 870/1000 [14:33<02:07,  1.02it/s, lr=4.11e-6, step_loss=0.00342]
Steps:  87%|████████▋ | 870/1000 [14:33<02:07,  1.02it/s, lr=4.11e-6, step_loss=0.0256] 
Steps:  87%|████████▋ | 870/1000 [14:33<02:07,  1.02it/s, lr=4.11e-6, step_loss=0.016] 
Steps:  87%|████████▋ | 871/1000 [14:33<02:06,  1.02it/s, lr=4.11e-6, step_loss=0.016]
Steps:  87%|████████▋ | 871/1000 [14:33<02:06,  1.02it/s, lr=4.05e-6, step_loss=0.541]
Steps:  87%|████████▋ | 871/1000 [14:34<02:06,  1.02it/s, lr=4.05e-6, step_loss=0.0793]
Steps:  87%|████████▋ | 871/1000 [14:34<02:06,  1.02it/s, lr=4.05e-6, step_loss=0.0803]
Steps:  87%|████████▋ | 871/1000 [14:34<02:06,  1.02it/s, lr=4.05e-6, step_loss=0.127] 
Steps:  87%|████████▋ | 872/1000 [14:34<02:05,  1.02it/s, lr=4.05e-6, step_loss=0.127]
Steps:  87%|████████▋ | 872/1000 [14:34<02:05,  1.02it/s, lr=3.99e-6, step_loss=0.00429]
Steps:  87%|████████▋ | 872/1000 [14:35<02:05,  1.02it/s, lr=3.99e-6, step_loss=0.18]   
Steps:  87%|████████▋ | 872/1000 [14:35<02:05,  1.02it/s, lr=3.99e-6, step_loss=0.105]
Steps:  87%|████████▋ | 872/1000 [14:35<02:05,  1.02it/s, lr=3.99e-6, step_loss=0.0132]
Steps:  87%|████████▋ | 873/1000 [14:35<02:04,  1.02it/s, lr=3.99e-6, step_loss=0.0132]
Steps:  87%|████████▋ | 873/1000 [14:35<02:04,  1.02it/s, lr=3.93e-6, step_loss=0.00707]
Steps:  87%|████████▋ | 873/1000 [14:36<02:04,  1.02it/s, lr=3.93e-6, step_loss=0.0854] 
Steps:  87%|████████▋ | 873/1000 [14:36<02:04,  1.02it/s, lr=3.93e-6, step_loss=0.0244]
Steps:  87%|████████▋ | 873/1000 [14:36<02:04,  1.02it/s, lr=3.93e-6, step_loss=0.0186]
Steps:  87%|████████▋ | 874/1000 [14:36<02:03,  1.02it/s, lr=3.93e-6, step_loss=0.0186]
Steps:  87%|████████▋ | 874/1000 [14:36<02:03,  1.02it/s, lr=3.87e-6, step_loss=0.0241]
Steps:  87%|████████▋ | 874/1000 [14:37<02:03,  1.02it/s, lr=3.87e-6, step_loss=0.525] 
Steps:  87%|████████▋ | 874/1000 [14:37<02:03,  1.02it/s, lr=3.87e-6, step_loss=0.0855]
Steps:  87%|████████▋ | 874/1000 [14:37<02:03,  1.02it/s, lr=3.87e-6, step_loss=0.29]  
Steps:  88%|████████▊ | 875/1000 [14:37<02:02,  1.02it/s, lr=3.87e-6, step_loss=0.29]
Steps:  88%|████████▊ | 875/1000 [14:37<02:02,  1.02it/s, lr=3.81e-6, step_loss=0.0201]
Steps:  88%|████████▊ | 875/1000 [14:37<02:02,  1.02it/s, lr=3.81e-6, step_loss=0.00572]
Steps:  88%|████████▊ | 875/1000 [14:38<02:02,  1.02it/s, lr=3.81e-6, step_loss=0.0119] 
Steps:  88%|████████▊ | 875/1000 [14:38<02:02,  1.02it/s, lr=3.81e-6, step_loss=0.0431]
Steps:  88%|████████▊ | 876/1000 [14:38<02:01,  1.02it/s, lr=3.81e-6, step_loss=0.0431]
Steps:  88%|████████▊ | 876/1000 [14:38<02:01,  1.02it/s, lr=3.75e-6, step_loss=0.0458]
Steps:  88%|████████▊ | 876/1000 [14:38<02:01,  1.02it/s, lr=3.75e-6, step_loss=0.164] 
Steps:  88%|████████▊ | 876/1000 [14:39<02:01,  1.02it/s, lr=3.75e-6, step_loss=0.139]
Steps:  88%|████████▊ | 876/1000 [14:39<02:01,  1.02it/s, lr=3.75e-6, step_loss=0.112]
Steps:  88%|████████▊ | 877/1000 [14:39<02:00,  1.02it/s, lr=3.75e-6, step_loss=0.112]
Steps:  88%|████████▊ | 877/1000 [14:39<02:00,  1.02it/s, lr=3.69e-6, step_loss=0.0436]
Steps:  88%|████████▊ | 877/1000 [14:39<02:00,  1.02it/s, lr=3.69e-6, step_loss=0.0356]
Steps:  88%|████████▊ | 877/1000 [14:40<02:00,  1.02it/s, lr=3.69e-6, step_loss=0.0203]
Steps:  88%|████████▊ | 877/1000 [14:40<02:00,  1.02it/s, lr=3.69e-6, step_loss=0.061] 
Steps:  88%|████████▊ | 878/1000 [14:40<01:59,  1.02it/s, lr=3.69e-6, step_loss=0.061]
Steps:  88%|████████▊ | 878/1000 [14:40<01:59,  1.02it/s, lr=3.63e-6, step_loss=0.0667]
Steps:  88%|████████▊ | 878/1000 [14:40<01:59,  1.02it/s, lr=3.63e-6, step_loss=0.0855]
Steps:  88%|████████▊ | 878/1000 [14:41<01:59,  1.02it/s, lr=3.63e-6, step_loss=0.121] 
Steps:  88%|████████▊ | 878/1000 [14:41<01:59,  1.02it/s, lr=3.63e-6, step_loss=0.0499]
Steps:  88%|████████▊ | 879/1000 [14:41<01:58,  1.02it/s, lr=3.63e-6, step_loss=0.0499]
Steps:  88%|████████▊ | 879/1000 [14:41<01:58,  1.02it/s, lr=3.57e-6, step_loss=0.296] 
Steps:  88%|████████▊ | 879/1000 [14:41<01:58,  1.02it/s, lr=3.57e-6, step_loss=0.0468]
Steps:  88%|████████▊ | 879/1000 [14:42<01:58,  1.02it/s, lr=3.57e-6, step_loss=0.0102]
Steps:  88%|████████▊ | 879/1000 [14:42<01:58,  1.02it/s, lr=3.57e-6, step_loss=0.128] 
Steps:  88%|████████▊ | 880/1000 [14:42<01:57,  1.02it/s, lr=3.57e-6, step_loss=0.128]
Steps:  88%|████████▊ | 880/1000 [14:42<01:57,  1.02it/s, lr=3.51e-6, step_loss=0.0487]
Steps:  88%|████████▊ | 880/1000 [14:42<01:57,  1.02it/s, lr=3.51e-6, step_loss=0.0951]
Steps:  88%|████████▊ | 880/1000 [14:43<01:57,  1.02it/s, lr=3.51e-6, step_loss=0.00699]
Steps:  88%|████████▊ | 880/1000 [14:43<01:57,  1.02it/s, lr=3.51e-6, step_loss=0.0498] 
Steps:  88%|████████▊ | 881/1000 [14:43<01:56,  1.02it/s, lr=3.51e-6, step_loss=0.0498]
Steps:  88%|████████▊ | 881/1000 [14:43<01:56,  1.02it/s, lr=3.45e-6, step_loss=0.089] 
Steps:  88%|████████▊ | 881/1000 [14:43<01:56,  1.02it/s, lr=3.45e-6, step_loss=0.114]
Steps:  88%|████████▊ | 881/1000 [14:44<01:56,  1.02it/s, lr=3.45e-6, step_loss=0.471]
Steps:  88%|████████▊ | 881/1000 [14:44<01:56,  1.02it/s, lr=3.45e-6, step_loss=0.0668]
Steps:  88%|████████▊ | 882/1000 [14:44<01:55,  1.02it/s, lr=3.45e-6, step_loss=0.0668]
Steps:  88%|████████▊ | 882/1000 [14:44<01:55,  1.02it/s, lr=3.4e-6, step_loss=0.826]  
Steps:  88%|████████▊ | 882/1000 [14:44<01:55,  1.02it/s, lr=3.4e-6, step_loss=0.112]
Steps:  88%|████████▊ | 882/1000 [14:45<01:55,  1.02it/s, lr=3.4e-6, step_loss=0.118]
Steps:  88%|████████▊ | 882/1000 [14:45<01:55,  1.02it/s, lr=3.4e-6, step_loss=0.128]
Steps:  88%|████████▊ | 883/1000 [14:45<01:54,  1.02it/s, lr=3.4e-6, step_loss=0.128]
Steps:  88%|████████▊ | 883/1000 [14:45<01:54,  1.02it/s, lr=3.34e-6, step_loss=0.0296]
Steps:  88%|████████▊ | 883/1000 [14:45<01:54,  1.02it/s, lr=3.34e-6, step_loss=0.0372]
Steps:  88%|████████▊ | 883/1000 [14:46<01:54,  1.02it/s, lr=3.34e-6, step_loss=0.00275]
Steps:  88%|████████▊ | 883/1000 [14:46<01:54,  1.02it/s, lr=3.34e-6, step_loss=0.0296] 
Steps:  88%|████████▊ | 884/1000 [14:46<01:53,  1.02it/s, lr=3.34e-6, step_loss=0.0296]
Steps:  88%|████████▊ | 884/1000 [14:46<01:53,  1.02it/s, lr=3.28e-6, step_loss=0.0129]
Steps:  88%|████████▊ | 884/1000 [14:46<01:53,  1.02it/s, lr=3.28e-6, step_loss=0.0231]
Steps:  88%|████████▊ | 884/1000 [14:47<01:53,  1.02it/s, lr=3.28e-6, step_loss=0.108] 
Steps:  88%|████████▊ | 884/1000 [14:47<01:53,  1.02it/s, lr=3.28e-6, step_loss=0.051]
Steps:  88%|████████▊ | 885/1000 [14:47<01:52,  1.02it/s, lr=3.28e-6, step_loss=0.051]
Steps:  88%|████████▊ | 885/1000 [14:47<01:52,  1.02it/s, lr=3.23e-6, step_loss=0.157]
Steps:  88%|████████▊ | 885/1000 [14:47<01:52,  1.02it/s, lr=3.23e-6, step_loss=0.0213]
Steps:  88%|████████▊ | 885/1000 [14:48<01:52,  1.02it/s, lr=3.23e-6, step_loss=0.0552]
Steps:  88%|████████▊ | 885/1000 [14:48<01:52,  1.02it/s, lr=3.23e-6, step_loss=0.12]  
Steps:  89%|████████▊ | 886/1000 [14:48<01:51,  1.02it/s, lr=3.23e-6, step_loss=0.12]
Steps:  89%|████████▊ | 886/1000 [14:48<01:51,  1.02it/s, lr=3.17e-6, step_loss=0.00292]
Steps:  89%|████████▊ | 886/1000 [14:48<01:51,  1.02it/s, lr=3.17e-6, step_loss=0.566]  
Steps:  89%|████████▊ | 886/1000 [14:48<01:51,  1.02it/s, lr=3.17e-6, step_loss=0.0262]
Steps:  89%|████████▊ | 886/1000 [14:49<01:51,  1.02it/s, lr=3.17e-6, step_loss=0.2]   
Steps:  89%|████████▊ | 887/1000 [14:49<01:50,  1.03it/s, lr=3.17e-6, step_loss=0.2]
Steps:  89%|████████▊ | 887/1000 [14:49<01:50,  1.03it/s, lr=3.12e-6, step_loss=0.0172]
Steps:  89%|████████▊ | 887/1000 [14:49<01:50,  1.03it/s, lr=3.12e-6, step_loss=0.171] 
Steps:  89%|████████▊ | 887/1000 [14:49<01:50,  1.03it/s, lr=3.12e-6, step_loss=0.0207]
Steps:  89%|████████▊ | 887/1000 [14:50<01:50,  1.03it/s, lr=3.12e-6, step_loss=0.205] 
Steps:  89%|████████▉ | 888/1000 [14:50<01:49,  1.02it/s, lr=3.12e-6, step_loss=0.205]
Steps:  89%|████████▉ | 888/1000 [14:50<01:49,  1.02it/s, lr=3.06e-6, step_loss=0.0153]
Steps:  89%|████████▉ | 888/1000 [14:50<01:49,  1.02it/s, lr=3.06e-6, step_loss=0.598] 
Steps:  89%|████████▉ | 888/1000 [14:50<01:49,  1.02it/s, lr=3.06e-6, step_loss=0.00706]
Steps:  89%|████████▉ | 888/1000 [14:51<01:49,  1.02it/s, lr=3.06e-6, step_loss=0.184]  
Steps:  89%|████████▉ | 889/1000 [14:51<01:48,  1.03it/s, lr=3.06e-6, step_loss=0.184]
Steps:  89%|████████▉ | 889/1000 [14:51<01:48,  1.03it/s, lr=3.01e-6, step_loss=0.00271]
Steps:  89%|████████▉ | 889/1000 [14:51<01:48,  1.03it/s, lr=3.01e-6, step_loss=0.00562]
Steps:  89%|████████▉ | 889/1000 [14:51<01:48,  1.03it/s, lr=3.01e-6, step_loss=0.0844] 
Steps:  89%|████████▉ | 889/1000 [14:52<01:48,  1.03it/s, lr=3.01e-6, step_loss=0.398] 
Steps:  89%|████████▉ | 890/1000 [14:52<01:47,  1.03it/s, lr=3.01e-6, step_loss=0.398]
Steps:  89%|████████▉ | 890/1000 [14:52<01:47,  1.03it/s, lr=2.96e-6, step_loss=0.0891]
Steps:  89%|████████▉ | 890/1000 [14:52<01:47,  1.03it/s, lr=2.96e-6, step_loss=0.0369]
Steps:  89%|████████▉ | 890/1000 [14:52<01:47,  1.03it/s, lr=2.96e-6, step_loss=0.0381]
Steps:  89%|████████▉ | 890/1000 [14:53<01:47,  1.03it/s, lr=2.96e-6, step_loss=0.00353]
Steps:  89%|████████▉ | 891/1000 [14:53<01:46,  1.03it/s, lr=2.96e-6, step_loss=0.00353]
Steps:  89%|████████▉ | 891/1000 [14:53<01:46,  1.03it/s, lr=2.9e-6, step_loss=0.0179]  
Steps:  89%|████████▉ | 891/1000 [14:53<01:46,  1.03it/s, lr=2.9e-6, step_loss=0.0304]
Steps:  89%|████████▉ | 891/1000 [14:53<01:46,  1.03it/s, lr=2.9e-6, step_loss=0.211] 
Steps:  89%|████████▉ | 891/1000 [14:54<01:46,  1.03it/s, lr=2.9e-6, step_loss=0.0705]
Steps:  89%|████████▉ | 892/1000 [14:54<01:45,  1.03it/s, lr=2.9e-6, step_loss=0.0705]
Steps:  89%|████████▉ | 892/1000 [14:54<01:45,  1.03it/s, lr=2.85e-6, step_loss=0.00676]
Steps:  89%|████████▉ | 892/1000 [14:54<01:45,  1.03it/s, lr=2.85e-6, step_loss=0.00737]
Steps:  89%|████████▉ | 892/1000 [14:54<01:45,  1.03it/s, lr=2.85e-6, step_loss=0.277]  
Steps:  89%|████████▉ | 892/1000 [14:55<01:45,  1.03it/s, lr=2.85e-6, step_loss=0.00983]
Steps:  89%|████████▉ | 893/1000 [14:55<01:44,  1.03it/s, lr=2.85e-6, step_loss=0.00983]
Steps:  89%|████████▉ | 893/1000 [14:55<01:44,  1.03it/s, lr=2.8e-6, step_loss=0.493]   
Steps:  89%|████████▉ | 893/1000 [14:55<01:44,  1.03it/s, lr=2.8e-6, step_loss=0.161]
Steps:  89%|████████▉ | 893/1000 [14:55<01:44,  1.03it/s, lr=2.8e-6, step_loss=0.11] 
Steps:  89%|████████▉ | 893/1000 [14:56<01:44,  1.03it/s, lr=2.8e-6, step_loss=0.0104]
Steps:  89%|████████▉ | 894/1000 [14:56<01:43,  1.03it/s, lr=2.8e-6, step_loss=0.0104]
Steps:  89%|████████▉ | 894/1000 [14:56<01:43,  1.03it/s, lr=2.75e-6, step_loss=0.106]
Steps:  89%|████████▉ | 894/1000 [14:56<01:43,  1.03it/s, lr=2.75e-6, step_loss=0.157]
Steps:  89%|████████▉ | 894/1000 [14:56<01:43,  1.03it/s, lr=2.75e-6, step_loss=0.0349]
Steps:  89%|████████▉ | 894/1000 [14:57<01:43,  1.03it/s, lr=2.75e-6, step_loss=0.005] 
Steps:  90%|████████▉ | 895/1000 [14:57<01:42,  1.03it/s, lr=2.75e-6, step_loss=0.005]
Steps:  90%|████████▉ | 895/1000 [14:57<01:42,  1.03it/s, lr=2.7e-6, step_loss=0.0269]
Steps:  90%|████████▉ | 895/1000 [14:57<01:42,  1.03it/s, lr=2.7e-6, step_loss=0.0408]
Steps:  90%|████████▉ | 895/1000 [14:57<01:42,  1.03it/s, lr=2.7e-6, step_loss=0.233] 
Steps:  90%|████████▉ | 895/1000 [14:57<01:42,  1.03it/s, lr=2.7e-6, step_loss=0.0106]
Steps:  90%|████████▉ | 896/1000 [14:58<01:41,  1.03it/s, lr=2.7e-6, step_loss=0.0106]
Steps:  90%|████████▉ | 896/1000 [14:58<01:41,  1.03it/s, lr=2.65e-6, step_loss=0.14] 
Steps:  90%|████████▉ | 896/1000 [14:58<01:41,  1.03it/s, lr=2.65e-6, step_loss=0.0185]
Steps:  90%|████████▉ | 896/1000 [14:58<01:41,  1.03it/s, lr=2.65e-6, step_loss=0.548] 
Steps:  90%|████████▉ | 896/1000 [14:58<01:41,  1.03it/s, lr=2.65e-6, step_loss=0.0803]
Steps:  90%|████████▉ | 897/1000 [14:59<01:40,  1.03it/s, lr=2.65e-6, step_loss=0.0803]
Steps:  90%|████████▉ | 897/1000 [14:59<01:40,  1.03it/s, lr=2.59e-6, step_loss=0.161] 
Steps:  90%|████████▉ | 897/1000 [14:59<01:40,  1.03it/s, lr=2.59e-6, step_loss=0.097]
Steps:  90%|████████▉ | 897/1000 [14:59<01:40,  1.03it/s, lr=2.59e-6, step_loss=0.13] 
Steps:  90%|████████▉ | 897/1000 [14:59<01:40,  1.03it/s, lr=2.59e-6, step_loss=0.00913]
Steps:  90%|████████▉ | 898/1000 [15:00<01:39,  1.03it/s, lr=2.59e-6, step_loss=0.00913]
Steps:  90%|████████▉ | 898/1000 [15:00<01:39,  1.03it/s, lr=2.55e-6, step_loss=0.339]  
Steps:  90%|████████▉ | 898/1000 [15:00<01:39,  1.03it/s, lr=2.55e-6, step_loss=0.0945]
Steps:  90%|████████▉ | 898/1000 [15:00<01:39,  1.03it/s, lr=2.55e-6, step_loss=0.101] 
Steps:  90%|████████▉ | 898/1000 [15:00<01:39,  1.03it/s, lr=2.55e-6, step_loss=0.00874]
Steps:  90%|████████▉ | 899/1000 [15:01<01:38,  1.03it/s, lr=2.55e-6, step_loss=0.00874]
Steps:  90%|████████▉ | 899/1000 [15:01<01:38,  1.03it/s, lr=2.5e-6, step_loss=0.177]   
Steps:  90%|████████▉ | 899/1000 [15:01<01:38,  1.03it/s, lr=2.5e-6, step_loss=0.00943]
Steps:  90%|████████▉ | 899/1000 [15:01<01:38,  1.03it/s, lr=2.5e-6, step_loss=0.61]   
Steps:  90%|████████▉ | 899/1000 [15:01<01:38,  1.03it/s, lr=2.5e-6, step_loss=0.246]
Steps:  90%|█████████ | 900/1000 [15:02<01:37,  1.03it/s, lr=2.5e-6, step_loss=0.246]
Steps:  90%|█████████ | 900/1000 [15:02<01:37,  1.03it/s, lr=2.45e-6, step_loss=0.0232]
Steps:  90%|█████████ | 900/1000 [15:02<01:37,  1.03it/s, lr=2.45e-6, step_loss=0.0183]
Steps:  90%|█████████ | 900/1000 [15:02<01:37,  1.03it/s, lr=2.45e-6, step_loss=0.0889]
Steps:  90%|█████████ | 900/1000 [15:02<01:37,  1.03it/s, lr=2.45e-6, step_loss=0.0243]
Steps:  90%|█████████ | 901/1000 [15:03<01:36,  1.03it/s, lr=2.45e-6, step_loss=0.0243]
Steps:  90%|█████████ | 901/1000 [15:03<01:36,  1.03it/s, lr=2.4e-6, step_loss=0.0403] 
Steps:  90%|█████████ | 901/1000 [15:03<01:36,  1.03it/s, lr=2.4e-6, step_loss=0.0126]
Steps:  90%|█████████ | 901/1000 [15:03<01:36,  1.03it/s, lr=2.4e-6, step_loss=0.0411]
Steps:  90%|█████████ | 901/1000 [15:03<01:36,  1.03it/s, lr=2.4e-6, step_loss=0.152] 
Steps:  90%|█████████ | 902/1000 [15:04<01:35,  1.03it/s, lr=2.4e-6, step_loss=0.152]
Steps:  90%|█████████ | 902/1000 [15:04<01:35,  1.03it/s, lr=2.35e-6, step_loss=0.0747]
Steps:  90%|█████████ | 902/1000 [15:04<01:35,  1.03it/s, lr=2.35e-6, step_loss=0.0412]
Steps:  90%|█████████ | 902/1000 [15:04<01:35,  1.03it/s, lr=2.35e-6, step_loss=0.00801]
Steps:  90%|█████████ | 902/1000 [15:04<01:35,  1.03it/s, lr=2.35e-6, step_loss=0.0429] 
Steps:  90%|█████████ | 903/1000 [15:05<01:34,  1.03it/s, lr=2.35e-6, step_loss=0.0429]
Steps:  90%|█████████ | 903/1000 [15:05<01:34,  1.03it/s, lr=2.3e-6, step_loss=0.0263] 
Steps:  90%|█████████ | 903/1000 [15:05<01:34,  1.03it/s, lr=2.3e-6, step_loss=0.153] 
Steps:  90%|█████████ | 903/1000 [15:05<01:34,  1.03it/s, lr=2.3e-6, step_loss=0.0912]
Steps:  90%|█████████ | 903/1000 [15:05<01:34,  1.03it/s, lr=2.3e-6, step_loss=0.431] 
Steps:  90%|█████████ | 904/1000 [15:05<01:33,  1.03it/s, lr=2.3e-6, step_loss=0.431]
Steps:  90%|█████████ | 904/1000 [15:06<01:33,  1.03it/s, lr=2.26e-6, step_loss=0.0246]
Steps:  90%|█████████ | 904/1000 [15:06<01:33,  1.03it/s, lr=2.26e-6, step_loss=0.243] 
Steps:  90%|█████████ | 904/1000 [15:06<01:33,  1.03it/s, lr=2.26e-6, step_loss=0.0015]
Steps:  90%|█████████ | 904/1000 [15:06<01:33,  1.03it/s, lr=2.26e-6, step_loss=0.00282]
Steps:  90%|█████████ | 905/1000 [15:06<01:32,  1.03it/s, lr=2.26e-6, step_loss=0.00282]
Steps:  90%|█████████ | 905/1000 [15:07<01:32,  1.03it/s, lr=2.21e-6, step_loss=0.0186] 
Steps:  90%|█████████ | 905/1000 [15:07<01:32,  1.03it/s, lr=2.21e-6, step_loss=0.0927]
Steps:  90%|█████████ | 905/1000 [15:07<01:32,  1.03it/s, lr=2.21e-6, step_loss=0.0279]
Steps:  90%|█████████ | 905/1000 [15:07<01:32,  1.03it/s, lr=2.21e-6, step_loss=0.00247]
Steps:  91%|█████████ | 906/1000 [15:07<01:31,  1.03it/s, lr=2.21e-6, step_loss=0.00247]
Steps:  91%|█████████ | 906/1000 [15:07<01:31,  1.03it/s, lr=2.16e-6, step_loss=0.00659]
Steps:  91%|█████████ | 906/1000 [15:08<01:31,  1.03it/s, lr=2.16e-6, step_loss=0.0822] 
Steps:  91%|█████████ | 906/1000 [15:08<01:31,  1.03it/s, lr=2.16e-6, step_loss=0.0171]
Steps:  91%|█████████ | 906/1000 [15:08<01:31,  1.03it/s, lr=2.16e-6, step_loss=0.11]  
Steps:  91%|█████████ | 907/1000 [15:08<01:30,  1.03it/s, lr=2.16e-6, step_loss=0.11]
Steps:  91%|█████████ | 907/1000 [15:08<01:30,  1.03it/s, lr=2.12e-6, step_loss=0.0829]
Steps:  91%|█████████ | 907/1000 [15:09<01:30,  1.03it/s, lr=2.12e-6, step_loss=0.157] 
Steps:  91%|█████████ | 907/1000 [15:09<01:30,  1.03it/s, lr=2.12e-6, step_loss=0.107]
Steps:  91%|█████████ | 907/1000 [15:09<01:30,  1.03it/s, lr=2.12e-6, step_loss=0.00741]
Steps:  91%|█████████ | 908/1000 [15:09<01:29,  1.03it/s, lr=2.12e-6, step_loss=0.00741]
Steps:  91%|█████████ | 908/1000 [15:09<01:29,  1.03it/s, lr=2.07e-6, step_loss=0.0723] 
Steps:  91%|█████████ | 908/1000 [15:10<01:29,  1.03it/s, lr=2.07e-6, step_loss=0.0503]
Steps:  91%|█████████ | 908/1000 [15:10<01:29,  1.03it/s, lr=2.07e-6, step_loss=0.0307]
Steps:  91%|█████████ | 908/1000 [15:10<01:29,  1.03it/s, lr=2.07e-6, step_loss=0.134] 
Steps:  91%|█████████ | 909/1000 [15:10<01:28,  1.03it/s, lr=2.07e-6, step_loss=0.134]
Steps:  91%|█████████ | 909/1000 [15:10<01:28,  1.03it/s, lr=2.03e-6, step_loss=0.0119]
Steps:  91%|█████████ | 909/1000 [15:11<01:28,  1.03it/s, lr=2.03e-6, step_loss=0.025] 
Steps:  91%|█████████ | 909/1000 [15:11<01:28,  1.03it/s, lr=2.03e-6, step_loss=0.00767]
Steps:  91%|█████████ | 909/1000 [15:11<01:28,  1.03it/s, lr=2.03e-6, step_loss=0.0928] 
Steps:  91%|█████████ | 910/1000 [15:11<01:27,  1.03it/s, lr=2.03e-6, step_loss=0.0928]
Steps:  91%|█████████ | 910/1000 [15:11<01:27,  1.03it/s, lr=1.99e-6, step_loss=0.0411]
Steps:  91%|█████████ | 910/1000 [15:12<01:27,  1.03it/s, lr=1.99e-6, step_loss=0.0046]
Steps:  91%|█████████ | 910/1000 [15:12<01:27,  1.03it/s, lr=1.99e-6, step_loss=0.041] 
Steps:  91%|█████████ | 910/1000 [15:12<01:27,  1.03it/s, lr=1.99e-6, step_loss=0.0613]
Steps:  91%|█████████ | 911/1000 [15:12<01:26,  1.03it/s, lr=1.99e-6, step_loss=0.0613]
Steps:  91%|█████████ | 911/1000 [15:12<01:26,  1.03it/s, lr=1.94e-6, step_loss=0.0574]
Steps:  91%|█████████ | 911/1000 [15:13<01:26,  1.03it/s, lr=1.94e-6, step_loss=0.117] 
Steps:  91%|█████████ | 911/1000 [15:13<01:26,  1.03it/s, lr=1.94e-6, step_loss=0.00995]
Steps:  91%|█████████ | 911/1000 [15:13<01:26,  1.03it/s, lr=1.94e-6, step_loss=0.0427] 
Steps:  91%|█████████ | 912/1000 [15:13<01:25,  1.03it/s, lr=1.94e-6, step_loss=0.0427]
Steps:  91%|█████████ | 912/1000 [15:13<01:25,  1.03it/s, lr=1.9e-6, step_loss=0.0519] 
Steps:  91%|█████████ | 912/1000 [15:14<01:25,  1.03it/s, lr=1.9e-6, step_loss=0.0294]
Steps:  91%|█████████ | 912/1000 [15:14<01:25,  1.03it/s, lr=1.9e-6, step_loss=0.287] 
Steps:  91%|█████████ | 912/1000 [15:14<01:25,  1.03it/s, lr=1.9e-6, step_loss=0.191]
Steps:  91%|█████████▏| 913/1000 [15:14<01:24,  1.03it/s, lr=1.9e-6, step_loss=0.191]
Steps:  91%|█████████▏| 913/1000 [15:14<01:24,  1.03it/s, lr=1.86e-6, step_loss=0.142]
Steps:  91%|█████████▏| 913/1000 [15:15<01:24,  1.03it/s, lr=1.86e-6, step_loss=0.256]
Steps:  91%|█████████▏| 913/1000 [15:15<01:24,  1.03it/s, lr=1.86e-6, step_loss=0.0031]
Steps:  91%|█████████▏| 913/1000 [15:15<01:24,  1.03it/s, lr=1.86e-6, step_loss=0.12]  
Steps:  91%|█████████▏| 914/1000 [15:15<01:23,  1.03it/s, lr=1.86e-6, step_loss=0.12]
Steps:  91%|█████████▏| 914/1000 [15:15<01:23,  1.03it/s, lr=1.81e-6, step_loss=0.00258]
Steps:  91%|█████████▏| 914/1000 [15:16<01:23,  1.03it/s, lr=1.81e-6, step_loss=0.0597] 
Steps:  91%|█████████▏| 914/1000 [15:16<01:23,  1.03it/s, lr=1.81e-6, step_loss=0.106] 
Steps:  91%|█████████▏| 914/1000 [15:16<01:23,  1.03it/s, lr=1.81e-6, step_loss=0.0245]
Steps:  92%|█████████▏| 915/1000 [15:16<01:22,  1.03it/s, lr=1.81e-6, step_loss=0.0245]
Steps:  92%|█████████▏| 915/1000 [15:16<01:22,  1.03it/s, lr=1.77e-6, step_loss=0.00631]
Steps:  92%|█████████▏| 915/1000 [15:17<01:22,  1.03it/s, lr=1.77e-6, step_loss=0.12]   
Steps:  92%|█████████▏| 915/1000 [15:17<01:22,  1.03it/s, lr=1.77e-6, step_loss=0.337]
Steps:  92%|█████████▏| 915/1000 [15:17<01:22,  1.03it/s, lr=1.77e-6, step_loss=0.00753]
Steps:  92%|█████████▏| 916/1000 [15:17<01:21,  1.03it/s, lr=1.77e-6, step_loss=0.00753]
Steps:  92%|█████████▏| 916/1000 [15:17<01:21,  1.03it/s, lr=1.73e-6, step_loss=0.037]  
Steps:  92%|█████████▏| 916/1000 [15:17<01:21,  1.03it/s, lr=1.73e-6, step_loss=0.0178]
Steps:  92%|█████████▏| 916/1000 [15:18<01:21,  1.03it/s, lr=1.73e-6, step_loss=0.0451]
Steps:  92%|█████████▏| 916/1000 [15:18<01:21,  1.03it/s, lr=1.73e-6, step_loss=0.165] 
Steps:  92%|█████████▏| 917/1000 [15:18<01:20,  1.03it/s, lr=1.73e-6, step_loss=0.165]
Steps:  92%|█████████▏| 917/1000 [15:18<01:20,  1.03it/s, lr=1.69e-6, step_loss=0.00342]
Steps:  92%|█████████▏| 918/1000 [15:18<01:03,  1.30it/s, lr=1.69e-6, step_loss=0.00342]
Steps:  92%|█████████▏| 918/1000 [15:19<01:03,  1.30it/s, lr=1.65e-6, step_loss=0.23]   {'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 18.29it/s]Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of runwayml/stable-diffusion-v1-5.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 14.00it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.21it/s]
07/28/2024 20:51:21 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A naruto with blue eyes..
Steps:  92%|█████████▏| 918/1000 [15:28<01:03,  1.30it/s, lr=1.65e-6, step_loss=0.0482]
Steps:  92%|█████████▏| 918/1000 [15:29<01:03,  1.30it/s, lr=1.65e-6, step_loss=0.448] 
Steps:  92%|█████████▏| 918/1000 [15:29<01:03,  1.30it/s, lr=1.65e-6, step_loss=0.153]
Steps:  92%|█████████▏| 919/1000 [15:29<04:58,  3.69s/it, lr=1.65e-6, step_loss=0.153]
Steps:  92%|█████████▏| 919/1000 [15:29<04:58,  3.69s/it, lr=1.61e-6, step_loss=0.254]
Steps:  92%|█████████▏| 919/1000 [15:29<04:58,  3.69s/it, lr=1.61e-6, step_loss=0.0909]
Steps:  92%|█████████▏| 919/1000 [15:29<04:58,  3.69s/it, lr=1.61e-6, step_loss=0.188] 
Steps:  92%|█████████▏| 919/1000 [15:30<04:58,  3.69s/it, lr=1.61e-6, step_loss=0.0723]
Steps:  92%|█████████▏| 920/1000 [15:30<03:49,  2.87s/it, lr=1.61e-6, step_loss=0.0723]
Steps:  92%|█████████▏| 920/1000 [15:30<03:49,  2.87s/it, lr=1.57e-6, step_loss=0.0747]
Steps:  92%|█████████▏| 920/1000 [15:30<03:49,  2.87s/it, lr=1.57e-6, step_loss=0.079] 
Steps:  92%|█████████▏| 920/1000 [15:30<03:49,  2.87s/it, lr=1.57e-6, step_loss=0.0153]
Steps:  92%|█████████▏| 920/1000 [15:31<03:49,  2.87s/it, lr=1.57e-6, step_loss=0.0363]
Steps:  92%|█████████▏| 921/1000 [15:31<03:02,  2.30s/it, lr=1.57e-6, step_loss=0.0363]
Steps:  92%|█████████▏| 921/1000 [15:31<03:02,  2.30s/it, lr=1.53e-6, step_loss=0.0541]
Steps:  92%|█████████▏| 921/1000 [15:31<03:02,  2.30s/it, lr=1.53e-6, step_loss=0.015] 
Steps:  92%|█████████▏| 921/1000 [15:31<03:02,  2.30s/it, lr=1.53e-6, step_loss=0.163]
Steps:  92%|█████████▏| 921/1000 [15:32<03:02,  2.30s/it, lr=1.53e-6, step_loss=0.235]
Steps:  92%|█████████▏| 922/1000 [15:32<02:28,  1.90s/it, lr=1.53e-6, step_loss=0.235]
Steps:  92%|█████████▏| 922/1000 [15:32<02:28,  1.90s/it, lr=1.49e-6, step_loss=0.139]
Steps:  92%|█████████▏| 922/1000 [15:32<02:28,  1.90s/it, lr=1.49e-6, step_loss=0.0119]
Steps:  92%|█████████▏| 922/1000 [15:32<02:28,  1.90s/it, lr=1.49e-6, step_loss=0.137] 
Steps:  92%|█████████▏| 922/1000 [15:33<02:28,  1.90s/it, lr=1.49e-6, step_loss=0.181]
Steps:  92%|█████████▏| 923/1000 [15:33<02:05,  1.63s/it, lr=1.49e-6, step_loss=0.181]
Steps:  92%|█████████▏| 923/1000 [15:33<02:05,  1.63s/it, lr=1.46e-6, step_loss=0.0143]
Steps:  92%|█████████▏| 923/1000 [15:33<02:05,  1.63s/it, lr=1.46e-6, step_loss=0.00289]
Steps:  92%|█████████▏| 923/1000 [15:33<02:05,  1.63s/it, lr=1.46e-6, step_loss=0.155]  
Steps:  92%|█████████▏| 923/1000 [15:34<02:05,  1.63s/it, lr=1.46e-6, step_loss=0.221]
Steps:  92%|█████████▏| 924/1000 [15:34<01:48,  1.43s/it, lr=1.46e-6, step_loss=0.221]
Steps:  92%|█████████▏| 924/1000 [15:34<01:48,  1.43s/it, lr=1.42e-6, step_loss=0.0131]
Steps:  92%|█████████▏| 924/1000 [15:34<01:48,  1.43s/it, lr=1.42e-6, step_loss=0.0666]
Steps:  92%|█████████▏| 924/1000 [15:34<01:48,  1.43s/it, lr=1.42e-6, step_loss=0.00315]
Steps:  92%|█████████▏| 924/1000 [15:35<01:48,  1.43s/it, lr=1.42e-6, step_loss=0.0111] 
Steps:  92%|█████████▎| 925/1000 [15:35<01:37,  1.29s/it, lr=1.42e-6, step_loss=0.0111]
Steps:  92%|█████████▎| 925/1000 [15:35<01:37,  1.29s/it, lr=1.38e-6, step_loss=0.104] 
Steps:  92%|█████████▎| 925/1000 [15:35<01:37,  1.29s/it, lr=1.38e-6, step_loss=0.109]
Steps:  92%|█████████▎| 925/1000 [15:35<01:37,  1.29s/it, lr=1.38e-6, step_loss=0.0262]
Steps:  92%|█████████▎| 925/1000 [15:36<01:37,  1.29s/it, lr=1.38e-6, step_loss=0.0348]
Steps:  93%|█████████▎| 926/1000 [15:36<01:28,  1.20s/it, lr=1.38e-6, step_loss=0.0348]
Steps:  93%|█████████▎| 926/1000 [15:36<01:28,  1.20s/it, lr=1.35e-6, step_loss=0.0114]
Steps:  93%|█████████▎| 926/1000 [15:36<01:28,  1.20s/it, lr=1.35e-6, step_loss=0.162] 
Steps:  93%|█████████▎| 926/1000 [15:36<01:28,  1.20s/it, lr=1.35e-6, step_loss=0.0138]
Steps:  93%|█████████▎| 926/1000 [15:37<01:28,  1.20s/it, lr=1.35e-6, step_loss=0.073] 
Steps:  93%|█████████▎| 927/1000 [15:37<01:22,  1.13s/it, lr=1.35e-6, step_loss=0.073]
Steps:  93%|█████████▎| 927/1000 [15:37<01:22,  1.13s/it, lr=1.31e-6, step_loss=0.0324]
Steps:  93%|█████████▎| 927/1000 [15:37<01:22,  1.13s/it, lr=1.31e-6, step_loss=0.0427]
Steps:  93%|█████████▎| 927/1000 [15:37<01:22,  1.13s/it, lr=1.31e-6, step_loss=0.0974]
Steps:  93%|█████████▎| 927/1000 [15:38<01:22,  1.13s/it, lr=1.31e-6, step_loss=0.00728]
Steps:  93%|█████████▎| 928/1000 [15:38<01:18,  1.08s/it, lr=1.31e-6, step_loss=0.00728]
Steps:  93%|█████████▎| 928/1000 [15:38<01:18,  1.08s/it, lr=1.27e-6, step_loss=0.0391] 
Steps:  93%|█████████▎| 928/1000 [15:38<01:18,  1.08s/it, lr=1.27e-6, step_loss=0.104] 
Steps:  93%|█████████▎| 928/1000 [15:38<01:18,  1.08s/it, lr=1.27e-6, step_loss=0.0363]
Steps:  93%|█████████▎| 928/1000 [15:38<01:18,  1.08s/it, lr=1.27e-6, step_loss=0.0232]
Steps:  93%|█████████▎| 929/1000 [15:39<01:14,  1.05s/it, lr=1.27e-6, step_loss=0.0232]
Steps:  93%|█████████▎| 929/1000 [15:39<01:14,  1.05s/it, lr=1.24e-6, step_loss=0.0119]
Steps:  93%|█████████▎| 929/1000 [15:39<01:14,  1.05s/it, lr=1.24e-6, step_loss=0.113] 
Steps:  93%|█████████▎| 929/1000 [15:39<01:14,  1.05s/it, lr=1.24e-6, step_loss=0.095]
Steps:  93%|█████████▎| 929/1000 [15:39<01:14,  1.05s/it, lr=1.24e-6, step_loss=0.0692]
Steps:  93%|█████████▎| 930/1000 [15:40<01:11,  1.03s/it, lr=1.24e-6, step_loss=0.0692]
Steps:  93%|█████████▎| 930/1000 [15:40<01:11,  1.03s/it, lr=1.2e-6, step_loss=0.0266] 
Steps:  93%|█████████▎| 930/1000 [15:40<01:11,  1.03s/it, lr=1.2e-6, step_loss=0.16]  
Steps:  93%|█████████▎| 930/1000 [15:40<01:11,  1.03s/it, lr=1.2e-6, step_loss=0.0481]
Steps:  93%|█████████▎| 930/1000 [15:40<01:11,  1.03s/it, lr=1.2e-6, step_loss=0.00416]
Steps:  93%|█████████▎| 931/1000 [15:41<01:09,  1.01s/it, lr=1.2e-6, step_loss=0.00416]
Steps:  93%|█████████▎| 931/1000 [15:41<01:09,  1.01s/it, lr=1.17e-6, step_loss=0.0331]
Steps:  93%|█████████▎| 931/1000 [15:41<01:09,  1.01s/it, lr=1.17e-6, step_loss=0.0473]
Steps:  93%|█████████▎| 931/1000 [15:41<01:09,  1.01s/it, lr=1.17e-6, step_loss=0.187] 
Steps:  93%|█████████▎| 931/1000 [15:41<01:09,  1.01s/it, lr=1.17e-6, step_loss=0.115]
Steps:  93%|█████████▎| 932/1000 [15:42<01:08,  1.00s/it, lr=1.17e-6, step_loss=0.115]
Steps:  93%|█████████▎| 932/1000 [15:42<01:08,  1.00s/it, lr=1.14e-6, step_loss=0.043]
Steps:  93%|█████████▎| 932/1000 [15:42<01:08,  1.00s/it, lr=1.14e-6, step_loss=0.00448]
Steps:  93%|█████████▎| 932/1000 [15:42<01:08,  1.00s/it, lr=1.14e-6, step_loss=0.00285]
Steps:  93%|█████████▎| 932/1000 [15:42<01:08,  1.00s/it, lr=1.14e-6, step_loss=0.11]   
Steps:  93%|█████████▎| 933/1000 [15:43<01:06,  1.01it/s, lr=1.14e-6, step_loss=0.11]
Steps:  93%|█████████▎| 933/1000 [15:43<01:06,  1.01it/s, lr=1.1e-6, step_loss=0.131]
Steps:  93%|█████████▎| 933/1000 [15:43<01:06,  1.01it/s, lr=1.1e-6, step_loss=0.294]
Steps:  93%|█████████▎| 933/1000 [15:43<01:06,  1.01it/s, lr=1.1e-6, step_loss=0.00844]
Steps:  93%|█████████▎| 933/1000 [15:43<01:06,  1.01it/s, lr=1.1e-6, step_loss=0.0968] 
Steps:  93%|█████████▎| 934/1000 [15:44<01:05,  1.01it/s, lr=1.1e-6, step_loss=0.0968]
Steps:  93%|█████████▎| 934/1000 [15:44<01:05,  1.01it/s, lr=1.07e-6, step_loss=0.0564]
Steps:  93%|█████████▎| 934/1000 [15:44<01:05,  1.01it/s, lr=1.07e-6, step_loss=0.0123]
Steps:  93%|█████████▎| 934/1000 [15:44<01:05,  1.01it/s, lr=1.07e-6, step_loss=0.113] 
Steps:  93%|█████████▎| 934/1000 [15:44<01:05,  1.01it/s, lr=1.07e-6, step_loss=0.0164]
Steps:  94%|█████████▎| 935/1000 [15:45<01:03,  1.02it/s, lr=1.07e-6, step_loss=0.0164]
Steps:  94%|█████████▎| 935/1000 [15:45<01:03,  1.02it/s, lr=1.04e-6, step_loss=0.022] 
Steps:  94%|█████████▎| 935/1000 [15:45<01:03,  1.02it/s, lr=1.04e-6, step_loss=0.00335]
Steps:  94%|█████████▎| 935/1000 [15:45<01:03,  1.02it/s, lr=1.04e-6, step_loss=0.101]  
Steps:  94%|█████████▎| 935/1000 [15:45<01:03,  1.02it/s, lr=1.04e-6, step_loss=0.147]
Steps:  94%|█████████▎| 936/1000 [15:46<01:02,  1.02it/s, lr=1.04e-6, step_loss=0.147]
Steps:  94%|█████████▎| 936/1000 [15:46<01:02,  1.02it/s, lr=1.01e-6, step_loss=0.0714]
Steps:  94%|█████████▎| 936/1000 [15:46<01:02,  1.02it/s, lr=1.01e-6, step_loss=0.00294]
Steps:  94%|█████████▎| 936/1000 [15:46<01:02,  1.02it/s, lr=1.01e-6, step_loss=0.0197] 
Steps:  94%|█████████▎| 936/1000 [15:46<01:02,  1.02it/s, lr=1.01e-6, step_loss=0.0629]
Steps:  94%|█████████▎| 937/1000 [15:46<01:01,  1.02it/s, lr=1.01e-6, step_loss=0.0629]
Steps:  94%|█████████▎| 937/1000 [15:47<01:01,  1.02it/s, lr=9.76e-7, step_loss=0.0693]
Steps:  94%|█████████▎| 937/1000 [15:47<01:01,  1.02it/s, lr=9.76e-7, step_loss=0.113] 
Steps:  94%|█████████▎| 937/1000 [15:47<01:01,  1.02it/s, lr=9.76e-7, step_loss=0.0839]
Steps:  94%|█████████▎| 937/1000 [15:47<01:01,  1.02it/s, lr=9.76e-7, step_loss=0.0388]
Steps:  94%|█████████▍| 938/1000 [15:47<01:00,  1.02it/s, lr=9.76e-7, step_loss=0.0388]
Steps:  94%|█████████▍| 938/1000 [15:48<01:00,  1.02it/s, lr=9.45e-7, step_loss=0.265] 
Steps:  94%|█████████▍| 938/1000 [15:48<01:00,  1.02it/s, lr=9.45e-7, step_loss=0.0976]
Steps:  94%|█████████▍| 938/1000 [15:48<01:00,  1.02it/s, lr=9.45e-7, step_loss=0.0368]
Steps:  94%|█████████▍| 938/1000 [15:48<01:00,  1.02it/s, lr=9.45e-7, step_loss=0.0045]
Steps:  94%|█████████▍| 939/1000 [15:48<00:59,  1.02it/s, lr=9.45e-7, step_loss=0.0045]
Steps:  94%|█████████▍| 939/1000 [15:48<00:59,  1.02it/s, lr=9.15e-7, step_loss=0.0438]
Steps:  94%|█████████▍| 939/1000 [15:49<00:59,  1.02it/s, lr=9.15e-7, step_loss=0.0625]
Steps:  94%|█████████▍| 939/1000 [15:49<00:59,  1.02it/s, lr=9.15e-7, step_loss=0.0135]
Steps:  94%|█████████▍| 939/1000 [15:49<00:59,  1.02it/s, lr=9.15e-7, step_loss=0.00353]
Steps:  94%|█████████▍| 940/1000 [15:49<00:58,  1.02it/s, lr=9.15e-7, step_loss=0.00353]
Steps:  94%|█████████▍| 940/1000 [15:49<00:58,  1.02it/s, lr=8.86e-7, step_loss=0.055]  
Steps:  94%|█████████▍| 940/1000 [15:50<00:58,  1.02it/s, lr=8.86e-7, step_loss=0.0566]
Steps:  94%|█████████▍| 940/1000 [15:50<00:58,  1.02it/s, lr=8.86e-7, step_loss=0.0625]
Steps:  94%|█████████▍| 940/1000 [15:50<00:58,  1.02it/s, lr=8.86e-7, step_loss=0.229] 
Steps:  94%|█████████▍| 941/1000 [15:50<00:57,  1.03it/s, lr=8.86e-7, step_loss=0.229]
Steps:  94%|█████████▍| 941/1000 [15:50<00:57,  1.03it/s, lr=8.56e-7, step_loss=0.0598]
Steps:  94%|█████████▍| 941/1000 [15:51<00:57,  1.03it/s, lr=8.56e-7, step_loss=0.00774]
Steps:  94%|█████████▍| 941/1000 [15:51<00:57,  1.03it/s, lr=8.56e-7, step_loss=0.121]  
Steps:  94%|█████████▍| 941/1000 [15:51<00:57,  1.03it/s, lr=8.56e-7, step_loss=0.0822]
Steps:  94%|█████████▍| 942/1000 [15:51<00:56,  1.03it/s, lr=8.56e-7, step_loss=0.0822]
Steps:  94%|█████████▍| 942/1000 [15:51<00:56,  1.03it/s, lr=8.28e-7, step_loss=0.00344]
Steps:  94%|█████████▍| 942/1000 [15:52<00:56,  1.03it/s, lr=8.28e-7, step_loss=0.00296]
Steps:  94%|█████████▍| 942/1000 [15:52<00:56,  1.03it/s, lr=8.28e-7, step_loss=0.0764] 
Steps:  94%|█████████▍| 942/1000 [15:52<00:56,  1.03it/s, lr=8.28e-7, step_loss=0.0231]
Steps:  94%|█████████▍| 943/1000 [15:52<00:55,  1.03it/s, lr=8.28e-7, step_loss=0.0231]
Steps:  94%|█████████▍| 943/1000 [15:52<00:55,  1.03it/s, lr=8e-7, step_loss=0.00464]  
Steps:  94%|█████████▍| 943/1000 [15:53<00:55,  1.03it/s, lr=8e-7, step_loss=0.0655] 
Steps:  94%|█████████▍| 943/1000 [15:53<00:55,  1.03it/s, lr=8e-7, step_loss=0.00783]
Steps:  94%|█████████▍| 943/1000 [15:53<00:55,  1.03it/s, lr=8e-7, step_loss=0.01]   
Steps:  94%|█████████▍| 944/1000 [15:53<00:54,  1.03it/s, lr=8e-7, step_loss=0.01]
Steps:  94%|█████████▍| 944/1000 [15:53<00:54,  1.03it/s, lr=7.72e-7, step_loss=0.214]
Steps:  94%|█████████▍| 944/1000 [15:54<00:54,  1.03it/s, lr=7.72e-7, step_loss=0.0241]
Steps:  94%|█████████▍| 944/1000 [15:54<00:54,  1.03it/s, lr=7.72e-7, step_loss=0.0145]
Steps:  94%|█████████▍| 944/1000 [15:54<00:54,  1.03it/s, lr=7.72e-7, step_loss=0.178] 
Steps:  94%|█████████▍| 945/1000 [15:54<00:53,  1.03it/s, lr=7.72e-7, step_loss=0.178]
Steps:  94%|█████████▍| 945/1000 [15:54<00:53,  1.03it/s, lr=7.45e-7, step_loss=0.0551]
Steps:  94%|█████████▍| 945/1000 [15:55<00:53,  1.03it/s, lr=7.45e-7, step_loss=0.00296]
Steps:  94%|█████████▍| 945/1000 [15:55<00:53,  1.03it/s, lr=7.45e-7, step_loss=0.00753]
Steps:  94%|█████████▍| 945/1000 [15:55<00:53,  1.03it/s, lr=7.45e-7, step_loss=0.00764]
Steps:  95%|█████████▍| 946/1000 [15:55<00:52,  1.03it/s, lr=7.45e-7, step_loss=0.00764]
Steps:  95%|█████████▍| 946/1000 [15:55<00:52,  1.03it/s, lr=7.18e-7, step_loss=0.127]  
Steps:  95%|█████████▍| 946/1000 [15:56<00:52,  1.03it/s, lr=7.18e-7, step_loss=0.00415]
Steps:  95%|█████████▍| 946/1000 [15:56<00:52,  1.03it/s, lr=7.18e-7, step_loss=0.0508] 
Steps:  95%|█████████▍| 946/1000 [15:56<00:52,  1.03it/s, lr=7.18e-7, step_loss=0.0116]
Steps:  95%|█████████▍| 947/1000 [15:56<00:51,  1.03it/s, lr=7.18e-7, step_loss=0.0116]
Steps:  95%|█████████▍| 947/1000 [15:56<00:51,  1.03it/s, lr=6.91e-7, step_loss=0.0545]
Steps:  95%|█████████▍| 947/1000 [15:57<00:51,  1.03it/s, lr=6.91e-7, step_loss=0.11]  
Steps:  95%|█████████▍| 947/1000 [15:57<00:51,  1.03it/s, lr=6.91e-7, step_loss=0.393]
Steps:  95%|█████████▍| 947/1000 [15:57<00:51,  1.03it/s, lr=6.91e-7, step_loss=0.0104]
Steps:  95%|█████████▍| 948/1000 [15:57<00:50,  1.03it/s, lr=6.91e-7, step_loss=0.0104]
Steps:  95%|█████████▍| 948/1000 [15:57<00:50,  1.03it/s, lr=6.66e-7, step_loss=0.0554]
Steps:  95%|█████████▍| 948/1000 [15:57<00:50,  1.03it/s, lr=6.66e-7, step_loss=0.0872]
Steps:  95%|█████████▍| 948/1000 [15:58<00:50,  1.03it/s, lr=6.66e-7, step_loss=0.144] 
Steps:  95%|█████████▍| 948/1000 [15:58<00:50,  1.03it/s, lr=6.66e-7, step_loss=0.0145]
Steps:  95%|█████████▍| 949/1000 [15:58<00:49,  1.03it/s, lr=6.66e-7, step_loss=0.0145]
Steps:  95%|█████████▍| 949/1000 [15:58<00:49,  1.03it/s, lr=6.4e-7, step_loss=0.0515] 
Steps:  95%|█████████▍| 949/1000 [15:58<00:49,  1.03it/s, lr=6.4e-7, step_loss=0.0582]
Steps:  95%|█████████▍| 949/1000 [15:59<00:49,  1.03it/s, lr=6.4e-7, step_loss=0.0079]
Steps:  95%|█████████▍| 949/1000 [15:59<00:49,  1.03it/s, lr=6.4e-7, step_loss=0.0229]
Steps:  95%|█████████▌| 950/1000 [15:59<00:48,  1.03it/s, lr=6.4e-7, step_loss=0.0229]
Steps:  95%|█████████▌| 950/1000 [15:59<00:48,  1.03it/s, lr=6.16e-7, step_loss=0.0075]
Steps:  95%|█████████▌| 950/1000 [15:59<00:48,  1.03it/s, lr=6.16e-7, step_loss=0.0557]
Steps:  95%|█████████▌| 950/1000 [16:00<00:48,  1.03it/s, lr=6.16e-7, step_loss=0.078] 
Steps:  95%|█████████▌| 950/1000 [16:00<00:48,  1.03it/s, lr=6.16e-7, step_loss=0.021]
Steps:  95%|█████████▌| 951/1000 [16:00<00:47,  1.03it/s, lr=6.16e-7, step_loss=0.021]
Steps:  95%|█████████▌| 951/1000 [16:00<00:47,  1.03it/s, lr=5.91e-7, step_loss=0.00678]
Steps:  95%|█████████▌| 951/1000 [16:00<00:47,  1.03it/s, lr=5.91e-7, step_loss=0.214]  
Steps:  95%|█████████▌| 951/1000 [16:01<00:47,  1.03it/s, lr=5.91e-7, step_loss=0.165]
Steps:  95%|█████████▌| 951/1000 [16:01<00:47,  1.03it/s, lr=5.91e-7, step_loss=0.00596]
Steps:  95%|█████████▌| 952/1000 [16:01<00:46,  1.03it/s, lr=5.91e-7, step_loss=0.00596]
Steps:  95%|█████████▌| 952/1000 [16:01<00:46,  1.03it/s, lr=5.67e-7, step_loss=0.235]  
Steps:  95%|█████████▌| 952/1000 [16:01<00:46,  1.03it/s, lr=5.67e-7, step_loss=0.00473]
Steps:  95%|█████████▌| 952/1000 [16:02<00:46,  1.03it/s, lr=5.67e-7, step_loss=0.00702]
Steps:  95%|█████████▌| 952/1000 [16:02<00:46,  1.03it/s, lr=5.67e-7, step_loss=0.0761] 
Steps:  95%|█████████▌| 953/1000 [16:02<00:45,  1.03it/s, lr=5.67e-7, step_loss=0.0761]
Steps:  95%|█████████▌| 953/1000 [16:02<00:45,  1.03it/s, lr=5.44e-7, step_loss=0.0238]
Steps:  95%|█████████▌| 953/1000 [16:02<00:45,  1.03it/s, lr=5.44e-7, step_loss=0.0101]
Steps:  95%|█████████▌| 953/1000 [16:03<00:45,  1.03it/s, lr=5.44e-7, step_loss=0.00545]
Steps:  95%|█████████▌| 953/1000 [16:03<00:45,  1.03it/s, lr=5.44e-7, step_loss=0.631]  
Steps:  95%|█████████▌| 954/1000 [16:03<00:44,  1.03it/s, lr=5.44e-7, step_loss=0.631]
Steps:  95%|█████████▌| 954/1000 [16:03<00:44,  1.03it/s, lr=5.21e-7, step_loss=0.0663]
Steps:  95%|█████████▌| 954/1000 [16:03<00:44,  1.03it/s, lr=5.21e-7, step_loss=0.0607]
Steps:  95%|█████████▌| 954/1000 [16:04<00:44,  1.03it/s, lr=5.21e-7, step_loss=0.325] 
Steps:  95%|█████████▌| 954/1000 [16:04<00:44,  1.03it/s, lr=5.21e-7, step_loss=0.0112]
Steps:  96%|█████████▌| 955/1000 [16:04<00:43,  1.03it/s, lr=5.21e-7, step_loss=0.0112]
Steps:  96%|█████████▌| 955/1000 [16:04<00:43,  1.03it/s, lr=4.99e-7, step_loss=0.0202]
Steps:  96%|█████████▌| 955/1000 [16:04<00:43,  1.03it/s, lr=4.99e-7, step_loss=0.00695]
Steps:  96%|█████████▌| 955/1000 [16:05<00:43,  1.03it/s, lr=4.99e-7, step_loss=0.09]   
Steps:  96%|█████████▌| 955/1000 [16:05<00:43,  1.03it/s, lr=4.99e-7, step_loss=0.048]
Steps:  96%|█████████▌| 956/1000 [16:05<00:42,  1.03it/s, lr=4.99e-7, step_loss=0.048]
Steps:  96%|█████████▌| 956/1000 [16:05<00:42,  1.03it/s, lr=4.77e-7, step_loss=0.169]
Steps:  96%|█████████▌| 956/1000 [16:05<00:42,  1.03it/s, lr=4.77e-7, step_loss=0.00485]
Steps:  96%|█████████▌| 956/1000 [16:06<00:42,  1.03it/s, lr=4.77e-7, step_loss=0.00571]
Steps:  96%|█████████▌| 956/1000 [16:06<00:42,  1.03it/s, lr=4.77e-7, step_loss=0.00655]
Steps:  96%|█████████▌| 957/1000 [16:06<00:41,  1.03it/s, lr=4.77e-7, step_loss=0.00655]
Steps:  96%|█████████▌| 957/1000 [16:06<00:41,  1.03it/s, lr=4.56e-7, step_loss=0.155]  
Steps:  96%|█████████▌| 957/1000 [16:06<00:41,  1.03it/s, lr=4.56e-7, step_loss=0.34] 
Steps:  96%|█████████▌| 957/1000 [16:07<00:41,  1.03it/s, lr=4.56e-7, step_loss=0.00282]
Steps:  96%|█████████▌| 957/1000 [16:07<00:41,  1.03it/s, lr=4.56e-7, step_loss=0.0182] 
Steps:  96%|█████████▌| 958/1000 [16:07<00:40,  1.03it/s, lr=4.56e-7, step_loss=0.0182]
Steps:  96%|█████████▌| 958/1000 [16:07<00:40,  1.03it/s, lr=4.35e-7, step_loss=0.0486]
Steps:  96%|█████████▌| 958/1000 [16:07<00:40,  1.03it/s, lr=4.35e-7, step_loss=0.0989]
Steps:  96%|█████████▌| 958/1000 [16:07<00:40,  1.03it/s, lr=4.35e-7, step_loss=0.125] 
Steps:  96%|█████████▌| 958/1000 [16:08<00:40,  1.03it/s, lr=4.35e-7, step_loss=0.00695]
Steps:  96%|█████████▌| 959/1000 [16:08<00:39,  1.03it/s, lr=4.35e-7, step_loss=0.00695]
Steps:  96%|█████████▌| 959/1000 [16:08<00:39,  1.03it/s, lr=4.14e-7, step_loss=0.0348] 
Steps:  96%|█████████▌| 959/1000 [16:08<00:39,  1.03it/s, lr=4.14e-7, step_loss=0.119] 
Steps:  96%|█████████▌| 959/1000 [16:08<00:39,  1.03it/s, lr=4.14e-7, step_loss=0.0125]
Steps:  96%|█████████▌| 959/1000 [16:09<00:39,  1.03it/s, lr=4.14e-7, step_loss=0.00566]
Steps:  96%|█████████▌| 960/1000 [16:09<00:38,  1.03it/s, lr=4.14e-7, step_loss=0.00566]
Steps:  96%|█████████▌| 960/1000 [16:09<00:38,  1.03it/s, lr=3.94e-7, step_loss=0.0379] 
Steps:  96%|█████████▌| 960/1000 [16:09<00:38,  1.03it/s, lr=3.94e-7, step_loss=0.161] 
Steps:  96%|█████████▌| 960/1000 [16:09<00:38,  1.03it/s, lr=3.94e-7, step_loss=0.0896]
Steps:  96%|█████████▌| 960/1000 [16:10<00:38,  1.03it/s, lr=3.94e-7, step_loss=0.0737]
Steps:  96%|█████████▌| 961/1000 [16:10<00:38,  1.02it/s, lr=3.94e-7, step_loss=0.0737]
Steps:  96%|█████████▌| 961/1000 [16:10<00:38,  1.02it/s, lr=3.75e-7, step_loss=0.435] 
Steps:  96%|█████████▌| 961/1000 [16:10<00:38,  1.02it/s, lr=3.75e-7, step_loss=0.0204]
Steps:  96%|█████████▌| 961/1000 [16:10<00:38,  1.02it/s, lr=3.75e-7, step_loss=0.108] 
Steps:  96%|█████████▌| 961/1000 [16:11<00:38,  1.02it/s, lr=3.75e-7, step_loss=0.0453]
Steps:  96%|█████████▌| 962/1000 [16:11<00:37,  1.02it/s, lr=3.75e-7, step_loss=0.0453]
Steps:  96%|█████████▌| 962/1000 [16:11<00:37,  1.02it/s, lr=3.56e-7, step_loss=0.00891]
Steps:  96%|█████████▌| 962/1000 [16:11<00:37,  1.02it/s, lr=3.56e-7, step_loss=0.0108] 
Steps:  96%|█████████▌| 962/1000 [16:11<00:37,  1.02it/s, lr=3.56e-7, step_loss=0.0555]
Steps:  96%|█████████▌| 962/1000 [16:12<00:37,  1.02it/s, lr=3.56e-7, step_loss=0.0413]
Steps:  96%|█████████▋| 963/1000 [16:12<00:36,  1.03it/s, lr=3.56e-7, step_loss=0.0413]
Steps:  96%|█████████▋| 963/1000 [16:12<00:36,  1.03it/s, lr=3.37e-7, step_loss=0.103] 
Steps:  96%|█████████▋| 963/1000 [16:12<00:36,  1.03it/s, lr=3.37e-7, step_loss=0.0486]
Steps:  96%|█████████▋| 963/1000 [16:12<00:36,  1.03it/s, lr=3.37e-7, step_loss=0.107] 
Steps:  96%|█████████▋| 963/1000 [16:13<00:36,  1.03it/s, lr=3.37e-7, step_loss=0.0142]
Steps:  96%|█████████▋| 964/1000 [16:13<00:35,  1.03it/s, lr=3.37e-7, step_loss=0.0142]
Steps:  96%|█████████▋| 964/1000 [16:13<00:35,  1.03it/s, lr=3.19e-7, step_loss=0.0277]
Steps:  96%|█████████▋| 964/1000 [16:13<00:35,  1.03it/s, lr=3.19e-7, step_loss=0.199] 
Steps:  96%|█████████▋| 964/1000 [16:13<00:35,  1.03it/s, lr=3.19e-7, step_loss=0.0128]
Steps:  96%|█████████▋| 964/1000 [16:14<00:35,  1.03it/s, lr=3.19e-7, step_loss=0.0262]
Steps:  96%|█████████▋| 965/1000 [16:14<00:34,  1.03it/s, lr=3.19e-7, step_loss=0.0262]
Steps:  96%|█████████▋| 965/1000 [16:14<00:34,  1.03it/s, lr=3.02e-7, step_loss=0.0715]
Steps:  96%|█████████▋| 965/1000 [16:14<00:34,  1.03it/s, lr=3.02e-7, step_loss=0.0208]
Steps:  96%|█████████▋| 965/1000 [16:14<00:34,  1.03it/s, lr=3.02e-7, step_loss=0.143] 
Steps:  96%|█████████▋| 965/1000 [16:15<00:34,  1.03it/s, lr=3.02e-7, step_loss=0.108]
Steps:  97%|█████████▋| 966/1000 [16:15<00:33,  1.03it/s, lr=3.02e-7, step_loss=0.108]
Steps:  97%|█████████▋| 966/1000 [16:15<00:33,  1.03it/s, lr=2.85e-7, step_loss=0.0541]
Steps:  97%|█████████▋| 966/1000 [16:15<00:33,  1.03it/s, lr=2.85e-7, step_loss=0.0127]
Steps:  97%|█████████▋| 966/1000 [16:15<00:33,  1.03it/s, lr=2.85e-7, step_loss=0.00937]
Steps:  97%|█████████▋| 966/1000 [16:16<00:33,  1.03it/s, lr=2.85e-7, step_loss=0.0136] 
Steps:  97%|█████████▋| 967/1000 [16:16<00:32,  1.03it/s, lr=2.85e-7, step_loss=0.0136]
Steps:  97%|█████████▋| 967/1000 [16:16<00:32,  1.03it/s, lr=2.68e-7, step_loss=0.0121]
Steps:  97%|█████████▋| 967/1000 [16:16<00:32,  1.03it/s, lr=2.68e-7, step_loss=0.0522]
Steps:  97%|█████████▋| 967/1000 [16:16<00:32,  1.03it/s, lr=2.68e-7, step_loss=0.0135]
Steps:  97%|█████████▋| 967/1000 [16:16<00:32,  1.03it/s, lr=2.68e-7, step_loss=0.126] 
Steps:  97%|█████████▋| 968/1000 [16:17<00:31,  1.03it/s, lr=2.68e-7, step_loss=0.126]
Steps:  97%|█████████▋| 968/1000 [16:17<00:31,  1.03it/s, lr=2.52e-7, step_loss=0.134]
Steps:  97%|█████████▋| 968/1000 [16:17<00:31,  1.03it/s, lr=2.52e-7, step_loss=0.0162]
Steps:  97%|█████████▋| 968/1000 [16:17<00:31,  1.03it/s, lr=2.52e-7, step_loss=0.00638]
Steps:  97%|█████████▋| 968/1000 [16:17<00:31,  1.03it/s, lr=2.52e-7, step_loss=0.0698] 
Steps:  97%|█████████▋| 969/1000 [16:18<00:30,  1.03it/s, lr=2.52e-7, step_loss=0.0698]
Steps:  97%|█████████▋| 969/1000 [16:18<00:30,  1.03it/s, lr=2.37e-7, step_loss=0.00916]
Steps:  97%|█████████▋| 969/1000 [16:18<00:30,  1.03it/s, lr=2.37e-7, step_loss=0.0108] 
Steps:  97%|█████████▋| 969/1000 [16:18<00:30,  1.03it/s, lr=2.37e-7, step_loss=0.0111]
Steps:  97%|█████████▋| 969/1000 [16:18<00:30,  1.03it/s, lr=2.37e-7, step_loss=0.0677]
Steps:  97%|█████████▋| 970/1000 [16:19<00:29,  1.03it/s, lr=2.37e-7, step_loss=0.0677]
Steps:  97%|█████████▋| 970/1000 [16:19<00:29,  1.03it/s, lr=2.22e-7, step_loss=0.183] 
Steps:  97%|█████████▋| 970/1000 [16:19<00:29,  1.03it/s, lr=2.22e-7, step_loss=0.00502]
Steps:  97%|█████████▋| 970/1000 [16:19<00:29,  1.03it/s, lr=2.22e-7, step_loss=0.38]   
Steps:  97%|█████████▋| 970/1000 [16:19<00:29,  1.03it/s, lr=2.22e-7, step_loss=0.133]
Steps:  97%|█████████▋| 971/1000 [16:20<00:28,  1.03it/s, lr=2.22e-7, step_loss=0.133]
Steps:  97%|█████████▋| 971/1000 [16:20<00:28,  1.03it/s, lr=2.07e-7, step_loss=0.0592]
Steps:  97%|█████████▋| 971/1000 [16:20<00:28,  1.03it/s, lr=2.07e-7, step_loss=0.107] 
Steps:  97%|█████████▋| 971/1000 [16:20<00:28,  1.03it/s, lr=2.07e-7, step_loss=0.195]
Steps:  97%|█████████▋| 971/1000 [16:20<00:28,  1.03it/s, lr=2.07e-7, step_loss=0.0214]
Steps:  97%|█████████▋| 972/1000 [16:21<00:27,  1.03it/s, lr=2.07e-7, step_loss=0.0214]
Steps:  97%|█████████▋| 972/1000 [16:21<00:27,  1.03it/s, lr=1.93e-7, step_loss=0.00396]
Steps:  97%|█████████▋| 972/1000 [16:21<00:27,  1.03it/s, lr=1.93e-7, step_loss=0.0324] 
Steps:  97%|█████████▋| 972/1000 [16:21<00:27,  1.03it/s, lr=1.93e-7, step_loss=0.036] 
Steps:  97%|█████████▋| 972/1000 [16:21<00:27,  1.03it/s, lr=1.93e-7, step_loss=0.0029]
Steps:  97%|█████████▋| 973/1000 [16:22<00:26,  1.03it/s, lr=1.93e-7, step_loss=0.0029]
Steps:  97%|█████████▋| 973/1000 [16:22<00:26,  1.03it/s, lr=1.8e-7, step_loss=0.0403] 
Steps:  97%|█████████▋| 973/1000 [16:22<00:26,  1.03it/s, lr=1.8e-7, step_loss=0.0671]
Steps:  97%|█████████▋| 973/1000 [16:22<00:26,  1.03it/s, lr=1.8e-7, step_loss=0.059] 
Steps:  97%|█████████▋| 973/1000 [16:22<00:26,  1.03it/s, lr=1.8e-7, step_loss=0.238]
Steps:  97%|█████████▋| 974/1000 [16:23<00:25,  1.03it/s, lr=1.8e-7, step_loss=0.238]
Steps:  97%|█████████▋| 974/1000 [16:23<00:25,  1.03it/s, lr=1.67e-7, step_loss=0.0154]
Steps:  97%|█████████▋| 974/1000 [16:23<00:25,  1.03it/s, lr=1.67e-7, step_loss=0.0439]
Steps:  97%|█████████▋| 974/1000 [16:23<00:25,  1.03it/s, lr=1.67e-7, step_loss=0.00863]
Steps:  97%|█████████▋| 974/1000 [16:23<00:25,  1.03it/s, lr=1.67e-7, step_loss=0.112]  
Steps:  98%|█████████▊| 975/1000 [16:24<00:24,  1.03it/s, lr=1.67e-7, step_loss=0.112]
Steps:  98%|█████████▊| 975/1000 [16:24<00:24,  1.03it/s, lr=1.54e-7, step_loss=0.0191]
Steps:  98%|█████████▊| 975/1000 [16:24<00:24,  1.03it/s, lr=1.54e-7, step_loss=0.0107]
Steps:  98%|█████████▊| 975/1000 [16:24<00:24,  1.03it/s, lr=1.54e-7, step_loss=0.0445]
Steps:  98%|█████████▊| 975/1000 [16:24<00:24,  1.03it/s, lr=1.54e-7, step_loss=0.0681]
Steps:  98%|█████████▊| 976/1000 [16:24<00:23,  1.03it/s, lr=1.54e-7, step_loss=0.0681]
Steps:  98%|█████████▊| 976/1000 [16:25<00:23,  1.03it/s, lr=1.42e-7, step_loss=0.126] 
Steps:  98%|█████████▊| 976/1000 [16:25<00:23,  1.03it/s, lr=1.42e-7, step_loss=0.00516]
Steps:  98%|█████████▊| 976/1000 [16:25<00:23,  1.03it/s, lr=1.42e-7, step_loss=0.371]  
Steps:  98%|█████████▊| 976/1000 [16:25<00:23,  1.03it/s, lr=1.42e-7, step_loss=0.0557]
Steps:  98%|█████████▊| 977/1000 [16:25<00:22,  1.03it/s, lr=1.42e-7, step_loss=0.0557]
Steps:  98%|█████████▊| 977/1000 [16:25<00:22,  1.03it/s, lr=1.3e-7, step_loss=0.176]  
Steps:  98%|█████████▊| 977/1000 [16:26<00:22,  1.03it/s, lr=1.3e-7, step_loss=0.119]
Steps:  98%|█████████▊| 977/1000 [16:26<00:22,  1.03it/s, lr=1.3e-7, step_loss=0.139]
Steps:  98%|█████████▊| 977/1000 [16:26<00:22,  1.03it/s, lr=1.3e-7, step_loss=0.00299]
Steps:  98%|█████████▊| 978/1000 [16:26<00:21,  1.03it/s, lr=1.3e-7, step_loss=0.00299]
Steps:  98%|█████████▊| 978/1000 [16:26<00:21,  1.03it/s, lr=1.19e-7, step_loss=0.0929]
Steps:  98%|█████████▊| 978/1000 [16:27<00:21,  1.03it/s, lr=1.19e-7, step_loss=0.00217]
Steps:  98%|█████████▊| 978/1000 [16:27<00:21,  1.03it/s, lr=1.19e-7, step_loss=0.0471] 
Steps:  98%|█████████▊| 978/1000 [16:27<00:21,  1.03it/s, lr=1.19e-7, step_loss=0.071] 
Steps:  98%|█████████▊| 979/1000 [16:27<00:20,  1.03it/s, lr=1.19e-7, step_loss=0.071]
Steps:  98%|█████████▊| 979/1000 [16:27<00:20,  1.03it/s, lr=1.09e-7, step_loss=0.184]
Steps:  98%|█████████▊| 979/1000 [16:28<00:20,  1.03it/s, lr=1.09e-7, step_loss=0.0684]
Steps:  98%|█████████▊| 979/1000 [16:28<00:20,  1.03it/s, lr=1.09e-7, step_loss=0.00927]
Steps:  98%|█████████▊| 979/1000 [16:28<00:20,  1.03it/s, lr=1.09e-7, step_loss=0.0552] 
Steps:  98%|█████████▊| 980/1000 [16:28<00:19,  1.03it/s, lr=1.09e-7, step_loss=0.0552]
Steps:  98%|█████████▊| 980/1000 [16:28<00:19,  1.03it/s, lr=9.87e-8, step_loss=0.0533]
Steps:  98%|█████████▊| 980/1000 [16:29<00:19,  1.03it/s, lr=9.87e-8, step_loss=0.0957]
Steps:  98%|█████████▊| 980/1000 [16:29<00:19,  1.03it/s, lr=9.87e-8, step_loss=0.0231]
Steps:  98%|█████████▊| 980/1000 [16:29<00:19,  1.03it/s, lr=9.87e-8, step_loss=0.00899]
Steps:  98%|█████████▊| 981/1000 [16:29<00:18,  1.03it/s, lr=9.87e-8, step_loss=0.00899]
Steps:  98%|█████████▊| 981/1000 [16:29<00:18,  1.03it/s, lr=8.9e-8, step_loss=0.0509]  
Steps:  98%|█████████▊| 981/1000 [16:30<00:18,  1.03it/s, lr=8.9e-8, step_loss=0.618] 
Steps:  98%|█████████▊| 981/1000 [16:30<00:18,  1.03it/s, lr=8.9e-8, step_loss=0.533]
Steps:  98%|█████████▊| 981/1000 [16:30<00:18,  1.03it/s, lr=8.9e-8, step_loss=0.18] 
Steps:  98%|█████████▊| 982/1000 [16:30<00:17,  1.03it/s, lr=8.9e-8, step_loss=0.18]
Steps:  98%|█████████▊| 982/1000 [16:30<00:17,  1.03it/s, lr=7.99e-8, step_loss=0.0381]
Steps:  98%|█████████▊| 982/1000 [16:31<00:17,  1.03it/s, lr=7.99e-8, step_loss=0.00538]
Steps:  98%|█████████▊| 982/1000 [16:31<00:17,  1.03it/s, lr=7.99e-8, step_loss=0.00211]
Steps:  98%|█████████▊| 982/1000 [16:31<00:17,  1.03it/s, lr=7.99e-8, step_loss=0.106]  
Steps:  98%|█████████▊| 983/1000 [16:31<00:16,  1.03it/s, lr=7.99e-8, step_loss=0.106]
Steps:  98%|█████████▊| 983/1000 [16:31<00:16,  1.03it/s, lr=7.13e-8, step_loss=0.00811]
Steps:  98%|█████████▊| 983/1000 [16:32<00:16,  1.03it/s, lr=7.13e-8, step_loss=0.0755] 
Steps:  98%|█████████▊| 983/1000 [16:32<00:16,  1.03it/s, lr=7.13e-8, step_loss=0.0545]
Steps:  98%|█████████▊| 983/1000 [16:32<00:16,  1.03it/s, lr=7.13e-8, step_loss=0.354] 
Steps:  98%|█████████▊| 984/1000 [16:32<00:15,  1.03it/s, lr=7.13e-8, step_loss=0.354]
Steps:  98%|█████████▊| 984/1000 [16:32<00:15,  1.03it/s, lr=6.32e-8, step_loss=0.0414]
Steps:  98%|█████████▊| 984/1000 [16:33<00:15,  1.03it/s, lr=6.32e-8, step_loss=0.0262]
Steps:  98%|█████████▊| 984/1000 [16:33<00:15,  1.03it/s, lr=6.32e-8, step_loss=0.118] 
Steps:  98%|█████████▊| 984/1000 [16:33<00:15,  1.03it/s, lr=6.32e-8, step_loss=0.0093]
Steps:  98%|█████████▊| 985/1000 [16:33<00:14,  1.03it/s, lr=6.32e-8, step_loss=0.0093]
Steps:  98%|█████████▊| 985/1000 [16:33<00:14,  1.03it/s, lr=5.55e-8, step_loss=0.0521]
Steps:  98%|█████████▊| 985/1000 [16:34<00:14,  1.03it/s, lr=5.55e-8, step_loss=0.0949]
Steps:  98%|█████████▊| 985/1000 [16:34<00:14,  1.03it/s, lr=5.55e-8, step_loss=0.769] 
Steps:  98%|█████████▊| 985/1000 [16:34<00:14,  1.03it/s, lr=5.55e-8, step_loss=0.0525]
Steps:  99%|█████████▊| 986/1000 [16:34<00:13,  1.03it/s, lr=5.55e-8, step_loss=0.0525]
Steps:  99%|█████████▊| 986/1000 [16:34<00:13,  1.03it/s, lr=4.84e-8, step_loss=0.0669]
Steps:  99%|█████████▊| 986/1000 [16:34<00:13,  1.03it/s, lr=4.84e-8, step_loss=0.0745]
Steps:  99%|█████████▊| 986/1000 [16:35<00:13,  1.03it/s, lr=4.84e-8, step_loss=0.324] 
Steps:  99%|█████████▊| 986/1000 [16:35<00:13,  1.03it/s, lr=4.84e-8, step_loss=0.042]
Steps:  99%|█████████▊| 987/1000 [16:35<00:12,  1.03it/s, lr=4.84e-8, step_loss=0.042]
Steps:  99%|█████████▊| 987/1000 [16:35<00:12,  1.03it/s, lr=4.17e-8, step_loss=0.0518]
Steps:  99%|█████████▊| 987/1000 [16:35<00:12,  1.03it/s, lr=4.17e-8, step_loss=0.0544]
Steps:  99%|█████████▊| 987/1000 [16:36<00:12,  1.03it/s, lr=4.17e-8, step_loss=0.0251]
Steps:  99%|█████████▊| 987/1000 [16:36<00:12,  1.03it/s, lr=4.17e-8, step_loss=0.396] 
Steps:  99%|█████████▉| 988/1000 [16:36<00:11,  1.03it/s, lr=4.17e-8, step_loss=0.396]
Steps:  99%|█████████▉| 988/1000 [16:36<00:11,  1.03it/s, lr=3.55e-8, step_loss=0.00794]
Steps:  99%|█████████▉| 988/1000 [16:36<00:11,  1.03it/s, lr=3.55e-8, step_loss=0.0364] 
Steps:  99%|█████████▉| 988/1000 [16:37<00:11,  1.03it/s, lr=3.55e-8, step_loss=0.00406]
Steps:  99%|█████████▉| 988/1000 [16:37<00:11,  1.03it/s, lr=3.55e-8, step_loss=0.129]  
Steps:  99%|█████████▉| 989/1000 [16:37<00:10,  1.03it/s, lr=3.55e-8, step_loss=0.129]
Steps:  99%|█████████▉| 989/1000 [16:37<00:10,  1.03it/s, lr=2.99e-8, step_loss=0.0122]
Steps:  99%|█████████▉| 989/1000 [16:37<00:10,  1.03it/s, lr=2.99e-8, step_loss=0.0152]
Steps:  99%|█████████▉| 989/1000 [16:38<00:10,  1.03it/s, lr=2.99e-8, step_loss=0.0109]
Steps:  99%|█████████▉| 989/1000 [16:38<00:10,  1.03it/s, lr=2.99e-8, step_loss=0.161] 
Steps:  99%|█████████▉| 990/1000 [16:38<00:09,  1.03it/s, lr=2.99e-8, step_loss=0.161]
Steps:  99%|█████████▉| 990/1000 [16:38<00:09,  1.03it/s, lr=2.47e-8, step_loss=0.216]
Steps:  99%|█████████▉| 990/1000 [16:38<00:09,  1.03it/s, lr=2.47e-8, step_loss=0.177]
Steps:  99%|█████████▉| 990/1000 [16:39<00:09,  1.03it/s, lr=2.47e-8, step_loss=0.319]
Steps:  99%|█████████▉| 990/1000 [16:39<00:09,  1.03it/s, lr=2.47e-8, step_loss=0.0181]
Steps:  99%|█████████▉| 991/1000 [16:39<00:08,  1.03it/s, lr=2.47e-8, step_loss=0.0181]
Steps:  99%|█████████▉| 991/1000 [16:39<00:08,  1.03it/s, lr=2e-8, step_loss=0.398]    
Steps:  99%|█████████▉| 991/1000 [16:39<00:08,  1.03it/s, lr=2e-8, step_loss=0.00724]
Steps:  99%|█████████▉| 991/1000 [16:40<00:08,  1.03it/s, lr=2e-8, step_loss=0.0849] 
Steps:  99%|█████████▉| 991/1000 [16:40<00:08,  1.03it/s, lr=2e-8, step_loss=0.117] 
Steps:  99%|█████████▉| 992/1000 [16:40<00:07,  1.03it/s, lr=2e-8, step_loss=0.117]
Steps:  99%|█████████▉| 992/1000 [16:40<00:07,  1.03it/s, lr=1.58e-8, step_loss=0.00399]
Steps:  99%|█████████▉| 992/1000 [16:40<00:07,  1.03it/s, lr=1.58e-8, step_loss=0.018]  
Steps:  99%|█████████▉| 992/1000 [16:41<00:07,  1.03it/s, lr=1.58e-8, step_loss=0.0505]
Steps:  99%|█████████▉| 992/1000 [16:41<00:07,  1.03it/s, lr=1.58e-8, step_loss=0.0109]
Steps:  99%|█████████▉| 993/1000 [16:41<00:06,  1.03it/s, lr=1.58e-8, step_loss=0.0109]
Steps:  99%|█████████▉| 993/1000 [16:41<00:06,  1.03it/s, lr=1.21e-8, step_loss=0.0295]
Steps:  99%|█████████▉| 993/1000 [16:41<00:06,  1.03it/s, lr=1.21e-8, step_loss=0.115] 
Steps:  99%|█████████▉| 993/1000 [16:42<00:06,  1.03it/s, lr=1.21e-8, step_loss=0.0539]
Steps:  99%|█████████▉| 993/1000 [16:42<00:06,  1.03it/s, lr=1.21e-8, step_loss=0.00345]
Steps:  99%|█████████▉| 994/1000 [16:42<00:05,  1.03it/s, lr=1.21e-8, step_loss=0.00345]
Steps:  99%|█████████▉| 994/1000 [16:42<00:05,  1.03it/s, lr=8.88e-9, step_loss=0.0115] 
Steps:  99%|█████████▉| 994/1000 [16:42<00:05,  1.03it/s, lr=8.88e-9, step_loss=0.115] 
Steps:  99%|█████████▉| 994/1000 [16:43<00:05,  1.03it/s, lr=8.88e-9, step_loss=0.0306]
Steps:  99%|█████████▉| 994/1000 [16:43<00:05,  1.03it/s, lr=8.88e-9, step_loss=0.0234]
Steps: 100%|█████████▉| 995/1000 [16:43<00:04,  1.03it/s, lr=8.88e-9, step_loss=0.0234]
Steps: 100%|█████████▉| 995/1000 [16:43<00:04,  1.03it/s, lr=6.17e-9, step_loss=0.356] 
Steps: 100%|█████████▉| 995/1000 [16:43<00:04,  1.03it/s, lr=6.17e-9, step_loss=0.00621]
Steps: 100%|█████████▉| 995/1000 [16:43<00:04,  1.03it/s, lr=6.17e-9, step_loss=0.00765]
Steps: 100%|█████████▉| 995/1000 [16:44<00:04,  1.03it/s, lr=6.17e-9, step_loss=0.00333]
Steps: 100%|█████████▉| 996/1000 [16:44<00:03,  1.03it/s, lr=6.17e-9, step_loss=0.00333]
Steps: 100%|█████████▉| 996/1000 [16:44<00:03,  1.03it/s, lr=3.95e-9, step_loss=0.285]  
Steps: 100%|█████████▉| 996/1000 [16:44<00:03,  1.03it/s, lr=3.95e-9, step_loss=0.0616]
Steps: 100%|█████████▉| 996/1000 [16:44<00:03,  1.03it/s, lr=3.95e-9, step_loss=0.0666]
Steps: 100%|█████████▉| 996/1000 [16:45<00:03,  1.03it/s, lr=3.95e-9, step_loss=0.0903]
Steps: 100%|█████████▉| 997/1000 [16:45<00:02,  1.03it/s, lr=3.95e-9, step_loss=0.0903]
Steps: 100%|█████████▉| 997/1000 [16:45<00:02,  1.03it/s, lr=2.22e-9, step_loss=0.147] 
Steps: 100%|█████████▉| 997/1000 [16:45<00:02,  1.03it/s, lr=2.22e-9, step_loss=0.251]
Steps: 100%|█████████▉| 997/1000 [16:45<00:02,  1.03it/s, lr=2.22e-9, step_loss=0.141]
Steps: 100%|█████████▉| 997/1000 [16:46<00:02,  1.03it/s, lr=2.22e-9, step_loss=0.214]
Steps: 100%|█████████▉| 998/1000 [16:46<00:01,  1.03it/s, lr=2.22e-9, step_loss=0.214]
Steps: 100%|█████████▉| 998/1000 [16:46<00:01,  1.03it/s, lr=9.87e-10, step_loss=0.0135]
Steps: 100%|█████████▉| 998/1000 [16:46<00:01,  1.03it/s, lr=9.87e-10, step_loss=0.0968]
Steps: 100%|█████████▉| 998/1000 [16:46<00:01,  1.03it/s, lr=9.87e-10, step_loss=0.0224]
Steps: 100%|█████████▉| 998/1000 [16:47<00:01,  1.03it/s, lr=9.87e-10, step_loss=0.00573]
Steps: 100%|█████████▉| 999/1000 [16:47<00:00,  1.03it/s, lr=9.87e-10, step_loss=0.00573]
Steps: 100%|█████████▉| 999/1000 [16:47<00:00,  1.03it/s, lr=2.47e-10, step_loss=0.0665] 
Steps: 100%|█████████▉| 999/1000 [16:47<00:00,  1.03it/s, lr=2.47e-10, step_loss=0.206] 
Steps: 100%|█████████▉| 999/1000 [16:47<00:00,  1.03it/s, lr=2.47e-10, step_loss=0.0507]
Steps: 100%|█████████▉| 999/1000 [16:48<00:00,  1.03it/s, lr=2.47e-10, step_loss=0.0978]
Steps: 100%|██████████| 1000/1000 [16:48<00:00,  1.03it/s, lr=2.47e-10, step_loss=0.0978]
Steps: 100%|██████████| 1000/1000 [16:48<00:00,  1.03it/s, lr=0, step_loss=0.0862]       {'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 17.89it/s]Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of runwayml/stable-diffusion-v1-5.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 13.86it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.05it/s]
07/28/2024 20:52:50 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A naruto with blue eyes..
Model weights saved in /tmp/train-t2i-lora/pytorch_lora_weights.safetensors
{'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]{'encoder_hid_dim', 'conv_out_kernel', 'upcast_attention', 'num_attention_heads', 'mid_block_only_cross_attention', 'cross_attention_norm', 'conv_in_kernel', 'addition_embed_type', 'addition_time_embed_dim', 'num_class_embeds', 'mid_block_type', 'time_embedding_type', 'encoder_hid_dim_type', 'only_cross_attention', 'transformer_layers_per_block', 'dual_cross_attention', 'reverse_transformer_layers_per_block', 'attention_type', 'time_embedding_act_fn', 'class_embed_type', 'resnet_time_scale_shift', 'projection_class_embeddings_input_dim', 'time_cond_proj_dim', 'timestep_post_act', 'class_embeddings_concat', 'resnet_out_scale_factor', 'resnet_skip_time_act', 'use_linear_projection', 'addition_embed_type_num_heads', 'time_embedding_dim', 'dropout'} was not found in config. Values will be initialized to default values.
Loaded unet as UNet2DConditionModel from `unet` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  14%|█▍        | 1/7 [00:00<00:00,  7.28it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 10.62it/s]Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of runwayml/stable-diffusion-v1-5.
{'force_upcast', 'scaling_factor', 'use_post_quant_conv', 'use_quant_conv', 'latents_mean', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 11.19it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 11.70it/s]
Loading unet.
07/28/2024 20:52:59 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A naruto with blue eyes..
Steps: 100%|██████████| 1000/1000 [17:06<00:00,  1.03s/it, lr=0, step_loss=0.0862]
./
./pytorch_lora_weights.safetensors
tar: ./lora.tar: file is the archive; not dumped
Version Details
Version ID
aa769732ccfc922fb77ec36c1bd3680208394f1770402db52681fc8556e9356a
Version Created
July 28, 2024
Run on Replicate →