problem in denoising and generated image

Hello, explainable AI,
First of all, thank you so much for the videos and codes you shared. I have learned lots of things from you. 
I tried to generate new images (unconditional)using the Mammography images dataset(3100 images). I trained the VQVAE for 100 epochs, and the results were good, as you can see below. 
![current_autoencoder_sample_151](https://github.com/user-attachments/assets/df7f37f6-0746-4a36-a488-222132cb1107)

Then, I trained the diffusion part. I started with 100 epochs, and when I ran the sample_ddpm_vqvae.py the output image was like this 
![x0_0](https://github.com/user-attachments/assets/683f6ba3-e3c9-4f1c-9b0a-098c0595566d)
I thought maybe the number of epochs in train_ddpm_vqvae.py was not enough, and it was still learning. So I changed to 500,  but I got the same result. I thought there might be another problem. Below is the generated (denoised) image for 500 epochs.
![x0_new](https://github.com/user-attachments/assets/a53c2f80-71ed-4a4c-9e28-9ef4a8609605)
As you can see, nothing has changed. 
"I would like to share my configuration here and would greatly appreciate it if you could take a look and provide your comments on the issue I'm facing. Do you think increasing the learning rate might help?" 

dataset_config:
  im_path: '/cta/users/undergrad2/StableDiffusion-PyTorch/data/Breast/Breast-img'
  im_channels : 3
  im_size : 256
  name: 'Breast'

diffusion_params:
  num_timesteps : 1000
  beta_start: 0.00085
  beta_end: 0.012

ldm_params:
  down_channels: [ 256, 384, 512, 768 ]
  mid_channels: [ 768, 512 ]
  down_sample: [ True, True, True ]
  attn_down : [True, True, True]
  time_emb_dim: 512
  norm_channels: 32
  num_heads: 16
  conv_out_channels : 128
  num_down_layers : 2
  num_mid_layers : 2
  num_up_layers : 2
  condition_config:
    condition_types: None #[ 'classes' ]
    class_condition_config:
      cond_drop_prob : 0.1
      class_condition_l : 11
    text_condition_config:
      text_embed_model: 'clip'
      train_text_embed_model: False
      text_embed_dim: 512
      cond_drop_prob: 0.1
    image_condition_config:
      image_condition_input_channels: 18
      image_condition_output_channels: 3
      image_condition_h : 512
      image_condition_w : 512
      cond_drop_prob: 0.1

autoencoder_params:
  z_channels: 3
  codebook_size : 8192
  down_channels : [64, 128, 256, 256]
  mid_channels : [256, 256]
  down_sample : [True, True, True]
  attn_down : [False, False, False]
  norm_channels: 32
  num_heads: 4
  num_down_layers : 2
  num_mid_layers : 2
  num_up_layers : 2
  num_headblocks : 1

train_params:
  seed : 1111
  task_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training'
  ldm_batch_size: 16
  autoencoder_batch_size: 8
  disc_start: 10
  disc_weight: 0.5
  codebook_weight: 1
  commitment_beta: 0.2
  perceptual_weight: 1
  kl_weight: 0.000005
  ldm_epochs: 500 # 100 it was 100 at first but didnt work
  autoencoder_epochs: 100
  num_samples: 1
  num_grid_rows: 1
  ldm_lr: 0.000005
  autoencoder_lr: 0.00001
  autoencoder_acc_steps: 4
  autoencoder_img_save_steps: 256
  use_latents: True
  save_latents : False #in vqvae if u set it true, in vqvae_latents it will save latents (which are for batches) also u can generate using code
  cf_guidance_scale : 1.0
  load_ckpt: True

  vqvae_latent_dir_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vqvae_latents'
  vae_latent_dir_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vae_latents'
  vqvae_autoencoder_ckpt_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vqvae_autoencoder_ckpt.pth'
  vae_autoencoder_ckpt_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vae_autoencoder_ckpt.pth'
  vqvae_discriminator_ckpt_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vqvae_discriminator_ckpt.pth'
  vae_discriminator_ckpt_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/vae_discriminator_ckpt.pth'
  ldm_ckpt_name: '/cta/users/undergrad2/StableDiffusion-PyTorch/Breast_training/ddpm_ckpt_epoch_495.pth'

Thanks a million!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

problem in denoising and generated image #36

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

problem in denoising and generated image #36

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions