diffengine.models.editors.lcm.lcm_xl

Module Contents

Classes

LatentConsistencyModelsXL

Stable Diffusion XL Latent Consistency Models.

class diffengine.models.editors.lcm.lcm_xl.LatentConsistencyModelsXL(*args, timesteps_generator=None, num_ddim_timesteps=50, w_min=3.0, w_max=15.0, ema_type='ExponentialMovingAverage', ema_momentum=0.05, **kwargs)[source]

Bases: diffengine.models.editors.stable_diffusion_xl.StableDiffusionXL

Stable Diffusion XL Latent Consistency Models.

Args:

timesteps_generator (dict, optional): The timesteps generator config.

Defaults to dict(type='DDIMTimeSteps').

num_ddim_timesteps (int): Number of DDIM timesteps. Defaults to 50. w_min (float): Minimum guidance scale. Defaults to 3.0. w_max (float): Maximum guidance scale. Defaults to 15.0. ema_type (str): The type of EMA.

Defaults to ‘ExponentialMovingAverage’.

ema_momentum (float): The EMA momentum. Defaults to 0.05.

prepare_model()[source]

Prepare model for training.

Disable gradient for some models.

Return type:

None

set_xformers()[source]

Set xformers for model.

Return type:

None

infer(prompt, height=None, width=None, num_inference_steps=4, guidance_scale=1.0, output_type='pil', **kwargs)[source]

Inference function.

Args:
prompt (List[str]):

The prompt or prompts to guide the image generation.

negative_prompt (Optional[str]):

The prompt or prompts to guide the image generation. Defaults to None.

height (int, optional):

The height in pixels of the generated image. Defaults to None.

width (int, optional):

The width in pixels of the generated image. Defaults to None.

num_inference_steps (int): Number of inference steps.

Defaults to 50.

guidance_scale (float): The guidance scale. Defaults to 1.0. output_type (str): The output format of the generate image.

Choose between ‘pil’ and ‘latent’. Defaults to ‘pil’.

**kwargs: Other arguments.

Parameters:
  • prompt (list[str]) –

  • height (int | None) –

  • width (int | None) –

  • num_inference_steps (int) –

  • guidance_scale (float) –

  • output_type (str) –

Return type:

list[numpy.ndarray]

loss(model_pred, gt, timesteps, weight=None)[source]

Calculate loss.

Parameters:
  • model_pred (torch.Tensor) –

  • gt (torch.Tensor) –

  • timesteps (torch.Tensor) –

  • weight (torch.Tensor | None) –

Return type:

dict[str, torch.Tensor]

forward(inputs, data_samples=None, mode='loss')[source]

Forward function.

Args:

inputs (dict): The input dict. data_samples (Optional[list], optional): The data samples.

Defaults to None.

mode (str, optional): The mode. Defaults to “loss”.

Returns:

dict: The loss dict.

Parameters:
  • inputs (dict) –

  • data_samples (Optional[list]) –

  • mode (str) –

Return type:

dict

_predicted_origin(model_output, timesteps, sample)[source]

Predict the origin of the model output.

Args:

model_output (torch.Tensor): The model output. timesteps (torch.Tensor): The timesteps. sample (torch.Tensor): The sample.

Parameters:
  • model_output (torch.Tensor) –

  • timesteps (torch.Tensor) –

  • sample (torch.Tensor) –

Return type:

torch.Tensor

Parameters:
  • timesteps_generator (dict | None) –

  • num_ddim_timesteps (int) –

  • w_min (float) –

  • w_max (float) –

  • ema_type (str) –

  • ema_momentum (float) –