diffengine.models.editors.ip_adapter.pipeline

Module Contents

Classes

StableDiffusionXLPipelineCustomIPAdapter

Custom IP Adapter for the StableDiffusionXLPipeline class.

StableDiffusionXLPipelineTimmIPAdapter

Timm IP Adapter for the StableDiffusionXLPipeline class.

class diffengine.models.editors.ip_adapter.pipeline.StableDiffusionXLPipelineCustomIPAdapter(vae, text_encoder, text_encoder_2, tokenizer, tokenizer_2, unet, scheduler, image_encoder=None, feature_extractor=None, force_zeros_for_empty_prompt=True, add_watermarker=None, hidden_states_idx=-2)[source]

Bases: diffusers.StableDiffusionXLPipeline

Custom IP Adapter for the StableDiffusionXLPipeline class.

The difference between this class and the original StableDiffusionXLPipeline class is that this class uses the hidden states from the hidden_states_idx layer of the image encoder to encode the image.

Parameters:
  • *args – Variable length argument list.

  • hidden_states_idx (int) – Index of the hidden states to be used. Defaults to -2.

  • **kwargs – Arbitrary keyword arguments.

encode_image(image, device, num_images_per_prompt, output_hidden_states=None)[source]

Encodes the image.

Parameters:
  • image – The input image to be encoded.

  • device – The device to be used for encoding.

  • num_images_per_prompt – The number of images per prompt.

  • output_hidden_states – Whether to output hidden states. Defaults to None.

Returns:

Encoded hidden states of the image. uncond_image_enc_hidden_states: Encoded hidden states of the unconditional image.

Return type:

image_enc_hidden_states

class diffengine.models.editors.ip_adapter.pipeline.StableDiffusionXLPipelineTimmIPAdapter(vae, text_encoder, text_encoder_2, tokenizer, tokenizer_2, unet, scheduler, image_encoder=None, feature_extractor=None, force_zeros_for_empty_prompt=True, add_watermarker=None)[source]

Bases: diffusers.StableDiffusionXLPipeline

Timm IP Adapter for the StableDiffusionXLPipeline class.

The difference between this class and the original StableDiffusionXLPipeline class is that this class uses the timm library for the image encoder.

Parameters:
  • *args – Variable length argument list.

  • hidden_states_idx (int) – Index of the hidden states to be used. Defaults to -2.

  • **kwargs – Arbitrary keyword arguments.

property _execution_device[source]

Returns the device on which the pipeline’s models will be executed. After calling [~DiffusionPipeline.enable_sequential_cpu_offload] the execution device can only be inferred from Accelerate’s module hooks.

encode_image(image, device, num_images_per_prompt, output_hidden_states=None)[source]

Encodes the image.

Parameters:
  • image – The input image to be encoded.

  • device – The device to be used for encoding.

  • num_images_per_prompt – The number of images per prompt.

  • output_hidden_states – Whether to output hidden states. Defaults to None.

Returns:

Encoded hidden states of the image. uncond_image_enc_hidden_states: Encoded hidden states of the unconditional image.

Return type:

image_enc_hidden_states