Enhancement on VRAM (and maybe RAM?) handling between samplers on LTX 2 comfy native workflow

### Custom Node Testing

- [ ] I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)

### Expected Behavior

1080p above 200 frames with 25 fps video generation can be normally performed without OOM across samplers

### Actual Behavior

At 1080p above 200 frames with 25 fps video generation, when the progress move from first sampling to the second sampling (upscale 2x step), the VRAM usage shoots up and causes OOM

### Steps to Reproduce

I'm using native ComfyUI workflow (from the templates browser) I2V LTX 2 distilled workflow, nothing is changed.

The flags i'm using to run comfy ui is: `comfy launch -- --use-sage-attention --listen --novram --cache-none --disable-smart-memory --preview-method taesd`

Machine specs is: 16GB VRAM, 32GB RAM, 64GB file swap, swappiness set to 6. I'm using Linux in headless mode (no desktop environment running, operate solely using SSH, remotely)

The versions as follows (i'm on nightly channel):

<img width="658" height="789" alt="Image" src="https://github.com/user-attachments/assets/c31e5ffe-9318-4f67-b66c-e553a73dc442" />


At first I thought ComfyUI doesn't have room to move stuff from VRAM to RAM, so i tried also added flag `--disable-pinned-memory`, drastically reduces the RAM usage when generating, but still got OOM (VRAM shoots up when starting the second sampling).

Then I'm experimenting, to split the run between the two samplers this way:
1. Run the first sampling steps, and save all the latents to file, there is three 3 latents: audio, video before `LTXVCropGuides` node node, video after `LTXVlmgToVideolnplace` node.
2. After all latents is saved, I invert the bypassed nodes and plug all the saved latents, and it works! Successfully generated the video without OOM, the VRAM usage is at mid hovering around 50%

By that experiment, I think there's a possibly for enhancement in VRAM handling, or maybe not. So, I give it to the contributors that know the inner working of VRAM handling for LTX 2 in ComfyUI.

This issue only surface when I tried to generate 1080p video with frame above 200 frames at 25 fps. 1080p 200 frame with the same fps work just fine without the experiment.

### Debug Logs

```powershell
Requested to load LTXAV
loaded partially; 0.00 MB usable, 0.00 MB loaded, 20541.27 MB offloaded, 448.07 MB buffer reserved, lowvram patches: 0
  0%|                                                                                                                                           | 0/3 [00:02<?, ?it/s]
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/execution.py", line 518, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/execution.py", line 329, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/execution.py", line 303, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/execution.py", line 291, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy_api/internal/__init__.py", line 149, in wrapped_func
    return method(locked_class, **inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy_api/latest/_io.py", line 1570, in EXECUTE_NORMALIZED
    to_return = cls.execute(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy_extras/nodes_custom_sampler.py", line 950, in execute
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 1050, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 994, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 980, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 752, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/k_diffusion/sampling.py", line 1465, in sample_gradient_estimation
    denoised = model(x, sigmas[i] * s_in, **extra_args)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 401, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 953, in __call__
    return self.outer_predict_noise(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 960, in outer_predict_noise
    ).execute(x, timestep, model_options, seed)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 963, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 381, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 206, in calc_cond_batch
    return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 214, in _calc_cond_batch_outer
    return executor.execute(model, conds, x_in, timestep, model_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/samplers.py", line 326, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/model_base.py", line 163, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/model_base.py", line 205, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/ldm/lightricks/av_model.py", line 828, in forward
    return super().forward(
           ^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/ldm/lightricks/model.py", line 745, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/patcher_extension.py", line 112, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/ldm/lightricks/model.py", line 792, in _forward
    x = self._process_transformer_blocks(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/ldm/lightricks/av_model.py", line 745, in _process_transformer_blocks
    vx, ax = block(
             ^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yusrmuttaqien/Documents/ComfyUI/py-3.12/bin/comfy/ldm/lightricks/av_model.py", line 257, in forward
    self.audio_to_video_attn(
torch.OutOfMemoryError: Allocation on device

Memory summary: |===========================================================================|
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|===========================================================================|
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |  13507 MiB |  14926 MiB |      0 B   |      0 B   |
|       from large pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Active memory         |  13507 MiB |  14926 MiB |      0 B   |      0 B   |
|       from large pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Requested memory      |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| GPU reserved memory   |  15712 MiB |  15712 MiB |      0 B   |      0 B   |
|       from large pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Non-releasable memory |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Allocations           |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Active allocs         |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| GPU reserved segments |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Non-releasable allocs |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize allocations  |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize GPU segments |       0    |       0    |       0    |       0    |
|===========================================================================|

Got an OOM, unloading all loaded models.
```

### Other

Additionally, i'm not sure what's wrong with sage attention when running the LTX 2 workflow, i got this log:

```bash
got prompt
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.bfloat16, manual cast: torch.bfloat16
model_type FLUX
unet unexpected: ['audio_embeddings_connector.learnable_registers', ...<and many more>..., 'video_embeddings_connector.transformer_1d_blocks.1.ff.net.2.weight']
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.
Requested to load VideoVAE
loaded partially; 0.00 MB usable, 0.00 MB loaded, 2378.23 MB offloaded, 648.02 MB buffer reserved, lowvram patches: 0
clip missing: ['gemma3_12b.logit_scale', ...<and many more>... , 'gemma3_12b.transformer.vision_model.post_layernorm.bias']
Requested to load LTXAVTEModel_
loaded completely; 95367431640625005117571072.00 MB usable, 25965.49 MB loaded, full load: True
CLIP/text encoder model load device: cpu, offload device: cpu, current: cpu, dtype: torch.float16
Error running sage attention: list indices must be integers or slices, not NoneType, using pytorch attention instead.
Error running sage attention: list indices must be integers or slices, not NoneType, using pytorch attention instead.
Error running sage attention: list indices must be integers or slices, not NoneType, using pytorch attention instead.
Error running sage attention: list indices must be integers or slices, not NoneType, using pytorch attention instead.
Warning: TAESD previews enabled, but could not find models/vae_approx/None
Requested to load LTXAV
loaded partially; 0.00 MB usable, 0.00 MB loaded, 20541.27 MB offloaded, 448.07 MB buffer reserved, lowvram patches: 0
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [02:40<00:00, 20.09s/it]
Requested to load VideoVAE
loaded partially; 0.00 MB usable, 0.00 MB loaded, 2378.23 MB offloaded, 648.02 MB buffer reserved, lowvram patches: 0
VRAMdebug: free memory before:  16,481,845,248
VRAMdebug: free memory after:  16,481,845,248
VRAMdebug: freed memory:  0
Warning: TAESD previews enabled, but could not find models/vae_approx/None
```

Not sure if it's refuse to use sage attention for the text encoder (gemini) or the model itself (distilled fp8 ltx2 19B)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enhancement on VRAM (and maybe RAM?) handling between samplers on LTX 2 comfy native workflow #11726

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enhancement on VRAM (and maybe RAM?) handling between samplers on LTX 2 comfy native workflow #11726

Description

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions