Skip to content

Performance degradation in workflows with many plugins after upgrading from v1.4.3 to v1.11.2 #30673

@noobtimize

Description

@noobtimize

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

I'm currently using Dify 1.4.3 and planning to upgrade to v1.11.2. During performance testing of our workflow, we observed a clear degradation in v1.11.2 compared to 1.4.3.

Please note that these tests were conducted using a mock LLM server, so LLM latency variations between requests are minimal.

  • This is the benchmark summary in v1.4.3:

--- Percentile Breakdown ---
Elapsed Time P50: 1.40 sec
Time to First Token P50: 1.11 sec
Elapsed Time P95: 1.92 sec
Time to First Token P95: 1.62 sec
Elapsed Time P99: 2.35 sec
Time to First Token P99: 1.79 sec
--- Summary for 20ccu_in_1m ---
Total Requests: 109
Successful Responses: 109
Failures: 0
Avg Latency: 1.46 sec
Avg Time to First Token: 1.18 sec
  • This is the summary in v1.11.2
--- Percentile Breakdown ---
Elapsed Time P50: 3.23 sec
Time to First Token P50: 2.62 sec
Elapsed Time P95: 5.77 sec
Time to First Token P95: 4.62 sec
Elapsed Time P99: 6.28 sec
Time to First Token P99: 5.22 sec
--- Summary for 20ccu_in_1m ---
Total Requests: 100
Successful Responses: 100
Failures: 0
Avg Latency: 3.56 sec
Avg Time to First Token: 2.89 sec

For this benchmark setup (20 CCU over two api instances) its CPU usage also increased 30% from ~60% in v1.4.3 to ~95% in v1.11.2

Please note that all the benchmark outputs are from the second run after I restarted the API instance to reach a stable warmed state.

2. Additional context or comments

After investigating the log, I see that it creates a new connection every time a request is made to the plugin daemon.

The api log: dify-1.11.2-debug.log

In my workflow, (which has many tools) it (re)makes about 24 TCP connections, each taking about 30-40 ms.
I made a change here: noobtimize@3fb5c70

This is the benchmark output using a reused httpx client:

--- Percentile Breakdown ---
Elapsed Time P50: 1.62 sec
Time to First Token P50: 1.35 sec
Elapsed Time P95: 2.41 sec
Time to First Token P95: 2.05 sec
Elapsed Time P99: 2.61 sec
Time to First Token P99: 2.18 sec
--- Summary for 20ccu_in_1m ---
Total Requests: 117
Successful Responses: 117
Failures: 0
Avg Latency: 1.69 sec
Avg Time to First Token: 1.39 sec

I also noticed that the sleep in the case of queue.Empty adds another ~0.1s-0.2s latency (for my workflow) and seems unnecessary, since we already wait in self._event_queue.get(timeout=0.1)

try:
event = self._event_queue.get(timeout=0.1)
self._event_handler.dispatch(event)
self._event_queue.task_done()
self._process_commands(event)
except queue.Empty:
time.sleep(0.1)

The output with the sleep commented out + reuse httpx client:

--- Percentile Breakdown ---
Elapsed Time P50: 1.34 sec
Time to First Token P50: 1.09 sec
Elapsed Time P95: 1.77 sec
Time to First Token P95: 1.44 sec
Elapsed Time P99: 1.97 sec
Time to First Token P99: 1.55 sec
--- Summary for 20ccu_in_1m ---
Total Requests: 120
Successful Responses: 120
Failures: 0
Avg Latency: 1.40 sec
Avg Time to First Token: 1.14 sec

3. Can you help us with this feature?

  • I am interested in contributing to this feature.

Metadata

Metadata

Assignees

Labels

🐞 bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions