Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

added atomic claims to prevent duplicate processing for long-running workflows, some gmail processes took >60s and as a result the same emails being processed twice, removed memory fallback since redis/db are always available

Type of Change

  • Bug fix

Testing

Tested manually.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Sep 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
sim Error Error Sep 18, 2025 0:19am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Sep 18, 2025 0:19am

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces atomic claiming to the idempotency service to prevent duplicate processing of long-running workflows. The changes address a race condition where Gmail webhook processing operations taking over 60 seconds were being processed multiple times by concurrent instances.

The key architectural changes include:

  • Atomic claiming mechanism: Replaces the previous check-then-process pattern with atomic operations using Redis SET NX and database INSERT ON CONFLICT DO NOTHING to ensure only one instance can claim a processing key at a time
  • Status-based coordination: Introduces a three-state system ('in-progress', 'completed', 'failed') allowing concurrent processes to coordinate properly
  • Waiting mechanism: Implements waitForResult() with a 5-minute timeout and 1-second polling intervals for processes that fail to claim a key, allowing them to wait for the claiming process to complete
  • Infrastructure simplification: Removes the memory cache fallback and enableDatabaseFallback configuration, assuming Redis and database are always available in production

The changes integrate with the existing Gmail polling service which was experiencing the duplicate processing issue. The atomicallyClaim() method ensures that when multiple instances attempt to process the same Gmail webhook simultaneously, only one succeeds in claiming the processing rights while others either wait for completion or receive cached results. This is particularly important for Gmail operations that can exceed 60 seconds due to API rate limits and large email volumes.

Confidence score: 3/5

  • This PR addresses a legitimate concurrency issue but introduces complex distributed coordination logic that could have edge cases
  • Score reflects the sophisticated atomic claiming implementation that should work well but may have timing-related edge cases in high-concurrency scenarios
  • Pay close attention to the atomic claiming logic and potential duplicate condition checking in the executeWithIdempotency method

1 file reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

@vercel vercel bot temporarily deployed to Preview – docs September 18, 2025 00:11 Inactive
@waleedlatif1 waleedlatif1 merged commit 658cf11 into staging Sep 18, 2025
5 of 6 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/idempotency branch September 18, 2025 00:17
Acumen-Desktop pushed a commit to Acumen-Desktop/sim that referenced this pull request Sep 20, 2025
…ocessing for long-running workflows (simstudioai#1366)

* improvement(idempotency): added atomic claims to prevent duplicate processing for long-running workflows

* ack PR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants