Skip to content

Conversation

@HsnSaboor
Copy link
Contributor

This PR fixes an issue where coding tools (Claude Code, OpenCode) would stop after a single turn because the stop_reason was not being correctly mapped from the upstream provider.

  • Gemini: Correctly map 'MAX_TOKENS' to 'max_tokens' and 'STOP'/'FINISH_REASON_UNSPECIFIED'/'UNKNOWN' to 'end_turn'.
  • Gemini CLI: Applied the same logic as the standard Gemini translator.
  • Codex: Propagate the 'stop_reason' directly from the upstream response if available, falling back to 'tool_use' or 'end_turn'.
  • Added SanitizeFunctionName utility to ensure function names comply with Gemini/Vertex AI naming requirements.
  • Fixed tool ID parsing bug where function name extraction was using wrong index.
  • Updated schema placeholder logic to prevent adding unnecessary placeholders to top-level schemas.

em4go and others added 30 commits November 27, 2025 20:14
Add complete GitHub Copilot support including:
- Device flow OAuth authentication via GitHub's official client ID
- Token management with automatic caching (25 min TTL)
- OpenAI-compatible API executor for api.githubcopilot.com
- 16 model definitions (GPT-5 variants, Claude variants, Gemini, Grok, Raptor)
- CLI login command via -github-copilot-login flag
- SDK authenticator and refresh registry integration

Enables users to authenticate with their GitHub Copilot subscription and
use it as a backend provider alongside existing providers.
- Simplify error type checking in oauth.go using errors.As directly
- Remove redundant errors.As call in GetUserFriendlyMessage
- Remove unused CachedAPIToken and TokenManager types (dead code)
- Add Unwrap() to AuthenticationError for proper error chain handling with errors.Is/As
- Extract hardcoded header values to constants for maintainability
- Replace verbose status code checks with isHTTPSuccess() helper
- Remove unused ExtractBearerToken() and BuildModelsURL() functions
- Make buildChatCompletionURL() private (only used internally)
- Remove unused 'strings' import
…pilot executor

- Replace concurrent-unsafe metadata caching with thread-safe sync.RWMutex-protected map
- Extract magic numbers and hardcoded header values to named constants
- Replace verbose status code checks with isHTTPSuccess() helper
- Simplify normalizeModel() to no-op with explanatory comment (models already canonical)
- Remove redundant metadata manipulation in token caching
- Improve code clarity and performance with proper cache management
…t-auth

feat(auth): add GitHub Copilot authentication and API integration
…and Goreleaser**

- Updated release and Docker workflows to ensure the `-plus` suffix is added to build versions when missing.
- Adjusted Goreleaser configuration to include `-plus` suffix in `main.Version` during build process.
…ant**

- Updated Docker and release workflows to use `cli-proxy-api-plus` as the Docker repository name and adjusted tag generation logic.
- Renamed artifacts in Goreleaser configuration to align with the new 'plus' variant naming convention.
- Updated README files to reflect the new 'Plus' variant with detailed third-party provider support information.
- Adjusted Dockerfile, `docker-compose.yml`, and build configurations to align with the `CLIProxyAPIPlus` naming convention.
- Added information about GitHub Copilot OAuth integration contributed by the community.
- Store models with provider:modelID keys for independent tracking per provider
- Add ModelIDNormalizer for incoming model ID normalization
- Add ExtractProviderFromPrefixedID() for provider detection from prefixed IDs
- Add alphabetical sorting for consistent model list ordering
- Consolidate provider display names into single source of truth
- Add SetShowProviderPrefixes() for toggling visual prefixes

This allows the same model (e.g. gemini-2.5-flash) to be registered
from multiple providers with separate quota tracking and availability.

Note: Visual prefixes are fully optional and controlled via the
show_provider_prefixes config flag.
- Add Intermediate Representation (IR) as provider-agnostic API format
- Add to_ir/ parsers: OpenAI, Claude, Gemini, Ollama -> IR
- Add from_ir/ emitters: IR -> OpenAI, Claude, Gemini, Gemini CLI, Ollama
- Add translator_wrapper.go for executor integration
- Add ir/util.go with shared helpers

The IR layer acts as an intermediate API between different provider formats.
Instead of NxM direct translations, each format only needs to_ir and from_ir
converters (hub-and-spoke architecture).

Key benefits:
- Unified handling of reasoning/thinking content
- Consistent tool call ID generation
- Centralized finish reason mapping
- Easier addition of new providers

Enabled via use_canonical_translator config flag.
- Update gemini_cli_executor.go with TranslateToGeminiCLI() and response translation
- Update gemini_executor.go with TranslateToGemini() and response translation
- Update aistudio_executor.go with new translator integration
- Update antigravity_executor.go with streaming state management
- Update claude_executor.go with TranslateClaudeResponse*() functions
- Update gemini_vertex_executor.go with new translator support
- Add GeminiCLIStreamState and ClaudeStreamState for streaming conversions

All executors now support proper reasoning_tokens tracking and consistent
thinking block handling when use_canonical_translator is enabled.
- Add internal/auth/cline/ with JWT token refresh mechanism
- Add cline_executor.go with OpenAI-compatible chat completions
- Support two free models: MiniMax M2 and Grok Code Fast 1

Cline provides access to free models through OpenAI-compatible API:
- minimax/minimax-m2 (MiniMax M2)
- x-ai/grok-code-fast-1 (Grok Code Fast 1)

IMPORTANT: Obtaining the refresh token requires modification of the Cline
VSCode extension source code to export the refresh token. The standard
export command only provides short-lived access tokens (~10 minutes).

Authentication uses JWT tokens with 'workos:' prefix for API requests.
- Add internal/auth/kiro/ with OAuth token refresh for social and IAM auth
- Add kiro_executor.go with AWS Event Stream protocol support
- Support Claude models available via Amazon Q

Kiro provides access to Claude models through Amazon Q infrastructure:
- claude-sonnet-4-5, claude-sonnet-4-20250514, claude-3-7-sonnet-20250219
- claude-opus-4-20250514, claude-opus-4-5-20251101
- claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022

The executor handles Kiro's binary AWS Event Stream format and converts
responses to OpenAI chat completion format via IR translator.
- Add sdk/api/handlers/ollama/ with Ollama API implementation
- Implement /api/tags, /api/chat, /api/generate, /api/show, /api/version
- Convert Ollama requests to OpenAI format via IR translator
- Convert OpenAI responses back to Ollama format for client compatibility

This exposes the proxy as Ollama-compatible server on port 11434.
Clients that only support Ollama protocol can access any model
available through the proxy.

Request flow: Ollama client -> IR -> OpenAI -> Provider -> IR -> Ollama response
- Add GetClineModels() with MiniMax M2 and Grok Code Fast 1
- Add GetKiroModels() with Claude models via Amazon Q
- Add GetIFlowModels() with Chinese LLM models (Qwen, Kimi, DeepSeek, GLM)
- Update GetClaudeModels() with Claude 4.5 and thinking variants
- Update GetGeminiModels() with Gemini 3 Pro preview models
- Update GetOpenAIModels() with GPT-5 and Codex model families
- Add ThinkingSupport metadata for reasoning-capable models
- Add cline_login.go and kiro_login.go commands for authentication
- Update auth_manager.go with Cline and Kiro provider registration
- Update server.go and routes.go with Ollama endpoint routing
- Update config.go with use_canonical_translator and show_provider_prefixes flags
- Update constant.go with Cline, Kiro, Ollama provider constants
- Add sdk/auth/cline.go and sdk/auth/kiro.go refresh handlers
- Update watcher.go with Cline and Kiro credential watching
- Update openai_handlers.go and handlers.go for Ollama compatibility
- Update provider.go and gemini_thinking.go utilities
- Update legacy translators with minor fixes
- Add docs/cline-integration.md with setup instructions
…kingMetadata, improve Ollama docs, remove Cline tests
…of new translator_new structure on main branch)
…ation (improvment cursor compatible)

BREAKING: Tools schema handling for Gemini API

Changes:
- Fix $schema field order in antigravity_executor: delete AFTER CleanJsonSchemaForClaude
  (CleanJsonSchemaForClaude adds $schema, so deleting before was ineffective)
- Add $schema and additionalProperties removal to CleanJsonSchema for Gemini compatibility
- Apply CleanJsonSchema when converting tools in from_ir/gemini.go

Tool call normalization improvements:
- Add ToolSchemaContext support for multiple formats (OpenAI, Gemini, direct Gemini)
- Store parameter types (not just names) for smarter array-to-string conversion
- Add configurable ParameterSynonyms map for semantic parameter matching
- Add ToolDefaults for commonly missing required parameters (e.g., is_background)
- Handle tool_choice object format ({"type":"function",...}) as "required"

MALFORMED_FUNCTION_CALL recovery:
- Parse Gemini's text-based function calls: "call:default_api:func{key:value}"
- Use hujson library to handle unquoted keys and malformed JSON
- Map MALFORMED_FUNCTION_CALL finish reason to tool_calls for proper handling

Gemini toolConfig:
- Set functionCallingConfig.mode based on ToolChoice (none/required/auto -> NONE/ANY/AUTO)
- Add mode setting in both old translator and canonical IR translator
HALDRO and others added 21 commits December 8, 2025 05:27
…uest**

- Enhance the `RequestAntigravityToken` function to fetch and log the project ID using the provided access token.
- Update metadata to include the project ID if available, improving the context for Antigravity services.
- Introduce `FetchAntigravityProjectID` function to facilitate project ID discovery for external callers.
…uth overhaul

## New Features

### GitHub Copilot Integration
- Add complete GitHub Copilot provider with OAuth Device Flow authentication
- Implement Copilot API executor with token caching and SSE streaming support
- Add CLI login command and SDK integration for Copilot

### Kiro Provider Overhaul
- Implement multiple authentication methods:
  - AWS SSO OIDC (Builder ID) device code flow
  - Social authentication (Google/GitHub) via AuthServiceClient
  - Native OAuth with PKCE support
  - Custom kiro:// protocol handler for OAuth callbacks
- Add AWS SigV4 request signing for CodeWhisperer API
- Significantly enhance Kiro executor with improved streaming and error handling
*! You need to re-login or adapt old accounts.

## Infrastructure
- Extend browser automation capabilities for OAuth flows
- Enhance file watcher with improved monitoring
- Update model definitions registry
- Update CI/CD workflows, Docker configuration, and documentation
…ng improvements + sync new translators + update README
Each tool call now gets unique index (0, 1, 2...) instead of all having index 0.

Also ensures ID/Name are sent only in first chunk per tool call.

Fixes Cursor losing tool arguments when multiple tools called in one response.
- Centralized thinking budget handling in executors (not translators)
- Added default thinking support for gemini-3-pro-preview
- Replaced Fatalf with Errorf for graceful degradation
- Adapted new canonical translator for thinking changes
- Updated scanner buffer size to 20MB in handleEventStreamResponse and processStream methods to accommodate large AWS EventStream frames.
…ization

Changes from official repository:
- Unified thinking/reasoning system with ResolveThinkingConfigFromMetadata
- HasContent flag to prevent empty final SSE events
- Atomic counters for tool call ID uniqueness
- Image support in antigravity claude request
- OAuth constant naming (geminiOAuthClientID)
- New codex instructions and model definitions

Fork-specific adaptations:
- Standardized scanner buffer size to 50MB across all executors
- Adapted HasContent logic in new translators (from_ir/claude.go)
- Preserved Kiro IDE token watching functionality
- Preserved rate limit retry settings for Gemini CLI
- Preserved specifiedProvider logic in handlers
…ility

- Add TranslateToClaude() function to use new canonical translators
- Update claude_executor.go to use TranslateToClaude() instead of old sdktranslator
- Add stream parameter to Claude API requests when streaming is enabled
- Filter context-1m beta header for OAuth authentication (incompatible with OAuth)
- Add isOAuthAuthentication() helper to detect auth type
- Keep context-1m for API Key authentication (supports 1M context window)
- Merged latest changes from origin/main

- Canonical IR translator now enabled by default (use-canonical-translator: true)

- Provider prefixes now enabled by default (show-provider-prefixes: true)

- Updated README: added Why This Fork section, updated Claude executor status to tested

- Updated config.example.yaml with new defaults
… IR translators, Added support GPT 5.2 and Gemini 3 Flash
… and extracting common response conversion functions

Changes:
- Enhanced work with Gemini
- Updated translator functions to streamline the use of canonical translator checks.
- Introduced shared helper functions for converting IR events to chunks and for non-stream responses, reducing code duplication.
- Enhanced handling of finish events and content tracking in streaming responses.
- Add new Cline models: GLM 4.6, Devstral with context/token limits
- Update existing Cline models (Grok Code Fast 1, MiniMax M2) with full metadata
- Add static model info enrichment in GetModelInfo() for dynamic models
- Support display prefix stripping (e.g., "[Antigravity] model" -> "model")
- Add forced translator functions for Cline executor (no fallback to old translator)
- Parse "reasoning" field in addition to "reasoning_content" for Cline/OpenRouter
- Normalize model names in request payloads before sending upstream
- Fix Ollama handlers to use Ollama handler type instead of OpenAI
- Improve ToOllamaShowResponse with proper architecture detection and capabilities
- Simplify findModelInfoByName to delegate to registry.GetModelInfo
…ure caching support

- Fix double "data: data:" SSE prefix in OpenAI streaming responses
  The ToOpenAIChunk function was adding "data:" prefix, but the handler
  also adds it, resulting in invalid SSE format that clients couldn't parse

- Add thoughtSignature caching for Antigravity API compatibility
  - Derive session ID from first user message hash for conversation tracking
  - Cache signatures when emitting signature_delta events in Claude responses
  - Validate and retrieve cached signatures when building Gemini requests
  - Drop unsigned thinking blocks that Antigravity API would reject

- Add interleaved thinking hint injection for Claude thinking models
  When both tools and thinking are active, inject hint into system instruction
  to guide model behavior between tool calls

- Parse cachedContentTokenCount from Gemini usage metadata

- Extract ThoughtSignature from Claude request thinking blocks
  Handle both simple string and wrapped object formats for thinking content

- Merge remote-tracking branch 'origin/main'
- Add .cursor/* to .gitignore for Cursor IDE metadata
- Extract helper functions in Claude provider (ensureClaudeUser, applyThinkingConfig, buildMessages, buildTools)
- Decompose Gemini provider's applyGenerationConfig into focused functions (applyThinkingConfig, applyGemini3ThinkingLevel, applyFunctionCallingConfig)
- Split Gemini message handling into applySystemMessage, applyUserMessage, applyAssistantMessage, applyToolResponses
- Refactor Kiro provider with filterSystemMessages, mergeConsecutiveMessages, alternateRoles, buildHistory
- Extract Ollama helpers (buildOllamaTools, applyOllamaFormat, extractPromptsAndImages, addOllamaUsage)
- Decompose OpenAI provider (buildOpenAITools, buildResponsesTools, applyResponsesThinking, buildToolCallDelta, buildImageDelta, buildChunkUsage)
- Simplify IR message_builder by replacing custom tolerant JSON parser with hujson library
- Remove redundant comments and reorganize struct field ordering in IR types
- Clean up util.go with better function grouping and simplified logic
…lity

- Enable Canonical Translator via SDK config (`UseCanonicalTranslator`).
- Add robust `grep` argument sanitization to fix common model hallucinations (conflict resolution for -A/-B/-C flags).
- Improve Codex/Responses API support:
  - Enforce `store: false` and `reasoning.summary: auto`.
  - Normalize tool names to 64-char limit.
  - Fix `web_search` tool mapping for compatibility.
- Implement unified streaming logic in OpenAI handlers to support both raw SSE events and data-only chunks (passthrough support).
- Refactor `TranslateToGeminiCLI` and `TranslateToOpenAI` to utilize shared IR conversion logic.
- Fix stateful streaming issues: correct mapping between internal `item_id` and client `call_id` for tool calls.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Copilot AI review requested due to automatic review settings January 1, 2026 06:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical issue where AI coding tools (Claude Code, OpenCode) would stop after a single turn due to incorrect stop_reason mapping from upstream providers. The changes ensure proper translation of finish reasons across different API formats, enabling multi-turn conversations to work correctly.

Key Changes:

  • Fixed stop_reason mapping in Gemini and Gemini CLI translators to correctly translate 'MAX_TOKENS' to 'max_tokens' and 'STOP'/'FINISH_REASON_UNSPECIFIED'/'UNKNOWN' to 'end_turn'
  • Added comprehensive translator infrastructure for new providers (Ollama, Kiro, iFlow, Qwen)
  • Implemented Codex response propagation with proper stop_reason handling
  • Added utility functions for schema cleaning and function name sanitization

Reviewed changes

Copilot reviewed 90 out of 120 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/translator/gemini/claude/gemini_claude_response.go Fixed stop_reason mapping from Gemini finish reasons to Claude format
internal/translator/gemini-cli/claude/gemini-cli_claude_response.go Applied same stop_reason fix to Gemini CLI translator
internal/translator_new/to_ir/ollama.go New Ollama API parser supporting both /api/chat and /api/generate endpoints
internal/translator_new/to_ir/kiro.go New Kiro (Amazon Q) response parser with embedded tool call extraction
internal/translator_new/to_ir/gemini.go New Gemini response parser with schema context support
internal/translator_new/to_ir/claude.go New Claude request parser with tool and thinking config support
internal/translator_new/ir/util.go Core utility functions for ID generation, text sanitization, and finish reason mapping
internal/translator_new/ir/types.go Type definitions for unified chat request/response structures
internal/translator_new/ir/tool_schema.go Tool schema context for parameter normalization
internal/translator_new/ir/tool_config.go Configuration for tool parameter synonyms and defaults
internal/translator_new/from_ir/gemini.go Gemini request generator with thinking config and tool support
internal/translator_new/from_ir/claude.go Claude request generator and SSE streaming support
internal/translator_new/from_ir/openai.go OpenAI format converter supporting Chat Completions and Responses API
internal/translator/kiro/* New Kiro translator for OpenAI and Claude format conversion
internal/runtime/executor/qwen_executor.go Migrated to canonical IR translation pipeline
internal/runtime/executor/proxy_helpers.go Added HTTP client caching for connection reuse

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @HsnSaboor, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the proxy's translation capabilities and expands its supported AI providers. The core change is the introduction of a Canonical Intermediate Representation (IR) architecture, which streamlines request and response handling across various AI models, addressing previous stop_reason mapping issues. This refactoring not only improves compatibility and reduces codebase complexity but also enables seamless integration of new providers like Kiro, GitHub Copilot, Cline, and Ollama, offering users a wider range of AI coding tools.

Highlights

  • Translator Fixes: Correctly maps stop_reason from upstream providers for Gemini, Gemini CLI, and Codex, resolving issues where coding tools would stop prematurely.
  • Canonical IR Translation Architecture: Introduces a new, unified Intermediate Representation (IR) translation architecture, aiming for better client compatibility, simplified codebase, and easier extension with new providers. This new architecture is enabled by default.
  • New Provider Support: Adds support for several new AI providers including Kiro (Amazon Q), GitHub Copilot, Cline, and Ollama, expanding the proxy's capabilities.
  • Codebase Optimization: The new Canonical IR architecture results in significant codebase reduction (62%) and unification of Google providers (86%), improving maintainability.
  • Enhanced Authentication: Implements new authentication methods for providers like GitHub Copilot (OAuth Device Flow) and Kiro (AWS Builder ID, Social Auth, token import), along with incognito browser support for multi-account logins.
  • Tooling Improvements: Includes a SanitizeFunctionName utility, fixes a tool ID parsing bug, and updates schema placeholder logic to prevent unnecessary additions to top-level schemas.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (2)
    • .github/workflows/docker-image.yml
    • .github/workflows/release.yaml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial pull request that goes far beyond the described bug fix for stop_reason mapping. It introduces a major architectural refactoring with the new Canonical IR translator, which significantly simplifies the codebase and improves maintainability by unifying provider logic. The addition of new providers like Kiro, GitHub Copilot, Cline, and a full Ollama-compatible API is a massive feature enhancement. I've also noted several key improvements, including critical thread-safety fixes, performance optimizations through client caching, and security enhancements to prevent XSS. The code quality is high, and the new architecture is a commendable step forward for the project.

Comment on lines +14 to +16
codexCacheMap = map[string]codexCache{}
codexCacheMutex sync.RWMutex
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Great job adding a mutex to protect codexCacheMap. Accessing a map concurrently from multiple goroutines without synchronization is a race condition that can lead to panics. This is a critical fix for the stability of the application.

Comment on lines +1042 to +1053
// conditionalAuthMiddleware returns middleware that checks disable-auth config flag.
// If disable-auth is true, all requests are allowed without authentication.
// Otherwise, standard authentication is applied.
func (s *Server) conditionalAuthMiddleware() gin.HandlerFunc {
return func(c *gin.Context) {
if s.cfg != nil && s.cfg.DisableAuth {
c.Next()
return
}
AuthMiddleware(s.accessManager)(c)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The introduction of conditionalAuthMiddleware and allowing requests with ErrNoCredentials to pass through the main AuthMiddleware is a significant change to the server's security model. While this is necessary for Ollama compatibility, it's important to be aware that when disable-auth: true is set, the API becomes completely open. The implementation correctly gates this behind the flag, but this feature should be used with caution, especially in non-local deployments.

Comment on lines +245 to +248
// Validate platformURL to prevent XSS - only allow http/https URLs
if !isValidURL(platformURL) {
platformURL = "https://console.anthropic.com/"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Adding the isValidURL validation for platformURL is a good security measure to prevent potential Cross-Site Scripting (XSS) vulnerabilities. By ensuring the URL starts with http:// or https://, you prevent malicious schemes like javascript: from being injected into the success page HTML. This is a solid defensive programming practice.

Comment on lines +18 to +22
// httpClientCache caches HTTP clients by proxy URL to enable connection reuse
var (
httpClientCache = make(map[string]*http.Client)
httpClientCacheMutex sync.RWMutex
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Caching http.Client instances based on the proxy URL is an excellent performance optimization. This allows reusing TCP and TLS connections for subsequent requests to the same proxy, reducing latency and resource usage. Well done!

Comment on lines +1257 to 1348
// geminiToAntigravity converts Gemini CLI format to Antigravity format.
// Optimized: single json.Unmarshal → in-memory modifications → single json.Marshal
// The projectID parameter should be the real GCP project ID from auth metadata.
// If empty, a random project ID will be generated (legacy fallback).
func geminiToAntigravity(modelName string, payload []byte, projectID string) []byte {
template, _ := sjson.Set(string(payload), "model", modelName)
template, _ = sjson.Set(template, "userAgent", "antigravity")
var root map[string]interface{}
if err := json.Unmarshal(payload, &root); err != nil {
return payload
}

root["model"] = modelName
root["userAgent"] = "antigravity"
// Use real project ID from auth if available, otherwise generate random (legacy fallback)
if projectID != "" {
template, _ = sjson.Set(template, "project", projectID)
root["project"] = projectID
} else {
template, _ = sjson.Set(template, "project", generateProjectID())
}
template, _ = sjson.Set(template, "requestId", generateRequestID())
template, _ = sjson.Set(template, "request.sessionId", generateStableSessionID(payload))

template, _ = sjson.Delete(template, "request.safetySettings")
template, _ = sjson.Set(template, "request.toolConfig.functionCallingConfig.mode", "VALIDATED")

if !strings.HasPrefix(modelName, "gemini-3-") {
if thinkingLevel := gjson.Get(template, "request.generationConfig.thinkingConfig.thinkingLevel"); thinkingLevel.Exists() {
template, _ = sjson.Delete(template, "request.generationConfig.thinkingConfig.thinkingLevel")
template, _ = sjson.Set(template, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
root["project"] = generateProjectID()
}
root["requestId"] = generateRequestID()

request, _ := root["request"].(map[string]interface{})
if request == nil {
request = make(map[string]interface{})
root["request"] = request
}
request["sessionId"] = generateStableSessionID(payload)
delete(request, "safetySettings")

if genConfig, ok := request["generationConfig"].(map[string]interface{}); ok {
delete(genConfig, "maxOutputTokens")

// TODO: Fix GPT-OSS thinking mode - model gets stuck in infinite planning loops
// GPT-OSS models have issues with thinking mode - they repeatedly generate
// the same plan without executing actions. Temporarily disable thinking.
// See README_Fork.md "Antigravity Provider — UI Client Testing" for details.
if strings.HasPrefix(modelName, "gpt-oss") {
delete(genConfig, "thinkingConfig")
} else if !strings.HasPrefix(modelName, "gemini-3-") {
if tc, ok := genConfig["thinkingConfig"].(map[string]interface{}); ok {
if _, has := tc["thinkingLevel"]; has {
delete(tc, "thinkingLevel")
tc["thinkingBudget"] = -1
}
}
}
}

// Clean tools for Claude models
if strings.Contains(modelName, "claude") {
gjson.Get(template, "request.tools").ForEach(func(key, tool gjson.Result) bool {
tool.Get("functionDeclarations").ForEach(func(funKey, funcDecl gjson.Result) bool {
if funcDecl.Get("parametersJsonSchema").Exists() {
template, _ = sjson.SetRaw(template, fmt.Sprintf("request.tools.%d.functionDeclarations.%d.parameters", key.Int(), funKey.Int()), funcDecl.Get("parametersJsonSchema").Raw)
template, _ = sjson.Delete(template, fmt.Sprintf("request.tools.%d.functionDeclarations.%d.parameters.$schema", key.Int(), funKey.Int()))
template, _ = sjson.Delete(template, fmt.Sprintf("request.tools.%d.functionDeclarations.%d.parametersJsonSchema", key.Int(), funKey.Int()))
if tools, ok := request["tools"].([]interface{}); ok {
for _, tool := range tools {
if tm, ok := tool.(map[string]interface{}); ok {
if fds, ok := tm["functionDeclarations"].([]interface{}); ok {
for _, fd := range fds {
if fdm, ok := fd.(map[string]interface{}); ok {
var schema map[string]interface{}
if s, ok := fdm["parametersJsonSchema"].(map[string]interface{}); ok {
schema = s
} else if s, ok := fdm["parameters"].(map[string]interface{}); ok {
schema = s
}
if schema != nil {
ir.CleanJsonSchemaForClaude(schema)
delete(schema, "$schema") // Must be after CleanJsonSchemaForClaude which adds $schema
fdm["parameters"] = schema
delete(fdm, "parametersJsonSchema")
}
}
}
}
}
return true
})
return true
})
} else {
template, _ = sjson.Delete(template, "request.generationConfig.maxOutputTokens")
}
}
}

return []byte(template)
if result, err := json.Marshal(root); err == nil {
return result
}
return payload
}

func generateRequestID() string {
return "agent-" + uuid.NewString()
}

func generateSessionID() string {
randSourceMutex.Lock()
n := randSource.Int63n(9_000_000_000_000_000_000)
randSourceMutex.Unlock()
return "-" + strconv.FormatInt(n, 10)
// Use uuid for thread-safe random generation instead of math/rand
// Format: negative number string (mimics original behavior)
uuidStr := uuid.NewString()
// Convert first 16 hex chars to int64-like string
return "-" + uuidStr[:8] + uuidStr[9:13] + uuidStr[14:18]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The refactoring of geminiToAntigravity to use map[string]interface{} instead of repeated sjson calls is a great improvement. The new implementation is much cleaner, more readable, and likely more performant. Also, switching from math/rand with a mutex to uuid for generateSessionID is a good move for thread safety and simplicity.

// Track whether tools are being used in this response chunk
usedTool := false
output := ""
var sb strings.Builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using strings.Builder (sb) for building the output string instead of repeated string concatenation is a good performance optimization. It avoids creating a new string object for each concatenation, which is much more efficient, especially in loops or functions that build large strings.

HsnSaboor and others added 4 commits January 1, 2026 11:57
…laceholder logic

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…nction

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…and fix tool ID parsing

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…n details

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants