A sophisticated Java-based research agent that leverages multiple search providers and AI to conduct in-depth research on topics, answer questions, and synthesize information from diverse sources.
The Deep Research Agent is designed to automate the process of researching topics by:
- Intelligently generating search queries
- Gathering information from multiple search providers
- Analyzing and synthesizing the collected information
- Providing comprehensive, structured research reports
The agent can determine when research is needed, conduct targeted searches, and combine results from multiple search engines for thorough analysis.
- Multiple Search Providers: Supports both DuckDuckGo scraping and Google Custom Search API
- Self-Assessment: Determines if a question requires external research or can be answered with existing knowledge
- Query Generation: Creates optimized search queries tailored to the research topic
- Iterative Research: Conducts multiple rounds of research with follow-up questions
- Comprehensive Synthesis: Combines information from all sources into coherent analyses
- Memory: Maintains research history for context and follow-up
- Flexible Configuration: Easily switch between or combine search providers
The project follows a modular, interface-based design:
SearchProvider: Interface for different search enginesDuckDuckGoSearchProvider: Web scraping implementationGoogleSearchProvider: API-based implementation
ResearchAgent: Core class that orchestrates the research processSearchResult: Model class for search resultsResearchResponse: Comprehensive response with metadata
- Java 11 or higher
- Maven or Gradle for dependency management
- OpenAI API key
- Google API key and Custom Search Engine ID (for Google search)
<dependencies>
<!-- OpenAI API Client -->
<dependency>
<groupId>com.openai</groupId>
<artifactId>openai-java-client</artifactId>
<version>0.8.1</version>
</dependency>
<!-- HTTP Client -->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.15.3</version>
</dependency>
<!-- JSON Processing -->
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20231013</version>
</dependency>
</dependencies>-
OpenAI API Key:
- Sign up at https://platform.openai.com
- Generate an API key under your account settings
-
Google Custom Search:
- Create a project on Google Cloud Console
- Enable the Custom Search JSON API
- Create API credentials
- Set up a programmable search engine at https://programmablesearchengine.google.com
- Get your Search Engine ID
// Initialize providers
SearchProvider duckDuckGo = new DuckDuckGoSearchProvider();
SearchProvider google = new GoogleSearchProvider(googleApiKey, searchEngineId);
// Create a research agent with multiple providers
List<SearchProvider> providers = List.of(google, duckDuckGo);
ResearchAgent agent = new ResearchAgent(openAiApiKey, providers);
// Simple research
String result = agent.research("Quantum computing applications in medicine");
System.out.println(result);
// Deep dive with multiple iterations
String deepResult = agent.deepDive("Climate change mitigation technologies", 3);
System.out.println(deepResult);
// Question answering with self-assessment
ResearchResponse response = agent.answerQuestion("What were the latest developments in quantum computing as of 2025?");
System.out.println(response);// Use only DuckDuckGo
ResearchAgent ddgAgent = new ResearchAgent(openAiApiKey, List.of(new DuckDuckGoSearchProvider()));
// Use only Google
ResearchAgent googleAgent = new ResearchAgent(openAiApiKey, List.of(
new GoogleSearchProvider(googleApiKey, searchEngineId)
));
// Use multiple search providers with equal priority
ResearchAgent multiAgent = new ResearchAgent(openAiApiKey,
List.of(new DuckDuckGoSearchProvider(), new GoogleSearchProvider(googleApiKey, searchEngineId))
);The agent can determine if it already has sufficient knowledge to answer a question:
ResearchResponse response = agent.answerQuestion("What is the capital of France?");
if (!response.getNeededResearch()) {
System.out.println("Answered without research: " + response.getAnswer());
} else {
System.out.println("Research was required.");
}For comprehensive exploration of a topic with follow-up questions:
String deepResult = agent.deepDive("Renewable energy storage solutions", 3);This will:
- Research the initial topic
- Generate a follow-up question based on findings
- Research that question
- Repeat for the specified number of iterations
- Synthesize all findings into a comprehensive report
Implement the SearchProvider interface to add new search engines:
class BingSearchProvider implements SearchProvider {
@Override
public List<SearchResult> search(String query) {
// Implementation
}
@Override
public String getName() {
return "Bing";
}
}Extend the project to extract full content from search results:
private String extractContentFromUrl(String url) {
try {
Document doc = Jsoup.connect(url)
.userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36")
.get();
// Extract main content
Elements content = doc.select("article, .content, main");
if (!content.isEmpty()) {
return content.first().text();
}
return doc.body().text();
} catch (Exception e) {
return "Failed to extract content: " + e.getMessage();
}
}- Rate Limits: Both search providers and AI APIs have rate limits
- Search Quality: DuckDuckGo scraping may be blocked or return inconsistent results
- Cost: Google Custom Search and OpenAI APIs have associated costs
- Timeliness: Information may not be the most recent
- Complexity: Very nuanced or specialized topics may require human expertise
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for the GPT API
- Google for the Custom Search API
- The Jsoup library for HTML parsing