Detect ChatGPT and AI-Related User Agents in PHP
UserAgent detection is an essential technique for identifying web crawlers, bots, or AI-driven agents accessing your web application. In this guide, we demonstrate how to create a robust PHP function to detect ChatGPT and other AI-related User-Agent strings.
Why Detect ChatGPT and AI-Related User Agents?
Identifying ChatGPT and AI-related User Agents can help you:
- Customize responses for automated systems.
- Implement bot-specific optimizations.
- Monitor traffic from AI bots and differentiate it from human visitors.
PHP Code for ChatGPT User-Agent Detection
The function uses a predefined list of patterns like "chatgpt", "openai", "gptbot", and similar identifiers to check if any of these exist in the User-Agent string. The matching process is case-insensitive to ensure consistency, regardless of how the User-Agent is formatted.
Here is the complete PHP code for detecting ChatGPT and AI-related User-Agent patterns:
<?php
/**
* Check if the provided User-Agent string corresponds to a ChatGPT-related bot.
*
* @param string|null $userAgent User-Agent string to check. Defaults to the server HTTP_USER_AGENT.
* @return bool True if the User-Agent matches a known ChatGPT-related pattern, false otherwise.
**/
function isChatGPTUserAgent(?string $userAgent = null): bool {
// Use the server User-Agent if none is provided
$userAgent = $userAgent ?? ($_SERVER['HTTP_USER_AGENT'] ?? '');
// Early exit for empty or invalid inputs
if (empty($userAgent) || !is_string($userAgent)) {
return false;
}
// Sanitize input and limit string length to avoid performance issues
$userAgent = substr($userAgent, 0, 500);
// Known ChatGPT user agent patterns with more comprehensive and specific matching
$patterns = [
// AI Assistants
'chatgpt',
'anthropic',
'claude',
'openai',
'gpt-',
'chatbot',
'ai assistant',
'copilot', // GitHub Copilot
'assistant.ai', // Generic AI assistants
'gptbot', // Specific bot identifiers
'bing ai', // Bing AI (uses OpenAI models)
'bard', // Google's Bard
// Specific AI crawler/bot identifiers
'amazonalexa-skill',
'chatgpt-user',
'anthropic-ai',
// Known AI model user agent fragments
'gpt-3',
'gpt-4',
'claude-',
// Additional AI-related patterns
'generative ai',
'large language model',
'llm crawler'
];
// Regex patterns for more robust matching
$regexPatterns = [
'/chat\s*gpt/i',
'/\bai\s*bot\b/i',
'/language\s*model/i',
'/\bllm\b/i'
];
// Case-insensitive pattern matching
$lcUserAgent = strtolower($userAgent);
// Check for direct string matches
foreach ($patterns as $pattern) {
if (stripos($lcUserAgent, $pattern) !== false) {
return true;
}
}
// Check for regex pattern matches
foreach ($regexPatterns as $regex) {
if (preg_match($regex, $userAgent)) {
return true;
}
}
return false;
}
How to Use This Function
You can use the isChatGPTUserAgent
function in your PHP applications to detect ChatGPT or similar AI-based User Agents. Here is an example:
// Use the server User Agent
if (isChatGPTUserAgent()) {
echo "ChatGPT User-Agent detected.";
} else {
echo "User-Agent does not match.";
}
PHP Example with Given User-Agent
// Sample User-Agent string
$sampleUserAgent = "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot";
// Call the function with the sample User-Agent
if (isChatGPTUserAgent($sampleUserAgent)) {
echo "AI-related User-Agent detected: $sampleUserAgent";
} else {
echo "No AI-related User-Agent detected: $sampleUserAgent";
}
Features of the PHP Function to Detect AI User Agents
- Customizable Input: Accepts a User-Agent string as an input or defaults to the server User-Agent.
- Input Validation: Handles invalid, non-string, or empty User-Agent values gracefully.
- Performance Optimization: Limits the User-Agent string to 500 characters to prevent unnecessary processing.
- Predefined Patterns: Matches against a list of known AI-related User-Agent patterns such as
chatgpt
,openai
,gptbot
, and more. - Case-Insensitive Matching: Ensures reliable detection regardless of the User-Agent string's case.
- Early Exit: Optimized logic to return early for invalid or unmatched strings.
- Scalable Design: Easily extendable to include additional AI-related patterns in the future.
- Secure and Reliable: Sanitizes input and avoids processing large or malformed strings.
Detecting ChatGPT and AI User Agents Using JavaScript
JavaScript provides a simple and effective way to detect AI-related User Agents directly in the browser. By leveraging the navigator.userAgent property, you can access the User-Agent string of the client making the request. This string can then be matched against known patterns for AI-driven tools such as ChatGPT, OpenAI, Bard, and others.
const isChatGPTUserAgent = (userAgent = null) => {
// Use the browser's User-Agent if none is provided
userAgent = userAgent || navigator.userAgent;
// Return false for empty strings or non-string inputs
if (typeof userAgent !== 'string' || userAgent.trim() === '') {
return false;
}
// Limit string length to 500 characters for safety
userAgent = userAgent.substring(0, 500);
// Known ChatGPT and AI-related User-Agent patterns
const patterns = [
'chatgpt',
'anthropic',
'claude',
'openai',
'gpt-',
'chatbot',
'ai assistant',
'copilot',
'assistant.ai',
'gptbot',
'bing ai',
'bard',
];
// Check if any pattern is present in the User-Agent string
return patterns.some(pattern => userAgent.toLowerCase().includes(pattern));
};
This JavaScript solution is particularly useful for client-side applications where User-Agent-based detection is required for customization, analytics, or bot-specific optimizations. It offers a lightweight, efficient method for identifying AI-related traffic in real-time.
Example usage
// Example usage
if (isChatGPTUserAgent()) {
console.log("ChatGPT or AI-related User-Agent detected.");
} else {
console.log("No AI-related User-Agent detected.");
}
JavaScript Example with Given User-Agent
// Sample User-Agent string
const sampleUserAgent = "Mozilla/5.0 (compatible; ChatGPT-User-Agent/1.0; +http://openai.com)";
// Call the function with the sample User-Agent
if (isChatGPTUserAgent(sampleUserAgent)) {
console.log(`AI-related User-Agent detected: ${sampleUserAgent}`);
} else {
console.log(`No AI-related User-Agent detected: ${sampleUserAgent}`);
}
Demo
Enter a user-agent string below to test:
Parsed Result:
Name: ChatGPT
Platform: OpenAI
Type: AI Assistant
AI Bot User Agent Parser
The AIUserAgentParser
is a smart PHP class that helps to read and understand user agent strings from different AI assistants, chatbots, and crawlers. It uses advanced pattern matching methods to give detailed information about where these AI systems come from, what platform they are on, and what type of AI they are when they connect with web services.
<?php
/**
* Class AIUserAgentParser
* Parses user agent strings to identify AI-related information.
*/
class AIUserAgentParser {
// AI-specific user agent patterns
private const AI_PATTERNS = [
'ChatGPT' => ['platform' => 'OpenAI', 'type' => 'AI Assistant', 'regex' => '/oai-searchbot|chatgpt|gpt-/i'],
'Claude' => ['platform' => 'Anthropic', 'type' => 'AI Assistant', 'regex' => '/claude|anthropic/i'],
'Copilot' => ['platform' => 'GitHub', 'type' => 'AI Assistant', 'regex' => '/copilot/i'],
'Bard' => ['platform' => 'Google', 'type' => 'AI Assistant', 'regex' => '/bard/i'],
'Alexa' => ['platform' => 'Amazon', 'type' => 'AI Assistant', 'regex' => '/amazonalexa-skill/i'],
'Bing AI' => ['platform' => 'Microsoft', 'type' => 'AI Assistant', 'regex' => '/bing ai/i'],
'GPTBot' => ['platform' => 'OpenAI', 'type' => 'AI Crawler', 'regex' => '/gptbot|chatgpt-user|anthropic-ai/i'],
];
// Generic patterns for AI-related user agents
private const GENERIC_AI_PATTERNS = [
'/chatbot/i',
'/ai assistant/i',
'/assistant\.ai/i',
'/generative ai/i',
'/large language model/i',
'/llm crawler/i',
];
// Version-specific AI model patterns
private const MODEL_PATTERNS = [
'/gpt-3/i' => ['name' => 'GPT-3', 'platform' => 'OpenAI', 'type' => 'AI Model'],
'/gpt-4/i' => ['name' => 'GPT-4', 'platform' => 'OpenAI', 'type' => 'AI Model'],
'/claude-/i' => ['name' => 'Claude', 'platform' => 'Anthropic', 'type' => 'AI Model'],
];
//Parses a user agent string and identifies AI-related information.
public function parse(?string $userAgentString = null): array {
// Use the server User-Agent if none is provided
$userAgentString = $userAgentString ?? ($_SERVER['HTTP_USER_AGENT'] ?? '');
// Early exit for empty or invalid inputs
if (empty($userAgentString) || !is_string($userAgentString)) {
return $this->formatResult('Unknown', 'Unknown', 'Unknown');;
}
$lowerUserAgent = strtolower($userAgentString);
// Check against predefined patterns
foreach (self::AI_PATTERNS as $name => $info) {
if (preg_match($info['regex'], $lowerUserAgent)) {
return $this->formatResult($name, $info['platform'], $info['type']);
}
}
// Check against generic AI patterns
foreach (self::GENERIC_AI_PATTERNS as $pattern) {
if (preg_match($pattern, $lowerUserAgent)) {
return $this->formatResult('Generic AI', 'Unknown', 'AI Assistant/Crawler');
}
}
// Check against version-specific patterns
foreach (self::MODEL_PATTERNS as $pattern => $info) {
if (preg_match($pattern, $lowerUserAgent)) {
return $this->formatResult($info['name'], $info['platform'], $info['type']);
}
}
// Default result if no match
return $this->formatResult('Unknown', 'Unknown', 'Unknown');
}
//Formats the result for consistent output.
private function formatResult(string $name, string $platform, string $type): array {
return [
'name' => $name,
'platform' => $platform,
'type' => $type,
];
}
}
Example usage User Agent Parser
// Example usage
$parser = new AIUserAgentParser();
$userAgent = 'Mozilla/5.0 (compatible; ChatGPT)';
$result = $parser->parse($userAgent);
print_r($result);
// output
/*
Array
(
[name] => ChatGPT
[type] => OpenAI
[platform] => AI Assistant
)
*/