Forget IPs: using cryptography to verify bot and agent traffic
With the rise of traffic from AI agents, what’s considered a bot is no longer clear-cut. There are some clearly malicious bots, like ones that DoS your site or do credential stuffing, and ones that most site owners do want to interact with their site, like the bot that indexes your site for a search engine, or ones that fetch RSS feeds.
Historically, Cloudflare has relied on two main signals to verify legitimate web crawlers from other types of automated traffic: user agent headers and IP addresses. The User-Agent
header allows bot developers to identify themselves, i.e. MyBotCrawler/1.1
. However, user agent headers alone are easily spoofed and are therefore insufficient for reliable identification. To address this, user agent checks are often supplemented with IP address validation, the inspection of published IP address ranges to confirm a crawler's authenticity. However, the logic around IP address ranges representing a product or group of users is brittle – connections from the crawling service might be shared by multiple users, such as in the case of privacy proxies and VPNs, and these ranges, often maintained by cloud providers, change over time.
Cloudflare will always try to block malicious bots, but we think our role here is to also provide an affirmative mechanism to authenticate desirable bot traffic. By using well-established cryptography techniques, we’re proposing a better mechanism for legitimate agents and bots to declare who they are, and provide a clearer signal for site owners to decide what traffic to permit.
Today, we’re introducing two proposals – HTTP message signatures and request mTLS – for friendly bots to authenticate themselves, and for customer origins to identify them. In this blog post, we’ll share how these authentication mechanisms work, how we implemented them, and how you can participate in our closed beta.
Existing bot verification mechanisms are broken
</a>
</div>
<p>Historically, if you’ve worked on ChatGPT, Claude, Gemini, or any other agent, you’ve had several options to identify your HTTP traffic to other services: </p><ol><li><p>You define a <a href="https://www.rfc-editor.org/rfc/rfc9110#name-user-agent"><u>user agent</u></a>, an HTTP header described in <a href="https://www.rfc-editor.org/rfc/rfc9110.html#name-user-agent"><u>RFC 9110</u></a>. The problem here is that this header is easily spoofable and there’s not a clear way for agents to identify themselves as semi-automated browsers — agents often use the Chrome user agent for this very reason, which is discouraged. The RFC <a href="https://www.rfc-editor.org/rfc/rfc9110.html#section-10.1.5-9"><u>states</u></a>:
“If a user agent masquerades as a different user agent, recipients can assume that the user intentionally desires to see responses tailored for that identified user agent, even if they might not work as well for the actual user agent being used.”
You publish your IP address range(s). This has limitations because the same IP address might be shared by multiple users or multiple services within the same company, or even by multiple companies when hosting infrastructure is shared (like Cloudflare Workers, for example). In addition, IP addresses are prone to change as underlying infrastructure changes, leading services to use ad-hoc sharing mechanisms like CIDR lists.
You go to every website and share a secret, like a Bearer token. This is impractical at scale because it requires developers to maintain separate tokens for each website their bot will visit.
We can do better! Instead of these arduous methods, we’re proposing that developers of bots and agents cryptographically sign requests originating from their service. When protecting origins, reverse proxies such as Cloudflare can then validate those signatures to confidently identify the request source on behalf of site owners, allowing them to take action as they see fit.

A typical system has three actors:
User: the entity that wants to perform some actions on the web. This may be a human, an automated program, or anything taking action to retrieve information from the web.
Agent: an orchestrated browser or software program. For example, Chrome on your computer, or OpenAI’s Operator with ChatGPT. Agents can interact with the web according to web standards (HTML rendering, JavaScript, subrequests, etc.).
Origin: the website hosting a resource. The user wants to access it through the browser. This is Cloudflare when your website is using our services, and it’s your own server(s) when exposed directly to the Internet.
In the next section, we’ll dive into HTTP Message Signatures and request mTLS, two mechanisms a browser agent may implement to sign outgoing requests, with different levels of ease for an origin to adopt.
Introducing HTTP Message Signatures
</a>
</div>
<p><a href="https://www.rfc-editor.org/rfc/rfc9421.html"><u>HTTP Message Signatures</u></a> is a standard that defines the cryptographic authentication of a request sender. It’s essentially a cryptographically sound way to say, “hey, it’s me!”. It’s not the only way that developers can sign requests from their infrastructure — for example, AWS has used <a href="https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html"><u>Signature v4</u></a>, and Stripe has a framework for <a href="https://docs.stripe.com/webhooks#verify-webhook-signatures-with-official-libraries"><u>authenticating webhooks</u></a> — but Message Signatures is a published standard, and the cleanest, most developer-friendly way to sign requests. </p><p>We’re working closely with the wider industry to support these standards-based approaches. For example, OpenAI has started to sign their requests. In their own words: </p><blockquote><p><i>"Ensuring the authenticity of Operator traffic is paramount. With HTTP Message Signatures (</i><a href="https://www.rfc-editor.org/rfc/rfc9421.html"><i><u>RFC 9421</u></i></a><i>), OpenAI signs all Operator requests so site owners can verify they genuinely originate from Operator and haven’t been tampered with” </i>– Eugenio, Engineer, OpenAI</p></blockquote><p>Without further delay, let’s dive in how HTTP Messages Signatures work to identify bot traffic.</p>
<div>
<h3>Scoping standards to bot authentication</h3>
<a href="#scoping-standards-to-bot-authentication">
</a>
</div>
<p>Generating a message signature works like this: before sending a request, the agent signs the target origin with a public key. When fetching <code>https://example.com/path/to/resource</code>, it signs <code>example.com</code>. This public key is known to the origin, either because the agent is well known, because it has previously registered, or any other method. Then, the agent writes a <b>Signature-Input</b> header with the following parameters:</p><ol><li><p>A validity window (<code>created</code> and <code>expires</code> timestamps)</p></li><li><p>A Key ID that uniquely identifies the key used in the signature. This is a <a href="https://www.rfc-editor.org/rfc/rfc7638.html"><u>JSON Web Key Thumbprint</u></a>. </p></li><li><p>A tag that shows websites the signature’s purpose and validation method, i.e. <code>web-bot-auth</code> for bot authentication.</p></li></ol><p>In addition, the <code>Signature-Agent</code> header indicates where the origin can find the public keys the agent used when signing the request, such as in a directory hosted by <code>signer.example.com</code>. This header is part of the signed content as well.</p><p>Here’s an example:</p>
<pre><code>GET /path/to/resource HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 Chrome/113.0.0 MyBotCrawler/1.1
Signature-Agent: signer.example.com
Signature-Input: sig=("@authority" “signature-agent”);
created=1700000000;
expires=1700011111;
keyid=“ba3e64==”;
tag=“web-bot-auth”
Signature: sig=abc==
For those building bots, we propose signing the authority of the target URI, i.e. crawler.search.google.com for Google Search, operator.openai.com for OpenAI Operator, workers.dev for Cloudflare Workers, and a way to retrieve the bot public key in the form of signature-agent, if present.
The User-Agent
from the example above indicates that the software making the request is Chrome, because it is an agent that uses an orchestrated Chrome to browse the web. You should note that MyBotCrawler/1.1
is still present. The User-Agent
header can actually contain multiple products, in decreasing order of importance. If our agent is making requests via Chrome, that’s the most important product and therefore comes first.
At Internet-level scale, these signatures may add a notable amount of overhead to request processing. However, with the right cryptographic suite, and compared to the cost of existing bot mitigation, both technical and social, this seems to be a straightforward tradeoff. This is a metric we will monitor closely, and report on as adoption grows.
Generating request signatures
</a>
</div>
<p>We’re making several examples for generating Message Signatures for bots and agents <a href="https://github.com/cloudflareresearch/web-bot-auth/"><u>available on Github</u></a> (though we encourage other implementations!), all of which are standards-compliant, to maximize interoperability. </p><p>Imagine you’re building an agent using a managed Chromium browser, and want to sign all outgoing requests. To achieve this, the <a href="https://github.com/w3c/webextensions"><u>webextensions standard</u></a> provides <a href="https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onBeforeSendHeaders"><u>chrome.webRequest.onBeforeSendHeaders</u></a>, where you can modify HTTP headers before they are sent by the browser. The event is <a href="https://developer.chrome.com/docs/extensions/reference/api/webRequest#life_cycle_of_requests"><u>triggered</u></a> before sending any HTTP data, and when headers are available.</p><p>Here’s what that code would look like: </p>
<pre><code>chrome.webRequest.onBeforeSendHeaders.addListener(
function (details) { // Signature and header assignment logic goes here // <CODE> }, { urls: ["<all_urls>"] }, [“blocking”, “requestHeaders”] // requires “installation_mode”: “force_installed” );
Cloudflare provides a web-bot-auth helper package on npm that helps you generate request signatures with the correct parameters. onBeforeSendHeaders
is a Chrome extension hook that needs to be implemented synchronously. To do so, we import {signatureHeadersSync} from “web-bot-auth”
. Once the signature completes, both Signature
and Signature-Input
headers are assigned. The request flow can then continue.
const request = new URL(details.url);
const created = new Date();
const expired = new Date(created.getTime() + 300_000)
// Perform request signature
const headers = signatureHeadersSync(
request,
new Ed25519Signer(jwk),
{ created, expires }
);
// headers
object now contains Signature
and Signature-Input
headers that can be used
This extension code is available on GitHub, alongside a debugging server, deployed at https://http-message-signatures-example.research.cloudflare.com.
Validating request signatures
</a>
</div>
<p>Using our <a href="https://http-message-signatures-example.research.cloudflare.com"><u>debug server</u></a>, we can now inspect and validate our request signatures from the perspective of the website we’d be visiting. We should now see the Signature and Signature-Input headers: </p>
<figure>
<img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/18P5OyGxu2fU0Dpyv70Gjz/d82d62355524ad1914deb41b601bcad2/image3.png" />
</figure><p><sup><i>In this example, the homepage of the debugging server validates the signature from the RFC 9421 Ed25519 verifying key, which the extension uses for signing.</i></sup></p><p>The above demo and code walkthrough has been fully written in TypeScript: the verification website is on Cloudflare Workers, and the client is a Chrome browser extension. We are cognisant that this does not suit all clients and servers on the web. To demonstrate the proposal works in more environments, we have also implemented bot signature validation in Go with a <a href="https://github.com/cloudflareresearch/web-bot-auth/tree/main/examples/caddy-plugin"><u>plugin</u></a> for <a href="https://caddyserver.com/"><u>Caddy server</u></a>.</p>
<div>
<h2>Experimentation with request mTLS</h2>
<a href="#experimentation-with-request-mtls">
</a>
</div>
<p>HTTP is not the only way to convey signatures. For instance, one mechanism that has been used in the past to authenticate automated traffic against secured endpoints is <a href="https://www.cloudflare.com/learning/access-management/what-is-mutual-tls/"><u>mTLS</u></a>, the “mutual” presentation of TLS certificates. As described in our <a href="https://www.cloudflare.com/learning/access-management/what-is-mutual-tls/"><u>knowledge base</u></a>:</p><blockquote><p><i>Mutual TLS, or mTLS for short, is a method for</i><a href="https://www.cloudflare.com/learning/access-management/what-is-mutual-authentication/"><i> </i><i><u>mutual authentication</u></i></a><i>. mTLS ensures that the parties at each end of a network connection are who they claim to be by verifying that they both have the correct private</i><a href="https://www.cloudflare.com/learning/ssl/what-is-a-cryptographic-key/"><i> </i><i><u>key</u></i></a><i>. The information within their respective</i><a href="https://www.cloudflare.com/learning/ssl/what-is-an-ssl-certificate/"><i> </i><i><u>TLS certificates</u></i></a><i> provides additional verification.</i></p></blockquote><p>While mTLS seems like a good fit for bot authentication on the web, it has limitations. If a user is asked for authentication via the mTLS protocol but does not have a certificate to provide, they would get an inscrutable and unskippable error. Origin sites need a way to conditionally signal to clients that they accept or require mTLS authentication, so that only mTLS-enabled clients use it.</p>
<div>
<h3>A TLS flag for bot authentication</h3>
<a href="#a-tls-flag-for-bot-authentication">
</a>
</div>
<p>TLS flags are an efficient way to describe whether a feature, like mTLS, is supported by origin sites. Within the IETF, we have proposed a new TLS flag called <a href="https://datatracker.ietf.org/doc/draft-jhoyla-req-mtls-flag/"><code><u>req mTLS</u></code></a> to be sent by the client during the establishment of a connection that signals support for authentication via a client certificate. </p><p>This proposal leverages the <a href="https://www.ietf.org/archive/id/draft-ietf-tls-tlsflags-14.html"><u>tls-flags</u></a> proposal under discussion in the IETF. The TLS Flags draft allows clients and servers to send an array of one bit flags to each other, rather than creating a new extension (with its associated overhead) for each piece of information they want to share. This is one of the first uses of this extension, and we hope that by using it here we can help drive adoption.</p><p>When a client sends the <a href="https://datatracker.ietf.org/doc/draft-jhoyla-req-mtls-flag/"><code><u>req mTLS</u></code></a> flag to the server, they signal to the server that they are able to respond with a certificate if requested. The server can then safely request a certificate without risk of blocking ordinary user traffic, because ordinary users will never set this flag. </p><p>Let’s take a look at what an example of such a req mTLS would look like in <a href="https://www.wireshark.org/"><u>Wireshark</u></a>, a network protocol analyser. You can follow along in the packet capture <a href="https://github.com/cloudflareresearch/req-mtls/tree/main/assets/demonstration-capture.pcapng"><u>here</u></a>.</p>
<pre><code>Extension: req mTLS (len=12)
Type: req mTLS (65025)
Length: 12
Data: 0b0000000000000000000001</code></pre>
<p>The extension number is 65025, or 0xfe01. This corresponds to an unassigned block of <a href="https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#tls-extensiontype-values-1"><u>TLS extensions</u></a> that can be used to experiment with TLS Flags. Once the standard is adopted and published by the IETF, the number would be fixed. To use the <code>req mTLS</code> flag the client needs to set the 80<sup>th</sup> bit to true, so with our block length of 12 bytes, it should contain the data 0b0000000000000000000001, which is the case here. The server then responds with a certificate request, and the request follows its course.</p>
<div>
<h3>Request mTLS in action</h3>
<a href="#request-mtls-in-action">
</a>
</div>
<p><i>Code for this section is available in GitHub under </i><a href="https://github.com/cloudflareresearch/req-mtls"><i><u>cloudflareresearch/req-mtls</u></i></a></p><p>Because mutual TLS is widely supported in TLS libraries already, the parts we need to introduce to the client and server are:</p><ol><li><p>Sending/parsing of TLS-flags</p></li><li><p>Specific support for the <code>req mTLS</code> flag</p></li></ol><p>To the best of our knowledge, there is no public implementation of either scheme. Using it for bot authentication may provide a motivation to do so.</p><p>Using <a href="https://github.com/cloudflare/go"><u>our experimental fork of Go</u></a>, a TLS client could support req mTLS as follows:</p>
<pre><code>config := &tls.Config{
TLSFlagsSupported: []tls.TLSFlag{0x50},
RootCAs: rootPool,
Certificates: certs,
NextProtos: []string{"h2"},
} trans := http.Transport{TLSClientConfig: config, ForceAttemptHTTP2: true}
This example library allows you to configure Go to send req mTLS 0xfe01
bytes in the TLS Flags
extension. If you’d like to test your implementation out, you can prompt your client for certificates against req-mtls.research.cloudflare.com using the Cloudflare Research client cloudflareresearch/req-mtls. For clients, once they set the TLS Flags associated with req mTLS
, they are done. The code section taking care of normal mTLS will take over at that point, with no need to implement something new.