Internet Engineering Task Force F. Canel, Ed.
Internet-Draft K. Madhavan
Updates: 9309 (if approved) Microsoft Corporation
Intended status: Informational 21 October 2024
Expires: 24 April 2025
Robots Exclusion Protocol Extension to manage AI content use
draft-canel-robots-ai-control-00
Abstract
This document extends RFC9309 by specifying additional rules for
controlling usage of the content in the field of Artificial
Intelligence (AI).
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 24 April 2025.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Canel & Madhavan Expires 24 April 2025 [Page 1]
Internet-Draft Robots Exclusion Protocol Extension to m October 2024
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 2
3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 2
3.1. Robots Control Rules . . . . . . . . . . . . . . . . . . 2
3.2. Application Layer Response Header . . . . . . . . . . . . 3
3.3. HTML Meta Element . . . . . . . . . . . . . . . . . . . . 3
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 3
1. Introduction
While the Robots Exclusion Protocol enables service owners to control
how, if at all, automated clients known as crawlers may access the
URIs on their services as defined by [RFC8288], the protocol doesn't
provide controls on how the data returned by their service may be
used in training generative AI foundation models.
Application developers are requested to honor these tags. The tags
are not a form of access authorization however.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Specification
3.1. Robots Control Rules
The possible values of the rules complementing existing allow,
disallow rules are:
DisallowAITraining - instructs the parser to not use the data for
AI training language model.
AllowAITraining - instructs the parser that the data can be used
for AI training language model.
The values are case insensitive and honor the same matching logic as
Allow and disallow rules. When Allow and Disallow rules define if
the content can be downloaded, AllowAITraining and DisallowAITraining
rules only apply rules on usage of the content for AI training.
Canel & Madhavan Expires 24 April 2025 [Page 2]
Internet-Draft Robots Exclusion Protocol Extension to m October 2024
3.2. Application Layer Response Header
The same rules can also be set in the Application Layer Response
Header:
DisallowAITraining - instructs the parser to not use the data for
AI training language model.
AllowAITraining - instructs the parser that the data can be used
for AI training language model.
The values are case insensitive and honor the same matching logic as
Allow and disallow rules.
3.3. HTML Meta Element
Same rules can also be set via an HTML meta tag:
4. IANA Considerations
TODO: https://www.rfc-editor.org/rfc/rfc9110.html#name-field-name-
registry
Canel & Madhavan Expires 24 April 2025 [Page 3]