Internet-Draft | Matroska Tags | November 2024 |
Lhomme, et al. | Expires 16 May 2025 | [Page] |
This document defines the Matroska tags, namely the tag names and their respective semantic meaning.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 16 May 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Matroska is a multimedia container format defined in [RFC9559]. It can store timestamped multimedia data
but also chapters and tags. The Tag
elements add important metadata to identify and classify the information found
in a Matroska Segment
. It can tag a whole Segment
, separate Tracks
elements, individual Chapter
elements or Attachments
elements.¶
Some details about tagging are already present in Section 24 of [RFC9559].¶
While the Matroska tagging framework allows anyone to create their own custom tags, it's important to have a common set of values for interoperability. This document intends to define a set of common tag names used in Matroska.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
When a SimpleTag
is nested within another SimpleTag
, the nested SimpleTag
becomes an attribute of its parent SimpleTag
.
For instance, if you wanted to store the dates that a singer started being the lead performer,
then your SimpleTag
tree would look something like this:¶
This corresponds to this layout of EBML elements:¶
<Tags> <Tag> <Targets> <TagTrackUID>{track UID of tagged content}</TagTrackUID> </Targets> <SimpleTag> <TagName>ARTIST</TagName> <TagString>Pet Shop Boys</TagString> <SimpleTag> <TagName>LEAD_PERFORMER</TagName> <TagString>Neil Tennant</TagString> <SimpleTag> <TagName>DATE_STARTED</TagName> <TagString>1981-08</TagString> </SimpleTag> </SimpleTag> </SimpleTag> </Tag> </Tags>¶
In this way, it becomes possible to store any SimpleTag
as attributes of another SimpleTag
.¶
Multiple items SHOULD never be stored as a list in a single TagString
. If there is more
than one tag value with the same name to be stored, then more than one SimpleTag
SHOULD be used.¶
The TagName
SHOULD consist of UTF-8 capital letters, numbers and the underscore character '_'.¶
The TagName
SHOULD NOT contain any space.¶
TagNames
starting with the underscore character '_' are not official tags; see Section 3.1.¶
The TargetTypeValue
element allows tagging of different parts that are inside or outside a
given file. For example, in an audio file with one song you could have information about
the album it comes from and even the CD set even if it's not found in the file.¶
For applications to know the kind of information (like "TITLE") relates to a certain level
(CD title or track title), we also need a set of official TargetTypeValue
values and TargetType
names.
That also means the same tag name can
have different meanings depending on where it is, otherwise we would end up with 7 "TITLE_" tag names.¶
For human readability a TargetType
string can be added next to the corresponding TargetTypeValue
.
Audio and video have different TargetType
values.
The following table summarizes the TargetType
values found in Section 5.1.8.1.1.2 of [RFC9559]:¶
TargetTypeValue | Audio TargetType | Video TargetType | Comment |
---|---|---|---|
70 | COLLECTION | COLLECTION | the high hierarchy consisting of many different lower items |
60 | EDITION / ISSUE / VOLUME / OPUS | SEASON / SEQUEL / VOLUME | a list of lower levels grouped together |
50 | ALBUM / OPERA / CONCERT | MOVIE / EPISODE / CONCERT | the most common grouping level of music and video (e.g., an episode for TV series) |
40 | PART / SESSION | PART / SESSION | when an album or episode has different logical parts |
30 | TRACK / SONG | CHAPTER | the common parts of an album or a movie |
20 | SUBTRACK / PART / MOVEMENT | SCENE | corresponds to parts of a track for audio (like a movement) |
10 | - | SHOT | the lowest hierarchy found in music or movies |
Tags from a TargetTypeValue
apply to the all lower TargetTypeValues
. This means that if a CD has the same
artist for all tracks, you just need to set the "ARTIST" tag at TargetTypeValue
50 (ALBUM) and not
to each TargetTypeValue
30 (TRACK), but you can also repeat the value for each track.
If some tracks of that CD have no known
"ARTIST", the value MUST be set to nothing, a void string "" as detailed in Section 24.2 of [RFC9559],
so that the album "ARTIST" doesn't apply.¶
If a tag with a given TagName
is found at a TargetTypeValue
,
only values of that TagName
are valid at that TargetTypeValue
level.
In other words, the TagName
values from upper TargetTypeValue
levels don't apply at that level.¶
Multiple SimpleTag
with the same TagName
can be used at a given TargetTypeValue
level when each SimpleTag
contain a TagString
.
For example this can be useful to find a single "ARTIST" even when they are found in a collaboration.
The concatenation of each TagString
represents the value for the TagName
at this level.
The presentation, for instance with a separator, is up to the application.¶
There are three organizational tags defined in Section 4.2:¶
These tags allow specifying the ordering of some tags within a another group of tags.¶
For example if you have an album with 10 tracks and you want to tag the second track from it.
You set "TOTAL_PARTS" to "10" at TargetTypeValue
50 (ALBUM). It means the "ALBUM" contains 10 lower parts.
The lower part in question is the first lower TargetTypeValue
that is specified in the file.
So, if it's TargetTypeValue
= 30 (TRACK), then that means the album contains 10 tracks.
If TargetTypeValue
is 20 (MOVEMENT), that means the album contains 10 movements, etc.
And since it's the second track within the album, the "PART_NUMBER" at TargetTypeValue
30 (TRACK) is set to "2".¶
If the parts are split into multiple logical entities, you can also use "PART_OFFSET".
For example you are tagging the third track of the second CD of a double CD album with a total of 10 tracks
-- like The Orb's Adventures Beyond The Ultraworld [OrbUltraworld] --
the "TOTAL_PARTS" at TargetTypeValue
50 (ALBUM) is "10",
the "PART_NUMBER" at TargetTypeValue
30 (TRACK) is "3",
and the the "PART_OFFSET" at TargetTypeValue
30 (TRACK) is "5", which is the number of tracks on the first CD.¶
When a TargetTypeValue
level doesn't exist it MUST NOT be specified in the files, so that the "TOTAL_PARTS"
and "PART_NUMBER" elements match the same levels.¶
Here is an example of an audio record with 2 tracks in a single file.
There is one Tag
element for the record, and one Tag
element per track on the record.
Each track being identified by a chapter.¶
The Tag
for the record:¶
The Tag
for the first track:¶
The Tag
for the second track:¶
This corresponds to this layout of EBML elements:¶
<Tags> <!-- description of the whole file/record --> <Tag> <Targets> <TargetTypeValue>50</TargetTypeValue> </Targets> <SimpleTag> <TagName>ARTIST</TagName> <TagString>Daft Punk</TagString> </SimpleTag> <SimpleTag> <TagName>TITLE</TagName> <TagString>Da Funk</TagString> </SimpleTag> <SimpleTag> <TagName>TOTAL_PARTS</TagName> <TagString>2</TagString> </SimpleTag> </Tag> <!-- description of the first track/chapter --> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <TagChapterUID>12345</TagChapterUID> </Targets> <SimpleTag> <TagName>TITLE</TagName> <TagString>Da Funk</TagString> </SimpleTag> <SimpleTag> <TagName>PART_NUMBER</TagName> <TagString>1</TagString> </SimpleTag> </Tag> <!-- description of the second track/chapter --> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <TagChapterUID>67890</TagChapterUID> </Targets> <SimpleTag> <TagName>TITLE</TagName> <TagString>Rollin' & Scratchin'</TagString> </SimpleTag> <SimpleTag> <TagName>PART_NUMBER</TagName> <TagString>2</TagString> </SimpleTag> </Tag> </Tags>¶
This document inherits security considerations from the EBML [RFC8794] and Matroska [RFC9559] documents.¶
Tag values can be either TagString
or TagBinary
blobs. In both cases issues can happen if the parsing of the data fails.¶
Most of the time strings are kept as-is and don't pose a security issue, apart from invalid UTF-8 values.¶
String tags that are parsed like "REPLAYGAIN_GAIN" or "REPLAYGAIN_PEAK" defined in Section 4.10 or string tags following the rules from Section 3.2.2 or string tags following other strict formats like URLs may cause issues when the string is bogus or in an unexpected format.¶
Binary tags that need to be parsed like "MCDI" defined in Section 4.11 may cause issues when the data is bogus or incomplete.¶
Due to the nature of nested SimpleTag
, it is possible to exhaust the memory of the host app by using very deep nesting.
An host app MAY add some limits to the amount of nesting possible to avoid such issues.¶