Document setup

On this page

Document setup

Overview

What document setup covers

Document setup is the set of file level facts that tell reader software what your document is and how to treat it. Four facts matter most: the title, the setting that makes a reader show that title, the flag that marks the file as tagged, and the identifier that declares the file conforms to the accessibility profile.

The title is a short, descriptive name for the document. It is stored inside the file, separate from the file name on your computer. "Introduction to Biology, Fall 2026" is a title. "syllabus_final_v3.pdf" is a file name, not a title.

A viewer preference called DisplayDocTitle tells the reader software to show the title in the window, rather than the file name. When it is turned on, a reader sees and hears the real title. When it is off, the reader sees the file name instead.

A tagged PDF carries a flag, called MarkInfo, that marks the file as tagged. Assistive technology checks this flag. If the flag is not set, the software treats the file as untagged, even if tags are present.

The accessibility profile for PDF is PDF/UA. A conforming file carries a PDF/UA identifier in its metadata that declares that conformance. Without it, the file does not announce that it was built to the standard.

What a reader loses when document setup is wrong

A screen reader user often opens a document and hears its title first, before anything else. The title is how they know which file they have open, especially when several are open at once. If there is no title, or the title is just the file name, the reader hears something like "syllabus underscore final underscore v3" or simply nothing useful. They cannot tell one document from another.

If the file is not marked as tagged, the situation is worse. Assistive technology treats the whole document as if it has no structure at all. The headings, tables, and reading order you built are ignored. The reader gets a flat stream of text in whatever order the software guesses.

In depth

A real, descriptive title with DisplayDocTitle turned on

This is the working case. The document has a title stored in its metadata that names the document clearly, and DisplayDocTitle is turned on so reader software shows that title rather than the file name.

A library publishes a reading list. Inside the file, the title is set to "First Year Politics Reading List, Semester One." DisplayDocTitle is turned on. A screen reader user opens it and immediately hears "First Year Politics Reading List, Semester One." They know exactly what they have, even with four other tabs open.

A machine can confirm all of this. Software can read the metadata and see that a title exists and is not empty. It can read the viewer preferences and confirm DisplayDocTitle is set. Where the machine stops is judging whether the title is meaningful. A title of "Untitled" or "Document1" passes the existence check and still tells the reader nothing.

No title, or a title that is just the file name

Here the document either has no title in its metadata, or the title field has been filled with the file name. To a screen reader user, both produce the same result: nothing meaningful to identify the document.

A student receives a scanned course handout. The author never set a title, so the authoring software copied the file name into the title field. The title reads "scan0007." The student opens it and hears "scan zero zero zero seven." They have three such files and cannot tell which is which without opening and reading each one from the top.

Before: title is "scan0007," which is the file name carried over. After: title is set to "Constitutional Law: Week 3 Handout," which names the document.

A machine can flag the obvious version of this, where the title is empty. It can sometimes flag a title that exactly matches the file name. It cannot judge a title like "Final Version" or "Copy of report," which exists, is not the file name, and still fails to identify the document. That judgement needs a person.

A file not marked as tagged at all

In this case the file may even contain a structure tree and tags, but the MarkInfo flag that marks the file as tagged is not set. Assistive technology reads that flag first. When it is missing, the software concludes the file is untagged and ignores the structure entirely.

A department exports a report from its layout software. The export produced tags, but a setting left the file unmarked. veraPDF and a screen reader both treat it as untagged. The careful heading structure inside the file is never used. A reader who wants to jump from section to section cannot, because as far as the screen reader is concerned, there are no sections.

A machine can confirm whether the file is marked as tagged. This is a clear yes or no. The flag is either set or it is not. What a machine cannot do is repair the underlying intent: if the flag is missing because the tags themselves are unreliable, marking the file as tagged alone will not make it accessible. The tags still have to be correct, which is a separate matter covered in the topics on headings, tables, and the other content types.

A file missing the PDF/UA identifier

Here the document may be well built, but it does not carry the PDF/UA identifier in its metadata that declares conformance to the accessibility profile. Nothing in the file announces that it was made to the standard.

A publisher remediates a textbook chapter properly: real title, tags, alternative text, declared language. But the final file never had the PDF/UA identifier added. A conformance checker reports that the file does not declare PDF/UA conformance, even though the content may meet the requirements. A buyer who filters for PDF/UA conforming files would pass this one over, because it never made the claim.

A machine can confirm the identifier is present, and that it is the part of document setup that records the conformance claim. The machine cannot judge whether the claim is true. The identifier says "this file was built to PDF/UA." Whether the rest of the file actually meets PDF/UA, including the parts only a person can check, is decided by the full set of checks, not by the identifier on its own. For where the line between machine and human checking sits across all of these, see the topic on what automated checking can and cannot find.

Where machine checking stops for document setup

For this element, the machine can go a long way. It can confirm a title exists, that DisplayDocTitle is set, that the file is marked as tagged, and that the PDF/UA identifier is present. Those are four clear yes or no checks.

Only a person can confirm whether the title clearly identifies the document. A nonempty title satisfies the machine. Whether a reader can actually use that title to tell this document apart from others is a human judgement.

Reference detail

Standards mapping

Standard	Reference	What it requires
WCAG 2.4.2 Page Titled	Level A	The document has a title that describes its topic or purpose
Matterhorn Protocol 1.1	Checkpoint 06 Metadata	The document title is present in the metadata
Matterhorn Protocol 1.1	Checkpoint 07 Dictionary	The document dictionary requirements, including the marks and conformance setup
PDF/UA-1	ISO 14289-1	File level setup: title in metadata, the file marked as tagged, the structure tree root, and the PDF/UA identifier

WCAG 2.4.2 Page Titled is a Level A success criterion, the minimum level. It states the outcome: a reader must be able to tell what the document is. The Matterhorn Protocol turns the related PDF/UA requirements into checkpoints a person or software can test. For how these standards relate to each other, see the topics on PDF/UA and the Matterhorn Protocol and on WCAG and the four POUR principles.

File level pieces of document setup

Piece	Where it lives	What it does
Title	XMP metadata in the file	Names the document, separate from the file name
DisplayDocTitle	Viewer preference	Tells reader software to show the title, not the file name
MarkInfo flag	File level dictionary	Marks the file as tagged so assistive technology uses the structure
Structure tree root	File level	The root of the tag tree the screen reader follows
PDF/UA identifier	XMP metadata	Declares the file conforms to PDF/UA

Common mistakes

Mistake	Effect on the reader
No title, or a title that is just the file name	A screen reader user hears nothing meaningful and cannot tell the document apart from others
DisplayDocTitle turned off	Reader software shows the file name instead of the real title
File not marked as tagged	Assistive technology treats the file as untagged and ignores its structure
PDF/UA identifier missing	The file does not declare conformance, even if its content meets the standard

The fix

Set a clear title that names the document, turn on DisplayDocTitle so reader software shows that title, mark the file as tagged, and include the PDF/UA identifier. The title is the one piece a person should check by eye: confirm it would tell a reader, who has several documents open, which one this is.

Authoritative sources

W3C, "Understanding Success Criterion 2.4.2: Page Titled" https://www.w3.org/WAI/WCAG21/Understanding/ 2024 ↩
PDF Association, "The Matterhorn Protocol 1.1" https://pdfa.org/resource/the-matterhorn-protocol/ 2021 ↩
International Organization for Standardization, ISO 14289-1:2014 (PDF/UA-1) 2014 ↩
WebAIM, "PDF Accessibility" https://webaim.org/ 2024 ↩