Lists
Overview
What a list is in a tagged PDF
A list is a set of items that belong together as a group: the steps in a recipe, the points in a summary, the entries in a reading list. On the page you see them as bullets or numbers running down the margin. In a tagged PDF, that grouping has to be written into the structure tree, the invisible layer a screen reader follows, not just drawn on the visible page.
A correctly tagged list uses four kinds of tag. The list as a whole is an L tag. Each item inside it is an LI tag. Inside each item, the bullet or number is an Lbl tag, short for label, and the words of the item are an LBody tag, short for list body. With these in place, a screen reader can announce "list, five items," read each item in turn, and tell the reader when the list ends.
What a reader loses when a list is not tagged
When a list is only drawn, not tagged, the grouping disappears for a screen reader user. The bullets you see are just characters sitting in front of ordinary paragraphs. The screen reader does not say "list, five items." It reads five separate paragraphs that each happen to start with a dot, with nothing to say they are one set and nothing to say how many there are.
That count and that grouping carry real meaning. A reader who hears "list, five items" knows to expect five things and can decide to skip the whole list or move through it item by item. A reader who hears five loose paragraphs has to work out, from the wording alone, that these belong together, and never learns how many there are until the last one has gone by. The information is on the page for a sighted reader and missing for a screen reader user.
A machine can confirm that a list you have tagged is built correctly: that the L, LI, Lbl, and LBody tags are present, in the right relationship, and nested properly. What a machine cannot do is look at five bulleted paragraphs and know whether they are truly a list. That a thing which looks like a list really is one is a judgement only a person can make.
In depth
A real list, tagged correctly
A genuine list that has been tagged properly is the case everything else is measured against.
Take a five step set of instructions. The whole set is wrapped in one L tag. Each step is an LI tag. Inside each LI, the number is an Lbl tag and the instruction text is an LBody tag. Laid out as structure, one step looks like this:
- L (the list)
- LI (item)
- Lbl: "1."
- LBody: "Open the document in your reader."
When a screen reader reaches this structure, it announces the list and its item count, reads each label and body in order, and signals the end of the list. The reader hears that this is a group of five steps, hears each step, and knows when the steps are finished. The grouping and the count are spoken, not left to be guessed.
Bulleted paragraphs that were never tagged as a list
This is the most common failure for this element, and it can pass a quick visual look because the page still shows bullets.
Here the bullet characters were typed at the start of each paragraph, or drawn by the layout, but no L, LI, Lbl, or LBody tags were ever applied. In the structure tree there is no list at all. There are only ordinary paragraph tags, each one starting with a dot or a number that is just text.
Before, in the structure tree:
- P: "- Bring two forms of identification."
- P: "- Arrive fifteen minutes early."
- P: "- Silence your phone."
A screen reader reads three separate paragraphs. It never says "list, three items." The reader does not hear that these belong together, and never learns there are three. The dots are read as stray punctuation or skipped, depending on the tool.
After, tagged as a list:
- L
- LI: Lbl "•", LBody "Bring two forms of identification."
- LI: Lbl "•", LBody "Arrive fifteen minutes early."
- LI: Lbl "•", LBody "Silence your phone."
Now the screen reader announces "list, three items," reads each item with its bullet, and marks the end. The same three lines that looked like a list to a sighted reader now behave like a list for a screen reader user.
A nested list
Lists often contain other lists: a main point with sub points beneath it. The structure has to carry that nesting, not flatten it.
A nested list is built by placing a whole L tag inside the LBody of the item it belongs to. The sub list is part of its parent item, not a sibling sitting next to it. As structure, a checklist with one item that has two sub items looks like this:
- L
- LI
- Lbl: "1."
- LBody: "Documents to bring"
- L (the nested list, inside the LBody above)
- LI: Lbl "a.", LBody "Identification"
- LI: Lbl "b.", LBody "Proof of address"
A reader hearing this knows that "Identification" and "Proof of address" sit underneath "Documents to bring," not at the same level as it. If the nesting is broken, for example if the sub list is placed beside the parent item instead of inside it, the levels collapse. The reader hears a flat run of items and loses the relationship between the main point and its sub points. The two sub items appear to be top level entries, which changes what the list means.
Where machine checking stops for lists
A machine can do a great deal with lists. It can confirm that an L tag contains LI tags and not loose content, that items carry their Lbl and LBody parts, and that a nested list is placed inside an LBody rather than dropped in at the wrong level. Broken structure and broken nesting are things software can find and report.
What a machine cannot do is decide whether something that looks like a list is actually a list, or whether something that is a list was left untagged. Five bulleted paragraphs with no list tags are, to the software, just five paragraphs. The machine has no failure to report, because no list was claimed. Only a person, reading the content, can see that those five paragraphs are one set of items that should have been grouped and counted. This is the same boundary that runs through PDF accessibility generally; for the full picture, see the topic on what automated checking can and cannot find.
Reference detail
Standards mapping
| Standard | Identifier | What it requires for lists |
|---|---|---|
| WCAG | 1.3.1 Info and Relationships, Level A | The grouping a reader sees, such as a list, must be conveyed in the structure, not by appearance alone |
| Matterhorn Protocol 1.1 | Checkpoint 16, Lists | Lists are tagged with the correct list structure |
| Matterhorn Protocol 1.1 | Checkpoint 09, Appropriate tags | Covers list nesting and the soundness of the tag structure that holds the list |
WCAG 1.3.1 is a Level A success criterion, the minimum level. Level AA, the level most laws and policies require, includes everything in Level A, so 1.3.1 applies under an AA requirement as well.
PDF tags used for lists
| Tag | Name | Role |
|---|---|---|
| L | List | The list as a whole |
| LI | List item | One item within the list |
| Lbl | Label | The bullet or number for an item |
| LBody | List body | The content of the item; a nested list is placed inside here |
Common mistakes
| Mistake | Effect on a screen reader user |
|---|---|
| Bulleted or numbered paragraphs not tagged as a list | No "list, N items" announcement; the grouping and count are lost |
| List items not wrapped in LI | The structure is not a valid list; items are not announced as items |
| Broken nesting | Sub items collapse to the top level; the relationship between a point and its sub points disappears |
The fix in every case is to tag the list with L, LI, Lbl, and LBody, so the grouping and the count are announced and any nesting is preserved.
Authoritative sources
-
W3C, "Understanding Success Criterion 1.3.1: Info and Relationships" https://www.w3.org/WAI/WCAG21/Understanding/info-and-relationships.html 2024 ↩
-
PDF Association, "The Matterhorn Protocol 1.1" https://pdfa.org/resource/the-matterhorn-protocol/ 2021 ↩
-
WebAIM, "PDF Accessibility" https://webaim.org/techniques/acrobat/ 2024 ↩