EquitableDocs Document Accessibility Guide

Tables

Overview

What a table does in a document

A table arranges information in a grid so that each piece of data sits where a row and a column meet. The row and the column together give a single cell its meaning. A number in a table is not just a number. It is the figure for one particular thing in one particular category, and you only know which because of the row and column it belongs to.

A sighted reader gets this for free. They glance up to the column heading and across to the row heading, and the cell makes sense at once. A screen reader user cannot glance. They move through the table one cell at a time, and they depend on the file telling the assistive technology which cells are headers and which are data. When that information is in the file, the screen reader can announce, for each cell, the headers it belongs under. The reader hears "Region: South, Sales: 4,200" instead of a bare "4,200" with no idea what it counts.

What a reader loses when a table is wrong

When a table is built without proper structure, a screen reader user hears a stream of values with no headings attached. They reach a cell that says "4,200" and have no way to learn that it is the South region's sales figure. They can count cells and try to hold the grid in their head, but a table of any size becomes impossible to follow.

A second, quieter problem is the layout table. Some documents use a table only to position things on the page, not to present data. If that arrangement is tagged as a real data table, the screen reader announces rows, columns, and cell coordinates for content that has no grid meaning at all. The reader is told they are in row three, column two of something that is really just a heading next to a logo. The structure actively misleads them.

The rule that covers tables is WCAG 1.3.1, Info and Relationships, at Level A. It says the relationships conveyed by the visual layout, here the link between a cell and its headers, must also be present in a form the assistive technology can read.

In depth

Three kinds of table, and why the difference matters

Not every grid on a page is the same kind of thing, and the right way to tag one depends on which kind it is. There are three cases. A simple data table has one level of headers, a single header row or a single header column. A complex data table has more than one level of headers, or merged cells, or both. A layout table is not really a table at all: it is a grid used only to place content on the page. Each case is handled differently, and the most damaging mistakes come from treating one kind as another.

A tagged PDF carries its table in the structure tree, the tag tree a screen reader follows, using a set of tags: a Table tag wraps the whole thing, TR tags mark each table row, TH tags mark header cells, and TD tags mark data cells. There are optional grouping tags, THead, TBody, and TFoot, for the head, body, and foot of the table, and a Caption tag for a table caption. These tags are the same across all three cases. What changes between cases is how the header cells are marked and how each data cell is tied to the headers that explain it.

A simple data table

A simple data table has one row of headers across the top, or one column of headers down the side, and nothing more complicated. This is the most common case and the easiest to get right.

Take a table of three regions and their sales. The top row reads "Region" and "Sales." Below it, three rows give "North, 3,100," "South, 4,200," and "East, 2,750." The top row is the header row. In the file, each cell in that top row should be a TH, a header cell, and every cell below should be a TD, a data cell.

For the headers to actually do their job, each TH carries a Scope attribute. Scope tells the assistive technology which cells a header applies to. A header at the top of a column has Scope set to Column, meaning it governs every cell beneath it. A header at the start of a row has Scope set to Row, meaning it governs every cell along that row. In the sales table, "Region" and "Sales" are column headers, so each gets Scope set to Column. Now when the reader lands on the cell "4,200," the screen reader can announce "Sales, 4,200," because it knows that the "Sales" header, with Scope Column, governs that cell.

A machine can confirm that a TH exists and that it has a Scope value. It cannot confirm that you marked the right row as headers. If your real header row was left as plain data cells and the first row of figures was marked as headers by mistake, the file is internally consistent and the machine sees nothing wrong. Only a person looking at the table can see that the headers are on the wrong cells.

A complex data table

A complex data table has more than one level of headers, or cells that span more than one row or column, or both. Scope alone is no longer enough, because a single cell may sit under two or more headers at different levels, and Scope cannot express that.

Picture a sales table broken down by quarter. The top header row has "Q1" and "Q2." Above both of those sits a single cell, "2026," that spans the two columns. Down the side, "North" and "South" each head a band of rows. A figure in the body of this table belongs under three headers at once: a year, a quarter, and a region. To carry this, two things are needed.

First, the cells that span more than one column or row must say so. A cell that covers two columns carries a ColSpan attribute set to 2. A cell that covers two rows carries a RowSpan attribute. The "2026" cell spanning Q1 and Q2 has ColSpan set to 2. Without this, the file claims the grid is a plain rectangle when it is not, and the cells below "2026" line up against the wrong headers.

Second, because Scope cannot reach across multiple header levels, each data cell is tied to its headers explicitly. Every header cell is given an ID, a unique label. Each data cell then carries a Headers attribute that lists the IDs of the headers it belongs under. The "4,200" cell would list the IDs of "2026," "Q1," and "North." Now the screen reader can announce all three headers for that one figure, and the reader hears exactly what it counts.

A machine can confirm that ColSpan and RowSpan are present where the grid is irregular, and that the Headers attributes point to IDs that exist. It cannot confirm that the associations are the right ones. If a data cell lists the ID for "Q1" when it visually sits under "Q2," the file is valid and the IDs all resolve, but the reader is told the wrong quarter. Only a person comparing the visual table to the tags can catch that.

A layout table

A layout table uses the grid only to position content on the page. There is no data being cross-referenced, no headers, no cells that mean anything by their row and column. A common example is a header band where a logo sits on the left and a document title sits on the right, arranged with an invisible two-cell table so they line up.

A layout table should never be tagged as a data table. There is nothing for headers to govern, so announcing rows, columns, and cell positions only confuses the reader. There are two correct ways to handle it. The first is to mark the table structure as an artifact, which tells the assistive technology to skip the grid and read the content inside it as ordinary flowing content. The second, often better, is to rebuild the content without a table at all. The logo and title in the example can be a figure with alternative text followed by a heading, read in that order, with no grid involved.

This is the case where a machine is least able to help. A machine can see that something is tagged as a Table and that its structure is regular. It cannot tell whether the thing is a real data table or a layout device, because both look like grids in the file. Deciding that a given table is really just positioning, and should be an artifact or a list, is a human judgement every time.

Where machine checking stops for tables

For tables, a machine can confirm the plumbing: that a TH carries a Scope value, that spanning cells declare ColSpan or RowSpan, that the rows are regular and the cell counts line up, that Headers attributes resolve to real IDs. The Matterhorn Protocol, which turns the PDF/UA rules into testable conditions, covers tables under its checkpoint 15, Tables, and the soundness of table nesting under checkpoint 09, Appropriate tags.

What a machine cannot do is judge meaning. It cannot confirm that the cells you marked as headers are the real headers, that the header-to-cell associations are the ones a reader would expect, or that a given grid is a genuine data table rather than a layout device that should not be a table at all. Those are the three judgements that decide whether a table actually works, and each of them needs a person. For more on this division of labour, see the topic on what automated checking can and cannot find. For how WCAG and PDF/UA fit together around it, see the topic on PDF/UA and the Matterhorn Protocol.

Reference detail

Standards mapping

Item Detail
WCAG success criterion 1.3.1 Info and Relationships
WCAG level A
WCAG principle Perceivable
Matterhorn checkpoint, tables Checkpoint 15, Tables
Matterhorn checkpoint, nesting Checkpoint 09, Appropriate tags

PDF tags used in a table

Tag Role
Table Wraps the whole table
TR Marks one table row
TH Marks a header cell
TD Marks a data cell
THead Optional grouping for the header rows
TBody Optional grouping for the body rows
TFoot Optional grouping for the footer rows
Caption Marks the table caption

Attributes that tie cells to headers

Attribute What it does
Scope On a TH, set to Row or Column, says which cells the header governs; used for simple tables
ColSpan On a cell, the number of columns it covers; needed for merged cells
RowSpan On a cell, the number of rows it covers; needed for merged cells
Headers and IDs Each header carries an ID; each data cell lists the header IDs it belongs under; used for complex tables

Common mistakes

Mistake What goes wrong
A layout table tagged as a data table The reader is told rows, columns, and cell positions for content that has no grid meaning
Header cells with no Scope The screen reader cannot tie a data cell to its header, so values are announced with no heading
Merged cells without ColSpan or RowSpan The file claims a regular grid, so cells line up against the wrong headers
No header row marked Every cell is a data cell, so nothing is announced as a heading and the grid loses its meaning

How to fix each case

In the authoring tool, mark the header row, or header column, as a header. This writes the TH tags and sets Scope for you. For tables with merged cells, use the tool's real merge-cells feature rather than spacing or blank cells, so ColSpan and RowSpan are written correctly; complex multi-level tables then need the Headers and IDs associations checked by a person. For a table used only to position content, mark the table structure as an artifact so the assistive technology skips the grid, or rebuild the content without a table, for example as a figure and a heading, or as a list.

Authoritative sources


  1. W3C, "Understanding Success Criterion 1.3.1: Info and Relationships" https://www.w3.org/WAI/WCAG21/Understanding/info-and-relationships.html 2024 

  2. PDF Association, "The Matterhorn Protocol 1.1" https://pdfa.org/resource/the-matterhorn-protocol/ 2021 

  3. WebAIM, "Creating Accessible Tables" https://webaim.org/techniques/tables/ 2024 

  4. International Organization for Standardization, ISO 14289-1:2014 (PDF/UA-1) 2014