The task is to add ‘Alt’ text for images in a PDF document, such that a screen reader will read out the text.
Currently, my PDF documents don’t have a structure tree defined.
Can such text be added to an image, WITHOUT needing to define an entire structure-element hierarchy for the whole document? I want to add the Alt text with minimal changes to the PDF document. The tool I’m using to generated the pdf is not so good at generating structural elements. Hence, I wish to avoid the need to define structure at all.
What I’m looking for is HTML-like behavior, where the Alt text is added locally to an image tag, without requiring changes elsewhere.
The PDF 1.6 spec states that, beginning with PDF 1.5, Alt text can be added for:
(PDF 1.5) A marked-content sequence (see Section 10.5, “Marked
Content”), through an Alt entry in a property list attached to the
marked-content sequence with a Span tag.
Can such a Span be added WITHOUT also adding any structure elements?
My tests indicate “no”, but my tests may not be robust. The tests generate this:
ET
/Span <</Alt(This is alternate text.)>> BDC
q 180 0 0 15.84 36 747 cm /img0 Do Q
EMC
BT
in a PDF 1.4 doc. The doc has no structure tree defined:
16 0 obj<</Type/Catalog/Pages 14 0 R>>
Then I hack the first line, and change the PDF version from 1.4 to 1.5. The end result is that the Alt text is not read by Adobe Reader 10.
In order to have a working Alt text, you need to define a structure tree, it is required by PDF specification. You can define the Alt text the way you wrote above without the structure tree, but it is non standard and it might or might not work.