Folger Digital Texts uses eXtensible Markup Language (XML) to encode our master files. XML is a semantic encoding language that allows encoders to include information "behind the scenes" of the visible text, which can then be used for special searching and analysis, visualizations, and other applications. Types of special information that are included in our encoded texts include details about which characters are entering or exiting a scene, which character is delivering a speech, and even when each character dies.
Folger Digital Texts follows the guidelines of the Text Encoding Initiative (TEI), a set of guidelines that has become the standard for encoding literary texts. If you're new to XML or the TEI guidelines and want to learn more, some helpful online resources to get you started are W3School's XML Tutorial and TEI By Example.
Using the encoded texts as a starting point is a significant time-saver in creating mobile apps and other digital projects, or conducting research. We are delighted to share our encoded texts at no cost for noncommercial use (please review our Terms of Use for more details). To get started, select a play from the Download table below.
For more specific information about how Folger Digital Texts uses TEI tags in the corpus, please refer to the Tag Guide.
Please select your content from the list. Your browser will download a compressed folder that contains the XML file(s) and supporting images and processing files needed to run your selected title(s) in a browser, edit using a text editor, or import into a pre-existing XML-based project. Remember that in order for the file to render properly in a browser, all of the files that are included in the download must be kept in the same place (i.e., in the same folder) on your system.
For more information about Folger Digital Texts's XML methodology, visit our Tag Guide, or download our Documentation PDF.
For your convenience, we no longer require you to register before downloading files. If you would like to join our mailing list or help us develop better offerings and services by giving us feedback on how we're doing so far, please visit our feedback page.
Title | Last Updated | Download Format |
Folger Digital Texts - Complete Set | May 31, 2016 | XML |
All's Well That Ends Well | July 31, 2015 | XML |
Antony and Cleopatra | July 31, 2015 | XML |
As You Like It | July 31, 2015 | XML |
The Comedy of Errors | July 31, 2015 | XML |
Coriolanus | July 31, 2015 | XML |
Cymbeline | July 31, 2015 | XML |
Hamlet | July 31, 2015 | XML |
Henry IV, Part 1 | July 31, 2015 | XML |
Henry IV, Part 2 | July 31, 2015 | XML |
Henry V | July 31, 2015 | XML |
Henry VI, Part 1 | July 31, 2015 | XML |
Henry VI, Part 2 | July 31, 2015 | XML |
Henry VI, Part 3 | July 31, 2015 | XML |
Henry VIII | July 31, 2015 | XML |
Julius Caesar | July 31, 2015 | XML |
King John | July 31, 2015 | XML |
King Lear | April 21, 2016 | XML |
Love's Labor's Lost | July 31, 2015 | XML |
Lucrece | July 31, 2015 | XML |
Macbeth | July 31, 2015 | XML |
Measure for Measure | July 31, 2015 | XML |
The Merchant of Venice | July 31, 2015 | XML |
The Merry Wives of Windsor | July 31, 2015 | XML |
A Midsummer Night's Dream | July 31, 2015 | XML |
Much Ado About Nothing | July 31, 2015 | XML |
Othello | May 31, 2016 | XML |
Pericles | July 31, 2015 | XML |
The Phoenix and Turtle | July 31, 2015 | XML |
Richard II | July 31, 2015 | XML |
Richard III | July 31, 2015 | XML |
Romeo and Juliet | July 31, 2015 | XML |
Shakespeare's Sonnets | July 31, 2015 | XML |
Taming of the Shrew | July 31, 2015 | XML |
The Tempest | July 31, 2015 | XML |
Timon of Athens | July 31, 2015 | XML |
Titus Andronicus | July 31, 2015 | XML |
Troilus and Cressida | July 31, 2015 | XML |
Twelfth Night | July 31, 2015 | XML |
The Two Gentlemen of Verona | July 31, 2015 | XML |
The Two Noble Kinsmen | July 31, 2015 | XML |
Venus and Adonis | July 31, 2015 | XML |
The Winter's Tale | April 21, 2016 | XML |
The following is a list of the TEI standard elements that we use in the Folger Digital Texts XML source code, along with specifications on how each tag is used in the project. To review the TEI documentation for each element, click on any highlighted tag.
Element | Description |
<pb> | Marks the beginning of a page in the print edition. The n attribute gives the page number. The spanTo attribute gives the xml:id of a milestone element marking the end of the page. |
<milestone> | Milestones are used in several instances. When the unit attribute has the value "page", it marks the end of a page in the print edition (see also the entry for <pb>). The n attribute gives the page number. When the unit attribute has the value "ftln", it describes a line of text, and the n attribute gives the line number. The corresp attribute notes the corresponding w, c, pc, or anchor elements. The ana attribute has the value "verse", "prose", or "short". The prev and next attributes provide the means for reconstructing split verse lines. |
<fw> | Provides the act/scene header for the page, as given in the print edition. The n attribute gives the page number. The type attribute has the value "header". |
<lb> | Marks a line break in the print edition. |
<div1> | Marks an act (or induction, prologue, epilogue). The type attribute gives the division type. The n attribute gives the canonical act number, where appropriate. |
<div2> | Marks a scene (or prologue, epilogue, chorus). The type attribute gives the division type. The n attribute gives the canonical act number, where appropriate. |
<head> | Provides the act/scene header, as given in the print edition. |
<stage> | Marks stage directions. The n attribute gives the stage direction line number. The type attribute identified the type of stage direction, as follows:
|
<sound> | Marks musical and other sound cues. The type attribute categorizes the type of cue, as follows:
|
<sp> | Marks a speech within a text. The who attribute identifies the characters associated with that speech. |
<speaker> | Provides the speech prefix, as given in the print edition. |
<ab> | Within sp tags, contains the text of the speech. |
<w> | Marks a word in a speech, stage direction, speech prefix, or header. The n attribute gives the line number, where appropriate. |
<c> | Marks a space character in a speech, stage direction, speech prefix, or header. The n attribute gives the line number, where appropriate. |
<pc> | Marks a punctuation character in a speech, stage direction, speech prefix, or header. The n attribute gives the line number, where appropriate. |
<gap> | Marks editorial placeholders where words are missing or unclear in the primary text. |
<join> | No longer used. In previous beta versions, was used to join w, c, pc, and anchor elements into a typographic line. The n attribute gave the line number. The type attribute had the value "verse", "prose", or "short". The prev and next attributes provided the means for reconstructing split verse lines. |
<ptr> | Creates a pointer for one or more w, c, pc, and anchor elements, used to link them to analytical interpretations such as textual notes or stanza identification. |
<seg> | Often contains a song, poem, or letter, identified by its type tag, identifies a word segment that may be quoted or emended. |
<label> | Marks the header to a song or dumbshow. |
<floatingText> | No longer used. In previous beta versions, used sparingly to contain a song or poem that seems distinct from the surrounding text and may not be a ttributed to a specific speaker. |
<q> | Contains quoted sections of text. |
<foreign> | Marks non-English words. The xml:lang attribute identifies the foreign language, where appropriate. |
<name> | Marks a proper name that may be quoted or italicized in the text. |
<title> | Marks a title that may be quoted or italicized in the text. |
<hi> | Marks sections of text that are otherwise highlighted (generally italicized). |
<anchor> | Marks areas where content in a prior source text is not present in the current reading. |
<app> | Critical apparatus containing variant readings from prior publications. |