The Craft of the Archive

Prototyping an Electronic Edition of William Blake’s Manuscript of Vala, or the Four Zoas: A Progress Report

Authors: Morris Eaves (University of Rochester) , Eric Loy (University of Rochester) , Hardeep Sidhu (University of Rochester) , Laura Whitebell (University of Rochester)

  • Prototyping an Electronic Edition of William Blake’s Manuscript of Vala, or the Four Zoas: A Progress Report

    The Craft of the Archive

    Prototyping an Electronic Edition of William Blake’s Manuscript of Vala, or the Four Zoas: A Progress Report

    Authors: , , ,


This article describes the recent experiments undertaken by the William Blake Archive in preparing a digital edition of Vala, or the Four Zoas, one of Blake’s most materially challenging works. The aim is to provide users with multiple tools for analysing Blake’s messy and mysterious manuscript: a high-quality image of the manuscript page paired with a dynamic transcription that allows readers to choose the level of detail they want for understanding what it is, precisely, they are looking at. However, the existing encoding standards and display were revealed to be totally inadequate by the complexity of The Four Zoas, and early models were almost impossible to read and understand. The article traces the journey towards the latest prototype as the team try to strike a balance between detail and accessibility, description and interpretation, readability and reliability, and established conventions and new approaches.

How to Cite:

Eaves, M., Loy, E., Sidhu, H. & Whitebell, L., (2015) “Prototyping an Electronic Edition of William Blake’s Manuscript of Vala, or the Four Zoas: A Progress Report”, 19: Interdisciplinary Studies in the Long Nineteenth Century 2015(21). doi:



Published on
09 Dec 2015
Peer Reviewed

Prefatory note

The account of editorial experiment that follows rarely strays into theories. Though of course editing is unavoidably the product of theories implicit and explicit, it is, as we describe it here, a ruthlessly demanding practical craft whose products will be consumed by reader-viewers whose needs and desires we serve along with our own. It is clear from our account that we are highly conscious of producing surrogates, in new media, of William Blake’s works. If he etched, engraved, or wrote words, we reproduce them in digital images and digital ASCII code marked up, as we say, in XML. If he etched, engraved, painted, watercoloured, or otherwise produced pictures on paper, canvas, wood, or copper, we provide them to the user in digital images — technically, lossless TIFFs converted into lossy JPEGs. The Blake Archive exploits the universal machine of computing technology to translate Blake’s wide range of materials, methods, and products into images on a screen of pixels. Though the computing is complex, and Blake’s methods and materials are complex, they are complex in manifestly different ways. It may be worth pondering the fact that, despite those profound differences, at our end of the chain of transformations, we employ the body parts and cognitive processes to consume this ‘Blake’ that Blake’s immediate audience employed. We depend upon our eyes, plus the native and learned capabilities of our brains, to access — to see, read, and comprehend — the material spectacle of Blake’s works as they have been passed down to us. We may or may not exercise the option to retranslate the indispensable optics into audio. Beyond such elementary observations, the theory must wait.

Morris Eaves

Into the void

Since its inception more than twenty years ago, the William Blake Archive has brought high-quality, electronic facsimile editions of Blake’s artistic and literary work to the Internet. The Archive’s continued significance as a scholarly resource and hive of digital editing testifies to the durability of its founding editorial rationale and archival plan.1 With any editorial project, however, and certainly with any digital archive project, originary visions are under constant pressure from new technological developments, changes in widely acknowledged best practices, and the expansion of the archive itself — among other factors. A mature project like the Blake Archive is an exemplary case study in this respect, having endured several shifts in editorial practice and constant technological evolution. For example, although always envisioned as a comprehensive Blake resource, the Archive’s early structure was built to accommodate the artistic, editorial, and technological demands of Blake’s illuminated books in illuminated printing — as he first labelled his odd new medium of (usually) watercoloured relief etching, the most widely studied and influential body of his work. But the full range of his seventy-year lifetime of literary and pictorial production is in fact highly diverse and often experimental. A framework for incorporating illuminated books into the Archive does not necessarily serve works in other media with equal scholarly efficiency or efficacy. Accordingly, when manuscripts and, later, typographic editions were incorporated into the Archive, editorial rationales and technical procedures were revisited and revised.2 All periods of growth of the Archive might be characterized by this pattern of feedback: works in progress are placed within the Archive’s existing framework, and the framework is modified to accommodate the new material.

This metanarrative of the Archive’s growth is relevant here as the latest episode in the storied life of Blake’s formidable manuscript Vala, or the Four Zoas and its numerous editions: a memorable — and meaningful — tale of extreme imaginative expression and correspondingly extreme editing. The story begins in 1796 or 1797 in the wake of a failed venture to design and engrave illustrations for a fancy multivolume edition of Edward Young’s Night Thoughts, for which Blake produced over five hundred watercolour designs. On materials left over from that project, he launched a work of staggering breadth, ultimately a monomyth in nine ‘nights’ (an organizing principle taken from Night Thoughts), that both synthesized and exceeded the episodic narratives of his earlier illuminated books. Blake continued working on the monumental project for years, drafting and revising to suit his evolving artistic and spiritual visions, sometimes in conversations with other very ambitious (and ultimately completed) projects such as the illuminated books Milton, a Poem (in fifty dense plates) and Jerusalem, the Emanation of the Giant Albion (in one hundred even denser plates). Both were begun around 1804, before he left off work on the Zoas project. Despite this apparent centrality to Blake’s mythopoetical corpus, the Zoas manuscript was never finished in any usual sense. Though its nine nights bring it to an explosive apocalyptic conclusion, its two unsettled seventh nights (so-called 7a and 7b) are only the most obvious of many loose ends.

Late in life Blake gave his Four Zoas materials to his friend and patron, artist John Linnell. After the family sold the manuscript at auction to dealer John Pearson in 1918, it was anonymously donated to the British Museum Department of Manuscripts. The 146 fragile, tentatively sequenced pages then moved with the collection of the British Library when it separated from the museum in 1997. Even before the museum acquired it, the manuscript had become an object of intense editorial scrutiny. In 1889, E. J. Ellis and W. B. Yeats attempted to sequence and transcribe it, and included passages in their published collection of Blake’s works (1893) — the first attempt to make editorial sense of the pile of papers that Blake had left to posterity. Several other editors over the next century would take up the challenge of sequencing the pages, and of untangling and deciphering layer upon layer of revision. Some editions have attempted to separate the earlier strata of narrative from later ones. The last significant print edition, co-edited in 1987 by Cettina Magno and David V. Erdman, offered no transcription, but made a concerted effort to interpret the images situated among Blake’s words on many of the pages.3

For all editors, Blake’s long manuscript presents a catalogue of profound challenges of transcription and representation. Much of the text was written on reused proof pages from the Night Thoughts project and/or revised heavily over many years of composition. As a complex physical document with a murky textual history, it is hard to read, hard to understand, and hard to edit. Such a rare combination of literary significance and editorial challenges perhaps explains the repeated attempts to assemble a coherent version for a public whose appetite for Blake — even this hardest of Blakes — increased sharply in the decades after Alexander Gilchrist’s landmark Life of Blake, ‘Pictor Ignotus’ (1863, 2nd edn 1880). In any attempt, editors must strike a balance between readability of the transcribed text and reliability of the transcription to represent accurately — in some sense that, under the circumstances, has to be precisely defined — the original document.

These editions have always been printed, physical documents themselves, either sacrificing the original manuscript’s complexity for gains in comprehensibility, or succumbing to the incomprehensibility of complex typographic schemes that remain faithful to the manuscript but alienate most readers by eliminating familiar conventions that orient and anchor reading comprehension. Even facsimile editions based on printed photographic representations of manuscript pages are often problematic: costs are high, access low, and the accompanying apparatus is either too simple or too complex. Of course, as Rachel Lee explains in her chronicle of the Archive’s earlier work with the Four Zoas manuscript,

electronic textual editing is no less prone to serious limitations, but it holds promise, particularly because web-based scholarly editing tends to be, as Morris Eaves has argued, experimental, collaborative, and provisional — precisely the working conditions that make editing Four Zoas seem even remotely possible.4

In other words, electronic editing offers not only advantages of digital publication and distribution, but also, we think, an intellectual environment that has the potential to better accommodate Blake’s messy and mysterious pages.

In terms familiar to an academic audience used to hearing that every representation is an interpretation, Justin Van Kleeck argues that every edition of The Four Zoas is an interpretation, as of course it is. With this understanding, he sorts the many editions of Blake’s text into two interpretive camps. In the first camp, there are ‘documentary’ editions that treat the manuscript as a physical object instead of a literary one. These kinds of editions focus on ‘“interpreting” what physical-textual evidence remains and the possible insights it can offer into the artifact’s growth’.5 The opposing ‘literary’ editions treat The Four Zoas more as a poem to be read than as a manuscript to be examined.6 The goal of this kind of edition is ‘recovering — or discovering — the order amidst the “chaos”’ to offer a reading version of the poem. Having surveyed the long editorial history of The Four Zoas, Van Kleeck explains that ‘we find many editors adopting seriously self-contradictory methods […] intentional and unintentional’ in trying to balance the different goals of documentary and literary editing. ‘They try to divide editing from interpretation’, he explains, ‘through commentary (in prefaces and notes) and materially/structurally (in the editions)’ (para. 8 of 83). As recognized earlier, the material constraints of print editions account for some of the difficulty in balancing two sometimes mutually exclusive editorial principles. An electronic edition, we think, will make possible a more effectively integrated presentation of the manuscript, which seems necessary in order to simulate the complex physical structure of the manuscript while making its important literary and artistic content more legible.7

Our electronic edition of The Four Zoas will be the first to support full textual markup and custom display, perhaps for good reason: the British Library maintains severe restrictions on access to the fragile manuscript. In 2005, however, the Blake Archive secured new high-resolution digital photography of the entire manuscript. In 2015 and 2016, the Archive will release a preliminary colour-corrected facsimile of The Four Zoas with essential metadata (bibliographic, editorial, etc.).8 Efforts to create a full electronic edition comparable to other scholarly editions in the Archive will continue for some time. As we will show, our aim is to provide users with multiple tools for analysing Blake’s multidimensional manuscript: high-quality images of the manuscript pages paired with dynamic transcriptions that allow readers to choose the level of detail they want for understanding precisely what they are looking at.9 We want, of course, to bring Blake’s formidable work — the physical challenges of which match its artistic ambitions — into a new digital existence capable of a previously unachievable dynamic balance between detail and accessibility, documentation and interpretation, readability and reliability, established conventions and novel approaches. The following account traces our work to date.

New texts, old methods

The textual data in the Blake Archive is encoded using XML, a markup language designed to describe data and carry information. We use a specific set of tags that has been significantly expanded and developed as our work on Blake manuscripts and typographic works has advanced.10 Our tagset is inspired by the Text Encoding Initiative (TEI), an XML schema, originally designed to be used in the field of digital humanities. XML is a hierarchical language whose organization, in very general terms, parallels the traditional organization of texts. For example, an anthology of verse is made up of poems, which may be made up of stanzas, which are made up of words, and so on. This is useful when it comes to the encoding of documents, as our documentation affirms:

XML does not have to be merely descriptive; unlike HTML [the markup language of the Web], XML allows us to identify and encode the structure of documents. A title or heading (to once again take a very simple example) can be tagged and described as such rather than being simply rendered in a large font or in boldface, etc., as HTML would encourage. (‘Technical Summary’, Blake Archive, emphasis in original.)

From the beginning we have encoded Archive editions using lines and groups of lines as the main hierarchical building blocks of a transcription.

Here (Fig. 2), it is clear that line groups (<lg>) and lines (<l>) are appropriate elements to use when encoding Blake’s etched and printed verse, in this case ‘The Tyger’. The team of project assistants working at the University of Rochester realized that the transcription standards developed with illuminated books in mind could not handle the more complex structure of handwritten documents that contain a great number of authorial revisions and emendations. For simple manuscripts, new tags helped to describe acts of composition, such as ‘deletions’, ‘additions’, and ‘substitutions’ (which group ‘deletions’ and ‘additions’ together). The first attempt at encoding and transcribing the Four Zoas manuscript for the Blake Archive relied upon this tried and true set of XML tags — developed for illuminated books, but then enlarged in 2010 to accommodate manuscripts — and a colour-coded transcription display (Fig. 3) that would allow users to understand the different acts of writing that had been described in the XML markup.

Fig. 1 

An opening page from Blake’s unfinished manuscript Vala, or the Four Zoas. All examples from our subsequent discussion are drawn from this page.

Fig. 2 

Manuscript page juxtaposed with Blake Archive XML encoding for ‘The Tyger’ from Songs of Innocence and of Experience.

Fig. 3 

Browser windows open to Blake Archive manuscript image enlargement, editor’s transcription, and key to colour-coded display, from An Island in the Moon, object 1.

Relying heavily on Van Kleeck’s unpublished, experimental electronic edition of the manuscript, Rachel Lee re-encoded and transcribed the first few objects of The Four Zoas using this 2010 schema; it quickly became clear that ‘the protocols we had developed for encoding and transcribing manuscripts were woefully inadequate’. The following example shows how the transcription turned out, and it is easy to see how it ‘violates the basic purpose of transcriptions in the Blake Archive, which is to remediate the Blakean object for readers in useful ways’ (Lee, ‘Editing in Technicolor’). In other words, if one of the purposes of transcription is to represent Blake’s hand and process of composition in a readable format, this display fails (Fig. 4).

Fig. 4 

Early efforts with The Four Zoas using old transcription colour-coded display generated from old encoding schema.

Due to the hierarchical nature of our encoding schema (discussed further below), the colour-coded transcription can represent multiple revisions only in a linear sequence. As Lee explains,

the problem is that forcing Blake’s revisions into a strictly linear presentation distorts the text to the point of chaos. In our current transcription display, we are unable to show text in layers occupying the same physical space in the manuscript; we are unable to represent text written over illegible erasures; we can only show these changes alongside one another. (‘Editing in Technicolor’, emphasis in original.)

All such revisions, which at the various times of composition take spatial forms in writing on a surface, must again be translated by editors into new spatial forms to serve the needs of reader-viewers. The main problem we had to tackle in our next attempt was readability, which we felt was related to the linear sequencing (in space) of revisions (that had occurred in time) that our encoding schema dictated. To put it simply, the horizontal processions of visual language were interfering with the vertical stacks of revision. The combination was, in terms of visual communication, far too noisy.

Our attention turned to experimental displays: new ways to present our XML-based understanding of the manuscript. The most promising development involved a dynamic transcription display based on a layered model rather than the unsatisfactory linear one. The new display worked by presenting, at first, a clean reading layer that would subsequently expand upon a user’s mouse click to reveal our reading of the manuscript’s textual composition. To us, this kind of dual-focused transcription display seemed a promising digital answer to the compromise between readability and reliability that had so often burdened print editions. Gabi Kirilloff, a former Blake Archive project assistant, created an early mock-up (Fig. 5) of the first few lines of the manuscript. Her mock-up, hard coded in HTML and not developed from the existing XML, served as a proof of concept.

Fig. 5 

A line from the conceptual mock-up for the new Four Zoas display.

Compared to the confusing disorder of a transcription using the old encoding scheme, the clickable model seemed logical, useful, and promising. However, while the new display certainly made our transcription easier to read, we were concerned about a number of issues that arose in early, informal testing phases. Our test audience was positive about the dynamic interface that allowed them to choose between a clean reading layer and the more detailed representation of the revision layers, but they seemed unsure about how to decipher larger units.

First, we noticed that the interface encouraged unintended connections between separate moments of revision in different lines. For example, when testers clicked and expanded more than one line, they assumed that a revision that appeared in the top layer happened at the same time as emendations in the top layers of other lines. The interface (Fig. 6) seems to suggest that the addition of ‘The Song of the Aged Mother which’ in line 1 was inserted as part of the same revision campaign as ‘Marshalld in order for the day of Intellectual Battle’ in line 3. (As a term of convenience in our case, ‘revision campaign’ indicates the strong narrative tendencies of genetic editing.) In just this way, of course, all editorial methods encourage — require — interpretation by readers. The play of interpretation is constant on both sides — ours and theirs — of the text. But The Four Zoas is notoriously resistant to easy conclusions about the sequences of Blake’s revisions, which are stratified and interlaced in multiple orders that no one has, or likely ever will, successfully narrate in any but very limited ways — a point adamantly emphasized by Van Kleeck. We wanted to avoid adding new confusion while attempting to offer new perspectives.

Fig. 6 

Comparing levels of revision between lines, as reflected in transcription mock-up.

We wanted our transcription display to reflect only what can be seen in the manuscript, not to make implicit claims about Blake’s composition process. Our goal was to parse and display the sets of revisions that had gone into the creation of the text, line by line: a concept more spatial than temporal, more diagrammatic than narrative. While it seems clear that ‘The Song of the Aged Mother which’ was written after the deleted text that it replaces, we wanted our display to reflect only what was written on top of that erasure. It quickly became clear that the display unintentionally misled users to think that we were using our transcriptions to register implicit claims about the chronology of composition.

The second problem we discovered was that our way of displaying the sequence of revisions provided a counter-intuitive level of detail. Here, the affordances of the display require the small, word-level emendations of line 6 to be painstakingly separated from larger-scale revisions in unintentionally confusing ways. When the separate layers are revealed (Fig. 7), it is difficult to see how the revisions (in this case strikethroughs and added words and letters) are related to the layer underneath (the words or letters they are replacing). Moreover, our display suggests that Blake wrote out the entire line as it appears in the bottommost layer, and then made all the subsequent emendations simultaneously, which is the kind of intentionalist reading we try to avoid at the Blake Archive. For lines that have multiple small-scale revisions, the revealed layers become redundant at best and baffling at worst, whereas the reading layer on its own provides the user with all necessary information. In fact, the original colour-coded display that was broken by most of The Four Zoas actually does a better job of representing these smaller-scale moments of revision, even if it disrupts the organization of the words on the page.

Fig. 7 

Attempting to represent Blake’s small-scale emendations along with more extensive revisions yields a confusing display.

As a result of these findings, we started to think in the direction of a hybrid display capable of fusing the dynamic new capabilities of the layered display with the straightforward visual cues of the original colour code. Our main objective was to create a legible display that accurately represented the various layers of inscription on each object and yet avoided unnecessary (always a tricky word) editorial encroachment. However, by focusing solely on the way the transcription would ultimately look, we missed an important step in the process: without adequate editorial guidance (supplied by XML markup adequate to our display), the user is left to make assumptions. It is important to remember that, even though most users of electronic works encounter them only through the display, an electronic edition is much more. For example, the data in Archive XML files records the editorial decisions that have been made about a work, creating a machine-readable text from which a display — or, for that matter, many different displays — can be generated. In our Four Zoas experiments, we had, on one hand, an unruly transcription based on our old manuscript encoding standards, a record of editorial decisions, but not one that lent itself to a readable transcription. On the other hand, we now had an idea of how a readable and reliable transcription might look. In the loops of feedback that characterize a great deal of archival work, we had reached a new point of potential growth for the Archive. Our manuscript encoding schema, newly devised in 2010, had met its match in The Four Zoas, and a revision of underlying editorial rationales became necessary. With a new sense of direction, we went back to the drawing board, this time to begin by encoding (and thus describing) the document itself.

Setting the stage, losing the line

What we had gained in consistency and detail with our densely layered mock-up, we had lost in readability. In the end, we realized that the ordering of layers was a blunt instrument, a single method for capturing and conveying many different kinds of information about the manuscript: acts of composition, revisions, deletions, additions, substitutions, and fixations alike. This approach could not differentiate relatively minor acts of writing or revision from major ones; each authorial decision was given nearly equal weight. The effect was that small alterations to the text became prominent in ways that were unfaithful to the look of the manuscript page. It bears repeating, moreover, that our display led readers to infer connections between revisions both within and across lines that we never intended. And we still lacked adequate techniques for describing and displaying the often highly unconventional spatial layout of Blake’s pages (Fig. 1), with their many marginal (and often vertical) annotations and additions.

For inspiration we surveyed a number of digital projects that produce editions of heavily revised documents. The editions of Walt Whitman’s documents in the Whitman Archive provided us with examples of how to display heavily revised manuscripts on the scale of some page-objects of The Four Zoas, and the project’s editorial policy ‘to establish documentary texts rather than to reconstruct what are assumed to be authorially intended texts’ resonated strongly with our own.11 However, as we learned through early attempts to encode and display the work using the Blake Archive colour code, the single display option would render our transcription of The Four Zoas unreadable. The Melville Electronic Library and the fluid-text edition of Typee, on the other hand, are both committed to showcasing the various different iterations of a text that exist, for most readers, as a single, definitive version.12 By identifying sites of revision (that can be anything from a mark of punctuation, to a sentence or paragraph, to words and phrases) users can compare and track changes throughout a number of different manuscript and print witnesses. We admired the way that these projects offered multiple display options, including the ability to toggle between a readable base transcription and a representation of the various acts of composition and revision.

The goals of these projects differ from ours, however, by making overt connections between revisions in order to support genetic textual analyses.13 The project that helped us to imagine a useful compromise between reliability and readability was the Shelley–Godwin Archive.14 Using a draft TEI module designed specifically for genetic editions without committing their project to the traditional goal of genetic editing — to reconstruct as fully as possible the history of composition, in effect to tell the story of the creation of the work — editors can ‘represent the evolution of a literary work by examining the writing process as it is physically manifested on the page’.15 This ethos matches our own attempts to find a compromise between transcribing (and encoding) what we see, and representing the layers of revision in a user-friendly, dynamic display. Moreover, the commitment to the source document itself complements one of the core principles of the Blake Archive: ‘we emphasize the physical object — the plate, page, or canvas — over the logical textual unit — the poem or other work abstracted from its physical medium.16 The draft TEI module even provides us with a framework for talking about ‘differing levels of interpretation’:

At the same time, there is an obvious difference between the interpretation that some trace of ink is indeed a specific letter and the assumption that a change in one line of a manuscript must have been made at the same time as a change in another line because their effects are textually related.17

This passage exactly describes the tension that we had been trying to manage in our previous attempts at transcription. While of course we act as interpreters when we separate the layers of revision simply to make the manuscript legible, we do not aspire to a full ‘revision narrative’ of the text.18 In other words, the principles and practices of a genetic approach would help us to describe and encode the complex structure of the manuscript where many layers of text often occupy the same physical space. At the same time, we would combine this approach with our own commitment to ‘transcribe what you see’ — a Blake Archive mantra — stopping short of the kinds of interpretations that genetic editors often supply through their narratives. And we can see at this juncture how, as we might put it, the work is reading the medium and, with equal plausibility, that the medium is reading the work. At best, the feedback loop between the two is tight and intense, though the gap is always large enough that the work being edited and the edition are easily distinguished.

Now that we had better tools in the TEI genetic editing module and a helpful example in the approach of the Shelley–Godwin Archive, our task was to decide how to organize the transcription. The TEI module provides a tagset based on the concept of ‘revision campaigns’ or ‘stages’, which allows the documentation of ‘a particular stage in the genesis of a text’.19 Using this logic we would be able to assign each separate act of composition to a ‘stage’, much as we had done visually in our layered display. Furthermore, the genetic editing module includes the option of not specifying an order when defining the ‘stages’, which, in terms of the XML markup, would eliminate the necessity of creating the revision narrative that we were trying to avoid. Other tags, such as <zone> or <metamark>, would also allow us to move beyond the line as a unit of physical description of the document. Many pages of the Four Zoas manuscript are made up of a central section of text with additional text written in the surrounding margins. The element <zone>, which describes a rectangular area on the object, would allow us to divide the object into different areas just as we do when preparing an illustration description that locates picture elements on a simple four-part grid used by Image Search in the Blake Archive. The reminder that <line> is a physical descriptor as well as a semantic one refocused our attention on the physical attributes of the manuscript that we could see. We could incorporate a new hierarchy, including <zone>, <stage>, and perhaps <metamark> into our long-established system of lines and line groups.

In other words, the decisively limiting factor of the old manuscript encoding schema was the line itself. Though adequate to describe a unit containing, say, the etched and printed verse of ‘The Tyger’, a basic <line> unit was inadequate to the unfinished, heavily marked, revised, and annotated pages of the Four Zoas manuscript. (In Archive terms, a line is less a literary unit — the line of a sonnet — than a visual unit with a beginning and an ending on the same horizontal plane; a plate number, ‘37’, or a single catchword is as much a line as a conventional line of verse. Further, a line is a physical inscription or mark created with tools and materials.) To retain the original code, we would have had to lean heavily on explanatory editors’ notes whenever our approach failed to capture vital information about the manuscript page. Such elaborate editorial explanation would have put the Blake Archive edition squarely in the company of ‘self-contradictory’ print editions that struggle to find a coherent compromise among editorial approaches (Van Kleeck, para. 8). Not that we were not struggling. But using the genetic editing module for inspiration, we could supplement our existing conception of the line to allow for greater levels of complexity. A new method of describing the manuscript was a first step towards more accurate and useful yet still economical representation.

We drafted, tested, and revamped a set of definitions for three rudimentary terms (‘stage’, ‘emendation’, and ‘zone’) that convey more specific information about two fundamental dimensions of the manuscript, temporal and spatial. Using physical-textual evidence, not literary interpretation, we set out to answer two questions about each moment of writing on the manuscript page. First, when did any given moment of the writing or revision take place relative to other text on the page? And second, where is this text relative to other text on the page? We now encoded our answers (and non-answers) to these questions using the following definitions:

<Stage>: A moment of composition or alteration of the text (deletion, addition, substitution, or fixation) at the level of phrase, line, line group, or zone.

‘Emendation’: An alteration of the text (deletion, addition, substitution, or fixation) at the level of letter/numeral or word. ‘Emendation’ is not itself an element; it is an umbrella term that covers the existing Blake Archive elements for deletions, additions, and substitutions (<del>, <add>, and <subst>).

<Zone>: A spatial description of the location of a letter/numeral, word, line, or line group on the page.

These terms, or, if you will, conceptual metaphors, are neither quite firmly temporal nor spatial, but some of each, due, no doubt, to the intersection of the history of the language with efforts to understand time in spatial terms and space in temporal terms. Adopting them has implications for the space and time of writing (and, for that matter, picture-making). But in practical terms — editorial terms, technological terms — they allow a more nuanced description of the manuscript. With them, we could now more precisely capture the relative location and extent of any given moment of writing or revision. These categorizations, we want to stress, do not interpret the poem’s content, formal qualities, or narrative structure. Rather, our definitions are based in physical-textual evidence of writing or revision: legible or illegible writing, washes, strikethroughs, overwriting, insertions indicated by carets or arrows, changes in medium, and so on. The border between interpretation and description is never sealed, but we police it as rigorously as we can to avoid at least some of the confusions that mar the history of attempts to edit The Four Zoas.

Let’s consider a specific example. Fig. 8 shows an excerpt from the manuscript, along with our initial XML transcription encoded with the existing Blake Archive tagset:

Fig. 8 

A line from the manuscript juxtaposed with its old XML encoding.

The revisions here are necessarily nested, because Blake modified the same passage multiple times. The XML does capture the sequence, though it is hard to parse. Fig. 9 shows the same line marked up with our new tags:

Fig. 9 

XML encoding of the same line as Fig. 8, with new schema.

Here the inclusion of <stage> works to describe larger changes to the manuscript, while we handle emendations within any given stage with conventional Archive tags such as <del>, <add>, and <subst>. The dividing line between ‘emendation’ (changes to words or smaller units) and ‘stage’ (changes to phrases or larger units) can seem arbitrary. But making the distinction is pragmatic: it produces a clearer description of a complicated revision sequence; it allows us to avoid implicit connections between different acts of revision in close proximity; and it helps mimic the look of the manuscript page in the transcription.

With <stage> we are able to attribute a degree of temporality of any given moment of writing to earlier moments of writing on the page. All ‘stages’ proceed from the page’s base layer: the earliest visible moment of writing, whether still legible or not. Subsequent ‘stages’ refer to additional layers of composition, usually in the form of revisionary marks such as deletions, substitutions, additions, and so on. Most importantly, <stage> does not necessarily describe the relationship of one moment of writing to text elsewhere on the page, even if that text is tagged with the same <stage> label. This qualification approximates the role of an editorial explanatory note. While an excess of such notes plagued many previous editions of The Four Zoas, we believe that the benefits of our solitary stipulation outweigh any potential costs. In any event, we shall address this issue with improved display options that avoid implicit connections between distinct moments of writing of the sort that our previous display prototypes sometimes suggested.

The other new addition is the spatial designation: all the text in this line is contained in the ‘body’ <zone> of the page. In our old markup, text in the left margin, body, and right margin had to be included in a single <line>. We define <zone>, like <stage>, as a relative description of the spatial arrangement of any writing relative to the main text on the page. Each page can be broken down into a maximum of five zones: body, head, foot, left, and right. There are no set physical dimensions for any given zone; each object’s unique layout determines the kinds and sizes of the marginal zones. Just as <stage> and ‘emendation’ provide more tools to describe temporal sequence, the addition of <zone> to the units of line (<l>) and line group (<lg>) allows us to describe (and ultimately represent) the manuscript more accurately, and users will in turn be able to target their searches with more precision.

Transcribing such a difficult manuscript — especially one by a writer who is also an artist — often brings us up against a maddening question: ‘Is it writing or drawing?’. In a sense, the changes to our encoding strategy, and the evolution of our treatment of manuscripts, return to the Blake Archive’s image-based roots. Because of the profound complexities presented by The Four Zoas, we have found it helpful to borrow from our principles of systematic illustration description that underlie the Archive’s unique image-search tools.20 By using the spatial structures of pictorial composition in our descriptions of the manuscript’s textual information, we regard inscriptions on the page as both writing and drawing. In general, our new XML tags will ultimately make it easier for our edition to remain faithful to the look of the manuscript page, because they make it possible to describe and represent more significant acts of composition or revision by incorporating rudimentary terms of visual analysis into our repertory of markup and display. If our XML captures both spatial and temporal data more precisely, a richer display can present that data more fully and accurately.

Visualizing Blake: transforming The Four Zoas

Of course, the primary application of a richly encoded XML document includes some sort of electronic display. In the Blake Archive, electronic facsimiles — digital scholarly editions — of exceptional quality serve as the primary objects, but the XML files we create for each Archive edition serve a variety of purposes in support of these images. Some of these purposes a user does not actually see: searchability, some metadata, site architecture, etc. But seen or unseen, most contribute to the support of important features, such as textual transcriptions.21 For a manuscript as notoriously hard as The Four Zoas, a clear, dynamic, and robust transcription display becomes essential in providing a significant measure of legibility and documenting its tangled compositional features.

Although previous encoding attempts were genuinely helpful in understanding the manuscript and in shaping our group’s most recent approach, the electronic displays generated from these earlier XML iterations offered only glimmers of what digital technology might provide. At worst, these XML files, which were encoded with the Archive’s traditional tagset and transformed into a display accordingly, spat out an incomprehensible mess. Our new XML schema was better able to conceptualize and describe the structure of the Four Zoas manuscript, but to take full advantage of it, we would have to develop a new transcription display capable of visualizing our compositional description of ‘stages’ and ‘zones’ and of helping users read the text more easily.

So, beginning once again, we created a new XML file for the opening page of the poem (see Fig. 1). No simple task, this drafting process took several months to complete, but through the experience we proved, at least to our own satisfaction, the effectiveness and flexibility of our encoding schema. We felt confident that we had developed a clearer and more consistent approach to documentation, and the proof of progress beyond earlier efforts was in our cleaner XML.

Next, with this draft of our markup, we approached Joshua Romphf, Humanities Programmer in the Digital Humanities Center of River Campus Libraries at the University of Rochester. After several meetings and numerous email exchanges, we were able to clarify a wish list of display features that would, first, generate a readable transcription from our new XML markup and, second, equip users with a variety of tools that would enable an improved sense of access to, and play between, facsimile and transcription. In general, the new display would rely on a level of interactivity and manipulation previously unsupported in the Archive. Users would be able to toggle between different editorial perspectives in the transcription, expand and collapse compositional layers embedded in the manuscript, and isolate textual content in respective ‘zones’. These new features would join standard features of Blake Archive transcription such as line numbers, annotations, and colour-coded symbols for emendations by Blake’s hand to create a sturdy but fluid electronic edition.

Our collaborative XML design and Romphf’s Web display programming result in a split-screen prototype (Fig. 10) with transcription text on the left, an image viewer for the manuscript facsimile on the right, and a floating toolbar. The image viewer on the right is based on an open-source JQuery plug-in that Romphf customized for the prototype.22 The toolbar, buttons, and various scripting for the Web prototype were coded from scratch by Romphf. The split-screen layout is a departure from the Blake Archive’s current design, which prioritizes the facsimile images, bringing up editorial transcriptions and notes in a separate window after a user selects options from a drop-down menu.23 Because of the unique difficulty of the Four Zoas manuscript and the intended interplay between the new XML transformation and our editorial conceptualization of the manuscript’s structure through <stage>, <zone>, etc., we feel that a tighter visual relationship between transcription and facsimile is necessary as each is used to make sense of the other, while eliminating the bother of fiddling with multiple browser windows.

Fig. 10 

Basic browser layout of most recent display for the Four Zoas manuscript.

Now, with text and image juxtaposed, the default transformation for our edition displays a diplomatic transcription (Fig. 11) featuring a ‘reading layer’ using the Archive’s traditional typography for representing text and revisions such as additions, substitutions, deletions, etc. As with previous Archive transcriptions, our work here balances ‘transcribe what you see’ with a degree of editorial restraint, staying as close to the decipherable text of the manuscript as possible. In any event, the purpose of the transcription is to deliver the content of the manuscript, extracting and simulating linguistic inscriptions while improving legibility as an aid to the user. Representing such an editorial approach in an edition of The Four Zoas offers a crucial resource for literary scholars curious about Blake’s poem as he left it.

Fig. 11 

Manuscript display prototype with ‘diplomatic’ transcription option, which provides a surface-level reading text.

While our display defaults to a familiar diplomatic transcription, the click of a button on the toolbar can switch the text to our experimental genetic transcription. As a visualization of the XML-encoded ‘stages’, the genetic option features a multilayered transcription that illustrates — and, in a significant sense, animates as it sequences — the compositional structure of the manuscript (Fig. 12). As in previous Blake Archive experiments with electronic transcriptions of this manuscript, we have adopted an interactive expand/collapse feature that allows users, first, to view (in an admittedly limited way) the layered, sometimes impossibly messy, compositional structure of the Four Zoas manuscript and, second, to unpack these layers in order to see moments of composition in isolation. Single lines expand upon a mouse click, and users may expand or contract as many lines as they like.24

Fig. 12 

Manuscript display prototype with ‘genetic’ transcription options, which provides a dynamic illustration of the manuscript’s layered composition.

However, there will be no option to ‘expand all lines’ in order to reinforce the ‘stage’ definition that restricts our layered encoding to single lines. In other words, because our conception of stages cannot be applied across an entire digital object, or work, the display similarly restricts interaction with the visual simulation. This technical decision is our attempted editorial response to an abiding, fundamental editorial question: where to stop. Finally, the simulation of the manuscript’s layers, and the illustration of our editorial markup of that structure, are conceptually linked through animated movement between display options (Fig. 13).

Figure 13 

Animation of an expandable line in the ‘genetic’ transcription. []

As in more traditional genetic editing projects, our transcription focuses on physical acts of inscription — how a manuscript page has been marked and manipulated. Such an editorial approach offers an invaluable resource to scholars interested in Blake’s compositional practices. But our scepticism about the knowability of his broader sequences of revision — after all, this is a manuscript that Blake laboured over for the better part of a decade — restrains our impulse to supply the user with more and more editorial apparatus.

In terms of underlying technology, both transcriptions rely on XSLT — Extensible Stylesheet Language Transformations, more simply XSL Transformations — which is a language that translates XML markup into Web display, turning code into image. Specifically, XSLT gives our XML elements ‘classes’ that can be operated on with CSS (Cascading Style Sheets) and JavaScript, basic styling and programming languages of the Web. While it may seem to add an unnecessary extra step, this separation of descriptive markup (what is it?) and display (how do we want it to look?) is one of the primary, stabilizing benefits of XML for archival editing; display transformations can be updated or redesigned while the information encoded in the XML files remains available to, but isolated from, another layer of code that creates changes in visualization. The Blake Archive currently takes advantage of this remarkably stable system for maintaining and displaying electronic editions, as many other electronic archive projects do. (The Archive and its contemporaries, such as the Whitman and Rossetti archives, originally employed SGML — Standard Generalized Markup Language, the predecessor of XML.) Of course, the transformation language will have to be updated to accommodate our new XML tags.

Making use of our breakdown of surface areas with the XML element <zone>, the toolbar also allows the display of these areas to be toggled on and off (Fig. 14). Along with traditional editorial aids such as line numbers, ‘zone’ can assist scholarly citation, and toggling can help make the manuscript more readable. Finally, font manipulation such as transparency levels and internal spacing contribute to the transcription’s basic mimicry of manuscript characteristics.

Figure 14 

Animation of selection and isolation of transcription ‘zones’, or physical areas of the manuscript page. []

To beta and beyond

The latest step in the long history of coping with the Four Zoas manuscript has produced the (tentative and temporary) proof of concept that we have described — only the latest step in the electronic prototyping of Blake’s manuscript, but certainly not the last. First, in this process, the display is immediately beneficial as a source of feedback in our system of encoding the manuscript, and can be seen as a microcosm of the larger archival process, perhaps. In addition to creating a visualization of our ‘stage’ layering, a functional prototype display allows us to see and test revisions to our XML. Our encoded description of the manuscript’s properties is acted upon by the XSLT, and if something fails, misleads, or just looks odd, we can return to the original encoding to correct mistakes or revise editorial decisions. Conversely, and as a second point of the editing process, building more robust XML necessitates modifications to the functions of the display. In this sense, it is difficult to predict at the outset how the edition will ultimately look and how the transformation display will function. What can seem like unlimited options do not so much alleviate difficulty as raise the stakes of editorial decisions.

And decisions must be made. For example, in the TEI module for ‘Representation of Primary Sources’, <metamark> accounts for manuscript markup in the following way:

marks such as numbers, arrows, crosses, or other symbols introduced by the writer into a document expressly for the purpose of indicating how the text is to be read. Such marks thus constitute a kind of markup of the document, rather than forming part of the text.25

Few characterizations better describe Blake’s marks on the Four Zoas manuscript. Therefore, <metamark> seems to be a useful tag to include in the encoding schema, but how might one use this information? How might an edition communicate it visually? Using the colour code to differentiate between ‘text’ and ‘metamarks’ seems logical, but suddenly we are dealing with an excess of colour on the screen. Furthermore, because Blake’s own markup often links sections or moments of composition, is there a way we can account for textual relationships between discrete physical inscriptions? Is this problem an opportunity to make <zone> more meaningful, by linking <metamark> across our encoded textual spaces? Is linking to a <zone> specific enough to be useful? Do we need to dip back into relevant TEI and adopt <surface>, which can contain exact coordinates of an inscription? One question, and then dozens more.26

Thankfully, some are easier to answer than others. As Van Kleeck observed in his article on editing The Four Zoas, the simple but fundamental issue of which-page-goes-where is ultimately an editorial decision based on implicit or explicit arguments, with contrary opinions in no short supply. A digital environment makes it possible to shuffle manuscript pages into many configurations. We could also offer toggling options between page sequences set by different institutions and scholars. At this point, we are also unsure what to do with the visual placement of Blake Archive line numbers and editorial notes, but some sort of toggling or mouseover option also seems a simple and effective way to keep the screen clear of clutter. While we believe editorial annotation to be useful for our edition, a Blake ‘behind barbed wire’ is useful to no one.27

So far, we have tackled only a small fraction of the Four Zoas manuscript, which runs to 146 pages of varying complexity. Our latest prototype has held up well to the challenges presented by the first several page-objects, which contain some of the heaviest revisions. But future objects will present new tests: text that spans separate zones, text that is written horizontally at the beginning of a line but vertically when Blake reaches the margin, a single zone that contains vertical text with up/down and down/up orientations, and on and on — and almost always complicated by those palimpsestic layers of revision. Future works will present new challenges, too: we anticipate that ‘zone’ will be useful for transcribing Blake’s marginalia, and that ‘stage’ will allow us to better describe Blake’s densely revised Notebook. We have also begun to explore the forensic possibilities of digital image manipulation, using software like Adobe Photoshop. Early experimentation suggests that such tools can be useful in the effort to clarify or recover faded or deleted text in difficult manuscripts like The Four Zoas. Manipulated photos might further be useful as an apparatus for readers to see more clearly the rationale behind editorial decisions or as an access point to enhanced or alternate views of facsimiles.

As the Blake Archive continues to expand, so does the set of editorial tools needed to deliver its new publications. As we stated at the outset, this is an ongoing process: we establish an editorial framework, find its limits as we add more challenging manuscripts, and adjust our approaches accordingly. For a manuscript as challenging as The Four Zoas, our methods have required far more forethought, more trial and error, and a more radical overhaul than usual. Our larger aim, of course, is not simply to modify our approach for this particular manuscript but also to keep our eye on future challenges. One work, and then many more.


1‘Plan of the Archive’, in The William Blake Archive, ed. by Morris Eaves, Robert N. Essick, and Joseph Viscomi <> [accessed 30 September 2015].

2The evolution of the Archive has entailed a ‘difficult transition from a specialized to a generalized framework’ (‘Plan of the Archive’) from the publication of illuminated works (e.g., Songs of Innocence and of Experience <>) and various pictorial media to the incorporation of manuscripts (e.g., An Island in the Moon <>) and, most recently, typographic works (e.g., Letter to William Hayley, 23 October 1804 <>) [all accessed 30 September 2015].

3For an overview of the history of Vala, or the Four Zoas and its various editions, see Justin Van Kleeck, ‘Editioning William Blake’s VALA/The Four Zoas’, in Editing and Reading Blake, ed. by Wayne C. Ripley and Justin Van Kleeck, Romantic Circles, (September 2010) <> [accessed 30 September 2015]; and Rachel Lee, ‘Editing in Technicolor: The Blake Archive’s Edition of the Vala, or the Four Zoas Manuscript’, Huntington Library Quarterly (forthcoming). For the broader editorial context, see Morris Eaves, ‘Crafting Editorial Settlements’, RoN: Romanticism on the Net [now RaVon], 41–42 (2006) <> [accessed 30 September 2015].

4Lee, ‘Editing in Technicolor’. Lee is a special consultant to and former project coordinator of the Blake Archive team at the University of Rochester.

5Van Kleeck, ‘Editioning William Blake’s VALA/The Four Zoas’. Van Kleeck is a consultant to and former project manager of the Blake Archive.

6In the hopes of reaching a wider audience, we are restricting ourselves to this admittedly simplistic portrait of Blake editions painted against an even simpler backdrop of scholarly editing. For a primer of the field and a survey of various approaches to historical textual editing, see G. Thomas Tanselle, ‘The Varieties of Scholarly Editing’, in Scholarly Editing: A Guide To Research, ed. by D. C. Greetham (New York: MLA, 1995), pp. 9–32. Van Kleeck sketches a useful overview of ‘documentary’, ‘genetic’, and ‘intentionalist’ editorial approaches in ‘Editioning William Blake’s VALA/The Four Zoas’ (see especially notes 1 and 2).

7In addition to these practical goal-oriented benefits, the process of remediation, from complex physical objects into digital surrogates, can be a remarkably fruitful site of knowledge production. For a case study, see Aaron Mauro, ‘Versioning Loss: Jonathan Safran Foer’s Tree of Codes and the Materiality of Digital Publishing’, Digital Humanities Quarterly, 8.4 (2014) <> [accessed 30 September 2015].

8For more details on the scanning and colour-correction process, see the Blake Archive’s ‘Technical Summary’ <> [accessed 30 September 2015].

9We realize, too, that such experimental digital work includes an implicit argument on the design and role of interfaces and their re-presentation of primary objects. See Alan Galey and Stan Ruecker, ‘How a Prototype Argues’, Literary and Linguistic Computing, 25 (2010), 405–24.

10For a detailed account of the development of these encoding standards, see Rachel Lee and J. Alexandra McGhee, ‘“The productions of time”: Visions of Blake in the Digital Age’, in Editing and Reading Blake, ed. by Ripley and Van Kleeck <> [accessed 30 September 2015].

11‘Editorial Policy Statement and Procedures’, in The Walt Whitman Archive <> [accessed 30 September 2015].

12The Melville Electronic Library <> [accessed 30 September 2015].

13While the documentary approach to editing focuses on the final physical object, genetic editing attempts to capture the process of change across time. Of course, Blake’s Four Zoas manuscript represents myriad revision campaigns and would lend itself well to a purely genetic project. However, as we will continue to explain, our strategy is to borrow certain genetic editing concepts to help us describe how those revisionary marks contribute to the manuscript’s final, complex textual structure. For a thoughtful consideration of genetic editing in a specifically digital context, see Elena Pierazzo, ‘Digital Genetic Editions: The Encoding of Time in Manuscript Transcription’, in Text Editing, Print and the Digital World, ed. by Marilyn Deegan and Kathryn Sutherland (Farnham: Ashgate, 2009), pp. 169–86.

14Eric Loy, ‘Transparency is Collaboration’, The Cynic Sang: The (Un)Official Blog of the Blake Archive <> [accessed 30 September 2015].

15‘Technological Infrastructure’, The Shelley–Godwin Archive <> [accessed 30 September 2015].

16‘Editorial Principles: Methodology and Standards in the Blake Archive’, The William Blake Archive <> [accessed 30 September 2015], emphasis in original.

17TEI Workgroup on Genetic Editions, An Encoding Model for Genetic Editions <> [accessed 30 September 2015], emphasis in original.

18‘This Site: An Elementary Guide’, in Herman Melville’s ‘Typee’: A Fluid-Text Edition, ed. by John Bryant <> [accessed 30 September 2015].

19TEI Workgroup on Genetic Editions, An Encoding Model for Genetic Editions. N.B. This concept has been adopted in the most recent version of the TEI guidelines (P5) as <change>. However, we felt that the term ‘stage’ was a better match for the characteristic revisions in the Four Zoas manuscript. See TEI Consortium, ‘11.7 Changes’, TEI P5: Guidelines for Electronic Text Encoding and Interchange <> [accessed 30 September 2015].

20See Morris Eaves, ‘Picture Problems: X-editing Images 1992–2010’, Digital Humanities Quarterly, 3.3 (2009) <> [accessed 30 September 2015].

21See the Blake Archive’s ‘Technical Summary’ for a thorough description of these basic technologies, including imaging standards, file formats, server structure, etc.

22‘JQuery.iviewer’, distributed via GitHub under the MIT open-source licence <> [accessed 30 September 2015].

23It should be noted that the Blake Archive is currently undergoing a site-wide redesign, which, of course, will change certain functions of the site.

24In this early prototype, efficient coding required the placement of a small button next to the line. Future versions will incorporate the click functionality into the line itself. In the prototype, vertical spacing between lines is static, with enough room for click expansion. Future versions will use dynamic spacing so that, when collapsed, the transcription resembles the manuscript page more closely and, when expanded, lines move vertically to accommodate the revealed layers.

25TEI Consortium, ‘ Metamark’, TEI P5: Guidelines for Electronic Text Encoding and Interchange <> [accessed 30 September 2015].

26Blogging has become a useful tool in thinking through some of these issues. See Eric Loy, ‘Blake’s Track Changes’, The Cynic Sang <> [accessed 30 September 2015].

27Lewis Mumford, ‘Emerson Behind Barbed Wire’, New York Review of Books, 18 January 1968, pp. 3–5.