After the last instalment’s detailed discussion of beam grouping and sloping, today’s development update is going to focus on some of the issues around laying out a page of music, in particular working in the horizontal dimension, including casting off and justification.

From galley view towards page view

The first kind of view onto the musical score that we implemented in our new application was an infinitely-wide view that lays the music out in a single system. In other applications, this is called things like scroll view, continuous view, or even Panorama; we call it galley view. This is an allusion to the use of the term in print publishing, when the typesetter would produce an initial proof of a book, having laid out the metal type into galleys, the trays that hold the individual pieces of type.

A galley full of type. Image courtesy of the Greyweather Press

A galley full of type. Image courtesy of the Greyweather Press

Although galley proofs for books are still typically laid out on pages, those pages are normally not bound together, and indeed are often handed to the proof-reader in signatures (large sheets of paper with multiple pages printed on them, which will eventually be cut and folded to produce the correct sequence of pages when the book is imposed). The analogy is imperfect, since in our application galley view produces no pages at all, but it is the same in spirit: a proof-reader looking at a galley proof is normally looking for typesetting errors, rather than examining the final layout in detail (that later stage of proof-reading uses page proofs), and the purpose of galley view in our application is to allow you to focus on the musical content, rather than on the final system and page layout1.

A galley proof of a poem by Ralph Hodgson. Image courtesy of the special collections at Bryn Mawr College Library.

A galley proof of a poem by Ralph Hodgson. Image courtesy of the special collections at Bryn Mawr College Library.

Galley view has been implemented since the very first temporary drawing of the music that we achieved two summers ago. We later put in a very simple version of what we at the time rather optimistically called page view, where the music would be drawn on a single page of infinite height, with the music split into systems of four bars each, regardless of how wide or narrow each system ends up. The main purpose of this was to expose the sorts of issues that arise with notations that start on one system, and end on the next: ties, slurs, beams, octave lines, and so on, and so on. Each kind of notation needs special handling to ensure that it is laid out correctly on either side of the system break.

We also implemented a debug option that would force a new system at intervals of an arbitrary number of quarter note (crotchet) beats, even if that position did not correspond to a barline, i.e. in the middle of a bar: this helped to expose less common sorts of issues, such as what happens if you need a system break to occur in the middle of a tuplet, and so on. Because our application has built-in support for open time signatures, where bars can be of any length, it’s crucial that it can place a system break at any rhythmic position, so these kinds of simplistic options are important in forcing us to think about how to handle the different ways in which notations must be represented on either side of a system break.

Over the past few months, however, we’ve started to work incrementally on taking that very basic page view and moving it closer to an actual page-based view of the music. We have to date focused on the horizontal aspects of page layout: casting off music into systems, calculating the rhythmic space required by the music on each system, and then justifying the music into the available width, processes which are somewhat interdependent and which I will discuss in more detail below.

There are also, of course, the vertical aspects of page layout: setting the appropriate distances between staves within a system, determining how many systems can fit onto a page of a given height, and then justifying the music into the available height. Work in this area is just beginning in earnest.

Splitting up the process of laying out a page of music down into smaller, more discrete steps is in the spirit of the architecture of our application: we always break complex problems down into smaller, (hopefully) simpler ones, writing individual engines (which we call processors) to solve each problem in isolation, before chaining together them together to tackle the larger issue.

Rhythmic spacing

I’ve discussed our general approach to rhythmic spacing before, including the ability to kern adjacent columns, introducing additional space above and beyond the rhythmic space that should be allocated to each column only when necessary: for example, the presence of a group of accidentals on a chord will only require additional space to be allocated if they would otherwise collide with the previous note or chord; if they can tuck above or below the previous note or chord, then they will. In addition to the crucial benefit of ensuring that the rhythmic spacing is as consistent as possible, another important benefit of this approach is that our application can determine exactly the minimum amount of space to space the music with correct rhythmic proportions, and no collisions between items.

Casting off

The knowledge of exactly how much space a given span of music requires is an input into the processor that determines the casting off of each system. Casting off is another term borrowed from print publishing, where it refers to the process of estimating the number of signatures required to typeset a given text. In music publishing, casting off is the process of determining how bars should be distributed within systems, and how systems should be distributed within pages, to produce a pleasing layout that uses the appropriate (not necessarily minimal) number of pages. Just as in print publishing, where complex considerations concerning widows (one or two lines at the start of a paragraph left at the end of one page) and orphans (a few words at the end of a paragraph spilling onto a new page), starting new chapters or sections on right-hand pages, ensuring that no two successive lines end with a hyphenated word, and so on, the process of casting off music is something that requires a good deal of skill.

Before engraving became computerised, casting off was normally the first step in preparing a piece: the engraver would mark up the manuscript or whatever source he was working from, and plan out how much music would appear on each system. He would do this even before working out the punctuation of the music2. Casting off is definitely more of an art than a science: engraver George McGuire, a recipient of many Paul Revere Awards (given out by the Music Publishers’ Association of the USA in recognition of excellence in music engraving), wrote:

“This is not an easy thing to do. Nor is it easy to describe how it is done. No one can tell you how to ride a bicycle, you learn by doing it – and falling off.”

Casting off, therefore, is one of the most difficult processes to ask a computer to perform, since it requires a high degree of judgement and discretion to do it well. However, there are aspects of casting off that a computer can perform much more efficiently than a human, and can arrive at a good result by way of a more deterministic method.

A human engraver, for example, must calculate the punctuation of the music by hand. He won’t have the time or inclination to do each system more than once, if at all possible, so based on his prior experience, he will make a number of quick decisions at the start of the process that will eliminate many possible solutions from the outset. He will make a judgement about the page size, the page margins, and the rastral (stave) size. These fundamental dimensions have an enormous impact on the punctuation; changing any one of these variables will dramatically alter the final layout of the music. To give a very simple example, using a larger rastral size will reduce the amount of rhythmic space (often measured in notehead widths, which are normally just a little more than one space wide, the space being the distance between two adjacent stave lines), so less music will fit onto the system.

A human engraver must also very carefully consider the vertical dimension at the outset: he must determine where the distance between staves must be increased to accommodate very high or low notes, or extra playing techniques, and so on. Once the staves are marked on the plate, or stenciled onto the page, it is very costly to go back and change his mind.

The computer, on the other hand, can perform the calculations required to punctuate the music in the blink of an eye. It can cope with changes to any of the vital statistics – page size, margins, stave size – at any point in the process. And, of course, the cost of making a change to the casting off is practically zero, because everything can be redone in an instant if the engraver changes his mind, if more music is added, or if the existing music is edited.

The computer can, to some degree, compensate for its lack of human judgement and discretion through its ability to evaluate combinations of possibilities thousands of times more quickly than a person can. When the computer performs casting off, unlike the human engraver, it starts with the punctuation: this gives an objective measure of the amount of space required to typset a span of music while producing appropriately proportioned spacing for the rhythm without any items colliding graphically. The computer also knows the width into which the music must be flowed: this will typically be the page width, minus the page margins, minus the width occupied by the staff labels, brackets or braces, and initial clef and, if appropriate, key and time signatures at the start of the system.

Most scoring programs seem to use a variant a kind of “greedy” approach to casting off: each system is simply filled up as much as possible, working from the start of the score to the end, which means that they might be left with a final system that is almost empty, and they deal with this remainder in different ways. Product A3, for example, appears to distribute the remainder over the last two systems. Product B chooses not to deal with the remainder at all, and will happily allow the last system to contain only one bar (though, in its defence, it provides very efficient tools to move bars between systems that help the user to exercise his or her judgement over casting off very quickly.)

Product C, according to its documentation, tries to produce an optimal result for casting off all systems. Since real-time performance is not a requirement for an application that produces output compiled from a text-based input format, the trade-off between requiring more computation to calculate the result and the speed of arriving at that result is a good one. For real-time, interactive applications, however, a compromise is needed in order to balance the quality of the result against the time taken to compute it. One design goal for our application in particular is to always do only the minimal amount of recalculation to respond to an edit, to make the performance of the application as high as possible.

Because there’s no one perfect solution to casting off any given score, the various scoring applications have all taken different approaches to solving the problem. Each approach has benefits and compromises, and it’s instructive to compare the results of each against our application, and against the results achieved by an expert human engraver.

Let’s look at an example: we’ll start with the Edition Peters engraving of Fugue 16 from Book 1 of Bach’s Well-Tempered Clavier4, the G minor BWV 861. The engraver has successfully cast off the 34 bars of music onto two pages, with a very consistent layout: of 11 systems, all but one have three bars, with the fourth system having four. The engraving becomes tighter as the rhythmic complexity grows with the introduction of the third and fourth voices, but it is always very clear.

Here is a comparison of the default results produced by Product A, Product B, Product C, and our own application:5

A comparison of the casting off of BWV 861 by four scoring applications, using the Edition Peters engraving (1937) as the point of comparison.

A comparison of the casting off of BWV 861 by four scoring applications, using the Edition Peters engraving (1937) as the point of comparison.

All of the applications produce 12 systems by default, rather than 11, with the result that there are a number of systems that only contain two bars, rather than three. None of the applications produce a result as consistent as the original engraving by default.

The result from our own application is based on using its default rhythmic spacing, which is wider than the Edition Peters engraving, and comparable to the default rhythmic spacing of the other applications. However, because of the sophisticated rhythmic spacing algorithm we have implemented, it’s possible to make the spacing more narrow by changing a single value, which produces spacing much closer to the original engraving, and also produces a solution with 11 systems rather than 12 (though our application bites off four bars for the first system as well as the fourth, so it still produces one system of two bars on the second page). Here is an image showing bars 29–31, with the original engraving above and our application’s result below:

Bars 29–31 of BWV 861; above, an extract from the Edition Peters engraving, 1937; below, an extract from our in-development application.

Bars 29–31 of BWV 861; above, an extract from the Edition Peters engraving, 1937; below, an extract from our in-development application.

(There has been no editing of the appearance or positioning of any of the elements in the image from our application, by the way: these are the default results when opening a MusicXML file with no positioning information included.)

Justification

The final element of the horizontal aspect of page layout is justification, another term borrowed from the related world of text typography and publishing. Justified text is where the spaces between words (and, sometimes, the spaces between individual letters) are increased to align both the left- and right-hand ends of the text with the surrounding lines, the page margins, or the gutter between columns.

Music is almost always justified: the stave starts at the left-hand page margin, and ends at the right-hand page margin. The process of spreading out the music to close the gap between the required width (providing adequate rhythmic space while avoiding collisions) and the available width (i.e. the width of the stave, minus the width of the items that appear on every system) is what we call justification.

In the days of hand engraving, the application of justification was another area where human judgement had to be carefully applied. Imagine that your available space, once you have subtracted the page margins and the musical preamble at the beginning of the system, is 120 spaces. Having completed the casting off, the required space for the bars to be included in this system is 111.5 spaces. Now, across the width of the whole system, a surplus 8.5 spaces must be distributed in a way that does not produce distortion.

The strategy that the engraver would employ to distribute the surplus space varies according to the number and duration of notes on the system. One strategy, for example, would be to identify the longest notes, and to allocate the surplus space equally only to the longest notes; if this doesn’t work out nicely, you might allocate it in smaller parcels to all of the notes that have the most frequently used duration; and so on. The one thing that a human engraver would find troublesome to achieve would be to distribute the surplus space absolutely evenly between all of the columns in the system, because this would typically mean adding a tiny amount of space very precisely to every column.

Of course, for a computer, applying a tiny fraction of a space to each column in a system is a simple bit of arithmetic, and will give overall more consistent results, particularly in the case of complex cross-rhythms and tuplets. This is an example of where a computer’s precision actually improves upon the more rough and ready approach taken by a human engraver. The computer can also more easily introduce small adjustments designed to produce a more optically pleasing result (for example, to ensure that the graphical distance between successive notes with opposing stem directions appears more equal to the eye).

By way of illustration, here are the same three bars of BWV 861 shown above, first drawn with justification disabled, and then drawn with justification enabled:

Bars 29–31 of BWV 861; above, justification disabled; below, justification enabled.

Bars 29–31 of BWV 861; above, justification disabled; below, justification enabled.

Notice how, even in the unjustified case, no elements are colliding: look, for example, at the left-hand F#2 on the third beat of bar 31, and notice how it is tucked underneath the preceding E3. There is no distortion of the rhythmic spacing caused by the presence of the accidental on the F#2, as is the case with other scoring applications. Even in the justified case, the accidental still tucks nicely underneath the preceding note, though the justification adds a little extra space between these two columns, so the accidental is not too tight on the previous note’s stem.

Vertical spacing

Of course, spacing the music according to its rhythmic durations, casting it off into systems, and then justifying the leftover width only addresses the horizontal dimension of page layout. In order to produce a complete page of music, the vertical dimension must be considered too: determining not only the optimal spacing between staves within a system, but also distributing the systems within the available height of the page as well.

The other important factors are the stave size, and the page size. Our application makes it very simple to change the stave size: a default stave size is defined for each layout, and the staves belonging to an individual player can be set to a different size expressed as a percentage of the default size – though the use of more than two different stave sizes within the same system is very rare; for example, ossia staves are typically three-quarters the size of the default, and small staves used to cue in a soloist’s part for the pianist in, say, a violin sonata are similarly sized. In some works for large ensembles, the number of staves may change from one movement to the next, or even sometimes within the same movement. To ensure an optimal layout throughout the whole score, it may therefore be necessary to change the default stave size, which can be done easily at any system break.

There is a great deal that could be written about the vertical aspects of page layout, but further discussion will have to wait for another instalment of this diary.

That’s all, folks

Of course a great deal of other work is going on. Our colleagues in Hamburg are continuing to work on extracting the audio engine from Cubase so that it can be integrated with our application; we continue to improve the input and editing workflows; we continue to iterate on the visual design of the application; we are starting to build the infrastructure that will make it easy to change the appearance of any notated element in your project; we are using the metadata defined in Bravura for fine adjustments to things like how stems attach to noteheads; we have beautiful slurs; and more besides. But all of this will have to wait for another time.

In the meantime, from all of us in London, thank you for taking an interest in the development of our application. If you have any questions or comments, please don’t be shy: leave a comment using the form below, or feel free to contact me directly.


  1. “Galley view” also happens to be the name given to this kind of view in MOTU’s venerable Mac OS scoring program, Composer’s Mosaic. 
  2. Punctuation is described in part 9 of this development diary, but briefly: it’s the process of determining what the columns to be spaced are, by creating a new column for each note onset across all of the staves in the system, and assigning the rhythmic value of the shortest sounding note across all systems to that column, to determine the allocation of rhythmic space to columns. 
  3. Product A and Product B are the two leading commercial, proprietary scoring applications; Product C is an open-source music engraving package that uses a text-based input format. 
  4. See pages 17 and 18 in this PDF, available from ISMLP (#69275). 
  5. A note on the methodology used in this comparison. A MusicXML file, containing no page and system formatting data, was opened in Product A and Product B, and the page size, page margins, stave size, and stave spacing parameters were changed to try to achieve the same density of systems per page and available horizontal space as the Edition Peters engraving. No changes were made to the default rhythmic spacing options. For Product C, the same MusicXML file was converted to the appropriate text input file using a tool supplied with Product C, and the resulting input file was lightly modified to set the page size, stave size, and page margins to match the Edition Peters engraving; neither the default stave spacing nor the default spacing was modified. For our own application, the same MusicXML file was opened, and the only adjustment made was to the stave size (at present, the page width is hardcoded, so it is instead necessary to adjust the stave size such that it is in proportion with the hardcoded width).