All of a sudden, over the past week I have received lots of requests for an update on our progress, so here we go. My last update was at the start of November, on the occasion of the first anniversary of our team joining Steinberg. This new update comes more or less on the occasion of the first anniversary of us starting to write our new scoring application. We’ve made a lot of progress in a year, but we still have a long way to go before the product will be in your hands.

In my last update I described the high-level design of the musical brain at the heart of our application: lots of individual engines performing single tasks, some dependent on one another, others running independently. This design approach allows us to break down the immensely complex problem of how to correctly notate and lay out music into smaller, more manageable chunks, and also provides the opportunity for these little engines to be designed and implemented independently, and eventually to run in parallel (unless an engine is dependent on the output of another engine, of course).

I described in broad terms the first couple of such engines that we had implemented already: determining the staff position of a note (taking into account clef, transposition and octave shift), and determining the stem direction of a note (taking into account staff position and musical context). We were also starting to embark on engines to handle note and rest grouping, and to position rests in multiple voices correctly. Over the next couple of posts, I’ll talk about these engines, together with a few other new ones we have developed since the last update.

Note and rest grouping

How notes and rests are rhythmically grouped, based on the prevailing metrical grouping or time signature, has a fundamental impact on how music is apprehended at sight. As with many of the rules of music notation, an experienced musician will instantly recognise if it is wrong, but may not always be able to articulate exactly what is wrong.

Consider the very simple case of a note on the final eighth note subdivision of an otherwise empty bar of 3/4, and a bar of 6/8. You can instantly see which is which:

3-4-vs-6-8-rests

Now consider the same bar, but with a note of five eighths in duration at the start of the bar. Again, it’s obvious which bar is in 3/4 and which is in 6/8 by the way the long note is divided into smaller parts:

3-4-vs-6-8-notes

Most scoring programs take quite a simplistic approach to note and rest grouping. Typically, the user inputs the note durations exactly as they want them to appear, and the software only has to worry about creating rests to pad out the bar; in the example above, neither of the two most mature commercial scorewriters provides a means for the user to enter a note of five eighths in duration using the mouse or keyboard, so the user has to choose how to split that note of five eighths himself. The exception to this, of course, is when the user plays notes using a MIDI instrument in real time, or imports a MIDI file – but you might be surprised at some of the choices that the established scoring programs make with relatively simple input.

It seems that, on the whole, existing scoring programs have a simple algorithm for note and rest grouping according to the time signature, with a few common edge cases given special handling.1

Our goal is to handle note and rest grouping completely automatically, while providing a small number of global options to account for some variation in conventions, and of course allowing editing on a case-by-case basis. For example, it’s normal modern practice to use dotted rests in compound time signatures, but if you are reproducing an earlier edition, you might prefer not to use dotted rests; you might not ever want to see double dotted rests; and so on.

To this end, we have devised an algorithmic approach to note and rest grouping that is fully general, so it works satisfactorily in any time signature or beat grouping. This has taken time to get right, but it will be used considerably more in our application than in other scoring programs, because the notation in our application is considerably more dynamic than in most other scoring programs.

Positioning rests

When two or more voices (independent streams of notes) are rendered on the same staff, it’s important that rests in both voices are vertically positioned in such a way as to remove ambiguities in the rhythm of the music. Elaine Gould’s book Behind Bars has some great examples of the kinds of ambiguities that can occur, like this doozy2:

gould-rest-book

Obviously this example is somewhat contrived, designed to make a point: Gould states that when the notation gets this complicated, each voice should really be given its own staff. But if you’re going to put it on a single staff, you need to ensure not only that rests do not collide with notes (notes obviously can’t move vertically to avoid rests, so rests have to move vertically to avoid notes), but that the rests are placed generally in line with the voice to which they apply. Look, for example, at the 32nd rest at the end of the second bar: this rest is in the lower voice (whose notes are the C5 quarter note and the D6 double-dotted eighth). It could physically be drawn in its nominal normal position, i.e. centered around the middle staff line, as it wouldn’t risk overprinting with a note there: however, it would then look as if it were following the stem-up eighth note in the upper voice, which is ambiguous, if you’re generous, or just plain wrong, if you’re not.

How do the main scoring programs handle this example by default? Here’s Product A, one of the two most mature commercial scorewriters:

gould-rest-sibelius-default

Ignoring the issues with stem-up and stem-down notes colliding in the first bar, you can see that a simplistic approach is taken, simply offsetting the rests in each voice outwards, away from their normal position in the middle of the staff, by a space. This isn’t enough to avoid the notes, and only one rest (the 16th rest in the lower voice in the final bar of the example) is positioned satisfactorily.3

Product B, the other of the two main commercial products, fares somewhat better:

gould-rest-finale-default

This takes a similarly simplistic approach to Product A, offsetting the vertical position of the rests in each voice by a fixed amount, but the decision to offset by three spaces rather than Product A’s one space definitely produces a more legible result.

The open-source music engraving program, which we’ll call Product C, fares better still:

gould-rest-lilypond-default

Here all of the differences from Gould’s preferred rendering really come down to a matter of taste: it would be neater to align the eighth rests in the lower voice in the first bar, and it would be clearer to place the lower-voice bar rest in the fourth bar and the upper-voice half rest in the fifth bar on leger lines further offset by one more space, but it certainly provides a legible result by default.

Finally, here’s how the rest positioning algorithm in our application handles this case:

gould-rest-steinberg-default

In the interest of full disclosure, the above example shows how the rests are positioned by our algorithm, but is not a direct screengrab from our application, because the prototype score rendering technology in our test harness doesn’t produce such a beautiful result just yet.

While it’s relatively quick to adjust the vertical position of rests in both of the main graphics-based scoring programs (and can be done in Product C by specifying a “pitch” for the rest), in our application it’s our goal that you simply won’t have to do so – even under extreme circumstances such as this contrived example.

More to come

In the next instalment in this series (coming soon!), I’ll describe a couple more of the engines we’ve been working on recently, focused in particular on accidentals.

  1. It must be said that the two most mature commercial scoring products fare much better in this regard than, say, the leading open-source music engraving program, which takes the most simplistic approach possible when converting MIDI data into notation, simply biting off as much of a long note as possible as can be represented by a single note value (with augmentation dot if necessary), and then handling any remaining duration in the same way, so every long note is always split up using durations that proceed from longest to shortest, completely ignoring the beat divisions implied by the prevailing time signature. Unfortunately for users of text input-based engraving programs, they probably also have the hardest job of fixing up the result, given that they have to manually edit the resulting text input file. In fairness, it is presumably not often that users of these kinds of programs choose to begin a project by importing a MIDI file. However, even the major graphic-based scoring programs offer relatively little control over the rules to be used.
  2. This example is found at the bottom of page 37.
  3. Improving rest positioning in Product A is, at the time of writing, its third highest-voted feature request.