Challenges in Intermodal Translation

Effective use of visual space

Music and Motion

Music is full of implied motion. The rising and falling of arpeggios, propulsive rhythms, the quiver of ornamental trills, the momentum of accelerandi, the trajectories of a fountain of saxophone notes, the soaring of a guitar-note riding a thermal of feedback... The implied motion in music draws our bodies into motion: the unconscious tapping of the foot, the conscious movements of dance. Motion can be expressed through virtually all the features of music. This myriad of motions defies formulaic graphical representation. Fischinger often used motion to present the gestures of short melodic figures. The Music Animation Machine is somewhat trapped in its horizontal time-line, but this gives clarity to longer melodic trajectories. The colour organs expressed the slower transitions of keys and their related emotional spaces. The motion of the conductor's baton is fundamentally different that the motion of the fingers across the fretboard, or the polyrhythmic banging of drums.

The Use of Space

These motions want space in which to express themselves. It is clear that a comprehensive representation of music must use multiple strategies for the use of the available space. The simple mapping of time to the horizontal and pitch to the vertical is far too inefficient a use of space for the task. The bulk of the space in this form of representation is used up for memory and foresight, with just a narrow band in the center to contain all the dynamism of the present. It seems to me that space must have a hybrid set of meanings. One approach that would seem promising is to reflect the hierarchical structure of the visual cortex, by expressing different kinds of motion at different scales of representation, from local detail though middle-sized structure up to global features. These different scales of features can be designed to express themselves in the preferred vocabulary of that layer of the cortex that processed features of that scale. For example, edge-orientation, local contrast, spatial frequency and stereoscopic depth at a local level most likely expressing the features that fluctuate most rapidly, since that part of the cortex dedicated to local features has the lowest temporal latency. Shape and position would seem to suite a middle level of detail, with larger patterns of global coherence expressing at the largest scale. Sense of passing time may still have spatial expressions to assist the memory in holding onto the passing structure at each level of detail without demanding the majority of available space.

For example in the small example of visual rhythm presented earlier, (and here again) the spatial play of the patterns of small circles descending to the lower left are an example of a situation where time is expressed across a spatial dimension locally, and recurrently rather than globally as a fixed dimension of the display. One can choose to focus on this, or ignore it and allow the overall play of rhythms to dominate.

(click on the image to stop or start.)

Use of the Figure / Ground relationship

Visual information is also perceptually organized into figure and ground. Figure / Ground relationships within a section of the image can be used to present relationships between immediate features and longer term feature trends, or between local phenomena and global features that have expressive relationships.

Depth of Field

It would be useful to be able to use depth of field as well to allow the eye to wander not just the plane of the image but to prioritize information within a section of the image by bringing one layer of visual features into focus and blurring another into a kind of ambient presence. In the absence of such a display, we can think of allowing the user to control this sort of depth of field, allowing the user to shift figure / ground relationships to change focus while retaining the overall relationship between the constituent parts.

Use of the Periphery

Peripheral vision in some ways seems like a completely different sense than foveal vision. A visual presentation of music could present some kinds of information on the periphery and others at the center of the images. My sense is that these two paths involve a different kind of attention, and are unlikely to conflict and obscure each other. The periphery is largely responsive to visual change, and does not have much colour and brightness resolution. The edges of the visual field might be used to present dimensions of temporal inflection meant to modulate the material being digested more slowly and in more detail in the fovea and surrounding area. I think it would be preferable to keep the assignments open to allow the translation to have different states which bring one set of features to the focal point and others to the edges.

next: The Plasticity of the Brain