« The vOICe Home Page
Now you can
|
See with sound: see with your ears!
Free soundscape synthesizer and sequencer! This fully interactive page allows you to draw your own 64 × 64, 16 grey-tone image and immediately hear the corresponding 64-voice polyphonic soundscape being synthesized on the fly! See and hear how The vOICe mapping works for your input. The 64-channel sound synthesis here maps the image into an exponentially distributed frequency interval for a one second soundscape. Furthermore, you can view sound waves, sonify existing images, train for audiovisual synesthesia and sensory substitution, perform on-line composing, make soundscape animations and create spectrograms (sonograms, sonagrams). Have a touch screen? You can draw with your finger instead of the mouse!
The vOICe mapping: vertical positions of points in a soundscape are represented by pitch,
while horizontal positions are represented by left-to-right scanning and corresponding
stereo panning. Brightness is represented by loudness. In this manner, pixels become... voicels!
Scroll down for more.
|
|
Click somewhere within the parked car image: The example image will disappear and your first point will appear, together with the soundscape of this single point on the black drawing canvas. You can at any time pick one of the 16 grey tones from the palette on the right to draw whatever you like, and even add shading. The vOICe applet will incrementally update the soundscape for the changes you make. The right mouse button acts like an <Undo> button, removing point by point what you have drawn with the left mouse button. With the <Cursor> checkbox you can activate a sound cursor, indicating the position of the graphical cursor in your sound design.
Animation: When the applet is set up for multiple soundscape frames (NFRAME>1), the left and right arrow keys (left/right cursor keys) can be used to cycle through each of the image frames, such that you can draw each of them in turn, while the <Animate> checkbox starts and stops the image and soundscape animation. Note that this will cycle through snapshots of the image frames as stored when they were last drawn. Therefore, they need not reflect the actual current state of the applet, unless you first manually cycle through each frame with the left or right arrow keys to redraw them before starting the animation through the checkbox. By following this same procedure in the wave display mode, you can also animate the wave display as a sequence of frames to view frame-by-frame changes in the acoustic waveform.
Some parameter settings are not yet functional: this is indicated by ``N.A.'' (Not Available).
|
http:
address where the image is stored, and
then pressing the <Return> key or the <Load from> button.
However, due to Java security restrictions, the applet may normally only load files from the
site where the applet itself was loaded from! That is, this site. If you wish to sonify
your own local images, you need to install the applet on your own
local machine first, or use the upload feature.
See also the text box on the right.With the <OIC> checkbox you can specify that auditory image enhancement should be tried while loading an image. This activates an experimental algorithm that is likely to change in future versions. It detects edges and aims to boost perceptually relevant features at the expense of the less relevant ones.
With the <NEG> checkbox you can specify that the negative of an image should be loaded. Sonifying (auralizing) an entire image at once is a major computational task, so this may take a while! You can track the progress in the status bar. Also note that an imported image is mapped into a square area, which will therefore distort the aspect ratio for non-square images. However, this does not matter for the soundscape representation, because there simply is no particular ``correct'' aspect ratio: it all depends on our preferences for frequency range and conversion time.
arti1.gif |
arti2.gif |
arti3.gif |
arti_gui.gif |
artiarch.gif |
artitalk.gif |
arti_abc.gif |
arti_car.gif |
artiface.gif |
artigirl.gif |
The two isolated bright pixels in artigirl.gif help to perceive the chin and nose: the audible repetition of tones at two specific pitches implies that the chin and nose can indeed be heard, where without the extra pixels your untrained brain may have difficulty focussing attention to such details in a complicated soundscape. In general, you will find that your brain receives much more information than it is normally aware of, and little tricks like the added pixels help to prove that. The important conclusion is that the cochlea and peripheral hearing system are often not the bottleneck in untrained soundscape perception. Learning to properly listen is, so... practise! | ||||
Just for fun: You might try sonifying the image artitalk.gif at URLGeorge Dillon from the University of Washington has created a beautiful collection of phonetic samples, putting The vOICe sonification within the realm of speech synthesis and analysis. Definitely worth a visit!
to get something The vOICe was really not meant for, but still... rudimentary voice synthesis!
Other images at that you could try include arti1.gif, arti2.gif, arti3.gif, arti_gui.gif, artigirl.gif, arti_abc.gif, arti_car.gif, artiface.gif and artiarch.gif. Clicking on one of the images on the right in Netscape 3.0 or MSIE 3.0 will automatically instruct The vOICe applet to load and sonify the corresponding image.
Steve Kinzler from Indiana University has installed The vOICe applet to allow sonification of his picons (personal icons) archive.
The vOICe applet has been installed at various other sites.
Soundscape animations: Examples of The vOICe image and soundscape
animations that cycle through multiple video frames are given on the
animation page (fast Java engine required).
These examples include situations relating to orientation and mobility for the
blind. Wavetable synthesis is used for the applet soundscape frame sequencing.
Soundgraph: You can create an elementary multimodal ``soundgraph'' or auditory function plot
of your own mathematical function via the form below. Note that Java and JavaScript functions,
contrary to C and C++, require a "Math." string in front of any transcendental functions like sin,
cos, sqrt and pow. White noise can be added using the Math.random() function.
The <Fit> button optionally calculates a new y range for a given x range, such that the soundgraph will best fit the available plot area and frequency range. The <Plot> button generates the soundgraph.
You can also set, get or clear individual voicels via the following form to create a simple auditory data plot without accessing the applet's graphical user interface.
Synthetic vision training: You can train yourself in understanding
soundscapes by creating randomly placed objects via the form below, listening
to the soundscapes and trying to imagine what the corresponding image should
have been. Finally check your interpretation (mental image) by scrolling back
to the applet to look at the actual image. Don't get frustrated by this elementary
synthetic vision training program! Note that it is also hard to visually
remember and manually redraw a random collection of more than just a
few geometric shapes, because our brain doesn't like meaningless data.
Camera support:
If you are running Microsoft Windows, you can apply
The vOICe for Windows, an advanced and fully
integrated synthetic mental vision program, optimized for use with a USB webcam
or USB camera glasses,
netbook PC or Windows tablet, and stereo headphones.
The vOICe for Windows lets you hear your own changing visual environment
for the most realistic interactive feedback and immersive experience.
Imagine the potential of a wearable computer for the blind if it would
turn out that people can indeed learn to understand arbitrary soundscapes! You can also try The vOICe for Android. All of this is aimed at artificial synesthesia through mental imagery. |
As a Java alternative for the PC camera support, a local installation of The vOICe applet can be used together with third-party frame grabbing software. For the applet, the only (Java security) requirement is that the updated image files are written on the same site or computer where The vOICe applet is installed. With suitable access to the provider's file system, the automatically refreshed soundscapes could even be put on the web with The vOICe applet running in the visitor's browser page.
Classic auditory effects: The vOICe applet, owing to its generality, can also
be used to demonstrate many classic auditory effects like beats, combination tones,
critical band masking, forward and backward temporal masking, informational masking,
auditory streaming (auditory fusion versus segregation or fission), comodulation masking
release (CMR), modulation detection interference (MDI), auditory profile analysis,
the role of onset asynchrony, spectral change detection, etc.
To demonstrate beats, you can simply set FL=1000 and FH=1005 (via the <Reset> button to get to the menu, and the <Apply> button to apply your new settings) and then draw a bright horizontal line at the top of the drawing area and another one at the bottom (as in beat.gif ), resulting in a 1005-1000 = 5 Hz audible beat.
missfund.gif |
shepard.gif |
tritone.gif |
If the partials in a complex tone form a harmonic series of some fundamental frequency, one may omit this fundamental, or even several harmonics of it, and still perceive the ``missing fundamental'' although there is no spectral energy associated with it. This is demonstrated when clicking on the missfund.gif image shown on the right, which generates several complex tones in quick succession with partials at multiples of 100 Hz, but some of these complex tones lack the 100 Hz fundamental, and yet you can in all cases still hear this low-frequency component. Moreover, when you switch to the wave display mode, you will see that even for the tones where the 100 Hz fundamental is missing, there is still a clearly observable periodicity with major peaks at 10 ms intervals - again corresponding to 1 / (10 ms) = 100 Hz, the missing fundamental! The 100 Hz subharmonic periodicity will always occur when at least two harmonics of 100 Hz make up the complex tone, including for instance the particular case of random-starting-phase partials at 700, 800, 900 and 1000 Hz corresponding to the rightmost part of missfund.gif. Because the 100 Hz phenomenon is quite obvious in the time domain, this is an indication for the existence of ``periodicity pitch'' detection in the brain: the ability to derive pitch from repetition rate, being distinct from the cochlear spectral analysis and associated tonotopic mappings in the brain. Recently, more direct neurophysiological evidence for this kind of time domain processing has been found (e.g., Langner et al.).A classic auditory paradox arises from the so-called Shepard tones (Shepard scale, original form published by R. Shepard in 1964), giving the auditory illusion of ever-ascending pitch through cosinusoidal loudness filtering of the partials in a complex tone (musical chord) being swept in musical semitone steps. This is an auditory equivalent of M.C. Escher's visual illusion of endless stairs in his famous 1960 Ascending-Descending lithograph. The vOICe applet creates a Shepard scale and synthesizes the Shepard tones for you when you click on the shepard.gif image.
Another auditory paradox clickable on the right is Diana Deutsch's tritone paradox, where two subsequent complex tones are by some people perceived as showing ascending pitch, while other people consider them to have descending pitch. The two complex tones, here D and G#, differ by half an octave in the frequencies of their harmonic partials, but the amplitude distribution of the partials leaves an ambiguity w.r.t. the octave to which each complex tone as a whole belongs. With some effort you can hear both variants (ascending and descending pitch), but the ``preferred'' choice appears to depend even on the world region where you grew up.
Both paradoxes simply exploit the fact that perceived pitch is for pure tones a monotonically increasing function of frequency, but for complex tones the brain has to decide which of the partials contribute (nonlinearly) to what extent to form a single perceived pitch. That pitch may even lie outside the frequency range spanned by the partials, as in the case of the missing fundamental.
On-line composing: Be your own musical composer, by first selecting a musical scale as well as a desired sound duration T and number of notes N, and then drawing notes at appropriate positions.
Music player and generative art: You can try The vOICe
bach.gif - Requires fast Java engine. Music by Johann S. Bach (1685-1750). |
sierpinski.gif |
Game of Life? You can also play with sonification of John Conway's Game of Life using The vOICe.
|
http:
address where the sound is stored, and then pressing the <Return> key or the
<Load from> button. In other words, you now specify the URL of a .wav or .au
sound file instead of a .gif or .jpg image file. Note that calculating a spectrogram (sonogram, sonagram)
for a nonlinear frequency distribution is a very CPU intensive task, and the spectrum analyzer may therefore
take some time to complete. Also, when loading across the Internet, the applet may seem to freeze, while it is
actually waiting for the sound file to arrive, so be patient: when nothing seems to happen, it may just be
a slow connection. See also the Java security issues as discussed above in the context of loading image files:
the same restrictions apply to loading sound files that do no reside on this site.
With the <NEG> checkbox you can specify that the negative spectrogram
(white is silence) should be generated.
As an example, you could try creating a spectrogram for the sound file arti2.wav at URL
and observe that the spectral analysis result indeed resembles the arti2.gif image, even though the default applet frequency interval is now [500 Hz, 4kHz] instead of the [500 Hz, 5kHz] range used in generating the sound file with the simple arti2.c ANSI C program!
You can also create a spectrogram of a human speech sample, like philres.wav, at URL
When clicking the above link, the temporal resolution is increased at the expense of frequency resolution by using a large number of columns N, leading to short time windows. Thus one can now observe the low-frequency periodicity of vowel fundamentals in the spectrogram too, where this would otherwise only be apparent in the wave display mode.
The .wav files can be mono or stereo, using either 8 or 16 bits per sample per channel. Other, more exotic, .wav variants are not supported and will be rejected with a message on the Java console. The .au files may use the common Java/Web µ-law (companded 8-bit 8kHz mono) format, but µ-law stereo and 8-bit and 16-bit linear mono and stereo Sun formats are also supported by The vOICe applet.spectrograph
Note that even the classic spectrogram can itself be viewed as an attempt to design a useful artificial cross-modal mapping (from the auditory to the visual domain), albeit mostly for research purposes.
Browser/Java problems: If you experience problems with your browser or with running the Java applet/application, visit the browser limitations and/or Java limitations page for a discussion and possible solutions.
Bug reports? If you experience problems running the applet under Netscape 3.0 or MSIE 3.0 (or later versions), or running the stand-alone application, please report with a note describing the problem, the computer/processor, the operating system and browser version you use, as well as the version number shown in the applet/application window. The problem will then hopefully soon be fixed (if it can be reproduced). Of course, you may also drop a note just to tell that things work fine!
Site license: Web site developers can obtain a free license and instructions for installing a copy of The vOICe applet on their own site for non-commercial use.
Source code: The vOICe Java source code is not available. However, the hificode.c ANSI C source code for generating .wav sound files (generally much better than companded µ-law .au files) from N × N pixel images is now available at the software examples, including 44.1 kHz 16-bit full stereo hifi sound generation (CD-quality), accounting for interaural time delays and head-masked loudness levels. These examples lack the easy user-interface of the Java applet, but The vOICe Java application lets you write the appropriate image data to the console, which may in turn be redirected to a file for later use in other programs.