MP3 Tutorial

Index

What is MP3?
The advantages of MP3
The disadvantages of MP3
What can MP3 be used for?
How does MP3 work?
History
Digital audio basics
MP3 compression
The compression rate
ID3 tags

What is MP3?

MP3 is currently the most important standard for compressing audio files. Normal audio files in PCM or WAV format are usually very large; in MP3 they are greatly diminished in size and reduced to their essentials. Along with MP3 there are other compression procedures, such as Real Audio, MSAudio or the MP3 predecessor MPEG 1 + 2. All of these compressions - even MP3 - involve a loss of information. The unique thing about MP3, however, is that these losses are kept within limits. The losses occur mainly within the body of data, and less in the sound quality. The overall sound of the compressed music is "practically" indistinguishable from the original.

The advantages of MP3

MP3 files are so small that they can be easily routed around the digital traffic jams in the Internet without long download times.
The storage capacity of data carriers can be enhanced several times using MP3. A CD with MP3 files can contain up to 10 hours of music.
MP3 files can be played on any computer with an MP3 decoder and soundcard or on a portable MP3 player.
Along with audio, MP3 files allow the coding of other types of information, such as information on the artist, the songs on the CD or on the style of music. This makes them optimal for archiving music.
The sound quality is so good that it hardly differs from conventional audio CDs.

The disadvantages

There is a differing schools of thought on the question of whether the sound quality of MP3 files "do not basically" differ from conventional audio CDs or whether the differences are much more audible. Depending on the demands made, the degree of imagination and the stereo system, real or implied details and nuances can be noticed, so that the original on the CD does perhaps sound better than the compression with MP3. For joggers with portable MP3 players, such differences will probably be as unimportant as they are for surfers who rummage through the Internet looking for new songs and sounds.

What can MP3 be used for?

MP3 turns the computer into a song archive. A normal audio CD contains a mountain of data - over 650 megabytes. To compile a digital sound archive using audio CDs, even larger hard drives become quickly filled up. Using MP3, the mountain of data shrinks to a twelfth of its original size. Each audio minute produces only 1 megabyte of data. Complete CD collections can now be copied onto a hard drive and saved in the MP3 format with great savings of memory space.
MP3 turns the computer into a sound carrier and playback station. Instead of time-consuming juggling of CD's, playlists can be created, changed stored and played on the monitor. Unwanted songs can be simply stricken from the playlist. CDs force the listener to zap through these songs using the remote control or to listen to everything. Records and cassettes have to be turned over, CDs have to be changed - MP3 songs can be played as long and as often as the MP3 files fit on the hard drive.
MP3 turns the Internet into a music market. Tens of thousands of musicians without a recording contract make their music available on the net in the MP3 format. Tens of thousands of commercially produced songs or CDs can be found and downloaded from the Internet for a fee. Hundreds of thousands of titles are put up on the net illegally by fans. Conventional audio files are too large for the Internet and are simply incompatible for online data transfer. MP3 allows anyone to retrieve songs from the net, send them by e-mail or save them on a home computer. Anyone with a CD writer can make individual audio CDs at home using MP3 songs, much to the disdain of the recording industry.
MP3 turns the Internet into a forum for novice and experienced artists. Anyone with access to an MP3 recorder can put self-produced pieces in the MP3 format up on the Internet for review. Such publications require only an individual homepage or a small amount of memory in one of the large Internet MP3 song archives. MP3 thus undermines the traditional significance of record companies and radio stations. A completely new market arises, one, which allows direct access to anyone with a modem.

How does MP3 work?

History

     MP3 (or MPEG Layer 3) is a further development of the older compression procedure MPEG1 + 2. MPEG stands for "Moving Picture Experts Group" and refers to a task force that works together with the International Standards Organisation (ISO) as well as the International Electro-Technical Commission (IEC), to develop standards for video and audio encoding.
Development of MP3 was commenced jointly in 1987 by the Fraunhofer Institute (IIS) and the University of Erlangen. Psychoacoustic methods, rather than physical ones, played a major role in its development. A number of specially trained test persons listened to alternating original and compressed versions and, based on their responses, the algorithms were improved so that the difference between original and compressed versions was increasingly diminished and the compression rate increased. The decisive factor is thus the listener's direct impression, the human ear. In this manner the MP3 format came about step by step, allowing a data reduction of 1:10 to 1:12 as compared to uncompressed audio files. With MP3, normal audio files are encrypted or "encoded". Thus, an MP3 encoder is needed to create MP3 files. Conversely, when listening to MP3 files, they must be "decoded" again into audio files. This is done by the decoder.

Digital audio basics

     Each digital audio recording transforms analog audio signals - sounds, music, singing, and speech - into digital files, which can be stored and processed in a computer. The device which is used to digitalise the audio signals is already built into most sound cards and aptly called an analog-digital converter, often abbreviated with A-to-D or A/D.In order to reconvert the digital signal into analog sounds, the sound card contains a digital-analog converter - or D/A converter, which is connected by cable to an amplifier or with active speakers. This is similar to the converter found in every CD player, which must also convert digitalised audio files on the CD back into audible, analog sounds. In order to record sounds, the A/D converter takes samples of the sound to be digitalised at fixed intervals by measuring the voltage level of the signal.
The frequency of the sampling is called the sample rate and naturally lies within the kHz frequency range; several thousand times per second. The higher the sample rate, the more samples are recorded by the A/D converter, thus making the sound conversion closer to the original.
The precision with which the A/D converter measures the voltage level of the analog signal is determined by the sample resolution. The same principle applies here: The finer the resolution, the better and more natural the digital conversion. A "normal" digitalised audio signal without data reduction often consists of 16-bit samples with a sample rate of 44.1 kHz. This is the form in which they are stored on audio CDs. Some sound and recording cards offer even higher resolution (18-24 bits) - this enables an even better digital sound reproduction.

MP3 compression

     Uncompressed audio material in stereo and CD quality generates a great volume of data because of the high resolution and sample rate - approx.10 MB per minute. In order to save memory or to reduce the data rate for transfers, it is possible to simply reduce the overall resolution and sample rate, for example from 16 bits down to8 bits or from 44.1 kHz to 22 kHz. The sound losses in such a procedure, however, are clearly audible. The idea behind the MPEG compressions procedure lies in that all frequency ranges are not reduced in resolution and sample rate, but rather in such ranges which carry "inaudible" audio information, thus leaving other, relevant ranges unchanged.
The MPEG encoders thus remove "superfluous" inaudible sounds from the audio material. For example, along with the audible sound of a certain frequency, the quieter sounds of more distant frequencies are no longer perceptible. They are "drowned out". This is known as the "masking effect".
In addition, there is redundant information, identical samples, which carry the same information and are thus unnecessary. Such data can be removed from the overall data without any audible losses.
In order to take advantage of the masking effect, the entire frequency spectrum of the audio material is at first divided into various partial ranges using a bank of filters. With MPEG1+2, 32 partial ranges are separated, with MP3, 576 partial ranges are separated from each other. The individual partial ranges are then process and reduced separately. The original 16-bit resolution is reduced - depending on the frequency range - by 2 to 15 bits. The psychoacoustic model determines which frequency ranges are to be reduced and to which extent. This depends directly from the sound impression of the test persons. In order to eliminate redundancies, an additional Huffman encoding is employed, which discovers and suppresses the redundant bits in the flow of data.

The compression rate

     Encoders reduce and encode the original audio signal, a WAV file, for example. Decoders decode the compressed MPEG file, so that a WAV file can be created again at the end of the compression procedure. This WAV file is not identical with the original WAV file - data is lost during compression, the process is not free of loss.

There are various MP3 encoders, which can "trim" the same files in somewhat different ways and chiefly at differing speeds. The MP3 encoders undergo continuous further development and their algorithms are being improved. The most important aspect of compression is the compression rate: The more the compression, the greater is the audible change to the original audio material. MP3 encodings usually offer a choice of compression rates from 8 to 320 kBit/s. An MP3 compression from 128 to112 kBit/s compresses the original file material in a proportion of 1:10 to 1:12 and produces very good results.

ID3 tags

     MP3 files transport not only audio information, but also additional information about the encoded piece of music. This purposed is served by the so-called "ID3 tags". These are file attachments in which an encoder can enter standardised information. The ID3 tags are recognised by the decoders and displayed by the MP3 players as music information. An ID3 tag is always attached at the end of an MP3 file and is128 bytes large. The tag begins with the identification TAG (length 3 bytes, offset 0), the song title then follows (length 30 bytes, offset 3), then the artist's name (length 30 bytes, offset 33), the title of the CD (length 30 bytes, offset 63), the year of publication (length 4 bytes, offset 93), an optional comment (length 30 bytes, offset 97) and identification of genre (length 1 Byte, offset 127).