Index



What is MP3?

     MP3 is currently the most important standard for compressing audio files. Normal audio files in PCM or WAV format are usually very large; in MP3 they are greatly diminished in size and reduced to their essentials. Along with MP3 there are other compression procedures, such as Real Audio, MSAudio or the MP3 predecessor MPEG 1 + 2. All of these compressions - even MP3 - involve a loss of information. The unique thing about MP3, however, is that these losses are kept within limits. The losses occur mainly within the body of data, and less in the sound quality. The overall sound of the compressed music is "practically" indistinguishable from the original.

The advantages of MP3

The disadvantages

     There is a differing schools of thought on the question of whether the sound quality of MP3 files "do not basically" differ from conventional audio CDs or whether the differences are much more audible. Depending on the demands made, the degree of imagination and the stereo system, real or implied details and nuances can be noticed, so that the original on the CD does perhaps sound better than the compression with MP3. For joggers with portable MP3 players, such differences will probably be as unimportant as they are for surfers who rummage through the Internet looking for new songs and sounds.

What can MP3 be used for?
How does MP3 work?

History

     MP3 (or MPEG Layer 3) is a further development of the older compression procedure MPEG1 + 2. MPEG stands for "Moving Picture Experts Group" and refers to a task force that works together with the International Standards Organisation (ISO) as well as the International Electro-Technical Commission (IEC), to develop standards for video and audio encoding.
Development of MP3 was commenced jointly in 1987 by the Fraunhofer Institute (IIS) and the University of Erlangen. Psychoacoustic methods, rather than physical ones, played a major role in its development. A number of specially trained test persons listened to alternating original and compressed versions and, based on their responses, the algorithms were improved so that the difference between original and compressed versions was increasingly diminished and the compression rate increased. The decisive factor is thus the listener's direct impression, the human ear. In this manner the MP3 format came about step by step, allowing a data reduction of 1:10 to 1:12 as compared to uncompressed audio files. With MP3, normal audio files are encrypted or "encoded". Thus, an MP3 encoder is needed to create MP3 files. Conversely, when listening to MP3 files, they must be "decoded" again into audio files. This is done by the decoder.

Digital audio basics

     Each digital audio recording transforms analog audio signals - sounds, music, singing, and speech - into digital files, which can be stored and processed in a computer. The device which is used to digitalise the audio signals is already built into most sound cards and aptly called an analog-digital converter, often abbreviated with A-to-D or A/D.In order to reconvert the digital signal into analog sounds, the sound card contains a digital-analog converter - or D/A converter, which is connected by cable to an amplifier or with active speakers. This is similar to the converter found in every CD player, which must also convert digitalised audio files on the CD back into audible, analog sounds. In order to record sounds, the A/D converter takes samples of the sound to be digitalised at fixed intervals by measuring the voltage level of the signal.
The frequency of the sampling is called the sample rate and naturally lies within the kHz frequency range; several thousand times per second. The higher the sample rate, the more samples are recorded by the A/D converter, thus making the sound conversion closer to the original.
The precision with which the A/D converter measures the voltage level of the analog signal is determined by the sample resolution. The same principle applies here: The finer the resolution, the better and more natural the digital conversion. A "normal" digitalised audio signal without data reduction often consists of 16-bit samples with a sample rate of 44.1 kHz. This is the form in which they are stored on audio CDs. Some sound and recording cards offer even higher resolution (18-24 bits) - this enables an even better digital sound reproduction.

MP3 compression

     Uncompressed audio material in stereo and CD quality generates a great volume of data because of the high resolution and sample rate - approx.10 MB per minute. In order to save memory or to reduce the data rate for transfers, it is possible to simply reduce the overall resolution and sample rate, for example from 16 bits down to8 bits or from 44.1 kHz to 22 kHz. The sound losses in such a procedure, however, are clearly audible. The idea behind the MPEG compressions procedure lies in that all frequency ranges are not reduced in resolution and sample rate, but rather in such ranges which carry "inaudible" audio information, thus leaving other, relevant ranges unchanged.
The MPEG encoders thus remove "superfluous" inaudible sounds from the audio material. For example, along with the audible sound of a certain frequency, the quieter sounds of more distant frequencies are no longer perceptible. They are "drowned out". This is known as the "masking effect".
In addition, there is redundant information, identical samples, which carry the same information and are thus unnecessary. Such data can be removed from the overall data without any audible losses.
In order to take advantage of the masking effect, the entire frequency spectrum of the audio material is at first divided into various partial ranges using a bank of filters. With MPEG1+2, 32 partial ranges are separated, with MP3, 576 partial ranges are separated from each other. The individual partial ranges are then process and reduced separately. The original 16-bit resolution is reduced - depending on the frequency range - by 2 to 15 bits. The psychoacoustic model determines which frequency ranges are to be reduced and to which extent. This depends directly from the sound impression of the test persons. In order to eliminate redundancies, an additional Huffman encoding is employed, which discovers and suppresses the redundant bits in the flow of data.

The compression rate

     Encoders reduce and encode the original audio signal, a WAV file, for example. Decoders decode the compressed MPEG file, so that a WAV file can be created again at the end of the compression procedure. This WAV file is not identical with the original WAV file - data is lost during compression, the process is not free of loss.

There are various MP3 encoders, which can "trim" the same files in somewhat different ways and chiefly at differing speeds. The MP3 encoders undergo continuous further development and their algorithms are being improved. The most important aspect of compression is the compression rate: The more the compression, the greater is the audible change to the original audio material. MP3 encodings usually offer a choice of compression rates from 8 to 320 kBit/s. An MP3 compression from 128 to112 kBit/s compresses the original file material in a proportion of 1:10 to 1:12 and produces very good results.

ID3 tags

     MP3 files transport not only audio information, but also additional information about the encoded piece of music. This purposed is served by the so-called "ID3 tags". These are file attachments in which an encoder can enter standardised information. The ID3 tags are recognised by the decoders and displayed by the MP3 players as music information. An ID3 tag is always attached at the end of an MP3 file and is128 bytes large. The tag begins with the identification TAG (length 3 bytes, offset 0), the song title then follows (length 30 bytes, offset 3), then the artist's name (length 30 bytes, offset 33), the title of the CD (length 30 bytes, offset 63), the year of publication (length 4 bytes, offset 93), an optional comment (length 30 bytes, offset 97) and identification of genre (length 1 Byte, offset 127).