ISO/IEC 11172-3:1993 - MP3编码标准翻译 (2)Section 1: General 1.1

Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio

Section 1: General

1.1 Scope

This part of ISO/IEC 11172 specifies the coded representation of high quality audio for storage media and the method for decoding of high quality audio signals. The input of the encoder and the output of the decoder are compatible with existing PCM standards such as standard Compact Disc and Digital Audio Tape.

ISO/IEC 11172的本部分规定了高质量音频信号的编码表示方法，并制定了高质量音频信号的解码方法。编码器的输入和解码器的输出均与现有PCM标准（如标准激光唱片和数字音频磁带）兼容。

This part of the ISO/IEC 11 172 is intended for application to digital storage media providing a total continuous transfer rate of about 1,s Mbits/sec for both audio and video bitstreams, such as CD, DAT and magnetic hard disc. The storage media may either be connected directly to the decoder, or via other means such as communication lines and the ISO/IEC 11 172 multiplexed stream defined in ISO/IEC 11 172-1. This part of ISO/IEC 11 172 is intended for sampling rates of 32 kHz, 44,l kHz, and 48 kHz.

ISO/IEC 11172的本部分适用于数字存储介质，其音频和视频比特流的总连续传输速率约为1.5 Mbit/s（如CD、DAT和磁硬盘）。存储介质可直接连接解码器，或通过其他方式（如通信线路及ISO/IEC 11172-1定义的复用流）接入。本部分支持的采样率为32 kHz、44.1 kHz和48 kHz。

1.2 Normative references

The following International Standards contain provisions which, through reference in this text, constitute provisions of this part of ISO/IEC 11172. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this part of ISO/IEC 11172 are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. Members of IEC and IS0 maintain registers of currently valid International Standards.

以下国际标准包含的条款，通过在本文本中的引用，构成本部分ISO/IEC 11172的条款。在发布时，所标注的版本为有效版本。所有标准均可能修订，基于本部分达成协议的各方应核查是否可采用以下列出的最新版本标准。IEC和ISO成员机构维护当前有效国际标准的登记册。 ISOAEC 11172-1:1993 Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbith - Part I System.

ISO/IEC 11 172-2: 1993 Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,s Mbids - Part 2: Video.

CCIR Recommendition 601-2 Encoding parameters of digital television for studios.

CCIR Report 624-4 Characteristics of systems for monochrome and colour television.

CCIR Recommendation 648 Recording of audio signals.

CCIR Report 955-2 Sound broadcasting by satellite for portable and mobile receivers, including Annex IV Surnmary description of Advanced Digital System II.

CCITT Recommendation J.17 Pre-emphasis used on Sound-Programme Circuits.

IEEE Draft Stmdaud P1180/D2 1990 Specijïcation for the irnplemntation of 8x 8 inverse discrete cosine transfonn ".

IEC publication 908:1987 CD Digital Audio System.

Section 2: Technical elements

2.1 Definitions

For the purposes of ISO/IEC 11172, the following definitions apply. If specific to a part, this is noted in square brackets.

为符合ISO/IEC 11172的要求，以下定义适用。若特定于某一部分，将用方括号注明。

1 ~ 50

2.1.1 ac coefficient [video]: Any DCT coefficient for which the frequency in one or both dimensions is non-zero.

2.1.2 access unit [system]: In the case of compressed audio an access unit is an audio access unit. In the case of compressed video an access unit is the coded representation of a picture.

2.1.3 adaptive segmentation [audio]: A subdivision of the digital representation of an audio signal in variable segments of time.

2.1.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency varying fashion according to a psychoacoustic model.

2.1.5 adaptive noise allocation [audio]: The assignment of coding noise to frequency bands in a time and frequency varying fashion according to a psychoacoustic model.

2.1.1 ac coefficient[video]：指离散余弦变换（DCT）中任一频率分量在某一维度或两个维度上非零的系数。

2.1.2 access unit[system]：对于压缩音频，访问单元指一个音频访问单元；对于压缩视频，访问单元指一幅图像的编码表示。

2.1.3 adaptive segmentation[audio]：将音频信号的数字化表示划分为时间长度可变的片段。

2.1.4 adaptive bit allocation[audio]：根据心理声学模型，以时间和频率动态变化的方式向子带分配比特资源。

2.1.5 adaptive noise allocation[audio]：根据心理声学模型，以时间和频率动态变化的方式向频带分配编码噪声。

2.1.6 alias [audio]: Mirrored signal component resulting from sub-Nyquist sampling.

2.1.7 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio signal into a set of subsampled subband samples.

2.1.8 audio access unit [audio]: For Layers I and II an audio access unit is defined as the smallest part of the encoded bitstream which GQI be decoded by itself, where decoded means "fully recoiistructed sound". For Layer III an audio access unit is part of the bitsue'm that is decodable with the use of previously acquired main information.

2.1.9 audio buffer [audio]: A buffer in the system Luget decoder for storage of compressed audio data.

2.1.10 audio sequence [audio]: A non-interrupted series of audio frames in which the following parameters are not changed:

ID

Layer

Sampling Frequency

For Layer I and II: Bitrate index

2.1.6 alias [audio]：由次奈奎斯特采样引起的镜像信号分量。

2.1.7 analysis filterbank [audio]：编码器中用于将宽带PCM音频信号转换为一组子采样子带样本的滤波器组。

2.1.8 audio access unit [audio]：对于Layer I和II，音频访问单元定义为编码比特流中可独立解码的最小部分，其中“解码”指“完全重构声音”；对于Layer III，音频访问单元是比特流中可利用先前获取的主信息解码的部分。

2.1.9 audio buffer [audio]：系统或解码器中用于存储压缩音频数据的缓冲区。

2.1.10 audio sequence [audio]：一系列未中断的音频帧，其中以下参数保持不变：ID,层级,采样频率,对于Layer I和II：比特率索引

2.1.11 backward motion vector [video]: A motion vector that is used for motion compensation from a reference picture at a later time in display order.

2.1.12 Bark [audio]: Unit of critical band rate. The Bark scale is a non-linear mapping of the frequency scale over the audio range closely corresponding with the frequency selectivity of the human ear across the band.

2.1.13 bidirectionally predictive-coded picture; B-picture [video]: A picture that is coded using motion compensated prediction from a past and/or future reference picture.

2.1.14 bitrate: The rate at which the compressed bitstream is delivered from the storage medium to the input of a decoder.

2.1.15 block companding [audio]: Normalizing of the digital representation of an audio signal within a certain time period.

2.1.11 backward motion vector [video]：用于从显示顺序中较晚出现的参考帧进行运动补偿的运动矢量。

2.1.12 Bark [audio]：临界频带率的单位。Bark刻度是一种频率轴的非线性映射，与人耳在整个频段内的频率选择性密切相关。

2.1.13 bidirectionally predictive-coded picture; B-picture [video]：通过从过去和或未来参考帧进行运动补偿预测编码的图像。

2.1.14 bitrate：压缩比特流从存储介质传输至解码器输入的速率。

2.1.15 block companding [audio]：在特定时间段内对音频信号的数字化表示进行归一化处理。

2.1.16 block [video]: An 8-row by 8-column orthogonal block of pels.

2.1.17 bound [audio]: The lowest subband in which intensity stereo coding is used.

2.1.18 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits from the first bit in the stream.

2.1.19 byte: Sequence of 8-bits.

2.1.20 channel: A digital medium that stores or transports an ISO/IEC 11172 stream.

2.1.16 block [video]：由8行×8列正交像元（pels）构成的块。

2.1.17 bound [audio]：启用强度立体声编码的最低子带。

2.1.18 byte aligned：若编码比特流中某一位的位置是从比特流中的第一个比特开始的8位整数倍，则该位为字节对齐。

2.1.19 byte：由8比特组成的序列。

2.1.20 channel：用于存储或传输ISO/IEC 11172比特流的数字媒介。

2.1.21 channel [audio]: The left and right channels of a stereo signal

2.1.22 chrominance (component) [video]: A matrix, block or single pel representing one of the two colour difference signals related to the primary colours in the manner defined in CCIR Rec 601, The symbols used for the colour difference signals are Cr and Cb.

2.1.23 coded audio bitstream [audio]: A coded representation of an audio signal as specified in this part of ISO/IEC 11172.

2.1.24 coded video bitstream [video]: A coded represenLition of a series of one or more pictures as specified in ISO/IEC 11172-2.

2.1.25 coded order [video]: The order in which the pictures are stored and decoded. This order is not necessarily the same as the display order.

2.1.21 channel [audio]：立体声信号的左声道与右声道。

2.1.22 chrominance (component) [video]：表示与CCIR Rec 601标准定义的原色相关的两种色差信号之一的矩阵、块或单个像元（pel）。色差信号符号为Cr和Cb。

2.1.23 coded audio bitstream [audio]：根据ISO/IEC 11172本部分规定编码的音频信号表示形式。

2.1.24 coded video bitstream [video]：根据ISO/IEC 11172-2规定编码的一系列一帧或多帧图像的表示形式。

2.1.25 coded order [video]：图像存储与解码的顺序，该顺序未必与显示顺序一致。

2.1.26 coded representation: A &?ta element as represented in its encoded form.

2.1.27 coding parameters [video]: The set of user-definable parameters that characterize a coded video bitstream. Bitstreams are characterised by coding paraneters. Decoders are chatacterised by the bitstreams that they are capable of decoding.

2.1.28 component [video]: A matrix, block or single pel from one of the three matrices (luminance and two chrominaice) that make up a picture.

2.1.29 compression: Reduction in the number of bits used to represent an item of data.

2.1.30 constant bitrate coded video [video]: A compressed video bitstream with a constant average bitrate.

2.1.26 coded representation：以编码形式表示的数据元素。

2.1.27 coding parameters [video]：用于表征编码视频比特流的用户可定义参数集合。比特流的特征由编码参数决定，解码器的能力则由其支持的比特流类型定义。

2.1.28 component [video]：构成图像的三个矩阵（亮度矩阵和两个色度矩阵）中的矩阵、块或单个像元（pel）。

2.1.29 compression：减少表示数据项所需的比特数。

2.1.30 constant bitrate coded video [video]：具有恒定平均比特率的压缩视频比特流。

2.1.31 constant bitrate: Operation where the bitrate is constant from start to finish of the compressed bitstream.

2.1.32 constrained parameters [video]: The values of the set of coding parameters defined in 2.4.3.2 Of ISO/IEC 11172-2.

2.1.33 constrained system parameter stream (CSPS) [system]: An ISO/IEC 11172 multiplexed stream for which the constraints defined in 2.4.6 of ISO/IEC 11 172-1 apply.

2.1.34 CRC: Cyclic redundancy code.

2.1.35 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible frequency it is proportional to the number of critical bands below that frequency. The units of the critical band rate scale are Barks.

2.1.31 constant bitrate：从压缩比特流的起始到结束，比特率始终保持恒定。

2.1.32 constrained parameters [video]：ISO/IEC 11172-2标准2.4.3.2节定义的编码参数集合的取值。

2.1.33 constrained system parameter stream (CSPS) [system]：符合ISO/IEC 11172-1标准2.4.6节约束条件的ISO/IEC 11172复用流。

2.1.34 CRC：循环冗余校验（Cyclic Redundancy Code）。

2.1.35 critical band rate [audio]：频率的心理声学函数。在给定可听频率下，其值与该频率以下的临界频带数量成正比，单位为Bark。

2.1.36 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the frequency selectivity of the human ear. This selectivity is expressed in Bark.

2.1.37 data element: An item of data as represented before encoding and after decoding.

2.138 dc-coefficient [video]: The DCT coefficient for which the frequency is zero in both dimensions.

2.1.39 dc-coded picture; D-picture [video]: A picture that is coded using only information from itself. Of the DCT coefficients in the coded representation, only the dc-coefficients are present.

2.1.40 DCT coefficient: The amplitude of a specific cosine basis function.

2.1.36 critical band [audio]：心理声学频谱域测量单位，对应人耳的频率选择性，单位为Bark。

2.1.37 data element：编码前和解码后均以相同形式表示的数据项。

2.1.38 dc-coefficient [video]：离散余弦变换（DCT）中两个维度频率均为零的系数（即直流分量）。

2.1.39 dc-coded picture; D-picture [video]：仅利用自身信息编码的图像，其编码表示中仅包含DC系数。

2.1.40 DCT coefficient：特定余弦基函数的幅值。

2.1.41 decoded stream: The decoded reconstruction of a compressed bitstream.

2.1.42 decoder input buffer [video]: The first-in first-out (FIFO) buffer specified in the video buffering verifier.

2.1.43 decoder input rate [video]: The date rate specified in the video buffering verifier and encoded in the coded video bitstream.

2.1.44 decoder: An embodiment of a decoding process.

2.1.45 decoding (process): The process defined in ISO/uEC 11172 that reads an input coded bitstream and produces decoded pictures or audio samples.

2.1.41 decoded stream：压缩比特流的解码重构结果。

2.1.42 decoder input buffer [video]：视频缓冲验证器（Video Buffering Verifier）中定义的先进先出（FIFO）缓冲区。

2.1.43 decoder input rate [video]：视频缓冲验证器中定义的数据速率（需结合标准上下文确认具体含义），该速率编码于视频比特流中。

2.1.44 decoder：解码过程的实现实体。

2.1.45 decoding (process)：ISO/IEC 11172标准中定义的过程，读取输入的编码比特流并生成解码后的图像或音频样本。

2.1.46 decoding time-stamp; DTS [system]: A field that may be present in a packet header that indicates the time that an access unit is decoded in the system target decoder.

2.1.47 de-emphasis [audio]: Filtering applied to an audio signal after storage or transmission to undo a linear distortion due to emphasis.

2.1.48 dequantization [video]: The process of rescaling the quantized DCT coefficients after their representation in the bitstream has been decoded and before they are presented to the inverse DCT.

2.1.49 digital storage media; DSM: A digital storage or transmission device or system.

2.1.50 discrete cosine transform; DCT [video]: Either the forward discrete cosine transform or the inverse discrete cosine transform. The DCT is an invertible, discrete orthogonal transformation. The inverse DCT is defined in annex A of ISO/IEC 11172-2.

2.1.46 decoding time-stamp; DTS [system]：数据包头中可能存在的字段，表示系统目标解码器解码访问单元的时间。

2.1.47 de-emphasis [audio]：对存储或传输后的音频信号施加的滤波处理，用于消除因预加重（pre-emphasis）导致的线性失真。

2.1.48 dequantization [video]：在比特流中量化后的DCT系数被解码后、送入逆DCT处理前的重新缩放过程。

2.1.49 digital storage media; DSM：用于数字存储或传输的设备或系统。

2.1.50 discrete cosine transform; DCT [video]：正向离散余弦变换或逆离散余弦变换。DCT是一种可逆的离散正交变换，其逆变换定义于ISO/IEC 11172-2附录A。

51 ~ 100

2.1.51 display order [video]: The order in which the decoded pictures should be displayed. Normally this is the same order in which they were presented at the input of the encoder.

2.1.52 dual channel mode [audio]: A mode, where two audio channels with independent programme contents (e.g. bilingual) are encoded within one bitstream. The coding process is the same as for the stereo mode.

2.1.53 editing: The process by which one or more compressed bitstreams are manipulated to produce a new compressed bitstream. Conforming edited bitstreams must meet the requirements defined in this ISO/IEC 11172.

2.1.54 elementary stream [system]: A generic tenn for one of the coded video, coded audio or other coded bitstreams.

2.1.55 emphasis [audio]: Filtering applied to an audio signal before storage or transmission to improve the signal-to-noise ratio at high frequencies.

2.1.51 display order [video]：解码后的图像应按照的显示顺序。通常情况下，此顺序与编码器输入时的原始顺序一致。

2.1.52 dual channel mode [audio]：一种将两个独立节目内容（如双语）的音频声道编码至同一比特流的模式。其编码过程与立体声模式相同。

2.1.53 editing：通过处理一个或多个压缩比特流生成新压缩比特流的过程。经编辑的合规比特流必须符合本ISO/IEC 11172标准的要求。

2.1.54 elementary stream [system]：编码视频、编码音频或其他编码比特流的通用术语。

2.1.55 emphasis [audio]：在存储或传输前对音频信号施加的滤波处理，用于提升高频信噪比。

2.1.56 encoder: An embodiment of (an encoding process.

2.1.57 encoding (process): A process, not specified in ISO/IEC 11172, that reads a stream of input pictures or audio samples and produces a valid coded bitstream as defined in ISO/IEC 11172.

2.1.58 entropy coding: Variable length lossless coding of the digital representltion of a signal to reduce redundancy.

2.1.59 fast forward playback [video]: The process of displaying a sequence, or parts of a sequence, of pictures in display-order faster than real-time.

2.1.60 FFT: Fast Fourier Transformation. A fast algorithm for performing a discrete Fourier transform (an orthogonal transform).

2.1.56 encoder：编码过程的实现实体。

2.1.57 encoding (process)：一种未在ISO/IEC 11172中规定的过程，读取输入图像或音频样本流并生成符合ISO/IEC 11172标准的有效编码比特流。

2.1.58 entropy coding：对信号的数字化表示进行变长无损编码以减少冗余。

2.1.59 fast forward playback [video]：以快于实时的速度按显示顺序播放图像序列或部分序列的过程。

2.1.60 FFT：快速傅里叶变换。一种用于执行离散傅里叶变换（正交变换）的快速算法。

2.1.61 filterbank [audio]: A set of band-pass filters covering the entire audio frequency range.

2.1.62 fixed segmentation [audio]: A subdivision of the digital representation of an audio signal into fixed segments of time.

2.1.63 forbidden: The term "forbidden" when used in the clauses defining the coded bitstream indicates that the value shall never be used. This is usually to avoid emulation of stut codes.

2.1.64 forced updating [video]: The process by which macroblocks are intra-coded from time-to-time to ensure that mismatch errors between the inverse DCT processes in encoders and decoders cannot build up excessively.

2.1.65 forward motion vector [video]: A motion vector that is used for motion compensation from a reference picture at an earlier time in display order.

2.1.61 filter bank [audio]：覆盖整个音频频率范围的一组带通滤波器。

2.1.62 fixed segmentation [audio]：将音频信号的数字表示按固定时间段进行划分。

2.1.63 forbidden：在定义编码比特流的条款中使用“禁止”一词时，表示该值绝不可使用。此要求通常用于避免模拟 stut 码。

2.1.64 forced updating [video]：通过周期性对宏块进行帧内编码，确保编解码器间逆 DCT 过程中的不匹配误差不会过度累积的过程。

2.1.65 forward motion vector [video]：用于从显示顺序更早的参考图像中获取运动补偿的运动矢量。

2.1.66 frame [audio]: A part of the audio signal that corresponds to audio PCM &unples from an Audio Access Unit.

2.1.67 free format [audio]: Any bitrate other than the defined bitrates that is less than the maximum valid bitrate for each layer.

2.1.68 future reference picture [video]: The future reference picture is the reference picture that occurs at a later tirne than the current picture in display order.

2.1.69 granules [Layer II] [audio]: The set of 3 consecutive subband samples from all 32 subbands that are considered together before quantization. They correspond to 96 PCM samples.

2.1.70 granules [Layer III] [audio]: 576 frequency lines that carry their own side information.

2.1.66 frame [audio]：对应于音频访问单元中的音频 PCM 样本的音频信号部分。

2.1.67 free format [audio]：非定义比特率且低于每层最大有效比特率的任何比特率。

2.1.68 future reference picture [video]：时间顺序上晚于当前图像的参考图像。

2.1.69 granules [Layer II] [audio]：量化前需共同处理的 3 个连续子带样本（来自全部 32 个子带），对应 96 个 PCM 样本。

2.1.70 granules [Layer III] [audio]：携带自身边信息的 576 个频率线。

2.1.71 group of pictures [video]: A series of one or more coded pictures intended to assist random access. The group of pictures is one of the layers in the coding syntax defined in ISO/IEC 11172-2.

2.1.72 Hann window [audio]: A time function applied sample-by-sample to a block of audio samples before Fourier transformation.

2.1.73 Huffman coding: A specific method for entropy coding.

2.1.74 hybrid filterbank [audio]: A serial combination of subband filterbank and MDCT.

2.1.75 IMDCT [audio]: Inverse Modified Discrete Cosine Transform.

2.1.71 group of pictures [video]：一组用于辅助随机访问的一个或多个编码图像。图像组是 ISO/IEC 11172-2 标准编码语法中的一层。

2.1.72 Hann window [audio]：傅里叶变换前对音频样本块逐样本应用的时间函数窗。

2.1.73 Huffman coding：一种特定的熵编码方法。

2.1.74 hybrid filterbank [audio]：子带滤波器与 MDCT（改进离散余弦变换）的串行组合。

2.1.75 IMDCT [audio]：逆改进离散余弦变换（Inverse Modified Discrete Cosine Transform）。

2.1.76 intensity stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic audio prograrmnes based on retaining at high frequencies only the energy envelope of the right and left channels.

2.1.77 interlace [video]: The property of conventional television pictures where alternating lines of the picture represent different instances in time.

2.1.78 intra coding [video]: Coding of a macroblock or picture that uses information only from that macroblock or picture.

2.1.79 intra-coded picture; I-picture [video]: A picture coded using information only from itself.

2.1.80 ISO/IEC 11172 (multiplexed) stream [system]: A bitstream composed of zero or more elementary streams combined in the manner defined in ISO/IEC 11172-1.

2.1.76 intensity stereo [audio]：一种利用立体声节目中的立体声冗余或相关性、仅保留高频部分左右声道能量包络的方法。

2.1.77 interlace [video]：传统电视画面的特性，画面中交替的扫描行代表不同时间点的图像实例。

2.1.78 intra coding [video]：仅使用当前宏块或图像自身信息进行编码的方式。

2.1.79 intra-coded picture; I-picture [video]：仅利用自身信息编码的图像，简称I帧。

2.1.80 ISO/IEC 11172 (multiplexed) stream [system]：由零个或多个基本流按 ISO/IEC 11172-1 标准组合而成的比特流。

2.1.81 joint stereo coding [audio]: Any method that exploits stereophonic irrelevance or stereophonic redundancy.

2.1.82 joint stereo mode [audio]: A mode of the audio coding algorithm using joint stereo coding.

2.1.83 layer [audio]: One of the levels in the coding hierarchy of the audio system defined in this part of ISO/IEC 11172.

2.1.84 layer [video and systems]: One of the levels in the data hierarchy of the video and system specifications defined in ISOIIEC 1 1172-1 and ISO/IEC 11 172-2.

2.1.85 luminance (component) [video]: A matrix, block or single pel representing a monochrome representation of the signal and related to the primary colours in the manner defined in CCIR Rec 601. The symbol used for luminance is Y.

2.1.81 joint stereo coding [audio]：任何利用立体声冗余或立体声相关性的编码方法。

2.1.82 joint stereo mode [audio]：音频编码算法中使用联合立体声编码的模式。

2.1.83 layer [audio]：ISO/IEC 11172 本部分定义的音频系统编码层级之一。

2.1.84 layer [video and systems]：ISO/IEC 11172-1 和 ISO/IEC 11172-2 定义的视频及系统规范的数据层级之一。

2.1.85 luminance (component) [video]：表示信号单色分量的矩阵、块或单个像素，其与原色的关系遵循 CCIR Rec 601 标准。亮度符号为Y。

2.1.86 macroblock [video]: The four 8 by 8 blocks of luminance data and the two corresponding 8 by 8 blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.

Macroblock is sometimes used to refer to the pel data and sometimes to the coded representation of the pel values and other data elements defined in the inacroblock layer of the syntax defined in ISO/IEC 11 172-2. The ustige is clear from the context.

2.1.87 mapping [audio]: Conversion of an audio signal from time to frequency domain by subband filtering and/or by MDCT.

2.1.88 masking [audio]: A property of the human auditory system by which an audio signal cannot be perceived in the presence of another audio signal.

2.1.89 masking threshold [audio]: A function in frequency and time below which an audio signal cannot be perceived by the human auditory system.

2.1.90 MDCT [audio]: Modified Discrete Cosine Transform.

2.1.86 macroblock [video]：由图像亮度分量的 16×16 区域提取的 4 个 8×8 亮度数据块和2 个对应的 8×8 色度数据块组成。“宏块”有时指像素数据，有时指语法中定义的宏块层（如 ISO/IEC 11172-2）的编码表示（如量化参数、运动矢量等）。具体含义需结合上下文。

2.1.87 mapping [audio]：通过子带滤波和/或 MDCT 将音频信号从时域转换到频域的过程。

2.1.88 masking [audio]：人类听觉系统的特性，当存在另一个音频信号时，某些音频信号可能无法被感知。

2.1.89 masking threshold [audio]：频率和时间的函数，低于此阈值的音频信号无法被人耳感知。

2.1.90 MDCT [audio]：改进离散余弦变换（Modified Discrete Cosine Transform）。

2.1.91 motion compensation [video]: The use of motion vectors to improve the efficiency of the prediction of pel values. The prediction uses motion vectors to provide offsets into the past andor future

reference pictures containing previously decoded pel values that are used to form the prediction error signal.

2.1.92 motion estimation [video]: The process of estimating motion vectors during the encoding process.

2.1.93 motion vector [video]: A two-dimensional vector used for motion compensation that provides an offset from the coordinate position in the current picture to the coordinates in a reference picture.

2.1.94 MS stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic audio programmes based on coding the sum and difference signal instead of the left and right channels.

2.1.95 non-intra coding [video]: Coding of a macroblock or picture that uses information both from itself and from macroblocks and pictures occurring at other times.

2.1.91 motion compensation [video]：通过运动矢量提高像素值预测效率的技术。预测过程利用运动矢量从过去和/或未来的参考图像中获取已解码像素值，生成预测误差信号。

2.1.92 motion estimation [video]：编码过程中估算运动矢量的方法。

2.1.93 motion vector [video]：用于运动补偿的二维向量，表示当前图像中像素位置与参考图像中对应位置的坐标偏移。

2.1.94 MS stereo [audio]：通过编码和差信号（而非左右声道）来利用立体声节目冗余或相关性的方法。

2.1.95 non-intra coding [video]：既使用当前图像自身信息，也使用其他时间点的图像和宏块信息的编码方式。

2.1.96 non-tonal component [audio]: A noise-like component of an audio signal.

2.1.97 Nyquist sampling: Sampling at or above twice the maximum bandwidth of a signal.

2.1.98 pack [system]: A pack consists of a pack header followed by one or more packets. It is a layer in the system coding syntax described in ISO/IEC 11172-1.

2.1.99 packet data [system]: Contiguous bytes of data from an elementmy stream present in a packet.

2.1.100 packet header [system]: The data structure used to convey information about the elementary stream data contained in the packet data.

2.1.96 non-tonal component [audio]：音频信号中类噪声的成分。

2.1.97 Nyquist sampling（奈奎斯特采样）：以至少两倍于信号最大带宽的速率进行采样。

2.1.98 pack [system]：由包头（pack header）和一个或多个包（packets）组成的数据单元，是 ISO/IEC 11172-1 中系统编码语法的一层。

2.1.99 packet data [system]：数据包中包含的来自某个基本流（elementary stream）的连续字节数据。

2.1.100 packet header [system]：用于传递数据包中包含的基本流数据信息的数据结构。

101 ~ 154

2.1.101 packet [system]: A packet consists of a header followed by a number of contiguous bytes from an elementary data stream. It is a layer in the system coding syntax described in ISO/IEC 11172-1.

2.1.102 padding [audio]: A method to adjust the average length in time of an audio frame to the duration of the corresponding PCM samples, by conditionally adding a slot to the audio frame.

2.1.103 past reference picture [video]: The past reference picture is the reference picture that occurs at an earlier time than the current picture in display order.

2.1.104 pel aspect ratio [video]: The ratio of the nominal vertical height of pel on the display to its nominal horizontal width.

2.1.105 pel [video]: Picture element.

2.1.101 packet [system]：由包头（header）和来自基本数据流的若干连续字节组成。它是 ISO/IEC 11172-1 中系统编码语法的一层。

2.1.102 padding [audio]：通过有条件地向音频帧添加时隙，调整音频帧平均时间长度以匹配对应 PCM 样本时长的方法。

2.1.103 past reference picture [video]：显示顺序早于当前图像的参考图像。

2.1.104 pel aspect ratio [video]：显示设备中像素标称垂直高度与其标称水平宽度的比例。

2.1.105 pel [video]：图像元素（picture element），即像素。

2.1.106 picture period [video]: The reciprocal of the picture rate.

2.1.107 picture rate [video]: The nominal rate at which pictures should be output from the decoding process.

2.1.108 picture [video]: Source, coded or reconstructed image data. A source or reconstructed picture consists of three rectangular matrices of 8-bit numbers representing the luminance and two chrominance signals. The Picture layer is one of the layers in the coding syntax defined in ISO/IEC 11 172-2. Note that the term "picture" is always used in ISO/IEC 11 172 in preference to the terms field or fmne.

2.1.109 polyphase filterbank [audio]: A set of equal b'andwidth filters with special phase interrelationships, allowing for an efficient implementation of the filterbank.

2.1.110 prediction [video]: The use of a predictor to provide an estimate of the pel value or data element currently being decoded.

2.1.106 picture period [video]：图像周期，即图像速率的倒数。

2.1.107 picture rate [video]：解码过程中图像输出的标称速率。

2.1.108 picture [video]：源图像、编码图像或重建图像数据。源图像或重建图像由三个 8 位数值的矩形矩阵组成，分别表示亮度信号和两个色度信号。图像层是 ISO/IEC 11172-2 定义的编码语法中的一层。注：ISO/IEC 11172 标准中始终使用“图像”（picture）而非“场”（field）或“帧”（fmne，疑似笔误，应为 frame）。

2.1.109 polyphase filterbank [audio]：一组带宽相等的滤波器，具有特殊的相位关系，可实现滤波器组的高效实现。

2.1.110 prediction [video]：使用预测器对当前解码的像素值（pel）或数据元素进行估计的过程。

2.1.111 predictive-coded picture; P-picture [video]: A picture that is coded using motion compensated prediction from the past reference picture.

2.1.112 prediction error [video]: The difference between the actual value of a pel or data element and its predictor.

2.1.113 predictor [video]: A linear combination of previously decoded pel values or data elements.

2.1.114 presentation time-stamp; PTS [system]: A field that may be present in a packet header that indicates the time that a presentation unit is presented in the system target decoder.

2.1.115 presentation unit; PU [system]: A decoded audio access unit or a decoded picture.

2.1.111 predictive-coded picture; P-picture [video]：使用来自过去参考图像的运动补偿预测进行编码的图像，简称P帧。

2.1.112 prediction error [video]：实际像素值（pel）或数据元素与其预测值之间的差异。

2.1.113 predictor [video]：基于先前已解码像素值或数据元素的线性组合进行预测的计算过程。

2.1.114 presentation time-stamp; PTS [system]：数据包头中可能存在的字段，表示系统目标解码器（System Target Decoder）中呈现单元（Presentation Unit）的呈现时间。

2.1.115 presentation unit; PU [system]：解码后的音频访问单元（Audio Access Unit）或解码后的图像（decoded picture）。

2.1.116 psychoacoustic model [audio]: A mathematical model of the masking behaviour of the human auditory system.

2.1.117 quantization matrix [video]: A set of sixty-four 8-bit values used by the dequantizer.

2.1.118 quantized DCT coefficients [video]: DCT coefficients before dequantization. A variable length coded representation of quantized DCT coefficients is stored as part of the compressed video bitstream.

2.1.119 quantizer scalefactor [video]: A data element represented in the bitstrean and used by the decoding process to scale the dequantization.

2.1.120 random access: The process of beginning to read and decode the coded bitstream at an arbitrary point.

2.1.116 psychoacoustic model [audio]：人耳听觉系统掩蔽效应的数学模型。

2.1.117 quantization matrix [video]：解码器使用的 64 个 8 位值的量化矩阵。

2.1.118 quantized DCT coefficients [video]：量化后的 DCT 系数。经量化的 DCT 系数以变长编码形式存储于压缩视频比特流中。

2.1.119 quantizer scalefactor [video]：比特流中表示的缩放因子，解码时用于调整反量化过程。

2.1.120 random access：从任意位置开始读取和解码编码比特流的过程。

2.1.121 reference picture [video]: Reference pictures are the nearest adjacent I- or P-pictures to the current picture in display order.

2.1.122 reorder buffer [video]: A buffer in the system target decoder for storage of a reconstructed I-picture or a reconstructed P-picture.

2.1.123 requantization [audio]: Decoding of coded subband samples in order to recover the original quantized values.

2.1.124 reserved: The tern "reserved" when used in the clauses defining the coded bitstream indicates that the value may be used in the future for ISOEC defined extensions.

2.1.125 reverse playback [video]: The process of displaying the picture sequence in the reverse of display order.

2.1.121 reference picture [video]：参考图像是显示顺序中与当前图像相邻的最近 I 帧或 P 帧。

2.1.122 reorder buffer [video]：系统目标解码器中用于存储重建的 I 帧或 P 帧的缓冲区。

2.1.123 requantization [audio]：解码已编码的子带样本以恢复原始量化值的操作。

2.1.124 reserved：在定义编码比特流的条款中使用“保留”一词时，表示该值可能在未来供 ISO/IEC 定义的扩展使用。

2.1.125 reverse playback [video]：以显示顺序的逆序呈现图像序列的过程（即倒放）。

2.1.126 scalefactor band [audio]: A set of frequency lines in Layer III which are scaled by one scalefac tor.

2.1.127 scalefactor index [audio]: A numerical code for a scalefactor.

2.1.128 scalefactor [audio]: Factor by which a set of values is scaled before quantization.

2.1.129 sequence header [video]: A block of data in the coded bitstream containing the coded representation of a number of data elements.

2.1.130 side information: Information in the bitstream necessary for controlling the decoder.

2.1.126 scalefactor band [audio]：Layer III 中的一组频率线，这些频率线按一个比例因子（scalefactor）进行缩放。

2.1.127 scalefactor index [audio]：比例因子的数值代码（用于标识具体缩放值）。

2.1.128 scalefactor [audio]：量化前对一组值进行缩放的因子。

2.1.129 sequence header [video]：编码比特流中的一个数据块，包含多个数据元素的编码表示（如帧率、分辨率等元信息）。

2.1.130 side information：比特流中用于控制解码器运行的辅助信息（如运动矢量、量化参数等）。

2.1.131 skipped macroblock [video]: A macroblock for which no data are stored.

2.1.132 slice [video]: A series of macroblocks. It is one of the layers of the coding syntax defined in ISO/IEC 11172-2.

2.1.133 slot [audio]: A slot is an elementary part in the bitstream. In Layer I a slot equals four bytes, in Layers II and III one byte.

2.1.134 source stream: A single non-multiplexed stream of samples before compression coding.

2.1.135 spreading function [audio]: A function that describes the frequency spread of masking.

2.1.131 skipped macroblock [video]：未存储数据的宏块（即解码时跳过的宏块）。

2.1.132 slice [video]：一系列宏块的集合，属于 ISO/IEC 11172-2 定义的编码语法层之一。

2.1.133 slot [audio]：比特流中的基本单元。Layer I 中 1 槽 = 4 字节，Layer II 和 III 中 1 槽 = 1 字节。

2.1.134 source stream：压缩编码前的单路非复用样本流。

2.1.135 spreading function [audio]：描述掩蔽效应频率扩散特性的函数。

2.1.136 start codes [system and video]: 32-bit codes embedded in that coded bitstream that are unique. They are used for several purposes including identifying some of the layers in the coding syntax.

2.1.137 STD input buffer [system]: A first-in first-out buffer at the input of the system target decoder for storage of compressed &?ta from elementary sueams before decoding. 2.1.138 stereo mode [audio]: Mode, where two audio channels which form a stereo pair (left and right) are encoded within one bitsueam. The coding process is the same as for the dual channel mode.

2.1.139 stuffing (bits); stuffing (bytes) : Code-words that may be inserted into the compressed bitstream that are discarded in the decoding process. Their purpose is to increase the bitrate of the stream.

2.1.140 subband [audio]: Subdivision of the audio frequency band.

2.1.136 start codes [system and video]：编码比特流中嵌入的32 位唯一标识码，用于标识编码语法中的某些层（如视频序列头或系统同步点）。

2.1.137 STD input buffer [system]：系统目标解码器输入端的先进先出缓冲区，用于存储来自基本流（elementary streams）的压缩数据，供解码前缓存。

2.1.138 stereo mode [audio]：将立体声声道对（左、右）编码为单一比特流的音频模式，编码过程与双声道模式一致。

2.1.139 stuffing (bits); stuffing (bytes)：插入压缩比特流中的冗余码字（填充位或填充字节），解码时丢弃，用于调整比特率或对齐帧结构。

2.1.140 subband [audio]：音频频率带的细分单元，用于子带滤波或频域分析。

2.1.141 subband filterbank [audio]: A set of band filters covering the entire audio frequency range. In this PM of ISOEC 11 172 the subband filterbank is a polyphase filterbank.

2.1.142 subband samples [audio]: The subband filterbank within the audio encoder creates a filtered and subsampled representation of the input audio stream. The filtered samples are dled subband samples. From 384 time-consecutive input audio samples, 12 time-consecutive subband samples are generated within each of the 32 subbands.

2.1.143 syncword [audio]: A 12-bit code embedded in the audio bitstream that identifes the start of a frame.

2.1.144 synthesis filterbank [audio]: Filterbank in the decoder that reconstructs a PCM audio signal from subband samples.

2.1.145 system header [system]: The system header is a data structure defined in ISO/IEC 11172-1 that carries information summarising the system characteristics of the ISO/IEC 11172 multiplexed stream.

2.1.141 subband filterbank [audio]：覆盖整个音频频率范围的带通滤波器组。在 ISO/IEC 11172 标准中，子带滤波器组采用多相滤波器组实现。

2.1.142 subband samples [audio]：音频编码器的子带滤波器组生成的滤波和下采样后的音频流表示。每 384 个连续输入音频样本通过 32 个子带滤波后，每个子带生成 12 个连续子带样本。

2.1.143 syncword [audio]：嵌入音频比特流中的 12 位同步码，用于标识帧的起始位置

2.1.144 synthesis filterbank [audio]：解码器中通过子带样本重建 PCM 音频信号的滤波器组。

2.1.145 system header [system]：ISO/IEC 11172-1 定义的系统头数据结构，携带 ISO/IEC 11172 复用流的系统特性摘要信息。

2.1.146 system target decoder; STD [system]: A hypothetical reference model of a decoding process used to describe the semantics of an ISO/IEC 11 172 multiplexed bitstream.

2.1.147 time-stamp [system]: A term that indicates the time of an event.

2.1.148 triplet [audio]: A set of 3 consecutive subband smnples from one subband. A triplet from each of the 32 subbands forms a granule.

2.1.149 tonal component [audio]: A sinusoid-like component of an audio signal.

2.1.150 variable bitrate: Operation where the bitrate varies with time during the decoding of a compressed bitstream.

2.1.146 system target decoder; STD [system]：用于描述 ISO/IEC 11172 复用比特流语义的假设性解码过程参考模型。 2.1.147 time-stamp [system]：表示事件发生时间的时间戳。

2.1.148 triplet [audio]：来自同一子带的3 个连续子带样本。32 个子带中每个子带的三连组（triplet）共同构成一个粒（granule）。

2.1.149 tonal component [audio]：音频信号中类似正弦波的音调成分。

2.1.150 variable bitrate：解码压缩比特流时比特率随时间动态变化的编码模式。

2.1.151 variable length coding; VLC: A reversible procedure for coding that assigns shorter code-words to frequent events and longer code-words to less frequent events.

2.1.152 video buffering verifier; VBV [video]: A hypothetical decoder that is conceptually connected to the output of the encoder. Its purpose is to provide a constraint on the variability of the data rate that an encoder or editing process may produce.

2.1.153 video sequence [video]: A series of one or more groups of pictures. It is one of the layers of the coding syntax defined in ISO/IEC 11 172-2.

2.1.154 zig-zag scanning order [video]: A specific sequential ordering of the DCT coefficients from (approximately) the lowest spatial frequency to the highest.

2.1.151 variable length coding; VLC：一种可逆编码方法，通过为高频事件分配短码字、低频事件分配长码字实现数据压缩。

2.1.152 video buffering verifier; VBV [video]：一种理论解码器模型，与编码器输出逻辑连接，用于约束编码器或编辑过程中数据率的波动范围。

2.1.153 video sequence [video]：由一个或多个图像组（Group of Pictures）构成的序列，属于 ISO/IEC 11172-2 标准编码语法中的一层。

2.1.154 zig-zag scanning order [video]：DCT 系数的特定遍历顺序，从（近似）最低空间频率到最高空间频率依次排列。

​​ISO/IEC 11172-3:1993 - MP3编码标准翻译 (2)