信息论与编码

《信息论与编码》课程复习笔记。

一些缩写:

  • pdf:probability density function概率密度函数
  • pmf: probability mass function概率质量函数
  • sss:如果对于任意的τ,都有fX(t1),⋅⋅⋅,X(tk)(x1,⋅⋅⋅,xk)=fX(t1+τ),⋅⋅⋅,X(tk+τ)(x1,⋅⋅⋅,xk),则称该随机过程是Strict-Sense Stationarity(SSS),即强平稳。
  • i.i.d.:independent and identically distributed(i.i.d.)process是一个很常见的SSS随机过程。其中
    independent代表该随机过程内所有时间点上随机变量都相互独立;identically distributed代表该随机过程的所有时间点上的随机变量的PDF都完全一致。即:fX(t1),x(t2),⋅⋅⋅,X(tk)(x1,⋅⋅⋅,xk)=fX(x1)fX(x2)⋅⋅⋅fX(xk) 且 fX(t1+τ),X(t2+τ),⋅⋅⋅,X(tk+τ)(x1,⋅⋅⋅,xk)=fX(x1)fX(x2)⋅⋅⋅fX(xk)。
  • wss: Wide-Sense Stationarity(WSS)又被称为Weak-Sense Stationarity,弱平稳。一个WSS的随机过程需要满足两个条件:随机过程在任意采样点上的随机变量的expectation都与时间无关,即μX(t)=μX;随机过程在任意两个采样点上的随机变量correlation/covariance都只与这两个采样点之间的时间差相关,与它们所处的位置无关

    Chapter 1

设计准则:

  1. 充分利用通信资源,如信道带宽和信号传输能量;
  2. 降低模拟通信过程中的失真;
  3. 最小化数字通信过程中的错误率;
  4. 降低成本。

(带宽、能量、可靠性和复杂度的折衷)

source coding,即 data compression

目标:去除冗余、提供一个更有效的信息表达方式、最大化带宽利用率、提高数据传输速度

  • lossless compression
    • Huffman coding
    • Arithmetic coding
    • Lempel-Ziv algorithm
  • lossy compression
    • JPEG

Channel coding

目标:通过对做完信源编码后的信息加入冗余信息,使得接收方在收到信号后,可通过信道编码中的冗余信息,做前向纠错。保证通信的可靠性。

原理:通过在传输信息流中插入控制冗余信息(inserting controlled redundancy)

  • Hamming codes: ECC memory
  • Reed-Solomon codes: DVD, xDSL, 3G
  • Convolution codes: GSM, satellite transmission
  • Trellis coded modulation: modem
  • CRC codes: digital network and storage devices

Modulation 调制解调

目标:把信道编码的输出转换为模拟信号,以支持在有噪声的信道上的有效传输。

  • amplitude-shift keying (ASK)
  • frequency-shift keying (FSK)
  • phase-shift keying (PSK)

Channel Models

协方差cov(X1, X2) = E[(X1 - E(X1))(X2 - E(X2))]


Chapter 2. Source Coding

  1. 在不失真情况下,每个符号传输的信息比特的最少位数?
  2. 压缩算法的选择

信源模型

离散信源

memoryless:A discrete source { Xi } is said to be memoryless if { Xi } is an i.i.d. random sequence, i.e., for any m and n, the random vector (Xm+1, Xm+2, ··Xm+n) is i.i.d..(独立同分布)

例:Markov Sources,如果当前信号源的输出只取决于前一个输出字母,则称为马尔可夫信源。

模拟信源

  1. X(t) 是 WSS
  2. 能量频谱密度 Sx(f) 的带宽限制在 W 之内。在 t = n/2W 时刻采样。

则采样结果 X’(t) 与 X(t) 等价。

熵与互信息 Entropy and Mutual Information

X~( x, p(x) ),熵 H(X) ,也可以写作 H (P0, P1, ··· Pj-1).

自信息 I(Xj) = -log P(Xj),所以熵 H(X) 等于自信息的均值,即 H(X) = E[ I(X) ]

在数据压缩中,熵表示 bits/letter 的最终压缩率。

//下面的,why???

Entropy, information and uncertainty are all related. Consider the event X=Xj.

  • Before the event X=Xj occurs, I(Xj) represents the average amount of uncertainty of the random variable X.
  • After the event X=Xj occurs, the occurrence of this particular event gives us the amount of information I(Xj).
  • H(X) represents the average amount of information obtained from the corresponding event.

联合熵 Joint entropy (P(X, Y)为联合概率)

条件熵

链式法则:Since P(xj,yi) = P(yi)P(xj|yi), we have -logP(xj,yi) = -logP(yi) -logP(xj|yi) = I(yi) + I (xj|yi) Take expectation on both sides, we have H(X, Y) = H(Y) + H(X|Y)

互信息

I(xj; yi) =I(xj) -I(xj|yi) =I(yj ; xi) = I(yj) - I(yj|xi)

  • I(xj) is the amount of uncertainty about the event X = Xj.
  • I(xj|yi) is the amount of uncertainty remaining about the event X = xj after we observe the outcome Y = yi.
  • The difference between I(xj) -I(xj|yi) is the amount of information that our observation Y = yi provides about the event X = xj.

平均互信息:

I(X; Y) = I(Y; X) = H(X) - H(X|Y) = H(Y) - H(Y|X) = H(X) + H(Y) - H(X, Y)

Chapter 3. Coding for Discrete Sources

无损信源编码:无歧义

无损无记忆编码:逐字母做无损编码

瞬时代码 (instantaneous code):看到比特后可以立即编码,而无需看到未来输入的数字。与前缀属性 (prefix property) 是同一个东西。

性能评估:用 nj 表示 码字 (codeword) C(xj) 的长度,那么性能由平均码字长度表示

|C(x)| = the length of C(x). R\ is the average rate of the output of C,平均码字长度。

于是,编码目标:最小化 R\

Kraft-McMillan Inequality:Any memoryless code C satisfies the following Kraft inequality

(证明未看)PPT 14-18

R\ >= H(X1) //证明未看 P21-24 平均码字长度和熵之间最多有一位不同

Block Memoryless Codes:可以减少平均码字长度 R\ 和熵 H(X1) 之间的不同。设 block 长度为 n

PPT 26-28 未看

编码流程:

  1. 根据前缀码规则,给不同概率的字母指定码字长度;
  2. 计算 kraft 不等式,若满足,可以用前缀码;
  3. 用前缀码编码。

哈夫曼编码(从最小概率的节点开始建树)

Chapter 4. Arithmetic Coding

为何使用算术码?哈夫曼编码的长度通常严格大于熵 H(X);使用块编码虽然会提升编码效率,但也会提高计算复杂度。综上,哈夫曼编码效率很低。–> 算术编码

Shannon-Fanno-Elias Codes: 中心思想:按照概率切分 [0, 1] 区间,The length of the interval I(u1,u2, ··· un) is equal to p(u1,u2, ··· un)联合概率

假设 I(u1u2··· un) = [a, b],

Adaptive Arithmetic Coding:

Lempel-Ziv Algorithm:

LZ编码过程:

length = log iJ, J 是给定的,m 是其前缀所在的位置,0代表从未出现过;j 是在前缀的基础上添加的 bit。最终得到的 codeword 是 m J + j 的二进制。

Chapter 5. Lossy Source Coding

Chapter 6. JPEG Encoding

The entropy decoder decodes the zig-zag sequence of
quantized DCT coefficients.

FDCT & IDCT?

至此,信源编码相关内容结束。


Chapter 7. Communication with AWGN Interference

寻找最优的决策方程,以最小化平均损失:贝叶斯决策准则 Bayes Decision Theory、最大后验概率准则 Maximum a Posteriori Probability Rule、最大似然准则 Maximum Likelihood (ML) Decision Rule、最小距离准则

If all messages mi are equally likely, i.e., the maximization of p(mi|x) is equivalent to the maximization of p(x|mi). •The MAP decision rule reduces to the maximum likelihood decision rule.

不同情况用何准则?

马尔可夫链:Markov chain:Three random variables X,Y,Z, are said to form a Markov chain (X->Y->Z) if the conditional pmf (or pdf) of Z given Y and X depends only on Y, i.e., p(z|y,x)= p(z|y) for any x,y and z

Chapter 8. Digital Modulation Methods

Modulator:有记忆/无记忆

memoryless modulation methods

  1. 数字脉冲幅度调制 Pulse Amplitude Modulation (PAM)

Digital PAM is also called amplitude-shifting-keying (ASK).

  1. 数字相位调节

Digital phase modulation is often called phase-shift keying (PSK). PSK with M signals is called M-PSK.

  1. Quadrature Amplitude Modulation (QAM) 正交幅度调制

术语cos2πfct通常称为同相载波,术语sin2πfct称为正交载波。In QAM, the transmitted information is carried by both the amplitudes of the inphase carrier and the amplitudes of the quadrature carrier.

  1. The transmitted information is carried by different frequencies. This type of modulation is called frequency-shift-keying (FSK) or M-FSK.

Channel Capacity

For any fixed bit SNR Eb/N0, the block error prob. goes to 1 as n approaches infinity. In order to transmit information reliably over the BSC (Binary Symmetric Channels), one has to employ channel encoders and decoders.

In order to achieve essentially error free transmission, Eb/N0 must be ≥-1.6 dB (Shannon limit)

An transmission rate Rb< C is achievable, any rate Rb>C is not achievable (Shannon noisy channel coding theorem).


channel encoder

Chapter 9. 线性分组码

GF(p) = { 0, 1, 2, ···p-1 }, where p is a prime. The addition and multiplication are carried out modulo p.(对p取约)

本元多项式 (Primitive Polynomial):如果不可拆分的f(x)除以x^(n-1)的最小n = q^m - 1,那么f(x)被称为本元多项式。

Linear Block Codes

生成矩阵:G

G can be used to directly encode the k information bits into a codeword C.


奇偶校验矩阵 (parity check matrix):H HG’ = 0

Systematic Code:

Ci= xiG,where xi = (xi1, xi2, …xik) is the k information bits。在码字 C 中,前 k 位为信息位,剩下的 n-k 位为前k位的线性组合,称为 parity check bits。

d=码长-有效位(也就是局部校验码的位数),H 中线性无关的列数。

If C is an [ n, k, d] code, then d≤n-k+1. If d = n – k + 1, then the corresponding [n, k, d] code C is called maximum distance separable (MDS)最大可分离距离

Cyclic Codes

若一个码字的任意循环转换都是码字,则成为循环码。Example : C = {000, 110, 101, 011}

C(x) = f(x)g(x)

x^n - 1 = g(x)h(x)

Chapter 10. Decoding of Linear Block Codes

//软解码硬解码没看懂

Optimum Soft Decision Decoding of Linear Block Codes

The channel encoder maps the k information bits m0,···,mk-1 into a codeword C =(Cn-1,···,C0).

The optimal decision rule is the minimum distance decision rule.

Hard Decision Decoding of Linear Block Codes

To reduce the computation burden, we can quantize the analog signals, and the decoding is performed digitally.

Minimum Hamming Distance Decoding Rule:选择与 C 汉明距最短的解码器。(异或后求 1 的个数)

Error Detection & Correction Capability

Error Detection Capability:Let C be a block code with length nand minimum Hamming distance d. C is capable of detecting up to d-1errors.

Error Correction Capability:When C is used only for error correction, Ccan correct up to (d-1)/2 errors

Syndrome Decoding

左边框框是错误向量,有1代表该位出错;syndrome是错误向量和校验矩阵 H 相乘得到的结果。没框的中间那部分,是码字,需要题目给定的好像。d的计算:H中有d-1列线性无关,d列线性相关。

Chapter 11. Concatenated Block Codes

Most of the codes discussed so far are designed for correcting and detecting random errors, but not suitable for correcting burst errors. –> Concatenated Block Codes

Convolutional Codes

rate = k/n: 输入 k 位信息位,输出 n 位。

The total # L of SR cells is called the constraint length of the convolutional code.(L:移位寄存器数量,卷积码的约束长度 constraint length)

memory M = L - K

//权重分布和转换方程未看

Optimal Decoding of Convolutional Codes: Viterbi Algorithm

Viterbi 之后的内容也未看

至此,信道编码相关内容结束。


Chapter 12. Cryptography

Stream Cypher Model:加密系统

Feedback Shift Registers (FSR) LFSR: Linear

注意高位和低位的位置。

si = (ai, ai+1, …, ai+n-1), i = 0, 1, … 是 FSR 的第 i 个状态。

周期性:最终周期序列,u 是周期之前的位数,r 是最小正周期。

i.e. 00 011 011 011 … 是 ultimately periodic, u = 2, r = 3

性质:

r <= q^n (n-stage FSR). A sequence generated by a LFSR over F = GF(q) with period 2^n - 1 is called a maximal length sequence or m-sequence.

n-stage LFSR <--> f(x) of degree n。移位寄存器中最右边的为 x^n。

最小生成式 (Minimal Polynomial, MP):不可约。最小生成式的阶,称为 a 的 linear span,线性跨度。

本元多项式 (primitive polynomial):f(x) 产生的任意非零序列是 m-sequence,即周期为 q^n - 1。求周期时注意所给的输入序列和FSR上移位寄存器的位置是相反的!

两个序列之间的交叉关系:

另:autocorrelation 自相关系数

Chapter 13. AES

AES: Advanced Encryption Standrad

未看

至此,加密部分结束


损失函数、性能指标:1-2道题

  1. Digital communication block diagram, Design criteria of a digital communication System, Channel models, Review of probability theories
  2. Source coding: discrete sources, analog sources, entropy, joint entropy, conditional entropy, mutual information, lossless source code, prefix code, Lossless source coding theorem, Block memoryless coding theorem, Huffman coding, Arithmetic coding, Lempe-Ziv coding, lossy source coding, rate distortion function, lossy source coding theorem, uniform and nonuniform quantization, Lloyd algorithm, LBG algorithm, JPEG
  3. Markov chain, Maximum a posteriori probability (MAP) rule, Maximum likelihood decision rule, Minimum distance decision rule
  4. Sufficient statistics, Vector communication: Decision region, Waveform communication: signal space
  5. Digital modulation methods: PAM, MPSK, QAM, MFSK
  6. Channel capacity, Shannon noisy channel coding theorem
  7. Linear block codes: generator matrix, parity check matrix, systematic code, hamming distance
  8. Cyclic code: generator polynomial
  9. Decoding of linear block code: soft decision, hard decision, error correction and detection capability, syndrome decoding
  10. Convolutional code: different representation, Viterbi decoding
  11. Stream Cypher Model, LFSR(periodic property, characteristic polynomial, primitive polynomial, randomness criteria), AES