The Kraft-McMillan Inequality

The Kraft-McMillan inequality is a foundational result in information theory that establishes a crucial constraint on the lengths of codewords in uniquely decodable codes. Developed independently by Leon Kraft in 1949 and Brockway McMillan in 1956, the inequality is essential for understanding how information can be efficiently encoded while ensuring that no ambiguities arise during decoding.

In essence, the Kraft-McMillan inequality provides a mathematical condition that must be satisfied by any prefix-free or uniquely decodable code. A code is prefix-free if no codeword is a prefix of another, and it is uniquely decodable if encoded messages can be uniquely reconstructed without ambiguity. The inequality states that for a code over an alphabet of size $D$ , the sum of $D^{-l_i}$ (where $l_i$ is the length of the $i$ -th codeword) must not exceed 1.

This result has profound implications for data compression, enabling the design of optimal coding schemes, such as Huffman codes, which achieve the minimum average codeword length for a given probability distribution. The Kraft-McMillan inequality also forms the theoretical foundation for more advanced topics like entropy, channel coding, and the efficiency of communication systems.

Understanding this inequality is not just critical for theoretical insights but also for practical applications in areas such as file compression, error correction, and telecommunications.

Table of Contents

1. Introduction

Efficient encoding of information relies on the ability to assign codewords to messages in a manner that ensures unique decodability. Uniquely decodable codes are a broader class of codes where no sequence of codewords can be confused with another. Within this class lies prefix codes, a subset where no codeword is a prefix of another.

The Kraft-McMillan inequality provides:

A necessary condition for the lengths of codewords in uniquely decodable codes.
A constructive proof that any set of lengths satisfying the inequality can always form a prefix code.

2. The Kraft-McMillan Inequality

Theorem

Let $\mathcal{C}$ be a code with $N$ codewords, each of length $l_1, l_2, \ldots, l_N$ . If $\mathcal{C}$ is uniquely decodable, then:

K(\mathcal{C}) = \sum_{i=1}^N 2^{-l_i} \leq 1

Interpretation

Necessary Condition: For any uniquely decodable code, the inequality must hold.
Sufficient Condition: If the inequality is satisfied, a prefix code can always be constructed with the same codeword lengths.

3. Proof of Necessity

Key Idea

If $K(\mathcal{C}) > 1$ , the sum $K(\mathcal{C})^n$ grows exponentially with $n$ . However, the growth rate of the combined lengths of $n$ -codeword sequences must remain bounded by the structure of the encoding tree.

Proof

Expression for $K(\mathcal{C})^n$ :

K(\mathcal{C})^n = \left( \sum_{i=1}^N 2^{-l_i} \right)^n

Expanding the $n$ -fold product:

K(\mathcal{C})^n = \sum_{i_1=1}^N \sum_{i_2=1}^N \cdots \sum_{i_n=1}^N 2^{-(l_{i_1} + l_{i_2} + \cdots + l_{i_n})}

Here, $l_{i_1} + l_{i_2} + \cdots + l_{i_n}$ represents the total length of an $n$ -codeword sequence.

Bounds on Total Lengths:
- The smallest total length is $n$ , achievable if every codeword length is 1.
- The largest total length is $nl$ , where $l = \max(l_1, l_2, \ldots, l_N)$ .
Therefore, the sum can be expressed as:

K(\mathcal{C})^n = \sum_{k=n}^{nl} A_k 2^{-k}

where $A_k$ is the number of combinations of $n$ -codewords whose total length equals $k$ .

Bound on $A_k$ : The number of distinct binary sequences of length $k$ is $2^k$ . If the code is uniquely decodable, no two sequences of codewords can map to the same binary sequence. Hence:

A_k \leq 2^k

Simplification: Using $A_k \leq 2^k$ :

K(\mathcal{C})^n \leq \sum_{k=n}^{nl} 2^k \cdot 2^{-k} = nl – n + 1

Growth Contradiction: If $K(\mathcal{C}) > 1$ , then $K(\mathcal{C})^n$ grows exponentially with $n$ , while $nl – n + 1$ grows linearly. For sufficiently large $n$ , this leads to a contradiction.
Therefore, $K(\mathcal{C}) \leq 1$ for any uniquely decodable code.

4. Proof of Sufficiency

Theorem

Given a set of integers $l_1, l_2, \ldots, l_N$ satisfying:

\sum_{i=1}^N 2^{-l_i} \leq 1

a prefix code can always be constructed with these lengths.

Proof by Construction

Assume Sorted Lengths: Without loss of generality, let $l_1 \leq l_2 \leq \cdots \leq l_N$ .
Construct Codewords: Define a sequence $w_1, w_2, \ldots, w_N$ as follows:
- $w_1 = 0$
- $w_j = \sum_{i=1}^{j-1} 2^{l_j – l_i}, \, j > 1$
Binary Representation:
- Convert each $w_j$ to its binary representation.
- Pad with zeros to ensure the length is $l_j$ .
Prefix Property: To prove the prefix property, consider any $j < k$ . By construction, $w_j$ and $w_k$ are distinct, and $w_k$ cannot start with the bits of $w_j$ . Therefore, the code is a prefix code.

5. Applications

5.1 Data Compression

The inequality underpins optimal compression techniques like Huffman coding, ensuring that the codewords generated are both efficient and decodable.

5.2 Uniquely Decodable Codes

The inequality provides a bridge between uniquely decodable codes and prefix codes, showing that focusing on prefix codes does not limit the design space of uniquely decodable codes.

6. Python Implementation

python
import math

def verify_kraft_mcmillan(lengths):
    """Verify the Kraft-McMillan inequality."""
    kraft_sum = sum(2**-l for l in lengths)
    return kraft_sum, kraft_sum <= 1

def construct_prefix_code(lengths):
    """Construct a prefix code with given lengths."""
    lengths = sorted(lengths)
    w = []
    prefix_code = []
    
    current = 0
    for l in lengths:
        w.append(current)
        code = format(current, f'0{l}b')
        prefix_code.append(code)
        current += 1 << (max(lengths) - l)
    
    return prefix_code

# Example
code_lengths = [2, 3, 3, 4]
kraft_sum, is_valid = verify_kraft_mcmillan(code_lengths)
print(f"Kraft-McMillan Sum: {kraft_sum}")
print(f"Inequality Valid: {is_valid}")

if is_valid:
    prefix_code = construct_prefix_code(code_lengths)
    print("Prefix Code:", prefix_code)
else:
    print("Cannot construct prefix code.")

References

“Elements of Information Theory” by Thomas M. Cover and Joy A. Thomas, which provides a comprehensive explanation of the Kraft-McMillan inequality and its applications.
McMillan, B. (1956). Two inequalities implied by unique decipherability, introducing the mathematical foundation of the inequality.
Kraft, L.G. (1949). A Device for Quantizing, Grouping, and Coding Amplitude Modulated Pulses, which formalized the original Kraft inequality.
“Introduction to Coding Theory” by J.H. van Lint, discussing the mathematical basis of uniquely decodable codes and prefix codes.
MIT OpenCourseWare on Information Theory (https://ocw.mit.edu), which offers free lectures and notes on the topic.
NPTEL Online Course on Information Theory and Coding (https://nptel.ac.in), which includes detailed discussions and examples of the Kraft-McMillan inequality.

1. Introduction

2. The Kraft-McMillan Inequality

Theorem

Interpretation

3. Proof of Necessity

Key Idea

Proof

4. Proof of Sufficiency

Theorem

Proof by Construction

5. Applications

5.1 Data Compression

5.2 Uniquely Decodable Codes

6. Python Implementation

References

Leave a Comment Cancel reply