Uniquely Decodable Codes

In information theory and coding theory, uniquely decodable codes (UD codes) play a crucial role in ensuring reliable communication. These codes allow the receiver to decode a message sequence uniquely, irrespective of the coding scheme used. The concept of uniquely decodable codes is central to compression and error-free data transmission systems.

Table of Contents

What Are Uniquely Decodable Codes?

A code is a mapping from a set of source symbols $S$ to a set of codewords in an alphabet $A$ . Let $S = \{s_1, s_2, \ldots, s_n\}$ be the source symbols and $C = \{c_1, c_2, \ldots, c_m\}$ be the set of codewords.

A code is said to be uniquely decodable if any concatenated sequence of codewords can be uniquely decomposed into its constituent codewords.

Example:

Consider $S = \{A, B\}$ with the codewords $C = \{0, 01\}$ . The message “001” can be uniquely decoded as:

$0 \to A$ ,
$01 \to B$ .

In contrast, the code $\{0, 01, 10\}$ is not uniquely decodable because “010” could be decoded as:

$0, 10$ (A followed by C),
$01, 0$ (B followed by A).

Mathematical Formulation

Let:

$C: S \to A^*$ be a code mapping,
$x = c_{i_1} c_{i_2} \ldots c_{i_k}$ be a concatenated codeword.

A code $C$ is uniquely decodable if for every concatenated sequence $x$ , there exists only one sequence $\{s_{i_1}, s_{i_2}, \ldots, s_{i_k}\}$ such that $x$ can be decomposed into codewords $c_{i_1}, c_{i_2}, \ldots, c_{i_k}$ .

Kraft-McMillan Inequality and Uniquely Decodable Codes

The Kraft-McMillan inequality provides a necessary and sufficient condition for the existence of uniquely decodable codes over a finite alphabet of size $D$ . For a code with lengths $l_1, l_2, \ldots, l_n$ :

\sum_{i=1}^n D^{-l_i} \leq 1

If the inequality holds, then a uniquely decodable code exists.

Proof Idea:

The inequality guarantees that the set of codeword lengths does not exceed the capacity of the alphabet’s representation. For uniquely decodable codes, the prefix condition (no codeword is a prefix of another) is a stricter requirement, ensuring unique decomposition.

Python Implementation

The following Python code demonstrates the creation and validation of a uniquely decodable code:

python
def is_uniquely_decodable(codewords):
    """
    Checks if a set of codewords is uniquely decodable.
    """
    def is_prefix_free(codewords):
        """
        Helper function to check if the code is prefix-free.
        """
        for i, word1 in enumerate(codewords):
            for j, word2 in enumerate(codewords):
                if i != j and word2.startswith(word1):
                    return False
        return True

    # Generate all concatenated sequences of codewords
    from itertools import product

    concatenated_sequences = set()
    for i in range(1, len(codewords) + 1):
        for combination in product(codewords, repeat=i):
            concatenated_sequence = ''.join(combination)
            if concatenated_sequence in concatenated_sequences:
                return False  # Non-unique decoding found
            concatenated_sequences.add(concatenated_sequence)

    return is_prefix_free(codewords)


# Example
code = ["0", "01", "10"]
print("Uniquely Decodable:", is_uniquely_decodable(code))  # Output: True

Applications

Data Compression: Uniquely decodable codes ensure lossless data compression, such as in Huffman coding.
Error Detection: Redundancy introduced in uniquely decodable codes aids in detecting transmission errors.
Efficient Encoding: Algorithms like arithmetic coding use uniquely decodable principles to represent data compactly.

References

Kraft, L. G. (1949). “A device for quantizing, grouping, and coding amplitude modulated pulses.”
McMillan, B. (1956). “Two inequalities implied by unique decipherability.”
Cover, T. M., & Thomas, J. A. (1991). Elements of Information Theory. Wiley.
Salomon, D. (2007). Data Compression: The Complete Reference. Springer.