In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.
Notation
C
⊂
F
2
n
{\displaystyle C\subset \mathbb {F} _{2}^{n}}
is considered a binary code with the length
n
{\displaystyle n}
;
x
,
y
{\displaystyle x,y}
shall be elements of
F
2
n
{\displaystyle \mathbb {F} _{2}^{n}}
; and
d
(
x
,
y
)
{\displaystyle d(x,y)}
is the distance between those elements.
Ideal observer decoding
One may be given the message
x
∈
F
2
n
{\displaystyle x\in \mathbb {F} _{2}^{n}}
, then ideal observer decoding generates the codeword
y
∈
C
{\displaystyle y\in C}
. The process results in this solution:
-
P
(
y
sent
∣
x
received
)
{\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})}
For example, a person can choose the codeword
y
{\displaystyle y}
that is most likely to be received as the message
x
{\displaystyle x}
after transmission.
Decoding conventions
Each codeword does not have an expected possibility: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree ahead of time on a decoding convention. Popular conventions include:
- Request that the codeword be resent – automatic repeat-request.
- Choose any random codeword from the set of most likely codewords which is nearer to that.
- If another code follows, mark the ambiguous bits of the codeword as erasures and hope that the outer code disambiguates them
- Report a decoding failure to the system
Maximum likelihood decoding
Given a received vector
x
∈
F
2
n
{\displaystyle x\in \mathbb {F} _{2}^{n}}
maximum likelihood decoding picks a codeword
y
∈
C
{\displaystyle y\in C}
that maximizes
-
P
(
x
received
∣
y
sent
)
{\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})}
,
that is, the codeword
y
{\displaystyle y}
that maximizes the probability that
x
{\displaystyle x}
was received, given that
y
{\displaystyle y}
was sent. If all codewords are equally likely to be sent then this scheme is equivalent to ideal observer decoding.
In fact, by Bayes' theorem,
-
P
(
x
received
∣
y
sent
)
=
P
(
x
received
,
y
sent
)
P
(
y
sent
)
=
P
(
y
sent
∣
x
received
)
⋅
P
(
x
received
)
P
(
y
sent
)
.
{\displaystyle {\begin{aligned}\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})&{}={\frac {\mathbb {P} (x{\mbox{ received}},y{\mbox{ sent}})}{\mathbb {P} (y{\mbox{ sent}})}}\\&{}=\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})\cdot {\frac {\mathbb {P} (x{\mbox{ received}})}{\mathbb {P} (y{\mbox{ sent}})}}.\end{aligned}}}
Upon fixing
P
(
x
received
)
{\displaystyle \mathbb {P} (x{\mbox{ received}})}
,
x
{\displaystyle x}
is restructured and
P
(
y
sent
)
{\displaystyle \mathbb {P} (y{\mbox{ sent}})}
is constant as all codewords are equally likely to be sent.
Therefore,
P
(
x
received
∣
y
sent
)
{\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})}
is maximised as a function of the variable
y
{\displaystyle y}
precisely when
P
(
y
sent
∣
x
received
)
{\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})}
is maximised, and the claim follows.
As with ideal observer decoding, a convention must be agreed to for non-unique decoding.
The maximum likelihood decoding problem can also be modeled as an integer programming problem.[1]
The maximum likelihood decoding algorithm is an instance of the "marginalize a product function" problem which is solved by applying the generalized distributive law.[2]
Minimum distance decoding
Given a received vector
x
∈
F
2
n
{\displaystyle x\in \mathbb {F} _{2}^{n}}
, minimum distance decoding picks a codeword
y
∈
C
{\displaystyle y\in C}
to minimise the Hamming distance:
-
d
(
x
,
y
)
=
|
{
i
:
x
i
≠
y
i
}
|
{\displaystyle d(x,y)=|\{i:x_{i}\not =y_{i}\}|}
i.e. choose the codeword
y
{\displaystyle y}
that is as close as possible to
x
{\displaystyle x}
.
Note that if the probability of error on a discrete memoryless channel
p
{\displaystyle p}
is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if
-
d
(
x
,
y
)
=
d
,
{\displaystyle d(x,y)=d,\,}
then:
-
P
(
y
received
∣
x
sent
)
=
(
1
−
p
)
n
−
d
⋅
p
d
=
(
1
−
p
)
n
⋅
(
p
1
−
p
)
d
{\displaystyle {\begin{aligned}\mathbb {P} (y{\mbox{ received}}\mid x{\mbox{ sent}})&{}=(1-p)^{n-d}\cdot p^{d}\\&{}=(1-p)^{n}\cdot \left({\frac {p}{1-p}}\right)^{d}\\\end{aligned}}}
which (since p is less than one half) is maximised by minimising d.
Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:
- The probability
p
{\displaystyle p}
that an error occurs is independent of the position of the symbol.
- Errors are independent events – an error at one position in the message does not affect other positions.
- The probability
p
{\displaystyle p}
These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.
As with other decoding methods, a convention must be agreed to for non-unique decoding.
Syndrome decoding
Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel, i.e. one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. This is allowed by the linearity of the code.[3]
Suppose that
C
⊂
F
2
n
{\displaystyle C\subset \mathbb {F} _{2}^{n}}
is a linear code of length
n
{\displaystyle n}
and minimum distance
d
{\displaystyle d}
with parity-check matrix
H
{\displaystyle H}
. Then clearly
C
{\displaystyle C}
is capable of correcting up to
-
t
=
⌊
d
−
1
2
⌋
{\displaystyle t=\left\lfloor {\frac {d-1}{2}}\right\rfloor }
errors made by the channel (since if no more than
t
{\displaystyle t}
errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).
Now suppose that a codeword
x
∈
F
2
n
{\displaystyle x\in \mathbb {F} _{2}^{n}}
is sent over the channel and the error pattern
e
∈
F
2
n
{\displaystyle e\in \mathbb {F} _{2}^{n}}
occurs. Then
z
=
x
+
e
{\displaystyle z=x+e}
is received. Ordinary minimum distance decoding would lookup the vector
z
{\displaystyle z}
in a table of size
|
C
|
{\displaystyle |C|}
for the nearest match - i.e. an element (not necessarily unique)
c
∈
C
{\displaystyle c\in C}
with
-
d
(
c
,
z
)
≤
d
(
y
,
z
)
{\displaystyle d(c,z)\leq d(y,z)}
for all
y
∈
C
{\displaystyle y\in C}
. Syndrome decoding takes advantage of the property of the parity matrix that:
-
H
x
=
0
{\displaystyle Hx=0}
for all
x
∈
C
{\displaystyle x\in C}
. The syndrome of the received
z
=
x
+
e
{\displaystyle z=x+e}
is defined to be:
-
H
z
=
H
(
x
+
e
)
=
H
x
+
H
e
=
0
+
H
e
=
H
e
{\displaystyle Hz=H(x+e)=Hx+He=0+He=He}
To perform ML decoding in a binary symmetric channel, one has to look-up a precomputed table of size
2
n
−
k
{\displaystyle 2^{n-k}}
, mapping
H
e
{\displaystyle He}
to
e
{\displaystyle e}
.
Note that this is already of significantly less complexity than that of a standard array decoding.
However, under the assumption that no more than
t
{\displaystyle t}
errors were made during transmission, the receiver can look up the value
H
e
{\displaystyle He}
in a further reduced table of size
-
∑
i
=
0
t
(
n
i
)
{\displaystyle {\begin{matrix}\sum _{i=0}^{t}{\binom {n}{i}}\\\end{matrix}}}
List decoding
Information set decoding
This is a family of Las Vegas-probabilistic methods all based on the observation that it is easier to guess enough error-free positions, than it is to guess all the error-positions.
The simplest form is due to Prange: Let
G
{\displaystyle G}
be the
k
×
n
{\displaystyle k\times n}
generator matrix of
C
{\displaystyle C}
used for encoding. Select
k
{\displaystyle k}
columns of
G
{\displaystyle G}
at random, and denote by
G
′
{\displaystyle G'}
the corresponding submatrix of
G
{\displaystyle G}
. With reasonable probability
G
′
{\displaystyle G'}
will have full rank, which means that if we let
c
′
{\displaystyle c'}
be the sub-vector for the corresponding positions of any codeword
c
=
m
G
{\displaystyle c=mG}
of
C
{\displaystyle C}
for a message
m
{\displaystyle m}
, we can recover
m
{\displaystyle m}
as
m
=
c
′
G
′
−
1
{\displaystyle m=c'G'^{-1}}
. Hence, if we were lucky that these
k
{\displaystyle k}
positions of the received word
y
{\displaystyle y}
contained no errors, and hence equalled the positions of the sent codeword, then we may decode.
If
t
{\displaystyle t}
errors occurred, the probability of such a fortunate selection of columns is given by
(
n
−
t
k
)
/
(
n
k
)
≈
exp
(
−
t
k
/
n
)
{\displaystyle \textstyle {\binom {n-t}{k}}/{\binom {n}{k}}\approx \exp(-tk/n)}
.
This method has been improved in various ways, e.g. by Stern[4] and Canteaut and Sendrier.[5]
Partial response maximum likelihood
Partial response maximum likelihood (PRML) is a method for converting the weak analog signal from the head of a magnetic disk or tape drive into a digital signal.
Viterbi decoder
A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using forward error correction based on a convolutional code. The Hamming distance is used as a metric for hard decision Viterbi decoders. The squared Euclidean distance is used as a metric for soft decision decoders.
Optimal decision decoding algorithm (ODDA)
Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system.[6]
See also
References
- Feldman, Jon; Wainwright, Martin J.; Karger, David R. (March 2005). "Using Linear Programming to Decode Binary Linear Codes". IEEE Transactions on Information Theory. 51 (3): 954–972. CiteSeerX 10.1.1.111.6585. doi:10.1109/TIT.2004.842696. S2CID 3120399.
- Aji, Srinivas M.; McEliece, Robert J. (March 2000). "The Generalized Distributive Law" (PDF). IEEE Transactions on Information Theory. 46 (2): 325–343. doi:10.1109/18.825794.
- Beutelspacher, Albrecht; Rosenbaum, Ute (1998). Projective Geometry. Cambridge University Press. p. 190. ISBN 0-521-48277-1.
- Stern, Jacques (1989). "A method for finding codewords of small weight". Coding Theory and Applications. Lecture Notes in Computer Science. Vol. 388. Springer-Verlag. pp. 106–113. doi:10.1007/BFb0019850. ISBN 978-3-540-51643-9.
- Ohta, Kazuo; Pei, Dingyi, eds. (1998). Advances in Cryptology — ASIACRYPT'98. Lecture Notes in Computer Science. Vol. 1514. pp. 187–199. doi:10.1007/3-540-49649-1. ISBN 978-3-540-65109-3. S2CID 37257901.
- Siamack Ghadimi (2020), Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system;, Universal Journal of Electrical and Electronic Engineering
Further reading
- Hill, Raymond (1986). A first course in coding theory. Oxford Applied Mathematics and Computing Science Series. Oxford University Press. ISBN 978-0-19-853803-5.
- Pless, Vera (1982). Introduction to the theory of error-correcting codes. Wiley-Interscience Series in Discrete Mathematics. John Wiley & Sons. ISBN 978-0-471-08684-0.
- van Lint, Jacobus H. (1992). Introduction to Coding Theory. Graduate Texts in Mathematics (GTM). Vol. 86 (2 ed.). Springer-Verlag. ISBN 978-3-540-54894-2.