This is the second article from the series of short articles devoted to the Information Theory.

## Conditional Entropy

**Conditional entropy** quantifies the amount of information needed to describe the outcome of a random variable Y given that the value of another random variable X is known.

The entropy of Y conditioned on X is written as H(Y|X).

## Chain rule

Assume that the combined system determined by two random variables X and Y has joint entropy H(X,Y), that is, we need H(X,Y) bits of information to describe its exact state. Now if we first learn the value of X, we have gained H(X) bits of information. Once X is known, we only need H(X,Y) - H(X) bits to describe the state of the whole system. This quantity is exactly H(Y|X), which gives the chain rule of conditional entropy:

H(Y|X) = H(X,Y) - H(X)

## Joint entropy

**Joint entropy** is a measure of the uncertainty associated with a set of variables.

**Properties:**

–The joint entropy of a set of variables is **greater than** or equal to all of the **individual entropies** of the variables in the set.

–The joint entropy of a set of variables is **less than** or equal to the sum of the **individual entropies** of the variables in the set.

Thus, formula for conditional entropy:

## References:

**Arndt C.**Information Measures: Information and its Description in Science and Engineering.**Thomas Cover.**Elements Of Information Theory.

## Comments