Information Theory

Shannon Entropy

Shannon Entropy H(X) of a distribution is the expected amount of information in an event drawn from that distribution. It gives a lower bound on the number of bits needed on average to encode symbols drawn from a distribution P.


The Shannon Entropy measures:

Kullback-Liebler Divergence

Kullback-Liebler Divergence KI is a method for measuring the dissimilarity between two probability distributions P and Q. It can also be seen as the Relative Entropy measure between the two distributions.


Mutual Information

Mutual Information MI between two vectors X and Y measures the dissimilarity between joint distribution p(X,Y) and factored distribution p(X)p(Y). Mutual Information also measures the reduction in uncertainty for one variable given a known value of the other variable.


Information Gain

Information Gain IG measures the reduction in entropy or surprise by splitting a dataset according to a given value of a random variable.
