Notations-1
To keep consistency for the readers and my sanity, this article will be used as a reference to introduce notations for the following articles:
- Self Information
- Entropy
- Joint, Conditional and Marginal Entropy
- Mutual Information
- Kullback–Leibler Divergence
The common notations are related to random variables, probability density function (PDF) and probability mass function (PMF).
Notation | Symbol |
---|---|
Random Variable | |
Sample Value of a Random Variable | |
Set of Possible Sample Values of | |
Probability Distribution (Both PDF and PMF) | |
Probability Mass Function (PMF) | |
Probability Density Function (PDF) | |
Expectation over Distribution |
The notation
Other Notations
: Assume natural logarithm or base- . Most resources you will come across for these topics, tend to use base- logarithm, but in the machine learning domain, it is common practice to use a base- logarithm. : You can find more information about using a semicolon here. I might have as well used instead.KL Divergence: Even though it depends on the context of what I am talking about, in general, consider
as the true/target distribution which we are trying to approximate and as the parameterized distribution. Most of these are only applicable when talking about variational inference.