# Information Bottleneck Method

In the bottleneck algorithm, one assumes to know ${\displaystyle P(x,y)}$ the joint distribution of the predictor and the relevant variable. In reality, you almost never have this knowledge, but only the knowledge of samples from this distribution. How does the uncertainty of learning the pdf from the samples influences the findings of the bottleneck?