Friday, August 18, 2017

Sums and Differences of Random Numbers


Here's a problem that cropped up in the course of some calculations I have been working on recently. In stating the problem, I'll include the physical context, thought this context isn't important for the rest of the discussion, which applies generally to certain manipulations on probability distributions:

A high-energy massive particle (e.g. a proton or α-particle), whose initial kinetic energy is governed by a certain probability distribution, passes through a slab of material. As it travels through the material, it scatters some of the electrons inside the slab, dissipating a small fraction of its energy. The amount of its energy that it loses also has some probability distribution (the so-called 'straggling function'). What is the probability distribution over the energy of the particle as it exits the slab of material? 

I'm sure all of us wrestle with exactly this question several times each day. The problem concerns the probability distribution over the difference between two independent random variables1 (in this case the particle's initial energy and the energy it deposits in the slab of material).

The route to solving this problem involves utilizing the solution to a related problem, namely the probability distribution over the sum of two random variables, so first let's look at that.

Thursday, August 3, 2017

Standard Error



In the quantification of uncertainty, there is an important distinction that's often overlooked. This is the distinction between the dispersion of a distribution, and the dispersion of the mean of the distribution. 

By 'dispersion of a distribution,' I mean how poorly is the mass of that probability distribution localized in hypothesis space. If half the employees in Company A are aged between 30 and 40, and half the employees in Company B are aged between 25 and 50, then (all else equal) the probability distribution over the age of a randomly sampled employee from Company B has a wider dispersion then the corresponding distribution for Company A.

A common measure of dispersion is the standard deviation, which is the average of the distance between all the parts of the distribution and the mean of that distribution.