Meta & resources

Probability, statistics and data analysis

Machine Learning: concepts & procedures

Machine Learning: fundamental algorithms

Machine Learning: model assessment

Artificial neural networks

Natural language processing

The computer science appendix

The mathematics appendix

The Monte Carlo method

What is

The Monte Carlo is a probability-based method, very popular in Physics circles, to perform numerical estimations of quantities. It relies on the very simple idea of *repeated random sampling* and is often used to estimate integrals. In practice, what you do is drawing a lot of random numbers, and observing the number of them which respect a certain property, property that will give you the estimate of the quantity you're looking for and that is difficult to calculate analytically.

The method is a very simple and elegant one, and this is why it's one of the silver bullets of Physics. It was proposed by S Ulam while working on military-related projects at the Los Alamos labs in 1940 and became a big contributor to the work of the Project Manhattan, see this historical review paper for a dive into those times. The name is a clear reference to the casino in Monaco.

Monte Carlo estimations are regularly used in problems of

- optimisation
- numerical integration
- drawing from a probability distribution

The basic idea can be visualised by imagining that we can estimate the surface area of a lake by throwing stones across and counting the number of those which fall in the lake with respect to those that fall outside.

The Monte Carlo method is fundamentally based on the law of large numbers (see page).

Numerical integration: an example estimating

$\pi$

This is a pedagogical example, cited in many places, for instance Wikipedia. Because of the relation linking

$\pi$

to the area $A$

of a circle of radius$r$

, namely$A = \pi r^2 \ ,$

we can easily estimate

$\pi$

via considering a quarter-circle inscribed in a square. The area of the square would simply be$r^2$

, hence the ratio of the two areas comes down to$\pi$

if the radius is 1, which is how we would estimate $\pi$

.1

def f_quartercircle(x):

2

"""Quarter of a cirle in the first quadrant."""

3

return np.sqrt(1-x**2)

4

â€‹

5

â€‹

6

# get 1000 numbers between 0 and 1, equally spaced, and the quarter circle function on them

7

x = np.linspace(0, 1, num=1000)

8

y = [f_quartercircle(item) for item in x]

9

â€‹

10

# loop over the number of extracted points, and extract them uniformly between 0 and 1, both x and y

11

for n in [100, 1000, 5000, 10000, 20000, 30000, 50000]:

12

13

points = np.random.uniform(0, 1, size=(n, 2)) # n points in the plane, randomly (uniformly) extracted in [0,1]

14

15

under_points = []

16

over_points = []

17

18

# select if point is below or above the circle

19

for point in points:

20

if point[1] <= f_quartercircle(point[0]):

21

under_points.append(point)

22

else:

23

over_points.append(point)

24

â€‹

25

# estimate pi as the ratio of number of points below circle to total

26

est_pi = float(len(under_points)) / n * 4

27

28

# compute the relative error to the real pi, in percentage

29

perc_err = abs(float(est_pi - np.pi))/np.pi

30

print('Estimated pi at %d points: %f, with relative error %f' %(n, est_pi, perc_err))

Copied!

This will print the estimation of

$\pi$

with decreasing relative error.1

plt.figure(figsize=(10, 10))

2

â€‹

3

plt.title('Final estimation plot')

4

plt.plot(x, y, color='r', lw=3)

5

plt.plot([point[0] for point in under_points], [point[1] for point in under_points], 'x', alpha=0.5)

6

plt.plot([point[0] for point in over_points], [point[1] for point in over_points], 'x', alpha=0.5)

7

plt.show();

Copied!

Monte-Carloing pi

Distribution sampling

Using Monte Carlo for distribution sampling means doing this:

- 1.generate some independent datasets under the condition of interest
- 2.computer the numerical value of the estimator for each dataset, that is, the test statistic
- 3.compute summary statistics over all the dataset test statistic computed

If for instance we want to estimate the mean and standard deviation of a random variable

$Y$

, and we have a sample of $N$

values of it, the sample mean and the sample standard deviation are effectively Monte Carlo estimations of those which by the law of large numbers we expect to converge to the population ones when $N$

is big enough.References

- 1.
- 2.
- 3.

Previous

Propagation of uncertainty

Next - Probability, statistics and data analysis

Notable brain teasers, paradoxes and how to be careful with data

Last modified 7mo ago