# Introduction

In this blog post, I will explain the concept of a `Random Variable` from a software developer’s perspective.

# Definitions

Before we dive in, I want to define a few terminologies, for those of you who are new to probability and statistics. Feel free to skip this section if you think you are too cool for this.

#### Experiment

An experiment is an experiment… What would you do if you are asked to test if a coin is fair? I would flip the coin 1000 times and see if the counts of head and tails are equal. If head occurs only 300 times and tail occurs 700 times, you probably can conclude that the coin is not fair!

So what is an experiment? In the context of the example described above, `flipping the coin 1000 times` is an `Experiment`.

#### Trial

A trial is one coin flip. The coin-flip experiment above consists of 1000 trials!

#### Outcome

An `outcome` is one possible outcome of a `trial`. As for the coin-flip experiment, the outcome of the trial is either head or tail.

#### Sample space

A `sample space` is a set of all possible `outcomes` of the experiment. Assume we only perform 3 trials for our coin-flip experiment. The `sample space` will be ```{ HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }```, where `H` denotes head and `T` denotes tail.

# Cool stuffs

So, I learnt about a thing called `Random Variable` this week in an introductory statistics course I am taking at the moment. According to the lecture note, a `random variable` is both a function and a variable. It can be assigned to a value and it also defines the mapping of an `outcome` in the `sample space` to a value.

For example, let’s consider the coin-flip example. We can define a ```Random Variable``` X to be a function that takes an outcome of the experiment and returns the count of heads.

eg.

``````X(HHH) = 3
X(HTH) = 2
X(THH) = 2
X(TTT) = 0
``````

Now, let’s define another term called `Range`. The `Range` of a ```Random Variable``` function is a set that contains all possible outputs of the function. Let’s go back to our coin-flip experiment that consists of only three trials. The `Range` of the `Random Variable` X we defined above will be `0, 1, 2, 3`. It cannot be larger than three since there are only 3 trials and it cannot be negative because monkeys like bananas.

If you understand static typing, the following snippet may be helpful. If you don’t, TOO BAD!

``````type Outcome = Head | Tail;
type ExperimentOutcome: (Outcome, Outcome, Outcome)
type Range = 0 | 1 | 2 | 3 | 4
type myRandomVar = (Outcome) => (Range)
val sampleSpace: Array[ExperimentOutcome] = { ... }
``````

What if you want to select a subset of the `sample space` that satisfies a value of our random variable?

For example, how do you select a subset of the sample space such that the subset contains only outcomes that contain exactly two heads.

You can just do

``````sampleSpace.filter (event) => X(event) == 2
``````

Isn’t it neat?