Free Essay

Probability

In:

Submitted By pari87
Words 4220
Pages 17
CS 70

Discrete Mathematics and Probability Theory

Fall 2009

Satish Rao,David Tse

Note 11

Conditional Probability
A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials, the test has the following properties:
1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10%
(these are called “false negatives”).
2. When applied to a healthy person, the test comes up negative in 80% of cases, and positive in 20%
(these are called “false positives”).
Suppose that the incidence of the condition in the US population is 5%. When a random person is tested and the test comes up positive, what is the probability that the person actually has the condition? (Note that this is presumably not the same as the simple probability that a random person has the condition, which is
1
just 20 .)
This is an example of a conditional probability: we are interested in the probability that a person has the condition (event A) given that he/she tests positive (event B). Let’s write this as Pr[A|B].
How should we define Pr[A|B]? Well, since event B is guaranteed to happen, we should look not at the whole sample space Ω , but at the smaller sample space consisting only of the sample points in B. What should the conditional probabilities of these sample points be? If they all simply inherit their probabilities from Ω , then the sum of these probabilities will be ∑ω ∈B Pr[ω ] = Pr[B], which in general is less than 1. So
1
we should normalize the probability of each sample point by Pr[B] . I.e., for each sample point ω ∈ B, the new probability becomes
Pr[ω ]
Pr[ω |B] =
.
Pr[B]
Now it is clear how to define Pr[A|B]: namely, we just sum up these normalized probabilities over all sample points that lie in both A and B:
Pr[A|B] :=



ω ∈A∩B

Pr[ω |B] =

Pr[ω ] Pr[A ∩ B]
=
.
Pr[B]
ω ∈A∩B Pr[B]



Definition 11.1 (conditional probability): For events A, B in the same probability space, such that Pr[B] > 0, the conditional probability of A given B is
Pr[A|B] :=

Pr[A ∩ B]
.
Pr[B]

Let’s go back to our medical testing example. The sample space here consists of all people in the US — denote their number by N (so N ≈ 250 million). The population consists of four disjoint subsets:
CS 70, Fall 2009, Note 11

1

N
20

T P: the true positives (90% of

=

9N
200

of them);

FP: the false positives (20% of

19N
20

=

19N
100

of them);

T N: the true negatives (80% of

19N
20

=

76N
100

of them);

FN: the false negatives (10% of

N
20

=

N
200

of them).

Now let A be the event that a person chosen at random is affected, and B the event that he/she tests positive.
Note that B is the union of the disjoint sets T P and FP, so
|B| = |T P| + |FP| =
Thus we have

Pr[A] =

1
20

and

9N
200

+ 19N =
100

47N
200 .

Pr[B] =

47
200 .

Now when we condition on the event B, we focus in on the smaller sample space consisting only of those
47N
200 individuals who test positive. To compute Pr[A|B], we need to figure out Pr[A ∩ B] (the part of A that lies in B). But A ∩ B is just the set of people who are both affected and test positive, i.e., A ∩ B = T P. So we have 9
|T P|
=
.
Pr[A ∩ B] =
N
200
Finally, we conclude from the definition of conditional probability that
Pr[A|B] =

9/200
9
Pr[A ∩ B]
=
=
≈ 0.19.
Pr[B]
47/200 47

This seems bad: if a person tests positive, there’s only about a 19% chance that he/she actually has the condition! This sounds worse than the original claims made by the pharmaceutical company, but in fact it’s just another view of the same data.
9
[Incidentally, note that Pr[B|A] = 9/200 = 10 ; so Pr[A|B] and Pr[B|A] can be very different. Of course, Pr[B|A]
1/20
is just the probability that a person tests positive given that he/she has the condition, which we knew from the start was 90%.]

To complete the picture, what’s the (unconditional) probability that the test gives a correct result (positive or negative) when applied to a random person? Call this event C. Then
Pr[C] =

|T P|+|T N|
N

=

9
200

76
+ 100 =

161
200

≈ 0.8.

So the test is about 80% effective overall, a more impressive statistic.
But how impressive is it? Suppose we ignore the test and just pronounce everybody to be healthy. Then we would be correct on 95% of the population (the healthy ones), and wrong on the affected 5%. I.e., this trivial test is 95% effective! So we might ask if it is worth running the test at all. What do you think?
Here are a couple more examples of conditional probabilities, based on some of our sample spaces from the previous lecture note.
1. Balls and bins. Suppose we toss m = 3 balls into n = 3 bins; this is a uniform sample space with
8
1
33 = 27 points. We already know that the probability the first bin is empty is (1 − 3 )3 = ( 2 )3 = 27 .
3
What is the probability of this event given that the second bin is empty? Call these events A, B

CS 70, Fall 2009, Note 11

2

respectively. To compute Pr[A|B] we need to figure out Pr[A ∩ B]. But A ∩ B is the event that both the
1
first two bins are empty, i.e., all three balls fall in the third bin. So Pr[A ∩ B] = 27 (why?). Therefore,
Pr[A|B] =
Not surprisingly, 1 is quite a bit less than
8
likely that bin 1 will be empty.

Pr[A ∩ B] 1/27 1
=
= .
Pr[B]
8/27 8

8
27 :

knowing that bin 2 is empty makes it significantly less

2. Dice. Roll two fair dice. Let A be the event that their sum is even, and B the event that the first die is even. By symmetry it’s easy to see that Pr[A] = 1 and Pr[B] = 1 . Moreover, a little counting gives us
2
2 that Pr[A ∩ B] = 1 . What is Pr[A|B]? Well,
4
Pr[A|B] =

Pr[A ∩ B] 1/4 1
=
= .
Pr[B]
1/2 2

In this case, Pr[A|B] = Pr[A], i.e., conditioning on B does not change the probability of A.

Bayesian Inference
The medical test problem is a canonical example of an inference problem: given a noisy observation (the result of the test), we want to figure out the likelihood of something not directly observable (whether a person is healthy). To bring out the common structure of such inference problems, let us redo the calculations in the medical test example but only in terms of events without explicitly mentioning the sample points of the underlying sample space.
Recall: A is the event the person is affected, B is the event that the test is positive. What are we given?
• Pr[A] = 0.05, (5% of the U.S. population is affected.)
• Pr[B|A] = 0.9 (90% of the affected people test positive)
• Pr[B|A] = 0.2 (20% of healthy people test positive)
We want to calculate Pr[A|B]. We can proceed as follows:
Pr[A|B] =

Pr[A ∩ B] Pr[B|A] Pr[A]
=
Pr[B]
Pr[B]

(1)

and
Pr[B] = Pr[A ∩ B] + Pr[A ∩ B] = Pr[B|A] Pr[A] + Pr[B|A](1 − Pr[A])

(2)

Combining equations (1) and (2), we have expressed Pr[A|B] in terms of Pr[A], Pr[B|A] and Pr[B|A]:
Pr[A|B] =

Pr[B|A] Pr[A]
Pr[B|A] Pr[A] + Pr[B|A](1 − Pr[A])

(3)

This equation is useful for many inference problems. We are given Pr[A], which is the (unconditional) probability that the event of interest A happens. We are given Pr[B|A] and Pr[B|A], which quantify how noisy
CS 70, Fall 2009, Note 11

3

the observation is. (If Pr[B|A] = 1 and Pr[B|A] = 0, for example, the observation is completely noiseless.)
Now we want to calculate Pr[A|B], the probability that the event of interest happens given we made the observation. Equation (3) allows us to do just that.
Equation (3) is at the heart of a subject called Bayesian inference, used extensively in fields such as machine learning, communications and signal processing. The equation can be interpreted as a way to update knowledge after making an observation. In this interpretation, Pr[A] can be thought of as a prior probability: our assessment of the likelihood of an event of interest A before making an observation. It reflects our prior knowledge. Pr[A|B] can be interpreted as the posterior probability of A after the observation. It reflects our new knowledge.
Of course, equations (1), (2) and (3 are derived from the basic axioms of probability and the definition of conditional probability, and are therefore true with or without the above Bayesian inference interpretation.
However, this interpretation is very useful when we apply probability theory to study inference problems.

Bayes’ Rule and Total Probability Rule
Equations (1) and (2) are very useful in their own right. The first is called Bayes’ Rule and the second is called the Total Probability Rule. Bayes’ Rule is useful when one wants to calculate Pr[A|B] but one is given Pr[B|A] instead, i.e. it allows us to ”flip” things around. The Total Probability Rule is an application of the strategy of ”dividing into cases” we learnt in Notes 2 to calculating probabilities. We want to calculate the probability of an event B. There are two possibilities: either an event A happens or A does not happen. If
A happens the probability that B happens is Pr[B|A]. If A does not happen, the probability that B happens is
Pr[B|A]. If we know or can easily calculate these two probabilities and also Pr[A], then the total probability rule yields the probability of event B.

Independent events
Definition 11.2 (independence): Two events A, B in the same probability space are independent if Pr[A ∩
B] = Pr[A] × Pr[B].
The intuition behind this definition is the following. Suppose that Pr[B] > 0. Then we have
Pr[A|B] =

Pr[A ∩ B] Pr[A] × Pr[B]
=
= Pr[A].
Pr[B]
Pr[B]

Thus independence has the natural meaning that “the probability of A is not affected by whether or not B occurs.” (By a symmetrical argument, we also have Pr[B|A] = Pr[B] provided Pr[A] > 0.) For events A, B such that Pr[B] > 0, the condition Pr[A|B] = Pr[A] is actually equivalent to the definition of independence.
Examples: In the balls and bins example above, events A, B are not independent. In the dice example, events
A, B are independent.
The above definition generalizes to any finite set of events:
Definition 11.3 (mutual independence): Events A 1 , . . . , An are mutually independent if for every subset
I ⊆ {1, . . . , n},
Pr[ i∈I Ai ] = ∏i∈I Pr[Ai ].
Note that we need this property to hold for every subset I.
For mutually independent events A1 , . . . , An , it is not hard to check from the definition of conditional probability that, for any 1 ≤ i ≤ n and any subset I ⊆ {1, . . . , n} \ {i}, we have
Pr[Ai |
CS 70, Fall 2009, Note 11

j∈I A j ]

= Pr[Ai ].
4

Note that the independence of every pair of events (so-called pairwise independence) does not necessarily imply mutual independence. For example, it is possible to construct three events A, B,C such that each pair is independent but the triple A, B,C is not mutually independent.

Combinations of events
In most applications of probability in Computer Science, we are interested in things like Pr[ n Ai ] and i=1 Pr[ n Ai ], where the Ai are simple events (i.e., we know, or can easily compute, the Pr[A i ]). The intersection i=1 i Ai corresponds to the logical AND of the events A i , while the union i Ai corresponds to their logical OR .
As an example, if Ai denotes the event that a failure of type i happens in a certain system, then i Ai is the event that the system fails.
In general, computing the probabilities of such combinations can be very difficult. In this section, we discuss some situations where it can be done.

Intersections of events
From the definition of conditional probability, we immediately have the following product rule (sometimes also called the chain rule) for computing the probability of an intersection of events.
Theorem 11.1: [Product Rule] For any events A, B, we have
Pr[A ∩ B] = Pr[A] Pr[B|A].
More generally, for any events A1 , . . . , An ,
Pr[

n i=1 Ai ]

= Pr[A1 ] × Pr[A2 |A1 ] × Pr[A3 |A1 ∩ A2 ] × · · · × Pr[An |

n−1 i=1 Ai ].

Proof: The first assertion follows directly from the definition of Pr[B|A] (and is in fact a special case of the second assertion with n = 2).
To prove the second assertion, we will use induction on n (the number of events). The base case is n = 1, and corresponds to the statement that Pr[A] = Pr[A], which is trivially true. For the inductive step, let n > 1 and assume (the inductive hypothesis) that
Pr[

n−1 i=1 Ai ]

= Pr[A1 ] × Pr[A2 |A1 ] × · · · × Pr[An−1 |

n−2 i=1 Ai ].

Now we can apply the definition of conditional probability to the two events A n and
Pr[

n i=1 Ai ]

= Pr[An ∩ (

n−1 i=1 Ai )]

= Pr[An |
= Pr[An |

n−1 i=1 Ai

n−1 n−1 i=1 Ai ] × Pr[ i=1 Ai ] n−1 i=1 Ai ] × Pr[A1 ] × Pr[A2 |A1 ] × · · · × Pr[An−1 |

to deduce that n−2 i=1 Ai ],

where in the last line we have used the inductive hypothesis. This completes the proof by induction. 2
The product rule is particularly useful when we can view our sample space as a sequence of choices. The next few examples illustrate this point.
1. Coin tosses. Toss a fair coin three times. Let A be the event that all three tosses are heads. Then
A = A1 ∩ A2 ∩ A3 , where Ai is the event that the ith toss comes up heads. We have
Pr[A] = Pr[A1 ] × Pr[A2 |A1 ] × Pr[A3 |A1 ∩ A2 ]
= Pr[A1 ] × Pr[A2 ] × Pr[A3 ]
1
1
= 1 × 1 × 2 = 8.
2
2

CS 70, Fall 2009, Note 11

5

The second line here follows from the fact that the tosses are mutually independent. Of course, we already know that Pr[A] = 1 from our definition of the probability space in the previous lecture note.
8
The above is really a check that the space behaves as we expect. 1
If the coin is biased with heads probability p, we get, again using independence,
Pr[A] = Pr[A1 ] × Pr[A2 ] × Pr[A3 ] = p3 .
And more generally, the probability of any sequence of n tosses containing r heads and n − r tails is pr (1− p)n−r . This is in fact the reason we defined the probability space this way in the previous lecture note: we defined the sample point probabilities so that the coin tosses would behave independently.
2. Balls and bins. Let A be the event that bin 1 is empty. We saw in the previous lecture note (by counting) that Pr[A] = (1 − 1 )m , where m is the number of balls and n is the number of bins. The n product rule gives us a different way to compute the same probability. We can write A = m Ai , i=1 where Ai is the event that ball i misses bin 1. Clearly Pr[A i ] = 1 − 1 for each i. Also, the Ai are n mutually independent since ball i chooses its bin regardless of the choices made by any of the other balls. So
1 m
Pr[A] = Pr[A1 ] × · · · × Pr[Am ] = 1 −
.
n
3. Card shuffling. We can look at the sample space as a sequence of choices as follows. First the top
1
card is chosen uniformly from all 52 cards, i.e., each card with probability 52 . Then (conditional on the
1
first card), the second card is chosen uniformly from the remaining 51 cards, each with probability 51 .
Then (conditional on the first two cards), the third card is chosen uniformly from the remaining 50, and so on. The probability of any given permutation, by the product rule, is therefore
1
1
1
1 1
1
× × × ···× × =
.
52 51 50
2 1 52!
Reassuringly, this is in agreement with our definition of the probability space in the previous lecture note, based on counting permutations.
4. Poker hands. Again we can view the sample space as a sequence of choices. First we choose one of the cards (note that it is not the “first” card, since the cards in our hand have no ordering) uniformly from all 52 cards. Then we choose another card from the remaining 51, and so on. For any given poker hand, the probability of choosing it is (by the product rule):
4
3
2
1
1
5
× × × ×
= 52 ,
52 51 50 49 48
5
just as before. Where do the numerators 5, 4, 3, 2, 1 come from? Well, for the given hand the first card we choose can be any of the five in the hand: i.e., five choices out of 52. The second can be any of the remaining four in the hand: four choices out of 51. And so on. This arises because the order of the cards in the hand is irrelevant.
Let’s use this view to compute the probability of a flush in a different way. Clearly this is 4 × Pr[A], where A is the probability of a Hearts flush. And we can write A = 5 Ai , where Ai is the event that i=1 the ith card we pick is a Heart. So we have
Pr[A] = Pr[A1 ] × Pr[A2 |A1 ] × · · · × Pr[A5 |

4 i=1 Ai ].

1 Strictly speaking, we should really also have checked from our original definition of the probability space that Pr[A ], Pr[A |A ]
1
2 1
1
and Pr[A3 |A1 ∩ A2 ] are all equal to 2 .

CS 70, Fall 2009, Note 11

6

Clearly Pr[A1 ] = 13 = 1 . What about Pr[A2 |A1 ]? Well, since we are conditioning on A 1 (the first card
52
4 is a Heart), there are only 51 remaining possibilities for the second card, 12 of which are Hearts. So
Pr[A2 |A1 ] = 12 . Similarly, Pr[A3 |A1 ∩ A2 ] = 11 , and so on. So we get
51
50
4 × Pr[A] = 4 ×

9
13 12 11 10
× × × × ,
52 51 50 49 48

which is exactly the same fraction we computed in the previous lecture note.
So now we have two methods of computing probabilities in many of our sample spaces. It is useful to keep these different methods around, both as a check on your answers and because in some cases one of the methods is easier to use than the other.
5. Monty Hall. Recall that we defined the probability of a sample point by multiplying the probabilities of the sequence of choices it corresponds to; thus, e.g.,
1
Pr[(1, 1, 2)] = 1 × 3 × 1 =
3
2

1
18 .

The reason we defined it this way is that we knew (from our model of the problem) the probabilities for each choice conditional on the previous one. Thus, e.g., the 1 in the above product is the probability
2
that Carol opens door 2 conditional on the prize door being door 1 and the contestant initially choosing door 1. In fact, we used these conditional probabilities to define the probabilities of our sample points.

Unions of events
You are in Las Vegas, and you spy a new game with the following rules. You pick a number between 1 and 6. Then three dice are thrown. You win if and only if your number comes up on at least one of the dice.
The casino claims that your odds of winning are 50%, using the following argument. Let A be the event that you win. We can write A = A1 ∪ A2 ∪ A3 , where Ai is the event that your number comes up on die i. Clearly
Pr[Ai ] = 1 for each i. Therefore,
6
Pr[A] = Pr[A1 ∪ A2 ∪ A3 ] = Pr[A1 ] + Pr[A2 ] + Pr[A3 ] = 3 ×

1 1
= .
6 2

Is this calculation correct? Well, suppose instead that the casino rolled six dice, and again you win iff your number comes up at least once. Then the analogous calculation would say that you win with probability
6 × 1 = 1, i.e., certainly! The situation becomes even more ridiculous when the number of dice gets bigger
6
than 6.
The problem is that the events Ai are not disjoint: i.e., there are some sample points that lie in more than one of the Ai . (We could get really lucky and our number could come up on two of the dice, or all three.) So if we add up the Pr[Ai ] we are counting some sample points more than once.
Fortunately, there is a formula for this, known as the Principle of Inclusion/Exclusion:
Theorem 11.2: [Inclusion/Exclusion] For events A 1 , . . . , An in some probability space, we have
Pr[

n i=1 Ai ]

n

= ∑ Pr[Ai ] − i=1 ∑ Pr[Ai ∩ A j ] + ∑

{i, j}

{i, j,k}

Pr[Ai ∩ A j ∩ Ak ] − · · · ± Pr[

n i=1 Ai ].

[In the above summations, {i, j} denotes all unordered pairs with i = j, {i, j, k} denotes all unordered triples of distinct elements, and so on.]
CS 70, Fall 2009, Note 11

7

I.e., to compute Pr[ i Ai ], we start by summing the event probabilities Pr[A i ], then we subtract the probabilities of all pairwise intersections, then we add back in the probabilities of all three-way intersections, and so on. We won’t prove this formula here; but you might like to verify it for the special case n = 3 by drawing a
Venn diagram and checking that every sample point in A 1 ∪ A2 ∪ A3 is counted exactly once by the formula.
You might also like to prove the formula for general n by induction (in similar fashion to the proof of the
Product Rule above).
Taking the formula on faith, what is the probability we get lucky in the new game in Vegas?
Pr[A1 ∪ A2 ∪ A3 ] = Pr[A1 ] + Pr[A2 ] + Pr[A3 ] − Pr[A1 ∩ A2 ] − Pr[A1 ∩ A3 ] − Pr[A2 ∩ A3 ] + Pr[A1 ∩ A2 ∩ A3 ].
Now the nice thing here is that the events A i are mutually independent (the outcome of any die does not
1
depend on that of the others), so Pr[A i ∩ A j ] = Pr[Ai ] Pr[A j ] = ( 1 )2 = 36 , and similarly Pr[A1 ∩ A2 ∩ A3 ] =
6
1 3
1
( 6 ) = 216 . So we get
1
1
Pr[A1 ∪ A2 ∪ A3 ] = 3 × 1 − 3 × 36 + 216 =
6

91
216

≈ 0.42.

So your odds are quite a bit worse than the casino is claiming!
When n is large (i.e., we are interested in the union of many events), the Inclusion/Exclusion formula is essentially useless because it involves computing the probability of the intersection of every non-empty subset of the events: and there are 2 n − 1 of these! Sometimes we can just look at the first few terms of it and forget the rest: note that successive terms actually give us an overestimate and then an underestimate of the answer, and these estimates both get better as we go along.
However, in many situations we can get a long way by just looking at the first term:
1. Disjoint events. If the events Ai are all disjoint (i.e., no pair of them contain a common sample point
— such events are also called mutually exclusive), then
Pr[

n i=1 Ai ]

n

= ∑ Pr[Ai ]. i=1 [Note that we have already used this fact several times in our examples, e.g., in claiming that the probability of a flush is four times the probability of a Hearts flush — clearly flushes in different suits are disjoint events.]
2. Union bound. Always, it is the case that
Pr[

n i=1 Ai ]

n

≤ ∑ Pr[Ai ]. i=1 This merely says that adding up the Pr[A i ] can only overestimate the probability of the union. Crude as it may seem, in the next lecture note we’ll see how to use the union bound effectively in a Computer
Science example.

CS 70, Fall 2009, Note 11

8

Similar Documents

Free Essay

Probability

...Title: The Probability that the Sum of two dice when thrown is equal to seven Purpose of Project * To carry out simple experiments to determine the probability that the sum of two dice when thrown is equal to seven. Variables * Independent- sum * Dependent- number of throws * Controlled- Cloth covered table top. Method of data collection 1. Two ordinary six-faced gaming dice was thrown 100 times using three different method which can be shown below. i. The dice was held in the palm of the hand and shaken around a few times before it was thrown onto a cloth covered table top. ii. The dice was placed into a Styrofoam cup and shaken around few times before it was thrown on a cloth covered table top. iii. The dice was placed into a glass and shaken around a few times before it was thrown onto a cloth covered table top. 2. All result was recoded and tabulated. 3. A probability tree was drawn. Presentation of Data Throw by hand Sum of two dice | Frequency | 23456789101112 | 4485161516121172 | Throw by Styrofoam cup Sum of two dice | Frequency | 23456789101112 | 2513112081481072 | Throw by Glass Sum of two dice | Frequency | 23456789101112 | 18910121214121174 | Sum oftwo dice | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Total | Experiment1 | 4 | 4 | 8 | 5 | 16 | 15 | 16 | 12 | 11 | 7 | 2 | 100 | Experiment2 | 2 | 5 | 13 | 11 | 20 | 8 | 14 | 8 | 10 | 7 | 2 | 100 | Experiment3 | 1 | 8 | 9 | 10 | 12 | 12 | 13 | 12 | 11...

Words: 528 - Pages: 3

Premium Essay

Probability

... Probability – the chance that an uncertain event will occur (always between 0 and 1) Impossible Event – an event that has no chance of occurring (probability = 0) Certain Event – an event that is sure to occur (probability = 1) Assessing Probability probability of occurrence= probability of occurrence based on a combination of an individual’s past experience, personal opinion, and analysis of a particular situation Events Simple event An event described by a single characteristic Joint event An event described by two or more characteristics Complement of an event A , All events that are not part of event A The Sample Space is the collection of all possible events Simple Probability refers to the probability of a simple event. Joint Probability refers to the probability of an occurrence of two or more events. ex. P(Jan. and Wed.) Mutually exclusive events is the Events that cannot occur simultaneously Example: Randomly choosing a day from 2010 A = day in January; B = day in February Events A and B are mutually exclusive Collectively exhaustive events One of the events must occur the set of events covers the entire sample space Computing Joint and Marginal Probabilities The probability of a joint event, A and B: Computing a marginal (or simple) probability: Probability is the numerical measure of the likelihood that an event will occur The probability of any event must be between 0 and 1,...

Words: 553 - Pages: 3

Free Essay

Probability

...Probability XXXXXXXX MAT300 Professor XXXXXX Date Probability Probability is commonly applied to indicate an outlook of the mind with respect to some hypothesis whose facts are not yet sure. The scheme of concern is mainly of the frame “would a given incident happen?” the outlook of the mind is of the type “how sure is it that the incident would happen?” The surety we applied may be illustrated in form of numerical standards and this value ranges between 0 and 1; this is referred to as probability. The greater the probability of an incident, the greater the surety that the incident will take place. Therefore, probability in a used perspective is a measure of the likeliness, which a random incident takes place (Olofsson, 2005). The idea has been presented as a theoretical mathematical derivation within the probability theory that is applied in a given fields of study like statistics, mathematics, gambling, philosophy, finance, science, and artificial machine/intelligence learning. For instance, draw deductions concerning the likeliness of incidents. Probability is applied to show the underlying technicalities and regularities of intricate systems. Nevertheless, the term probability does not have any one straight definition for experimental application. Moreover, there are a number of wide classifications of probability whose supporters have varied or even conflicting observations concerning the vital state of probability. Just as other theories...

Words: 335 - Pages: 2

Free Essay

Probability

...Massachusetts Institute of Technology 6.042J/18.062J, Fall ’02: Mathematics for Computer Science Professor Albert Meyer and Dr. Radhika Nagpal Course Notes 10 November 4 revised November 6, 2002, 572 minutes Introduction to Probability 1 Probability Probability will be the topic for the rest of the term. Probability is one of the most important subjects in Mathematics and Computer Science. Most upper level Computer Science courses require probability in some form, especially in analysis of algorithms and data structures, but also in information theory, cryptography, control and systems theory, network design, artificial intelligence, and game theory. Probability also plays a key role in fields such as Physics, Biology, Economics and Medicine. There is a close relationship between Counting/Combinatorics and Probability. In many cases, the probability of an event is simply the fraction of possible outcomes that make up the event. So many of the rules we developed for finding the cardinality of finite sets carry over to Probability Theory. For example, we’ll apply an Inclusion-Exclusion principle for probabilities in some examples below. In principle, probability boils down to a few simple rules, but it remains a tricky subject because these rules often lead unintuitive conclusions. Using “common sense” reasoning about probabilistic questions is notoriously unreliable, as we’ll illustrate with many real-life examples. This reading is longer than usual . To keep things in bounds...

Words: 18516 - Pages: 75

Premium Essay

Probability and Distributions

...Probability and Distributions Abstract This paper will discuss the trends and data values and how they relate to statistical terms. Also will describe the probability of different actions to the same group of data. The data will be broke down accordingly to qualitative and quantitative data, and will be grouped and manipulated to show how the data in each group can prove to be useful in the workplace. Memo To: Head of American Intellectual Union From: Abby Price Date: 3/05/2014 Subject: Data analysis from within the union’s surveys Dear Dr. Common: I will be analyzing data given to me which was taken from a survey within the union from 186 employees. I will discuss probability and how its information is important in the workplace. Overview of the Data Set The data group I was given to analyze has 9 categories: gender, age, department, position, tenure, job satisfaction, intrinsic, extrinsic, and benefits. The employees were asked to rate on a scale of 1-7 on how satisfied they were with the company. Gender, age, department, position and tenure are all qualitative data. This data is acknowledged by a code on the given data but cannot measured unlike the quantitative data: job satisfaction, intrinsic, extrinsic, and benefits. Use of Statistics and Probability in the Real World Statistics are just about everywhere in the business world, from the upper management to the lower line of employees, statistics are very useful and are a huge part of our...

Words: 1165 - Pages: 5

Free Essay

Lecture - Probability

...Probability: Introduction to Basic Concept Uncertainty pervades all aspects of human endeavor. Probability is one of our most important conceptual tools because we use it to assess degrees of uncertainty and thereby to reduce risk. Whether or not one has had formal instruction in this topic, s/he is already familiar with the concept of probability since it pervades almost all aspects of our lives. With out consciously realizing it many of our decisions are based on probability. For example, when you study for an examination, you concentrate more on areas that you feel are likely to be covered on the test. You may cancel or postpone an out door activity if you believe the likelihood of rain is high. In business, probability plays a key role in decision-making. The owner of a retail shoe store, for example, orders heavily in those sizes that s/he believes likely to sell fast. The owner of a movie theatre schedules matinees only during holiday seasons because the chances of filling the theatre are greater at that time. The two companies decide to merge when they believe the probability of success is greater for the consolidated company than for either independently. Some important Definitions: Experiment: Experiment is an act that can be repeated under given conditions. Usually, the exact result of the experiment cannot be predicted with certainly. Unit experiment is known as trial. This means that trial is a special case of experiment. Experiment may be a trial...

Words: 2855 - Pages: 12

Premium Essay

Probability and Distributions

...Unit 2 – Probability and Distributions Kimberly Reed American InterContinental University Abstract This week’s paper focuses on an email that will be written to AUI the email will contain information from the data set key and explain why this information is important to the company. Memo To: HR Department From: Senior Manager Date: 20 Sept, 2011 Subject: Data Set Dear Department Heads: The following memo will contain information that contains vital and confidential information. This information will need to be studied by all department heads. Overview of the data set This data set of information contains information on the breakdown of the survey that was conducted on the company Use of statistics and probability in the real world Companies use statistics in the real world to get and have an advantage. They can be used for things such as knowing the latest stats on a sports figure or what items a consumer will likely buy from the local hardware store Distributions Distribution table contains the information that gives the breakdown of how the study was conducted and who the participants were in the study. This information is important to AIU for the company will be able to better prepare for the future when they know how to better manage their work force Then complete the following distribution tables. Please pay attention to whether you should present the results in terms of percentages or simple counts. Gender |Gender |Percentage...

Words: 476 - Pages: 2

Premium Essay

Probabilities

...Statistics 100A Homework 5 Solutions Ryan Rosario Chapter 5 1. Let X be a random variable with probability density function c(1 − x2 ) −1 < x < 1 0 otherwise ∞ f (x) = (a) What is the value of c? We know that for f (x) to be a probability distribution −∞ f (x)dx = 1. We integrate f (x) with respect to x, set the result equal to 1 and solve for c. 1 1 = −1 c(1 − x2 )dx cx − c x3 3 1 −1 = = = = c = Thus, c = 3 4 c c − −c + c− 3 3 2c −2c − 3 3 4c 3 3 4 . (b) What is the cumulative distribution function of X? We want to find F (x). To do that, integrate f (x) from the lower bound of the domain on which f (x) = 0 to x so we will get an expression in terms of x. x F (x) = −1 c(1 − x2 )dx cx − cx3 3 x −1 = But recall that c = 3 . 4 3 1 3 1 = x− x + 4 4 2 = 3 4 x− x3 3 + 2 3 −1 < x < 1 elsewhere 0 1 4. The probability density function of X, the lifetime of a certain type of electronic device (measured in hours), is given by, 10 x2 f (x) = (a) Find P (X > 20). 0 x > 10 x ≤ 10 There are two ways to solve this problem, and other problems like it. We note that the area we are interested in is bounded below by 20 and unbounded above. Thus, ∞ P (X > c) = c f (x)dx Unlike in the discrete case, there is not really an advantage to using the complement, but you can of course do so. We could consider P (X > c) = 1 − P (X < c), c P (X > c) = 1 − P (X < c) = 1 − −∞ f (x)dx P (X > 20) = 10 dx x2...

Words: 4895 - Pages: 20

Premium Essay

Probability

...Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #07 Random Variables So, far we were discussing the laws of probability so, in the laws of the probability we have a random experiment, as a consequence of that we have a sample space, we consider a subset of the, we consider a class of subsets of the sample space which we call our event space or the events and then we define a probability function on that. Now, we consider various types of problems for example, calculating the probability of occurrence of a certain number in throwing of a die, probability of occurrence of certain card in a drain probability of various kinds of events. However, in most of the practical situations we may not be interested in the full physical description of the sample space or the events; rather we may be interested in certain numerical characteristic of the event, consider suppose I have ten instruments and they are operating for a certain amount of time, now after amount after working for a certain amount of time, we may like to know that, how many of them are actually working in a proper way and how many of them are not working properly. Now, if there are ten instruments, it may happen that seven of them are working properly and three of them are not working properly, at this stage we may not be interested in knowing the positions, suppose we are saying one instrument, two instruments and so, on tenth...

Words: 5830 - Pages: 24

Premium Essay

Probability

...Probability & Statistics for Engineers & Scientists This page intentionally left blank Probability & Statistics for Engineers & Scientists NINTH EDITION Ronald E. Walpole Roanoke College Raymond H. Myers Virginia Tech Sharon L. Myers Radford University Keying Ye University of Texas at San Antonio Prentice Hall Editor in Chief: Deirdre Lynch Acquisitions Editor: Christopher Cummings Executive Content Editor: Christine O’Brien Associate Editor: Christina Lepre Senior Managing Editor: Karen Wernholm Senior Production Project Manager: Tracy Patruno Design Manager: Andrea Nix Cover Designer: Heather Scott Digital Assets Manager: Marianne Groth Associate Media Producer: Vicki Dreyfus Marketing Manager: Alex Gay Marketing Assistant: Kathleen DeChavez Senior Author Support/Technology Specialist: Joe Vetere Rights and Permissions Advisor: Michael Joyce Senior Manufacturing Buyer: Carol Melville Production Coordination: Lifland et al. Bookmakers Composition: Keying Ye Cover photo: Marjory Dressler/Dressler Photo-Graphics Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Pearson was aware of a trademark claim, the designations have been printed in initial caps or all caps. Library of Congress Cataloging-in-Publication Data Probability & statistics for engineers & scientists/Ronald E. Walpole . . . [et al.] — 9th ed. p. cm. ISBN 978-0-321-62911-1...

Words: 201669 - Pages: 807

Premium Essay

Probability

...PROBABILITY SEDA YILDIRIM 2009421051 DOKUZ EYLUL UNIVERSITY MARITIME BUSINESS ADMINISTRATION   CONTENTS Rules of Probability 1 Rule of Multiplication 3 Rule of Addition 3 Classical theory of probability 5 Continuous Probability Distributions 9 Discrete vs. Continuous Variables 11 Binomial Distribution 11 Binomial Probability 12 Poisson Distribution 13 PROBABILITY Probability is the branch of mathematics that studies the possible outcomes of given events together with the outcomes' relative likelihoods and distributions. In common usage, the word "probability" is used to mean the chance that a particular event (or set of events) will occur expressed on a linear scale from 0 (impossibility) to 1 (certainty), also expressed as a percentage between 0 and 100%. The analysis of events governed by probability is called statistics. There are several competing interpretations of the actual "meaning" of probabilities. Frequentists view probability simply as a measure of the frequency of outcomes (the more conventional interpretation), while Bayesians treat probability more subjectively as a statistical procedure that endeavors to estimate parameters of an underlying distribution based on the observed distribution. The conditional probability of an event A assuming that B has occurred, denoted ,equals The two faces of probability introduces a central ambiguity which has been around for 350 years and still leads to disagreements about...

Words: 3252 - Pages: 14

Premium Essay

Probability

...PROBABILITY 1. ACCORDING TO STATISTICAL DEFINITION OF PROBABILITY P(A) = lim FA/n WHERE FA IS THE NUMBER OF TIMES EVENT A OCCUR AND n IS THE NUMBER OF TIMES THE EXPERIMANT IS REPEATED. 2. IF P(A) = 0, A IS KNOWN TO BE AN IMPOSSIBLE EVENT AND IS P(A) = 1, A IS KNOWN TO BE A SURE EVENT. 3. BINOMIAL DISTRIBUTIONS IS BIPARAMETRIC DISTRIBUTION, WHERE AS POISSION DISTRIBUTION IS UNIPARAMETRIC ONE. 4. THE CONDITIONS FOR THE POISSION MODEL ARE : • THE PROBABILIY OF SUCCESS IN A VERY SMALL INTERAVAL IS CONSTANT. • THE PROBABILITY OF HAVING MORE THAN ONE SUCCESS IN THE ABOVE REFERRED SMALL TIME INTERVAL IS VERY LOW. • THE PROBABILITY OF SUCCESS IS INDEPENDENT OF t FOR THE TIME INTERVAL(t ,t+dt) . 5. Expected Value or Mathematical Expectation of a random variable may be defined as the sum of the products of the different values taken by the random variable and the corresponding probabilities. Hence if a random variable X takes n values X1, X2,………… Xn with corresponding probabilities p1, p2, p3, ………. pn, then expected value of X is given by µ = E (x) = Σ pi xi . Expected value of X2 is given by E ( X2 ) = Σ pi xi2 Variance of x, is given by σ2 = E(x- µ)2 = E(x2)- µ2 Expectation of a constant k is k i.e. E(k) = k fo any constant k. Expectation of sum of two random variables is the sum of their expectations i.e. E(x +y) = E(x) + E(y) for any two...

Words: 979 - Pages: 4

Premium Essay

Probability

...Probability, Mean and Median In the last section, we considered (probability) density functions. We went on to discuss their relationship with cumulative distribution functions. The goal of this section is to take a closer look at densities, introduce some common distributions and discuss the mean and median. Recall, we define probabilities as follows: Proportion of population for Area under the graph of   p ( x ) between a and b which x is between a and b  p( x)dx a b The cumulative distribution function gives the proportion of the population that has values below t. That is, t P (t )    p( x)dx  Proportion of population having values of x below t When answering some questions involving probabilities, both the density function and the cumulative distribution can be used, as the next example illustrates. Example 1: Consider the graph of the function p(x). p x  0.2 0.1 2 4 6 8 10 x Figure 1: The graph of the function p(x) a. Explain why the function is a probability density function. b. Use the graph to find P(X < 3) c. Use the graph to find P(3 § X § 8) 1 Solution: a. Recall, a function is a probability density function if the area under the curve is equal to 1 and all of the values of p(x) are non-negative. It is immediately clear that the values of p(x) are non-negative. To verify that the area under the curve is equal to 1, we recognize that the graph above can be viewed as a triangle. Its...

Words: 1914 - Pages: 8

Free Essay

Theoretical Probability

...Review Chapter 18 discussed Theoretical Probability and Statistical Inference. Jakob Bernoulli, wanted to be able to quantify probabilities by looking at the results observed in many similar instances. It seemed reasonably obvious to Bernoulli that the more observations one made of a given situation, the better one would be able to predict future occurrences. Bernoulli presented this scientific proof in his theorem, the “Law of Large Numbers”, where the same experiment is performed a large number of times. De Moivre also made contributions in the area of Theoretical Probability. His major mathematical work, “The Doctrine of Chances,” began with a precise definition of probability. His definition stated that, “The Probability of an Event is greater or less, according to the number of Chances by which it may either happen or fail.” De Moivre used his definition in solving problems, such as the dice problem of de Mere. Another concept discussed in this chapter is Statistical Inference. Statistical Inference is the process of drawing conclusions from data that are subject to random variation. Thomas Bayes and Pierre Laplace were the first to attempt a direct answer to the question of how to determine probability from observed frequencies. Bayes develop a theorem that states if X represents the number of times the event has happened in n trials, x the probability of its happening in a single trial, and r and s the two given probabilities, Baye’s aim was to calculate P(r<x<s|X)...

Words: 295 - Pages: 2

Free Essay

Dice and Probability

...DICE AND PROBABILITY LAB Learning outcome: Upon completion, students will be able to… * Compute experimental and theoretical probabilities using basic laws of probability. Scoring/Grading Rubric: * Part 1: 5 points * Part 2: 5 points * Part 3: 22 points (2 per sum of 2-12) * Part 4: 5 points * Part 5: 5 points * Part 6: 38 points (4 per sum of 4-12, 2 per sum of 3) * Part 7: 10 points * Part 8: 10 points Introduction: While it is fairly simple to understand the outcomes of a single die roll, the outcomes when rolling two dice are a little more complicated. The goal of this lab is to get a better understanding of these outcomes and the probabilities that go with them. We will examine and compare the experimental and theoretical probabilities for rolling two dice and obtaining a certain sum. Directions: 1. (5 pts) You are going to roll a pair of dice 108 times and record the sum of each roll. Before beginning, make a prediction about how you think the sums will be distributed. (Each sum will occur equally often, there will be more 12s than any other sum, there will be more 5s than any other sum, etc.) Record your prediction here: The more combinations available, the more possibility that the dice will roll that number. For example- there is only one way you can get 2, with rolling the pair of dice with 1 on each. Now with for example 8, you can roll a 3 and a 5 or a 2 and a 6 or a 4 and a 4 which means there is more...

Words: 883 - Pages: 4