Conditional Probability, Bayes' Rule, Counting, Binomial, Poisson & Normal Approximations, Hypergeometric Sampling

1 day ago

Conditional Probability, Bayes’ Rule, Binomial, Poisson & Normal Approximations, Hypergeometric Sampling


Section 1.1: Equally Likely Outcomes

1. Outcome Space

The outcome space Ω\Omega(omega) is the set of all possible outcomes of an experiment.

Ω={1,2,3,4,5,6}(rolling a die)\Omega = \{1, 2, 3, 4, 5, 6\} \quad \text{(rolling a die)}

An event AA is any subset of Ω\Omega. Example: “rolling an even number”

A={2,4,6}.A = \{2, 4, 6\}.

2. Equally Likely Probability Formula

When all outcomes in a finite Ω\Omega are equally likely:

P(A)=AΩ.P(A) = \frac{|A|}{|\Omega|}.

Key boundary values: P(Ω)=1P(\Omega) = 1, P()=0P(\emptyset) = 0.

Quick example: draw a ticket from a box of 100 tickets labeled 1,,1001, \ldots, 100. Event “number has one digit” is A={1,,9}A = \{1, \ldots, 9\}, so P(A)=9/100P(A) = 9/100.

3. Counting with Pairs (Two Dice)

Rolling two dice: each outcome is an ordered pair (i,j)(i, j) where i,j{1,,6}i, j \in \{1, \ldots, 6\}.

Ω=66=36.|\Omega| = 6 \cdot 6 = 36.

Example: “sum is 5”

A={(1,4),(2,3),(3,2),(4,1)}    P(A)=436=19.A = \{(1,4), (2,3), (3,2), (4,1)\} \implies P(A) = \frac{4}{36} = \frac{1}{9}.

For a general nn-sided die, Ω=n2|\Omega| = n^2. Number of pairs where the second number exceeds the first (above the diagonal):

above=1+2++(n1)=n(n1)2.|\text{above}| = 1 + 2 + \cdots + (n-1) = \frac{n(n-1)}{2}.

So

P(second>first)=n(n1)2n2=12(11n).P(\text{second} > \text{first}) = \frac{\frac{n(n-1)}{2}}{n^2} = \frac{1}{2}\left(1 - \frac{1}{n}\right).

4. Odds

Odds in favor of AA:

Odds in favor of A=P(A)1P(A).\text{Odds in favor of } A = \frac{P(A)}{1 - P(A)}.

Odds against AA is the inverse.

Example: P(red at roulette)=18/38P(\text{red at roulette}) = 18/38, so odds against red are 20:1820:18 (or 10:910:9).

5. Fair Odds Rule & House Percentage

Fair Odds Rule: in a fair bet, payoff odds = chance odds.

If you bet $1 on event AA at payoff odds rpayr_{\text{pay}} to 1 against, total stake is (rpay+1)(r_{\text{pay}} + 1). A fair price for your bet would be P(A)(rpay+1)P(A)(r_{\text{pay}} + 1). House percentage:

House %=[1P(A)(rpay+1)]100%.\text{House \%} = \left[1 - P(A)(r_{\text{pay}} + 1)\right] \cdot 100\%.

Example (straight play at roulette): P(A)=1/38P(A) = 1/38, rpay=35r_{\text{pay}} = 35.

House %=[113836]100%=238100%=5.26%.\text{House \%} = \left[1 - \frac{1}{38} \cdot 36\right] \cdot 100\% = \frac{2}{38} \cdot 100\% = 5.26\%.

Interpretation: for every $1 bet, the house keeps about 5.26 cents on average.


Section 1.3: Distributions

1. Events as Sets

An outcome space Ω\Omega is the set of all possible outcomes. Every event is a subset of Ω\Omega.

EventSet notation
not AAAcA^c
AA or BB (or both)ABA \cup B
both AA and BBABA \cap B (or ABAB)
A,BA, B mutually exclusiveAB=AB = \emptyset

2. Partition

Event BB is partitioned into B1,,BnB_1, \ldots, B_n if

B=B1Bn,BiBj= for ij.B = B_1 \cup \cdots \cup B_n, \qquad B_i \cap B_j = \emptyset \ \text{for } i \ne j.

Every outcome in BB belongs to exactly one BiB_i.

3. Three Axioms of Probability

A distribution on Ω\Omega is a function PP satisfying:

P(B)0,P(B) \ge 0, If B1,,Bn partition B, then P(B)=P(B1)++P(Bn),\text{If } B_1, \ldots, B_n \text{ partition } B, \text{ then } P(B) = P(B_1) + \cdots + P(B_n), P(Ω)=1.P(\Omega) = 1.

4. Derived Rules

Complement Rule

P(Ac)=1P(A).P(A^c) = 1 - P(A).

(Implies P()=0P(\emptyset) = 0 and 0P(A)10 \le P(A) \le 1.)

Difference Rule. If ABA \subseteq B, then

P(BAc)=P(B)P(A),P(B \cap A^c) = P(B) - P(A),

since AA and BAcB \cap A^c partition BB.

Inclusion–Exclusion (2 events)

P(AB)=P(A)+P(B)P(AB).P(A \cup B) = P(A) + P(B) - P(AB).

If A,BA, B mutually exclusive, then P(AB)=0P(AB) = 0 and this reduces to P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B).

Inclusion–Exclusion (3 events)

P(ABC)=P(A)+P(B)+P(C)P(AB)P(AC)P(BC)+P(ABC).P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(AB) - P(AC) - P(BC) + P(ABC).

Quick example: 10% rich, 5% famous, 3% both \Rightarrow P(rich or famous)=10%+5%3%=12%P(\text{rich or famous}) = 10\% + 5\% - 3\% = 12\%.

5. Named Distributions

  • Bernoulli(pp): distribution on {0,1}\{0, 1\} with P(1)=pP(1) = p, P(0)=1pP(0) = 1 - p. (Indicator of event AA with p=P(A)p = P(A).)

  • Uniform on a finite set: if Ω={1,2,,n}\Omega = \{1, 2, \ldots, n\} equally likely, then P(i)=1/nP(i) = 1/n and

P(B)=Bn.P(B) = \frac{|B|}{n}.
  • Uniform(a,ba, b): point picked at random from (a,b)(a, b); probability proportional to length:
P((x,y))=yxba(a<x<y<b).P((x, y)) = \frac{y - x}{b - a} \qquad (a < x < y < b).

6. Independence (Preview)

Two events A,BA, B are independent if

P(AB)=P(A)P(B).P(AB) = P(A)P(B).

Section 1.4: Conditional Probability and Independence

1. Conditional Probability

P(AB)=P(AB)P(B),P(B)>0.P(A \mid B) = \frac{P(A \cap B)}{P(B)}, \qquad P(B) > 0.

Equally likely outcomes version:

P(AB)=ABB.P(A \mid B) = \frac{|A \cap B|}{|B|}.

Quick example: 3 fair coin tosses. Let A={2+ heads}A = \{\text{2+ heads}\}, H={first toss is heads}H = \{\text{first toss is heads}\}. H={hhh,hht,hth,htt}H = \{hhh, hht, hth, htt\}, AH={hhh,hht,hth}A \cap H = \{hhh, hht, hth\}, so P(AH)=3/4P(A \mid H) = 3/4.

2. Multiplication Rule

P(AB)=P(AB)P(B)=P(BA)P(A).P(A \cap B) = P(A \mid B) \, P(B) = P(B \mid A) \, P(A).

3. Rule of Average Conditional Probabilities (Law of Total Probability)

If B1,,BnB_1, \ldots, B_n partition Ω\Omega, then

P(A)=i=1nP(ABi)P(Bi).P(A) = \sum_{i=1}^{n} P(A \mid B_i) \, P(B_i).

Quick example: P(second card black)P(\text{second card black}) from a 52-card deck; condition on first card color:

P(2nd black)=255112+265112=12.P(\text{2nd black}) = \frac{25}{51} \cdot \frac{1}{2} + \frac{26}{51} \cdot \frac{1}{2} = \frac{1}{2}.

4. Independence

AA and BB are independent iff

P(AB)=P(A)P(B).P(A \cap B) = P(A)P(B).

Equivalent statements (when probabilities are positive):

  • P(AB)=P(A)P(A \mid B) = P(A)
  • P(BA)=P(B)P(B \mid A) = P(B)
  • P(AB)=P(ABc)P(A \mid B) = P(A \mid B^c)

Key facts: if ABA \perp B, then also AcBA^c \perp B, ABcA \perp B^c, AcBcA^c \perp B^c.

Mutual independence (3 events)

For events A,B,CA, B, C:

mutual independence    {P(AB)=P(A)P(B),P(AC)=P(A)P(C),P(BC)=P(B)P(C),P(ABC)=P(A)P(B)P(C).\text{mutual independence} \iff \begin{cases} P(AB) = P(A)P(B), \quad P(AC) = P(A)P(C), \quad P(BC) = P(B)P(C), \\ P(ABC) = P(A)P(B)P(C). \end{cases}

i.i.d. symmetry lemma

If X,YX, Y are i.i.d. (independent and identically distributed), then

P(X>Y)=P(Y>X)=1P(X=Y)2.P(X > Y) = P(Y > X) = \frac{1 - P(X = Y)}{2}.

Series system (both must work; assume independence):

P(W1W2)=P(W1)P(W2).P(W_1 \cap W_2) = P(W_1)P(W_2).

Parallel system (at least one works; assume independence):

P(W1W2)=1P(F1)P(F2).P(W_1 \cup W_2) = 1 - P(F_1)P(F_2).

Section 1.5: Bayes’ Rule

What is Bayes’ Rule?

Bayes’ Rule reverses conditional probability: from P(ABi)P(A \mid B_i) to P(BiA)P(B_i \mid A).

The Formula

For a partition B1,,BnB_1, \ldots, B_n:

P(BiA)=P(ABi)P(Bi)j=1nP(ABj)P(Bj).P(B_i \mid A) = \frac{P(A \mid B_i) \, P(B_i)}{\displaystyle\sum_{j=1}^{n} P(A \mid B_j) \, P(B_j)}.

Key Terminology

  • Prior: P(Bi)P(B_i)
  • Likelihood: P(ABi)P(A \mid B_i)
  • Posterior: P(BiA)P(B_i \mid A)

How to Derive It (3 steps)

  1. Multiplication rule: P(BiA)=P(ABi)P(Bi)P(B_i \cap A) = P(A \mid B_i) \, P(B_i)
  2. Total probability: P(A)=i=1nP(ABi)P(Bi)P(A) = \sum_{i=1}^{n} P(A \mid B_i) \, P(B_i)
  3. Conditional probability: P(BiA)=P(BiA)P(A)P(B_i \mid A) = \dfrac{P(B_i \cap A)}{P(A)}

Bayes’ Rule for Odds (shortcut)

For two hypotheses B1,B2B_1, B_2:

posterior odds=prior odds×likelihood ratio,\text{posterior odds} = \text{prior odds} \times \text{likelihood ratio}, P(B1A)P(B2A)=P(B1)P(B2)P(AB1)P(AB2).\frac{P(B_1 \mid A)}{P(B_2 \mid A)} = \frac{P(B_1)}{P(B_2)} \cdot \frac{P(A \mid B_1)}{P(A \mid B_2)}.

Quick Example

Prevalence P(D)=0.01P(D) = 0.01. Test: P(+D)=0.95P(+ \mid D) = 0.95, P(+Dc)=0.02P(+ \mid D^c) = 0.02.

P(D+)=(0.95)(0.01)(0.95)(0.01)+(0.02)(0.99)=0.00950.0293=9529332%.P(D \mid +) = \frac{(0.95)(0.01)}{(0.95)(0.01) + (0.02)(0.99)} = \frac{0.0095}{0.0293} = \frac{95}{293} \approx 32\%.

Diagnostic Testing Template

Let

P(D)=π,P(+D)=s,P(+Dc)=f.P(D) = \pi, \quad P(+ \mid D) = s, \quad P(+ \mid D^c) = f.

Then

P(D+)=sπsπ+f(1π).P(D \mid +) = \frac{s\pi}{s\pi + f(1 - \pi)}.

Appendix 1: Counting

This appendix covers the fundamental counting rules and the standard formulas for sequences, permutations, and combinations. These are the tools behind (nk)\binom{n}{k} and thus the Binomial and Hypergeometric distributions.

Three Basic Rules

1. Correspondence Rule. If there exists a bijection f:BCf: B \to C, then B=C|B| = |C|.

2. Addition Rule. If B1,,BmB_1, \ldots, B_m are pairwise disjoint and B=i=1mBiB = \bigcup_{i=1}^{m} B_i, then

B=i=1mBi.|B| = \sum_{i=1}^{m} |B_i|.

3. Multiplication Rule. If an outcome is generated by kk successive choices with njn_j available options at stage jj (independent of earlier choices), then the total number of outcomes is

n1n2nk=j=1knj.n_1 \cdot n_2 \cdots n_k = \prod_{j=1}^{k} n_j.

Think of it as counting paths through a decision tree.

Selection Types

Selection typeOrder matters?Repetition allowed?Count
SequencesYesYesnkn^k
Permutations / OrderingsYesNo(n)k(n)_k
CombinationsNoNo(nk)\binom{n}{k}

Sequences (order matters, repetition allowed)

Number of sequences of length kk from nn elements:

nk.n^k.

Example: number of 5-letter “words” from 26 letters =265= 26^5.

Permutations / Orderings (order matters, no repetition)

Number of orderings of kk distinct elements chosen from nn distinct elements:

(n)k=n(n1)(nk+1)=n!(nk)!.(n)_k = n(n-1)\cdots(n-k+1) = \frac{n!}{(n-k)!}.

Conventions: (n)0=1(n)_0 = 1,   0!=1\;0! = 1.

Special case: when k=nk = n, (n)n=n!(n)_n = n! (the number of ways to arrange all nn objects in a row).

Example (birthday problem): number of ways kk people can all have different birthdays is (365)k(365)_k.

Combinations (order doesn’t matter, no repetition)

Number of ways to choose a subset of size kk from nn elements:

(nk)=(n)kk!=n!k!(nk)!=n(n1)(nk+1)k(k1)1.\binom{n}{k} = \frac{(n)_k}{k!} = \frac{n!}{k!(n-k)!} = \frac{n(n-1)\cdots(n-k+1)}{k(k-1)\cdots 1}.

Why divide by k!k!? Each ordering of kk distinct elements can be decomposed as:

(n)k=(nk)k!.(n)_k = \binom{n}{k} \cdot k!.

Dividing by k!k! removes the effect of ordering.

Equivalent Interpretations of (nk)\binom{n}{k}

  1. Choose positions: number of ways to choose kk positions from nn positions.
  2. 0–1 sequences: number of binary sequences of length nn with exactly kk ones.
  3. pp/qq placements: number of ways to place kk symbols pp and nkn - k symbols qq in a row.

These interpretations explain why (nk)\binom{n}{k} appears in the Binomial distribution: it counts the number of ways to choose which kk trials are successes.

Useful Identities

Symmetry

(nk)=(nnk).\binom{n}{k} = \binom{n}{n-k}.

Pascal’s Rule

(nk)=(n1k1)+(n1k).\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}.

Sum of all subsets

k=0n(nk)=2n.\sum_{k=0}^{n} \binom{n}{k} = 2^n.

Binomial Theorem

(x+y)n=k=0n(nk)xkynk.(x + y)^n = \sum_{k=0}^{n} \binom{n}{k} x^k y^{n-k}.

Multinomial Coefficient

If k0+k1++km=nk_0 + k_1 + \cdots + k_m = n, the number of sequences of length nn containing exactly kik_i copies of symbol ii is

n!k0!k1!km!.\frac{n!}{k_0! \, k_1! \cdots k_m!}.

Example: MISSISSIPPI has 11 letters (1 M, 4 I, 4 S, 2 P), so the number of distinct rearrangements is

11!1!4!4!2!=34,650.\frac{11!}{1! \cdot 4! \cdot 4! \cdot 2!} = 34{,}650.

Section 2.1: The Binomial Distribution

What is this about?

Repeat the same experiment nn times independently. Each trial is success w.p.(with probability) pp, failure w.p. q=1pq = 1 - p.

The Core Formula

P(k successes in n trials)=(nk)pkqnk,P(k \text{ successes in } n \text{ trials}) = \binom{n}{k} p^k q^{\,n-k},

where

(nk)=n!k!(nk)!=n(n1)(nk+1)k(k1)1,\binom{n}{k} = \frac{n!}{k!(n-k)!} = \frac{n(n-1)\cdots(n-k+1)}{k(k-1)\cdots 1},

and k{0,1,,n}k \in \{0, 1, \ldots, n\}.

Quick example: exactly 2 sixes in 9 die rolls (p=1/6p = 1/6, q=5/6q = 5/6):

(92)(16)2(56)7=3657690.279.\binom{9}{2}\left(\frac{1}{6}\right)^2\left(\frac{5}{6}\right)^7 = 36 \cdot \frac{5^7}{6^9} \approx 0.279.

Useful Properties

Binomial expansion (sum to 1)

k=0n(nk)pkqnk=(p+q)n=1.\sum_{k=0}^{n} \binom{n}{k} p^k q^{\,n-k} = (p + q)^n = 1.

Fair coin special case (p=q=1/2p = q = 1/2):

P(k heads in n tosses)=(nk)2n.P(k \text{ heads in } n \text{ tosses}) = \frac{\binom{n}{k}}{2^n}.

Consecutive Odds Ratio

R(k)=P(k)P(k1)=nk+1kpq.R(k) = \frac{P(k)}{P(k-1)} = \frac{n - k + 1}{k} \cdot \frac{p}{q}.

Start with P(0)=qnP(0) = q^n, then P(1)=P(0)R(1)P(1) = P(0) \cdot R(1), etc.

Mode

m=np+p=(n+1)p.m = \lfloor np + p \rfloor = \lfloor (n+1)p \rfloor.

Probabilities increase up to mm, then decrease after. If (n+1)pZ(n+1)p \in \mathbb{Z}, there are two modes: mm and m1m - 1.

Mean

μ=np.\mu = np.

Best-of-(2n1)(2n-1) series

If Team A wins each game independently with probability pp, then

P(A wins best-of-(2n1))=k=n2n1(k1n1)pn(1p)kn.P(\text{A wins best-of-}(2n-1)) = \sum_{k=n}^{2n-1} \binom{k-1}{n-1} p^{\,n}(1-p)^{k-n}.

Section 2.4: Poisson Approximation

When to Use It

Normal approximation to binomial is poor when nn is large but pp is very small (or very close to 1). Poisson approximation depends mainly on μ=np\mu = np.

The Key Formula

P(k successes)eμμkk!,k=0,1,2,,P(k \text{ successes}) \approx e^{-\mu} \frac{\mu^k}{k!}, \qquad k = 0, 1, 2, \ldots,

where μ=np\mu = np. Conditions: nn large, pp small, μ=np\mu = np moderate.

Why It Works

P(0)=(1p)n(ep)n=enp=eμ.P(0) = (1 - p)^n \approx (e^{-p})^n = e^{-np} = e^{-\mu}.

Consecutive odds ratio:

R(k)=P(k)P(k1)=nk+1kp1pμk.R(k) = \frac{P(k)}{P(k-1)} = \frac{n - k + 1}{k} \cdot \frac{p}{1 - p} \approx \frac{\mu}{k}.

Thus

P(k)=P(0)j=1kR(j)eμμkk!.P(k) = P(0) \prod_{j=1}^{k} R(j) \approx e^{-\mu} \cdot \frac{\mu^k}{k!}.

The Poisson(μ\mu) Distribution

Pμ(k)=eμμkk!,k=0,1,2,P_\mu(k) = e^{-\mu} \frac{\mu^k}{k!}, \qquad k = 0, 1, 2, \ldots

and k=0Pμ(k)=1\displaystyle\sum_{k=0}^{\infty} P_\mu(k) = 1.

Quick example: 200 items, 1% defective. Find P(2)P(\ge 2). μ=np=200(0.01)=2\mu = np = 200(0.01) = 2.

P(2)=1P(0)P(1)=1e2200!e2211!=13e20.594.P(\ge 2) = 1 - P(0) - P(1) = 1 - e^{-2}\frac{2^0}{0!} - e^{-2}\frac{2^1}{1!} = 1 - 3e^{-2} \approx 0.594.

Section 2.2: Normal Approximation

Method

1. The Normal Curve

Normal density with mean μ\mu and standard deviation σ\sigma:

y=12πσexp ⁣(12(xμσ)2),<x<.y = \frac{1}{\sqrt{2\pi}\,\sigma} \exp\!\left(-\frac{1}{2}\left(\frac{x - \mu}{\sigma}\right)^2\right), \qquad -\infty < x < \infty.

μ\mu controls center; σ\sigma controls spread; total area is 1.

2. Standard Units and the Standard Normal

Convert XN(μ,σ2)X \sim N(\mu, \sigma^2) to ZN(0,1)Z \sim N(0,1) via

z=xμσ.z = \frac{x - \mu}{\sigma}.

Standard normal density:

ϕ(z)=12πez2/2.\phi(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}.

3. The Standard Normal CDF Φ(z)\Phi(z)

Φ(z)=zϕ(y)dy=P(Zz).\Phi(z) = \int_{-\infty}^{z} \phi(y) \, dy = P(Z \le z).

Symmetry:

Φ(z)=1Φ(z).\Phi(-z) = 1 - \Phi(z).

Interval probability (notation):

Φ(a,b)=Φ(b)Φ(a).\Phi(a, b) = \Phi(b) - \Phi(a).

Symmetric interval:

Φ(z,z)=2Φ(z)1.\Phi(-z, z) = 2\Phi(z) - 1.

Three values to memorize:

IntervalProbability
Φ(1,1)\Phi(-1, 1)68%\approx 68\%
Φ(2,2)\Phi(-2, 2)95%\approx 95\%
Φ(3,3)\Phi(-3, 3)99.7%\approx 99.7\%

4. Normal Approximation to the Binomial

For SnBinomial(n,p)S_n \sim \text{Binomial}(n, p), when npq\sqrt{npq} is large enough:

μ=np,σ=npq,q=1p.\mu = np, \qquad \sigma = \sqrt{npq}, \qquad q = 1 - p.

Continuity correction:

P(aSnb)Φ ⁣(b+12μσ)Φ ⁣(a12μσ).P(a \le S_n \le b) \approx \Phi\!\left(\frac{b + \tfrac{1}{2} - \mu}{\sigma}\right) - \Phi\!\left(\frac{a - \tfrac{1}{2} - \mu}{\sigma}\right).

Quick example: P(S100=50)P(S_{100} = 50) for fair tosses. μ=50\mu = 50, σ=5\sigma = 5, a=b=50a = b = 50.

P(50)Φ ⁣(50.5505)Φ ⁣(49.5505)=Φ(0.1)Φ(0.1)=2Φ(0.1)10.0796.P(50) \approx \Phi\!\left(\frac{50.5 - 50}{5}\right) - \Phi\!\left(\frac{49.5 - 50}{5}\right) = \Phi(0.1) - \Phi(-0.1) = 2\Phi(0.1) - 1 \approx 0.0796.

5. Square Root Law & Confidence Intervals

Success count fluctuates around npnp on scale n\sqrt{n}; proportion fluctuates around pp on scale 1/n1/\sqrt{n}.

Conservative 99.99% CI for unknown pp, given observed p^\hat{p} from nn trials:

p^±2n.\hat{p} \pm \frac{2}{\sqrt{n}}.

Why 99.99%? Worst-case SD of p^\hat{p} is σmax=12n\sigma_{\max} = \frac{1}{2\sqrt{n}} (at p=1/2p = 1/2). So 2n=4σmax\frac{2}{\sqrt{n}} = 4\sigma_{\max}, and ±4σ\pm 4\sigma under normal approximation covers 99.99%\approx 99.99\%.

To halve the CI width, you must quadruple nn.


Section 2.5: Random Sampling

1. Setup

Population size NN contains GG good and BB bad, with G+B=NG + B = N. Sample size n=g+bn = g + b, where gg = # good drawn, bb = # bad drawn.

2. Sampling WITH Replacement

Draws independent; p=G/Np = G/N. Number of good follows Binomial(n,p)\text{Binomial}(n, p):

P(g good and b bad)=(ng)GgBbNn=(ng)pgqb,q=BN.P(g \text{ good and } b \text{ bad}) = \binom{n}{g} \frac{G^g B^b}{N^n} = \binom{n}{g} p^g q^b, \quad q = \frac{B}{N}.

3. Sampling WITHOUT Replacement

Draws dependent. Hypergeometric:

P(g good and b bad)=(Gg)(Bb)(Nn).P(g \text{ good and } b \text{ bad}) = \frac{\binom{G}{g}\binom{B}{b}}{\binom{N}{n}}.

Equivalent ordered form:

P(g good and b bad)=(ng)(G)g(B)b(N)n,P(g \text{ good and } b \text{ bad}) = \binom{n}{g} \frac{(G)_g \, (B)_b}{(N)_n},

where (M)k=M(M1)(Mk+1)(M)_k = M(M-1)\cdots(M-k+1) (falling factorial, kk factors).

4. Binomial Approximation to Hypergeometric

When N,G,BN, G, B are large relative to nn, sampling without replacement \approx sampling with replacement:

(N)nNn1as N,\frac{(N)_n}{N^n} \to 1 \quad \text{as } N \to \infty,

so hypergeometric probability \approx binomial probability.

5. Confidence Intervals (brief)

For large nn, the sample proportion p^\hat{p} satisfies:

P ⁣(p1n<p^<p+1n)95%.P\!\left(p - \frac{1}{\sqrt{n}} < \hat{p} < p + \frac{1}{\sqrt{n}}\right) \ge 95\%.

So p^±1n\hat{p} \pm \dfrac{1}{\sqrt{n}} is an approximate 95% CI for unknown population proportion pp.