Correlation measures the degree to which two phenomena are related to one another. For example, there is a correlation between summer temperatures and ice cream sales. When one goes up, so does the other. Two variables are positively correlated if a change in one is associated with a change in the other in the same direction, such as the relationship between height and weight. Taller people weigh more (on average); shorter people weigh less. A correlation is negative if a positive change in one variable is associated with a negative change in the other, such as the relationship between exercise and weight.
The tricky thing about these kinds of associations is that not every observation fits the pattern. Sometimes short people weigh more than tall people. Sometimes people who don’t exercise are skinnier than people who exercise all the time. Still, there is a meaningful relationship between height and weight, and between exercise and weight.
library(HistData)
plot(GaltonFamilies$midparentHeight,GaltonFamilies$childHeight)
cor(GaltonFamilies$midparentHeight,GaltonFamilies$childHeight)
## [1] 0.3209499
The correlation coefficient has two fabulously attractive characteristics. First, for math reasons that have been relegated to the appendix, it is a single number ranging from –1 to 1. A correlation of 1, often described as perfect correlation, means that every change in one variable is associated with an equivalent change in the other variable in the same direction.
The closer the correlation is to 1 or –1, the stronger the association. A correlation of 0 (or close to it) means that the variables have no meaningful association with one another, such as the relationship between shoe size and SAT scores.
The second attractive feature of the correlation coefficient is that it has no units attached to it.
Example: The SAT Reasoning Test, formerly known as the Scholastic Aptitude Test, is a standardized exam made up of three sections: math, reading, and writing. Why is a four-hour test so important when college admissions officers have access to four years of high school grades?
The purpose of the test is to measure academic ability and predict college performance.
The correlation between high school grade point average and first-year college grade point average is .56.
The correlation between the SAT composite score (critical reading, math, and writing) and first-year college GPA is also .56.
In fact, the best predictor of all is a combination of SAT scores and high school GPA, which has a correlation of .64 with first-year college grades.
One crucial point in this general discussion is that correlation does not imply causation; a positive or negative association between two variables does not necessarily mean that a change in one of the variables is causing the change in the other.
Example: A likely positive correlation between a student’s SAT scores and the number of televisions that his family owns. This does not mean that overeager parents can boost their children’s test scores by buying an extra five televisions for the house. Nor does it likely mean that watching lots of television is good for academic achievement.
The most logical explanation for such a correlation would be that highly educated parents can afford a lot of televisions and tend to have children who test better than average. Both the televisions and the test scores are likely caused by a third variable, which is parental education.
Probability is the study of events and outcomes involving an element of uncertainty.
Investing in the stock market involves uncertainty.
So does flipping a coin, which may come up heads or tails.
Flipping a coin four times in a row involves additional layers of uncertainty, because each of the four flips can result in a head or a tail.
If you flip a coin four times in a row, I cannot know the outcome in advance with certainty (nor can you). Yet I can determine in advance that some outcomes (two heads, two tails) are more likely than others (four heads).
Many events have known probabilities.
Some events have probabilities that can be inferred on the basis of past data.
Example: The Australian Transport Safety Board published a report quantifying the fatality risks for different modes of transport. Despite widespread fear of flying, the risks associated with commercial air travel are tiny. Australia hasn’t had a commercial air fatality since the 1960s, so the fatality rate per 100 million kilometers traveled is essentially zero. The rate for drivers is .5 fatalities per 100 million kilometers traveled. The really impressive number is for motorcycles—if you aspire to be an organ donor. The fatality rate is thirty-five times higher for motorcycles than for cars.
Probability can also sometimes tell us after the fact what likely happened and what likely did not happen.
Example: Humans share similarities in their DNA, just as we share other similarities: shoe size, height, eye color. (More than 99 percent of all DNA is identical among all humans.) If researchers have access to only a small sample of DNA on which only a few loci can be tested, it’s possible that thousands or even millions of individuals may share that genetic fragment. Therefore, the more loci that can be tested, and the more natural genetic variation there is in each of those loci, the more certain the match becomes. Or, to put it a bit differently, the less likely it becomes that the DNA sample will match more than one person.
Often it is extremely valuable to know the likelihood of multiple events’ happening.
The probability of two independent events’ both happening is the product of their respective probabilities. In other words, the probability of Event A happening and Event B happening is the probability of Event A multiplied by the probability of Event B.
intuitive. If the probability of flipping heads with a fair coin is ½ , then the probability of flipping heads twice in a row is ½ × ½ , or ¼ . The probability of flipping three heads in a row is ⅛ , the probability of four heads in a row is 1/16, and so on.
This explains why the system administrator at your school or office is constantly on your case to improve the “quality” of your password. If you have a six-digit password using only numerical digits, we can calculate the number of possible passwords: 10 × 10 × 10 × 10 × 10 × 10, which equals 10 6 , or 1,000,000. That sounds like a lot of possibilities, but a computer could blow through all 1,000,000 possible combinations in a fraction of a second.
There is one crucial distinction here. This formula is applicable only if the events are independent, meaning that the outcome of one has no effect on the outcome of another.
Suppose you are interested in the probability that one event happens or another event happens: outcome A or outcome B (again assuming that they are independent). In this case, the probability of getting A or B consists of the sum of their individual probabilities: the probability of A plus the probability of B.
Probability also enables us to calculate what might be the most useful tool in all of managerial decision making, particularly finance: expected value.
clearer. Suppose you are invited to play a game in which you roll a single die. The payoff to this game is $1 if you roll a 1; $2 if you roll a 2; $3 if you roll a 3; and so on. What is the expected value for a single roll of the die? Each possible outcome has a image (Image) probability, so the expected value is:
1/6($1) + 1/6 ($2) + 1/6 ($3) + 1/6 ($4) + 1/6 ($5) + 1/6 ($6) =21/6 , or $3.50.
Suppose you have the chance to play the above game for $3 a throw. Does it make sense to play? Yes, because the expected value of the outcome ($3.50) is higher than the cost of playing ($3.00).
The same basic analysis can illustrate why you should never buy a lottery ticket. In Illinois, the probabilities associated with the various possible payoffs for the game are printed on the back of each ticket. I purchased a $1 instant ticket. (Note to self: Is this tax deductible?) On the back—in tiny, tiny print—are the chances of winning different cash prizes, or a free new ticket: 1 in 10 (free ticket); 1 in 15 ($2); 1 in 42.86 ($4); 1 in 75 ($5); and so on up to the 1 in 40,000 chance of winning $1,000. I calculated the expected payout for my instant ticket by adding up each possible cash prize weighted by its probability. * It turns out that my $1 lottery ticket has an expected payout of roughly $.56, making it an absolutely miserable way to spend $1.
The law of large numbers explains why casinos always make money in the long run. The probabilities associated with all casino games favor the house (assuming that the casino can successfully prevent blackjack players from counting cards). If enough bets are wagered over a long enough time, the casino will be certain to win more than it loses.
Expected value can also help us untangle complex decisions that involve many contingencies at different points in time.
Example: The same basic process can be used to explain a seemingly counterintuitive phenomenon. Sometimes it does not make sense to screen the entire population for a rare but serious disease, such as HIV/AIDS. Suppose we can test for some rare disease with a high degree of accuracy. For the sake of example, let’s assume the disease affects 1 of every 100,000 adults and the test is 99.9999 percent accurate. The test never generates a false negative (meaning that it never misses someone who has the disease); however, roughly 1 in 10,000 tests conducted on a healthy person will generate a false positive, meaning that the person tests positive but does not actually have the disease. The striking outcome here is that despite the impressive accuracy of the test, most of the people who test positive will not have the disease. . This will generate enormous anxiety among those who falsely test positive; it can also waste finite health care resources on follow-up tests and treatment.
Only 1,750 adults have the disease. They all test positive. Over 174 million adults do not have the disease. Of this healthy group who are tested, 99.9999 get the correct result that they do not have the disease. Only .0001 get a false positive.
But .0001 of 174 million is still a big number. In fact, 17,500 people will, on average, get false positives.
Let’s look at what that means. A total of 19,250 people are notified that they have the disease; only 9 percent of them are actually sick!
Example: Security. The Securities and Exchange Commission (SEC), the government agency responsible for enforcing federal laws related to securities trading, uses a similar methodology for catching inside traders. (Inside trading involves illegally using private information, such as a law firm’s knowledge of an impending corporate acquisition, to trade stock or other securities in the affected companies.) The SEC uses powerful computers to scrutinize hundreds of millions of stock trades and look for suspicious activity, such as a big purchase of shares in a company just before a takeover is announced, or the dumping of shares just before a company announces disappointing earnings. The SEC will also investigate investment managers with unusually high returns over long periods of time. (Both economic theory and historical data suggest that it is extremely difficult for a single investor to get above-average returns year after year.)
Probability is not deterministic. No, you shouldn’t buy a lottery ticket—but you still might win money if you do. And yes, probability can help us catch cheaters and criminals—but when used inappropriately it can also send innocent people to jail.
The “Monty Hall problem” is a famous probability-related conundrum faced by participants on the game show Let’s Make a Deal , which premiered in the United States in 1963 and is still running in some markets around the world.
At the end of each day’s show a contestant was invited to stand with host Monty Hall facing three big doors: Door no. 1, Door no. 2, and Door no. 3. Monty explained to the contestant that there was a highly desirable prize behind one of the doors and a goat behind the other two doors. The player chose one of the three doors and would get as a prize whatever was behind it.
After the contestant chose a door, Monty would open one of the two doors that the contestant had not picked , always revealing a goat. At that point, Monty would ask the contestant if he would like to change his pick—to switch from the closed door that he had picked originally to the other remaining closed door.
Should the contestant switch? Yes
This answer seems entirely unintuitive at first. It would appear that the contestant has a one-third chance of winning no matter what he does. There are three closed doors.
The answer lies in the fact that Monty Hall knows what is behind each door. If the contestant picks Door no. 1 and there is a car behind it, then Monty can open either no. 2 or no. 3 to display a goat.
If the contestant picks Door no. 1 and the car is behind no. 2, then Monty opens no. 3.
If the contestant picks Door no. 1 and the car is behind no. 3, then Monty opens no. 2.
By switching after a door is opened, the contestant gets the benefit of choosing two doors rather than one.
Assume that Monty Hall offers you a choice from among 100 doors rather than just three. After you pick your door, say, no. 47, he opens 98 other doors with goats behind them. Now there are only two doors that remain closed, no. 47 (your original choice) and one other, say, no. 61. Should you switch? Of course you should. There is a 99 percent chance that the car was behind one of the doors that you did not originally choose.
Statistics cannot be any smarter than the people who use them. And in some cases, they can make smart people do dumb things. One of the most irresponsible uses of statistics in recent memory involved the mechanism for gauging risk on Wall Street prior to the 2008 financial crisis.
At that time, firms throughout the financial industry used a common barometer of risk, the Value at Risk model, or VaR. In theory, VaR combined the elegance of an indicator (collapsing lots of information into a single number) with the power of probability (attaching an expected gain or loss to each of the firm’s assets or trading positions).
Prior to the financial crisis of 2008, firms trusted the VaR model to quantify their overall risk. The formula even took into account the correlations among different positions. For example, if two investments had expected returns that were negatively correlated, a loss in one would likely have been offset by a gain in the other, making the two investments together less risky than either one separately.
Then, even better, the aggregate risk for the firm could be calculated at any point in time by taking the same basic process one step further. The underlying mathematical mechanics are obviously fabulously complicated, as firms had a dizzying array of investments in different currencies, with different amounts of leverage (the amount of money that was borrowed to make the investment), trading in markets with different degrees of liquidity, and so on.
The primary critique of VaR is that the underlying risks associated with financial markets are not as predictable as a coin flip.
The false precision embedded in the models created a false sense of security. The VaR was like a faulty speedometer, which is arguably worse than no speedometer at all. If you place too much faith in the broken speedometer, you will be oblivious to other signs that your speed is unsafe. In contrast, if there is no speedometer at all, you have no choice but to look around for clues as to how fast you are really going.
Unfortunately, there were two huge problems with the risk profiles encapsulated by the VaR models. First, the underlying probabilities on which the models were built were based on past market movements; however, in financial markets, the future does not necessarily look like the past.
Second, even if the underlying data could accurately predict future risk, the 99 percent assurance offered by the VaR model was dangerously useless, because it’s the 1 percent that is going to really mess you up. In fact, the models had nothing to say about how bad that 1 percent scenario might turn out to be. Very little attention was devoted to the “tail risk,” the small risk (named for the tail of the distribution) of some catastrophic outcome.
Probability offers a powerful and useful set of tools—many of which can be employed correctly to understand the world or incorrectly to wreak havoc on it.
Now that you are armed with this powerful knowledge, let’s assume that you have been promoted to head of risk management at a major airline. Your assistant informs you that the probability of a jet engine’s failing for any reason during a transatlantic flight is 1 in 100,000. Given the number of transatlantic flights, this is not an acceptable risk. Fortunately each jet making such a trip has at least two engines. Your assistant has calculated that the risk of both engines’ shutting down over the Atlantic must be \((1/100,000)^2\) , or 1 in 10 billion, which is a reasonable safety risk.
The two engine failures are not independent events. If a plane flies through a flock of geese while taking off, both engines are likely to be compromised in a similar way. The same would be true of many other factors that affect the performance of a jet engine, from weather to improper maintenance. If one engine fails, the probability that the second engine fails is going to be significantly higher than 1 in 100,000.
If you flip a fair coin 1,000,000 times and get 1,000,000 heads in a row, the probability of getting tails on the next flip is still ½ . The very definition of statistical independence between two events is that the outcome of one has no effect on the outcome of the other. Even if you don’t find the statistics persuasive, you might ask yourself about the physics: How can flipping a series of tails in a row make it more likely that the coin will turn up heads on the next flip ? “the gambler’s fallcy”.
You’ve probably read the story in the newspaper, or perhaps seen the news exposé: Some statistically unlikely number of people in a particular area have contracted a rare form of cancer. It must be the water, or the local power plant, or the cell phone tower. Of course, any one of those things might really be causing adverse health outcomes. (Later chapters will explore how statistics can identify such causal relationships.) But this cluster of cases may also be the product of pure chance, even when the number of cases appears highly improbable.
Yes, the probability that five people in the same school or church or workplace will contract the same rare form of leukemia may be one in a million, but there are millions of schools and churches and workplaces. It’s not highly improbable that five people might get the same rare form of leukemia in one of those places.
The same phenomenon can explain why students who do much better than they normally do on some kind of test will, on average, do slightly worse on a retest, and students who have done worse than usual will tend to do slightly better when retested.
When is it okay to act on the basis of what probability tells us is likely to happen, and when is it not okay?
In 2003, Anna Diamantopoulou, the European commissioner for employment and social affairs, proposed a directive declaring that insurance companies may not charge different rates to men and women, because it violates the European Union’s principle of equal treatment. 8 To insurers, however, gender-based premiums aren’t discrimination; they’re just statistics.
Men typically pay more for auto insurance because they crash more. Women pay more for annuities (a financial product that pays a fixed monthly or yearly sum until death) because they live longer. Obviously many women crash more than many men, and many men live longer than many women.
Data are to statistics what a good offensive line is to a star quarterback. In front of every star quarterback is a good group of blockers. They usually don’t get much credit. But without them, you won’t ever see a star quarterback. Assume that you are using good data, just as a cookbook assumes that you are not buying rancid meat and rotten vegetables. But even the finest recipe isn’t going to salvage a meal that begins with spoiled ingredients. So it is with statistics; no amount of fancy analysis can make up for fundamentally flawed data. Hence the expression “garbage in, garbage out.” Data deserve respect, just like offensive linemen.
We generally ask our data to do one of three things.
First, we may demand a data sample that is representative of some larger group or population. One of the most powerful findings in statistics is that inferences made from reasonably large, properly drawn samples can be every bit as accurate as attempting to elicit the same information from the entire population.
The easiest way to gather a representative sample of a larger population is to select some subset of that population randomly. (Shockingly, this is known as a simple random sample.) The key to this methodology is that each observation in the relevant population must have an equal chance of being included in the sample.
A representative sample is a fabulously important thing, for it opens the door to some of the most powerful tools that statistics has to offer.
Getting a good sample is harder than it looks.
Many of the most egregious statistical assertions are caused by good statistical methods applied to bad samples, not the opposite.
Size matters, and bigger is better. The details will be explained in the coming chapters, but it should be intuitive that a larger sample will help to smooth away any freak variation.
The second thing we often ask of data is that they provide some source of comparison. Is a new medicine more effective than the current treatment? Are ex-convicts who receive job training less likely to return to prison than ex-convicts who do not receive such training? Do students who attend charter schools perform better than similar students who attend regular public schools?
In these cases, the goal is to find two groups of subjects who are broadly similar except for the application of whatever “treatment” we care about.
In the physical and biological sciences, creating treatment and control groups is relatively straightforward.
One recurring research challenge with human subjects is creating treatment and control groups that differ only in that one group is getting the treatment and the other is not.
For this reason, the “gold standard” of research is randomization, a process by which human subjects (or schools, or hospitals, or whatever we’re studying) are randomly assigned to either the treatment or the control group. We do not assume that all the experimental subjects are identical. Instead, we assume that randomization will evenly divide all relevant characteristics between the two groups—both the characteristics we can observe, like race or income, but also confounding characteristics that we cannot measure or had not considered, such as perseverance or faith.
We sometimes have no specific idea what we will do with the information—but we suspect it will come in handy at some point. This is similar to a crime scene detective who demands that all possible evidence be captured so that it can be sorted later for clues. Some of this evidence will prove useful, some will not. If we knew exactly what would be useful, we probably would not need to be doing the investigation in the first place.
Behind every important study there are good data that made the analysis possible. And behind every bad study . . . well, read on. People often speak about “lying with statistics.” In fact, some of the most egregious statistical mistakes involve lying with data ; the statistical analysis is fine, but the data on which the calculations are performed are bogus or inappropriate. Here are some common examples of “garbage in, garbage out.”
Slection Bias
Publication Bias
Recall Bias
Survivorship bias
What is a traditional mutual fund company to do? Bogus data to the rescue! Here is how they can “beat the market” without beating the market. A large mutual company will open many new actively managed funds (meaning that experts are picking the stocks, often with a particular focus or strategy). For the sake of example, let’s assume that a mutual fund company opens twenty new funds, each of which has roughly a 50 percent chance of beating the S&P 500 in a given year. (This assumption is consistent with long-term data.) Now, basic probability suggests that only ten of the firm’s new funds will beat the S&P 500 the first year; five funds will beat it two years in a row; and two or three will beat it three years in a row.
Here comes the clever part. At that point, the new mutual funds with unimpressive returns relative to the S&P 500 are quietly closed. (Their assets are folded into other existing funds.) The company can then heavily advertise the two or three new funds that have “consistently outperformed the S&P 500”—even if that performance is the stock-picking equivalent of flipping three heads in a row. The subsequent performance of these funds is likely to revert to the mean, albeit after investors have piled in. The number of mutual funds or investment gurus who have consistently beaten the S&P 500 over a long period is shockingly small.
A t times, statistics seems almost like magic. We are able to draw sweeping and powerful conclusions from relatively little data. Somehow we can gain meaningful insight into a presidential election by calling a mere one thousand American voters. We can test a hundred chicken breasts for salmonella at a poultry processing plant and conclude from that sample alone that the entire plant is safe or unsafe. Where does this extraordinary power to generalize come from?
— Much of it comes from the central limit theorem.
The core principle underlying the central limit theorem is that a large, properly drawn sample will resemble the population from which it is drawn. Obviously there will be variation from sample to sample, but the probability that any sample will deviate massively from the underlying population is very low.
That’s the basic intuition behind the central limit theorem. When we add some statistical bells and whistles, we can quantify the likelihood that you will be right or wrong. For example, we might calculate that in a marathon field of 10,000 runners with a mean weight of 155 pounds, there is less than a 1 in 100 chance that a random sample of 60 of those runners (our lost bus) would have a mean weight of 220 pounds or more.
This kind of analysis all stems from the central limit theorem, which, from a statistical standpoint, has Lebron James–like power and elegance. According to the central limit theorem, the sample means for any population will be distributed roughly as a normal distribution around the population mean. Hang on for a moment as we unpack that statement.
The household income distribution in the United States. Household income is not distributed normally in America; instead, it tends to be skewed to the right. No household can earn less than $0 in a given year, so that must be the lower bound for the distribution. Meanwhile, a small group of households can earn staggeringly large annual incomes—hundreds of millions or even billions of dollars in some cases.
The median household income in the United States is roughly $51,900; the mean household income is $70,900.
Now suppose we take a random sample of 1,000 U.S. households and gather information on annual household income. On the basis of the information above, and the central limit theorem, what can we infer about this sample?
Quite a lot, it turns out. First of all, our best guess for what the mean of any sample will be is the mean of the population from which it’s drawn. The whole point of a representative sample is that it looks like the underlying population. A properly drawn sample will, on average, look like America. There will be hedge fund managers and homeless people and police officers and everyone else—all roughly in proportion to their frequency in the population. Therefore, we would expect the mean household income for a representative sample of 1,000 American households to be about $70,900. Will it be exactly that? No. But it shouldn’t be wildly different either.
If we took multiple samples of 1,000 households, we would expect the different sample means to cluster around the population mean, $70,900. We would expect some means to be higher, and some to be lower. Might we get a sample of 1,000 households with a mean household income of $427,000? Sure, that’s possible—but highly unlikely.
Simulation Examples
# Sampling data from normal population with mean 2 and standard deviation 4
n <- 200
Time <- 1000
sample_mean <- rep(0,Time)
sample_sd <- rep(0,Time)
for(i in 1:Time){
samples <- rnorm(n,mean=2, sd=4)
sample_mean[i] <-mean(samples)
sample_sd[i] <- sd(samples)
}
MEAN_mean <- mean(sample_mean)
MEAN_mean
## [1] 2.015055
MEAN_mean_sd <-sd(sample_mean)
MEAN_mean_sd
## [1] 0.2820093
4/sqrt(n)
## [1] 0.2828427
hist(sample_mean, breaks=20, freq=FALSE)
lines(seq(min(sample_mean), max(sample_mean), by=0.01),dnorm(seq(min(sample_mean), max(sample_mean), by=0.01), mean=2, sd=(4/sqrt(n) )), type="l")
MEAN_sd <- mean(sample_sd)
MEAN_sd
## [1] 4.006302
# Sampling data from Multinomial population with 1,2,3,4,5,6 and prob. 1/6,1/6,1/6,1/6,1/6,1/6
n <- 200
Time <- 1000
sample_mean <- rep(0,Time)
sample_sd <- rep(0,Time)
for(i in 1:Time){
samples <- rmultinom(n, size=1, prob=c(1/6,1/6,1/6,1/6,1/6,1/6))
return_samples <- matrix(c(1,2,3,4,5,6), ncol=6)%*%samples
sample_mean[i] <-mean(return_samples)
sample_sd[i] <- sd(return_samples)
}
MEAN_mean <- mean(sample_mean)
MEAN_mean
## [1] 3.497865
MEAN_mean_sd <-sd(sample_mean)
MEAN_mean_sd
## [1] 0.1182652
sqrt(sum((c(1:6)-3.5)^2)/6/(200))
## [1] 0.1207615
hist(sample_mean, breaks=20, freq=FALSE)
lines(seq(min(sample_mean), max(sample_mean), by=0.01),dnorm(seq(min(sample_mean), max(sample_mean), by=0.01), mean=3.5, sd=(sqrt(sum((c(1:6)-3.5)^2)/6/(200)))), type="l")
sqrt(sum((c(1:6)-3.5)^2)/6)
## [1] 1.707825
MEAN_sd <- mean(sample_sd)
MEAN_sd
## [1] 1.705266
# Sampling data from chi-square population with degree freedom 4
n <- 200
Time <- 1000
sample_mean <- rep(0,Time)
sample_sd <- rep(0,Time)
samples <- rchisq(n, 4)
mean(samples)
## [1] 3.964792
sd(samples)
## [1] 2.310256
hist(samples, breaks=20, freq=FALSE)
lines(seq(min(samples), max(samples), by=0.01),dchisq(seq(min(samples), max(samples), by=0.01), 4), type="l")
for(i in 1:Time){
samples <- rchisq(n, 4)
sample_mean[i] <-mean(samples)
sample_sd[i] <- sd(samples)
}
MEAN_mean <- mean(sample_mean)
MEAN_mean
## [1] 4.00817
MEAN_mean_sd <-sd(sample_mean)
MEAN_mean_sd
## [1] 0.1929887
sqrt(8)/sqrt(n)
## [1] 0.2
hist(sample_mean, breaks=20, freq=FALSE)
lines(seq(min(sample_mean), max(sample_mean), by=0.01),dnorm(seq(min(sample_mean), max(sample_mean), by=0.01), mean=4, sd=(sqrt(8)/sqrt(n) )), type="l")
MEAN_sd <- mean(sample_sd)
sqrt(8)
## [1] 2.828427
MEAN_sd
## [1] 2.82563
Two different measures of dispersion: the standard deviation and the standard error.
The “big picture” here is simple and massively powerful:
That’s pretty much what statistical inference is about. The central limit theorem is what makes most of it possible.