Let’s Make a Deal (Famous program of U.S.)

 

A player would stand and face three big doors: Door no.1, Door no. 2, and Door no.3. There is a highly desirable prize behind one of the doors-something like a new car- and a goat behind the other two. The player choose one of doors and would get the contents behind that door.

As each player stand facing the doors, he or she have 1 in 3 chance of choosing the door taht would be openned to reveal the valuable prize. After the player choose a door, one of remaining doors which the goat behind will be opened. The player will be asked whether he would like to change his mind and switch doors. Reminder, both doors are still closed, and the only new information the contestant had received is that a goat showed up behind on one of the doors that he didn’t pick.

 

Should he switch ?

 

 

Consider the following hypothetical Internet news flash:

 

People Who Take Short Breaks at Work Are Far More Likely to Die of Cancer. Imagine that headline popping up while you are surfing the Web. According to a seemingly impressive study of 36,000 office workers (a huge data set!), those workers who reported leaving their offices to take regular ten-minute breaks during the workday were 41 percent more likely to develop cancer over the next five years than workers who don’t leave their offices during the workday.

  • Clearly we need to act on this kind of finding—perhaps some kind of national awareness campaign to prevent short breaks on the job ?

  • Or maybe we just need to think more clearly about what many workers are doing during that ten-minute break ?

  • In fact, many of those workers who report leaving their offices for short breaks are huddled outside the entrance of the building smoking cigarettes (creating a haze of smoke through which the rest of us have to walk in order to get in or out). Hence it’s probably the cigarettes, and not the short breaks from work, that are causing the cancer.

 

1. What’s point?

 

Performance of Quarterback in US Football

It’s a nice tool for making a quick comparison between the performances of two quarterbacks on a given day.

 

Gini Index

 

What is the point? The point is that statistics helps us process data, which is really just a fancy name for information. Sometimes the data are trivial in the grand scheme of things, as with sports statistics. Sometimes they offer insight into the nature of human existence, as with the Gini index.

 

Consider the following disparate questions:

The world is producing more and more data, ever faster and faster. Statistics is the most powerful tool we have for using information to some meaningful end, whether that is identifying underrated baseball players or paying teachers more fairly.

 

Description and Comparison

 

  • A bowling score is a descriptive statistic. So is a batting average. Most American sports fans over the age of five are already conversant in the field of descriptive statistics. We use numbers, in sports and everywhere else in life, to summarize information.

  • Of course, baseball fans have also come to recognize that descriptive statistics other than batting average may better encapsulate a player’s value on the field.

  • We evaluate the academic performance of high school and college students by means of a grade point average, or GPA. By graduation, when high school students are applying to college and college students are looking for jobs, the grade point average is a handy tool for assessing their academic potential.

  • But it’s not perfect . The GPA does not reflect the difficulty of the courses that different students may have taken. How can we compare a student with a 3.4 GPA in classes that appear to be relatively nonchallenging and a student with a 2.9 GPA who has taken calculus, physics, and other tough subjects?

Descriptive statistics exist to simplify, which always implies some loss of nuance or detail. Anyone working with numbers needs to recognize as much.

 

Inference

One key function of statistics is to use the data we have to make informed conjectures about larger questions for which we do not have full information. In short, we can use data from the “known world” to make informed inferences about the “unknown world.”

 

Political Poll

  • A political poll is one form of sampling. A research organization will attempt to contact a sample of households that are broadly representative of the larger population and ask them their views about a particular issue or candidate. This is obviously much cheaper and faster than trying to contact every household in an entire state or country.

*a methodologically sound poll of 1,000 households will produce roughly the same results as a poll that attempted to contact every household in America.

 

Identifying Important Relationships

 

Does smoking cigarettes cause cancer?

  • We have an answer for that question—but the process of answering it was not nearly as straightforward as one might think.

  • The scientific method dictates that if we are testing a scientific hypothesis, we should conduct a controlled experiment in which the variable of interest (e.g., smoking) is the only thing that differs between the experimental group and the control group.

  • If we observe a marked difference in some outcome between the two groups (e.g., lung cancer), we can safely infer that the variable of interest is what caused that outcome.

  • We cannot do that kind of experiment on humans. If our working hypothesis is that smoking causes cancer, it would be unethical to assign recent college graduates to two groups, smokers and nonsmokers, and then see who has cancer at the twentieth reunion.

  • Now, you might point out that we do not need to conduct an ethically dubious experiment to observe the effects of smoking. Couldn’t we just skip the whole fancy methodology and compare cancer rates at the twentieth reunion between those who have smoked since graduation and those who have not?

  • No. Smokers and nonsmokers are likely to be different in ways other than their smoking behavior.

We cannot treat humans like laboratory rats. As a result, statistics is a lot like good detective work. The data yield clues and patterns that can ultimately lead to meaningful conclusions.

 

Lies, Damned Lies, and Statistics

 

  • Even in the best of circumstances, statistical analysis rarely unveils “the truth.” We are usually building a circumstantial case based on imperfect data. As a result, there are numerous reasons that intellectually honest individuals may disagree about statistical results or their implications. At the most basic level, we may disagree on the question that is being answered.

  • There are limits on the data we can gather and the kinds of experiments we can perform.

  • We conduct statistical analysis using the best data and methodologies and resources available. The approach is not like addition or long division, in which the correct technique yields the “right” answer and a computer is always more precise and less fallible than a human.

  • Statistical analysis is more like good detective work (hence the commercial potential of CSI: Regression Analysis ). Smart and honest people will often disagree about what the data are trying to tell us.

  • But who says that everyone using statistics is smart or honest? The reality is that you can lie with statistics. Or you can make inadvertent errors. In either case, the mathematical precision attached to statistical analysis can dress up some serious nonsense.

 

What is the point of learning statistics

 

  • To summarize huge quantities of data.

  • To make better decisions.

  • To answer important social questions.

  • To recognize patterns that can refine how we do everything.

  • To evaluate the effectiveness of policies, programs, drugs medical procedures, and other innovations.

 

2. Descriptive Statistics

 

Let us ponder for a moment two seemingly unrelated questions:

 

What the two questions have in common is that they can be used to illustrate the strengths and limitations of descriptive statistics, which are the numbers and calculations we use to summarize raw data.

 

 

3. Deceptive Description and other true but grossly misleading statements

 

To anyone who has ever contemplated dating, the phrase he’s got a great personality usually sets off alarm bells, not because the description is necessarily wrong, but for what it may not reveal, such as the fact that the guy has a prison record or that his divorce is “not entirely final.”

Example: Many of the Wall Street risk management models prior to the 2008 financial crisis were quite precise. The concept of value at risk allowed firms to quantify with precision the amount of the firm’s capital that could be lost under different scenarios. The problem was that the supersophisticated models. The math was complex and arcane. The answers it produced were reassuringly precise. But the assumptions about what might happen to global markets that were embedded in the models were just plain wrong, making the conclusions wholly inaccurate in ways that destabilized not only Wall Street but the entire global economy.

 

Example: Hollywood studios may be the most egregiously oblivious to the distortions caused by inflation when comparing figures at different points in time—and deliberately so. What were the top five highest-grossing films (domestic) of all time as of 2011?

The most accurate way to compare commercial success over time would be to adjust ticket receipts for inflation. Earning $100 million in 1939 is a lot more impressive than earning $500 million in 2011. So what are the top grossing films in the U.S. of all time, adjusted for inflation

In real terms, Avatar falls to number 14; Shrek 2 falls all the way to 31st.

 

Example: In a similar vein, your kindhearted boss might point out that as a matter of fairness, every employee will be getting the same raise this year, 10 percent. What a magnanimous gesture—except that if your boss makes $1 million and you make $50,000, his raise will be $100,000 and yours will be $5,000. The statement “everyone will get the same 10 percent raise this year” just sounds so much better than “my raise will be twenty times bigger than yours.” Both are true in this case.

 

Example: If school administrators are evaluated—and perhaps even compensated—on the basis of the high school graduation rate for students in a particular school district, they will focus their efforts on boosting the number of students who graduate. Of course, they may also devote some effort to improving the graduation rate, which is not necessarily the same thing. For example, students who leave school before graduation can be classified as “moving away” rather than dropping out.

This is not merely a hypothetical example; it is a charge that was leveled against former secretary of education Rod Paige during his tenure as the Houston school superintendent. Paige was hired by President George W. Bush to be U.S. secretary of education because of his remarkable success in Houston in reducing the dropout rate and boosting test scores.

Example: Cardiologists obviously care about their “scorecard.” However, the easiest way for a surgeon to improve his mortality rate is not by killing fewer people; presumably most doctors are already trying very hard to keep their patients alive. The easiest way for a doctor to improve his mortality rate is by refusing to operate on the sickest patients.

According to a survey conducted by the School of Medicine and Dentistry at the University of Rochester, the scorecard, which ostensibly serves patients, can also work to their detriment: 83 percent of the cardiologists surveyed said that, because of the public mortality statistics, some patients who might benefit from angioplasty might not receive the procedure; 79 percent of the doctors said that some of their personal medical decisions had been influenced by the knowledge that mortality data are collected and made public. The sad paradox of this seemingly helpful descriptive statistic is that cardiologists responded rationally by withholding care from the patients who needed it most.

Example: Rankings of Universities

 

Statistical malfeasance has very little to do with bad math. If anything, impressive calculations can obscure nefarious motives. The fact that you’ve calculated the mean correctly will not alter the fact that the median is a more accurate indicator. Judgment and integrity turn out to be surprisingly important. A detailed knowledge of statistics does not deter wrongdoing any more than a detailed knowledge of the law averts criminal behavior. With both statistics and crime, the bad guys often know exactly what they’re doing!