Lies, Damned Lies, and Statistics

Frederic G. Snider, RPG and Michelle B. Snider, PhD

Course Outline

The material presented in this course provides an overview of the basics of statistical analyses. The terminology is defined and examples provided for both single-valued statistics such as mean and standard deviation, as well as multi-valued statistics such as correlation and regressions. Basic steps are described and explained by example so that the student can understand how different statistical techniques can be applied.

• Introduction
• Chapter 1 – When We Have All the Information
• Single-Value Statistics
• The Box Plot
• Histograms
• Common Shapes of Histograms – Distributions
• Standard Deviation
• Gaussian Distributions (The Bell Curve)
• Problems with Single-Value Statistics
• Multi-Valued Statistics
• Two Data Sets: Correlation
• Correlation Versus Causation
• More Than Two Data Sets: Multiple Regression
• Chapter 2: When We Don’t Have All the Information
• Statistical Inference
• Sampling Biases
• Addressing Liars in Surveys
• Chapter 3 - Interesting and Amazing Aspects of Statistics
• Lurking Variables – Science Fair Nemesis
• Regression to the Mean
• Benford’s Law and Fraud Detection
• The Numbers Never Lie, but They Can Mislead!
• The Future of Statistics
• Conclusion

The course includes a multiple-choice quiz at the end, which is designed to enhance the understanding of the course materials.

Learning Objectives

The learning objectives for this course are to:

• Become familiar with the terminology and language of basic statistics;
• Understand the basic concepts in statistical analysis;
• Learn the concept of distributions and the most common types;
• Develop a sense of the biases that can be introduced by statistics;
• Understand the implications of basing critical decisions on limited statistics;
• Address the trade-offs that must be made when not all the information is available;
• Learn new concepts such as Lurking Variables and Simpson’s Paradox;
• Explore how data can be used to mislead or confound an issue; and
• Form a good basis for more advanced study if desired or necessary.

Intended Audience

This course is intended for all students, engineers, scientists, architects, engineering managers, project managers, and corporate leaders who are interested in the basics of statistics and, most importantly, how statistics can be used to manipulate information to be purposely misleading.

Benefit to Attendees

All engineers, architects, and related fields can benefit from a better understanding of the principles and methodologies that are part of statistical analyses, and hopefully will become more skeptical of conclusions drawn from purely statistical analysis.

Course Introduction

The title of this course “Lies, Damned Lies, and Statistics’ is part of a famous quote by Mark Twain, who reportedly stated: "There are three kinds of lies: lies, damned lies and statistics." His quote refers to how statistics are commonly misapplied, often on purpose, to support a position or push an agenda. Statistics at its mathematical roots, however, has no such nefarious underpinnings, but is rather a way for us to grasp and communicate patterns and relationships within large sets of data without having to struggle with the data sets themselves.

Statistics is the study of data. A large collection of information by itself is difficult to for our brains to process, often leading to conclusions that can be at best meaningless and at worst misleading. Statistics is a mode of reasoning, a way of “mathematizing” data into a concise picture. It allows us to put information into a context, and gives us a way to discern its global behavior. Statistics are used in almost all human fields of endeavor, including:

• Politics: polls on opinions, how people will vote
• Education: assessment of course work, teacher and student performance
• Sociology: how is happiness related to wealth? How much tv do people watch?
• Sports: records, streaks, betting
• Art: how much things sell for, how much actors get paid, how popular a film is
• Medicine: How to decide whether or not to take a drug
• Science: How to interpret results of experiments.

However, when doing statistical analyses, or looking at others analyses, it is critical that we don’t check our brains at the door. The application of statistical methods and the interpretation of results must involve both mathematics and logic, in order for the conclusions to be correct and the implications properly interpreted. We need to be clear about what a statistical statement does or does not imply. If we only consider formulas, we are likely to get incorrect or misleading results.

Statistics is more subtle than people may realize, so the underlying logic must be included. The choice of statistical summary, the method of data collection and/or sampling, bias in experimental construction, the arbitrary choices of thresholds for outliers or for invalidation of a hypothesis, all play a role in a result, and need to be considered when one is assessing the value and accuracy of that result.

Course Content

In this lesson, you are required to download and study the following course content:

Please click on the above underlined hypertext to view, download or print the document for your study.
You may need to download Acrobat Reader to view and print the document.

Course Summary

The material presented in this course provides an overview of the basic concepts in statistics. The course addresses single-valued statistics such as mean and standard deviation, as well as multi-valued statistics such as correlations and causation. Key terms commonly used in statistical analysis are defined and examples given. Some of the limitations of statistics and ways that statistics can be used to mislead, confuse, or obfuscate truth are explored. Finally, some interesting and generally unknown and counter-intuitive aspects of statistics are discussed and examples given.

Quiz

Once you finish studying the above course content, you need to take a quiz to obtain the PDH credits.

DISCLAIMER: The materials contained in the online course are not intended as a representation or warranty on the part of PDH Center or any other person/organization named herein. The materials are for general information only. They are not a substitute for competent professional advice. Application of this information to a specific project should be reviewed by a registered architect and/or professional engineer/surveyor. Anyone making use of the information set forth herein does so at their own risk and assumes any and all resulting liability arising therefrom.