Print this page

Environmental Statistics with R: Simple Linear Regression

Joseph Ofungwu, Ph.D., PE

Course Outline

R is a free software environment for statistical computing and graphics. This course offers participants an irresistible 2-in-1 value package, namely a fundamental data analysis methodology, and a freely available high-powered software system to do all the computations! First, regression is a pivotal statistical technique because it is essentially the only practical way that we can establish inter-relationships between variables using only limited data samples, and subsequently generalize or extrapolate the results to the larger data populations or universe. This course ensures that you will never again be confused by any terminologies or statistics associated with simple linear regression. Second, lack of access to affordable high quality software has long been an impediment to widespread use of statistics among environmental professionals. Not anymore! R has brought excitement to statistics. I invite you to become addicted…

This course includes a multiple choice quiz at the end, which is designed to enhance the understanding of the course materials.

Learning Objectives

At the conclusion of this course, the student will be able to

• Download and install R from the web;
• Compute basic analyses using R commands;
• Operate in R interactively as well as in batch mode;
• Recognize and avoid common R pitfalls;
• Differentiate between the data sample and the data population;
• Comprehend the concept of a conditional distribution;
• Identify the predictor (i.e. X) and response (i.e. Y) variables in a regression;
• Understand the significance of the coefficients;
• Plot the data to assess linearity and identify outliers;
• Compute simple linear regression using the lm function in R;
• Differentiate between the p value and t value;
• Assess overall significance using the F value;
• Determine the significance of the Sums of Squares;
• Define the degrees of freedom for various statistics;
• Quantify uncertainty using confidence intervals;
• Describe the main assumptions of linear regression;
• Evaluate the potential presence and impacts of outliers using regression diagnostics;
• Compute regression with data containing NDs while avoiding arbitrary substitutions for the NDs;
• Understand the mechanics, strengths and limitations of data transformation;
• Understand the strengths and limitations of various types of robust regression; and
• Compare and contrast Ordinary Least Squares and robust regression results.

Course Content

The course material along with Appendix A can be found in the quiz section of this course. You will be able to access the course content after your purchase.

Course Summary

R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License. It provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. Compared with other popular statistical software such as SAS and SPSS, the usage of R is on the rise among engineers and scientists.

Related Links

The R Project for Statistical Computing - R Website
R (programming language) - Wikipedia

Quiz

Once you finish studying the above course content, you need to take a quiz to obtain the PDH credits.

DISCLAIMER: The materials contained in the online course are not intended as a representation or warranty on the part of PDH Center or any other person/organization named herein. The materials are for general information only. They are not a substitute for competent professional advice. Application of this information to a specific project should be reviewed by a registered architect and/or professional engineer/surveyor. Anyone making use of the information set forth herein does so at their own risk and assumes any and all resulting liability arising therefrom.