Environmental Statistics and Computation

AATM315 (#7196) / AENV315 (#9533), Spring 2018

Credits: 4

Professor: Dr. Oliver Elison Timm (oelisontimm@albany.edu)

Class Location: ES232

Class Schedule: Tue/Thu/ 11:45am-1:05pm & Fri 9:20am-10:15am

Contact Information:

Office: ES 316A

E-mail: oelisontimm@albany.edu

Phone: 518-442-3584

Office hours: Mon 10:15am-11:15am, Wed 9:45am-10:45am, or by appointment

TA: Kevin Biernat ( kbiernat@albany.edu ),

Office: ES 234

TA’s office hours: Tue/Thu 10:30am-11:30am

 

 

Course Description

 

General Goals

 

In this course you will work with environmental data and learn to apply basic concepts of statistical data analysis. You will use your computer/laptop and learn to solve statistical problems that are often found in atmospheric and environmental research. For example you will be working with computer programs and data examples that illustrate the following typical statistical problems: How can we summarize large data sets in visual graphs? Why is it important to have a large sample size when dealing with statistical data analysis methods? How can we detect climatic trends in global temperatures or local rainfall? When is a correlation significant? What is the idea behind statistical hypothesis tests? How do we make inferences from noisy and uncertain observational data? What signals can we detect in time series of atmospheric CO2 concentrations? This course will cover standard concepts in probability theory, univariate and multivariate statistical analysis methods, statistical description of data, visualization of data and the concepts of hypothesis testing, and time series analysis.

 

Many research activities and jobs in the private sector operate with large volumes of data. In many cases, the analysis of big data sets can be streamlined by means of a few lines of computer code. In this course you will work with the general-purpose programming language Python. In this course you will practice the use of Python for your data analysis and visualization. The coding goes hand in hand with the statistical analysis techniques used during the class activities. After an introduction into the essential programming concepts (such as variables, loops, code-branching and functions) we will make use of packages numpy and pylab that allow us to handle our data analysis problems in an effective and illustrative programming format.

 

Computer and programming prerequisites

 

This course assumes no prior knowledge in programming! Import is that you come with an open mindset for learning a new language and the logic of computers! You will learn the basic principles in computer coding in Python (Python is available for all common computer operating systems as free software).

Bring your laptop computer to the classes! We will make use of DAES’s own Jupyter Hub in this class. A common web browser (Mozilla, Chrome) works just fine for our course. In the first class we will run some test code to make sure that all of you have the hardware and software to participate in class, and that you can work at home with the Jupyter Hub. Course instructors and DAES IT support team will assist in case you encounter technical problems.

 

 

What students can expect

 

The course is adopting the team-based learning (TBL) concept. We use the classroom for practicing the statistical and programming concepts through the application generic data sets and real-life data examples. We will use the class-room to practice working in teams on specific tasks. You are expected to prepare for classes through reading assignments. The reading material will provide you with the background information and knowledge that you need for successfully solving the team-based activities. Note that reading about python programming is not effective unless you practice it on your own. Learning a programming language works best when you try things out and learn from mistakes!

Some class time will be used to provide additional information (e.g. after unexpected problems occurred during the activities or when team discussions brought up some interesting statistical results), or to answer questions that came up during the team activities. This is done in lecture-centered format, including short PPT presentations and use of the whiteboard. At the end of learning units, each student prepares as a homework assignment a written summary of the activities (with a discussion of the results and the statistical methods). Mid-term and final exam are based on mix of multiple choice questions and questions with free-format answers.

 

The TBL procedure:

 

The semester course is divided into several units. Before and during each unit, students are required to study the assigned readings. The readiness assurance progress tests the student’s individual preparation for the classroom activities (individual Readiness Assessment Test, iRAT).

The iRATs are immediately followed by team tests (tRATs), in which the teams discuss the same questions. As a team you have to form a consensus answer. The tRATs will be immediately evaluated in class and the team scores will be announced (individual scores will be handled discreetly). The RATs are scheduled in the beginning of each unit (or in middle of a unit).

 

Team appeals:

  

Teams are given the opportunity to submit a written appeal if they feel their answer was correct or if there were problems with the question. In any case, the written appeal (one-page limit) must be based on sound reasoning supported by references to the reading material or the calculations etc. Appeals must be submitted during or by the end of the class to the instructors, and the instructors will review and decide on the case either in class (or to the beginning of the next class).

Successful appeals will earn the associated points in the team’s tRAT score. Teams that did not appeal will not earn additional points in their tRAT score.

 

Team activities:

 

This class is about statistics and the interpretation of environmental data. The tasks will require a combination data analysis, and discussion of results as well as the underlying theoretical statistical principles. Participation in these group activities is a substantial part of the overall learning process, individually and through the discussion and communication with team members. From each unit one team activity report is used it for team’s grading.

 

Peer-evaluation process:

 

Since the team activity is part of the effective learning process, you will conduct two peer evaluation processes of your team members. The first is formative and not counted for the final grades; only the second peer evaluation at the end of the course will be considered in the final grade calculation. The peer review evaluates the attendance, preparedness, and constructive participation in the team work assignments.

 

 

Additional performance evaluations:

 

This course has a mid-term exam and a final exam that each student takes individually. Mid-term and final exam scores will contribute to the final grade.

 

 

Course Policies:

 

 

Attendance:

 

Your team needs you and you need your team! Therefore, attendance in class is crucially important. Attendance will be recorded, but not be graded. Missed RATs and team activities cannot be made up. Scores for missed iRAT and tRAT and team activities count as 0 points.

 

 

Safety-valve rule:

 

The lowest iRAT and lowest tRAT score will be dropped from the final grade calculations (for each individual student).

 

Grading:

 

 RATs (average of individual and team): 20%

 Homework: 30%

 Midterm: 15%

 Final: 20%

 Team Activities: ~10%

 Peer Evaluation: ~5%

 

Note that about 75% of your grade is determined by your individual performance.

Conversion from percentage points to grade is following the SUNY Albany grade conversion table.

 

Attachments: Course overview


 

AATM/AENV315 Spring 2018 course overview

Instructor: Oliver Elison Timm

 

The outlined timeline with its units and associated dates is a tentative timeline. RATs are scheduled on the first day of each learning unit, with a few exceptions (see below).

 

Learning Units:

(0)  Introduction to the AATM/AENV315 course                                         01/23/18

a.    Syllabus

b.    Team-based learning approach

c.    Grading

d.    Team formation

e.    First Python examples / testing Jupyter Notebook environment

 

(1)  Introduction to Python                                                              01/25/18-01/26/18
RAT1: 01/25/18

a.    Introduction to Jupyter Notebooks

b.    Variables, values and data types

c.    Working with numbers: integer, float

d.    Working with strings

e.    Logical expressions                 

(2)  Writing Python scripts: Data types and flow control            01/30/18-02/09/18
RAT2:   02/06/18

a.    Tuples, lists and dictionaries

b.    Iterations

c.    First plots with matplotlib (pylab)

d.    Branching with if-then-else statements

e.    Using functions

f.     Defining and using your own functions

g.    Name spaces and scope rules

 

(3)  Python for statistical data analysis                                         02/13/18-02/23/18

a.    Importing packages

b.    Package numpy and pylab

c.    Introduction to arrays

d.    Working with data arrays

e.    Plotting data with pylab

f.     Linear algebra: math calculus and numpy arrays

 

 

 

 

 

(4)  Descriptive Statistics                                                                 02/27/18-03/09/18
RAT3:   02/27/18

a.    Descriptive statistical analysis
(mean, median, standard deviation,
histograms, box and whisker plots)

b.    Role of sample size – part 1

c.    Randomness, random data, data samples, population

d.    Independence, joint probability, conditional probability

 

MID-TERM EXAM                           FRIDAY                                             03/09/2018
(Spring-break follows)

 

(5)  Distributions                                                                               03/20/18-03/23/18

a.    Binomial Distribution

b.    Uniform Distribution, Normal Distribution

c.    Probability Density Functions

d.    Central Limit Theorem



(6)  Hypothesis testing                                                                     03/27/18-04/06/18
RAT4: 04/03/18

a.    Introducing T-test

b.    Rationale behind hypothesis testing

c.    Significance, Confidence level

d.    Role of sample size – part 2

e.    Error type 1 and Error type 2



(7)  Covariance and correlation                                                     04/10/18-04/20/18
RAT5: 04/17/18

a.    Covariance between two random variables

b.    Pearson correlation coefficient

c.    Effects of sample size and outliers on correlation

d.    Linear regression

e.    Trend fitting

f.     Multiple linear regression

 

 

 

 

(8)  Time series                                                                                  04/24/18-04/27/18

a.    Characteristics of time series:
mean, variance, time stepping

b.    Low/High-pass filters

c.    Seasonal cycle

 

 

(9)  (Optional: useful statistical tools and python packages     05/01/18-05/04/18)


One or two topics from:

PCA, spectral analysis, ANOVA,
classification methods, interpolation methods,
spatial mapping, forecast verification.

 

(10)       Review session                                                                    05/08/18



Final exam*            ­                                MONDAY                                               05/14/18

10:30am-12:30pm

 

(*www.albany.edu/registrar/registrar_assets/Spring_2018_Final_Examination_Schedule.pdf)


The final exam date and time is fixed and cannot be changed. Students are referred to the University’s Undergraduate Academic Regulations in the case of potential time conflicts. (http://www.albany.edu/undergraduate_bulletin/regulations.html).