MIS 74013 Spring 2013 Booth
NonParametric and Robust Statistics
MIS 64013/74013
Instructor: Dr. David Booth
Office: A428 BSA
Phone (Office): 330-672-1143
Office Hours: 2:30-4:30 MW
Email: dbooth@kent.edu
Please note that if I am not in my office at these times you will find a note on my door telling you where I am. Please then go to that location to see me. Please feel free to call me or leave a note in my mailbox if you need to contact me.
Textbook: See reference list at the end of the syllabus
Course objectives:
At the end of the course the student will have:
1. Learned methods of nonparametric and robust statistics.
2. Learned to apply these basic techniques to real situations
These skills will prepare you for advanced work as you commonly encounter outliers, non-normal distributions, etc in your data sets.
Learning Outcomes:
1. For each technique presented, the student must demonstrate the ability to apply the technique.
2. The student must demonstrate the ability to apply an advanced technique to a problem of his or her choice.
3. Measurement: The student must turn in a project report for each technique under #1 and 2 to demonstrate the abilities. The project in #2 should be submitted to a conference and/or journal and presented in class.
Attendance and Make-up Policy:
In general, students are expected to attend class and are responsible for any material discussed and/or assigned. With respect to make-up, the general policy is no make-up of missed work (including exams) is allowed, and no late work will be accepted. The only exceptions are:
1. A prearranged situation (e.g., course field trips, athletic trip, etc.)
2. Emergency illness, death in the family, etc., in this case the instructor should be notified as soon as possible.
3. Contact the instructor early
Performance Evaluation:
There will be an optional comprehensive final examination that will be open book, open notes, worth 100 points. There will be four data analysis projects each worth 100 points. The course will end with a term paper (start early) on a nonparametric or robust method of your choice (approved by the instructor). Get approval early. The paper must include a full data analysis of a real data set using, either SAS or R. You will write up your results as a paper for journal publication and present these results to the class. You are encouraged to submit this paper or a derivative to a conference and journal for publication. The instructor will provide as much guidance and help as is needed. To date six of these papers have been published. The paper is worth 500 points.
Academic dishonesty, in all forms, is prohibited. All material handed in is in the public domain. This syllabus is a guide, not an absolute contract. The grading scale is based on the total number of points attempted which is either 900 or 1,000. The scale is 90% +A, 80% +B, 70% +C, etc.
Students with Disabilities:
In accordance with the university policy, if you have a documented disability and require accommodations to obtain equal access in this course, please contact the instructor at the beginning of the semester or when given an assignment for which an accommodation is required. Students with disabilities must verify their eligibility through the Office of Student Accessibility Services (SAS)
(672-3391)
Course Outline:
Topics Readings Data Sets
Introduction 14, 15,17,18,19,
20,21,30,31,
Robust Regression 1, 2 (chapt, 10, 11)
16, 34 1
Generalized Additive 3 (pp, 29-31)
Models 4, 34
Robust ANOVA 25,2,5,6,7 (chapt 7)
13, 22 (chapt 4, 6), 33 27
Robust Partial
Discriminant Analysis 8
Robust Logistic
Regression 9, 22 (chapt 5) 23 (p 239) 24
Robust Principal
Components Analysis 10, 28
Robust Proportional 11, 22 (chapt 7) 23 (p 250)
Hazards (Cox) Regression 32 22 (p 197)
29 Prostate Cancer
Robust Methods in 12
Event Studies
References- Most can be found on the internet by inserting the title into Google
1. Booth, D.E. Regression Methods and Problem Banks, UMAP Module No. 626 1986
2. Wilcox, R.R. Introduction to Robust Estimation and Hypothesis Testing, Second Edition, 2005
3. Hastie, T. and R.J. Tibshirani, Generalized Additive Models, 1990
4. Kimmel, R., D. Booth and S. Booth, The Analysis of Outlying Data Points by Robust Loess, International Journal of Operational Research 7 (1), 1-15 (2010).
5. Notebaart etal. Co-Regulation of Metabolic Genes Is Better Explained by Flux Coupling Than by Network Distance, PLOS Computational biology 4 (1): e26. Doi 137/ Journal. Pcbi. 0040026, (2008)
6. R Package Rfit Feb. 14, 2012
7. Terpstra, J.T. and J.W. McKean, Rank Based Analyses of Linear Models using R, April 22, 2004
8. Booth, D.E. and T.L. Isenhour, On Robust Partial Discrimination Analysis As A Decision Making Tool With Clinical and Analytical Chemical Data, Computers and Biomedical Research, 19, 1-12, 1986
9. Hauser, R and D. Booth, Predicting Bankruptcy with Robust Logistic Regression, J. Data Science 9 (4), 585-605 (2011)
10. Booth, D.E. The Analysis of Outlying Data Points Using Robust Regression: A Multivariate Problem Bank Identification Model, Decision Sciences 13, 71-81 1982
11. Farcomeni, A. and S. Viviani, Robust Estimation for the Cox Regression Model Based on Trimming, Biometrical Journal 53, (6), 956-973 (2011)
12. Sorokina, N., D. Booth and J. Thornton, Market Noise Reduction in Event Studies submitted to J. of Busns & Econ. Stats. 2012
13. Gill, P.S., A robust Mixed Linear model analysis for longitudinal data, Statistics in Medicine 19, 975-987 (2000)
14. Erceg-Hurn, D and V. Mirosevich, Modern Robust Statistical Methods, American Psychologist 63 (7), 591-601 (2008)
15. Farcomeni, A. and L. Ventura, An overview of robust methods in medical research
16. Bellio, R. and L. Ventura, An Introduction to Robust Estimation with R Functions (2005)
17. Ricci, V. R Functions for Regression Analysis (2005)
18. Logan, M. Biostatistical Design and Analysis Using R (2010) (wiley.com/go/logan)
19. SAS Manual online
20. R Handout by D. Booth
21. Pett, M. Non Parametric Statistics for Health care Research (1997)
22. Heritier, S. etal. Robust Methods in Biostatistics (2009)
23. Dalgaard, P. Introductory Statistics with R 2nd ed.
24. Pregibon, D. Logistic Regression Diagnostics Annals of Statistics 9 (4), 705-724
25. Carroll, R.J. Robust Methods for Factorial Experiments with Outliers
26. Carrol, R.J. Robust Methods for Factorial Experiments with Outliers, Appl. Statist. (1980) 2q (3), 246-251
27. John, J.A. Outliers in Factorial Experiments Appl. Statist. 27, 111-119 (1978)
28. Rousseeuw, PJ and VanDriessen (1999) A Fast Algorithm for the minimum covariance determinant estimator Technometrics 41, 212-223
29. Andrews, D.F. and Herzberg, A.M. (1985) Data: A collection of Problems From Many Fields For the Student and Research Worker
30. Kuhnert, P and Bill Venables, An Introduction to R: software for statistical modeling and computing (2005)
31. Andersen, R, R Functions for Modern Regression (2003)
32. R Package ‘Coxrobust’ (2006)
33. R Package ‘Rfit’ (2011)
34. Venables, W. and B. Ripley (2002), Modern Applied Statistics with S 4th ed. Springer-Verlag