Exploratory Data Analysis

CECS 6800.001 Meets Thursdays from 5:30-8:20 p.m.

Instructor: G. Knezek

Catalog Description:  CECS 6800 Exploratory Data Analysis. Students will analyze current research in educational computing as a tool for understanding the unique characteristics of technology based research activities in educational environments. Special consideration will be given to strategies for separating influences in research designs that incorporate technology as tools and as variables in the design. Students will also identify potential dissertation research topics and prepare preliminary reports that will be critiqued in class in preparation for doing the dissertation.

This class is intended for those who might wish to explore preliminary analysis of data or develop instruments for topics that might someday become the focus of their dissertations. The timetable is specially arranged to allow student to go directly from Simulating Teaching & Learning to Analysis of Research if they wish. Exercises will concentrate on practical analysis of data issues using packages such as SPSS.

Deadline for Uploading Final Papers to Moodle: 

Final Presentations from Previous Class

Required Readings:

Salkind, Neil. J. Statistics for People Who Think They Hate Statistics
Sage Publications ISBN 978-1-4129-5151-7 (with CD)

Naugher, Jimmie R. Survival Guides to Statistics for the Nonmathematician
July 2010 (from the bookstore)

Campbell & Stanley (1966). Experimental and Quasi-Experimental Designs for Research.

(5 Articles (post summary paragraph to Moodle).

Recommended Sources:

International Handbook of Information Technology in Education (Secure Site - Full Text)

International Handbook of Information Technology in Primary and Secondary Education (Publisher Website)

Collis, B., Knezek, G., Lai, K., Miyashita, K., Pelgrum, W., Plomp, T., & Sakamoto, T. (1996). Children and Computers in School. Erlbaum. (ISBN 0805820744 – available on amazon.com)

Computers in the Schools Volume: 24 Issue: 3/4 Year: 2007. (ISSN: 0738-0569. Publisher Site. Article: Effect of Technology-Based Programs on First- and Second-Grade Reading Achievement, Knezek and Christensen)

Optional Readings Book: Bracey, G. W. (2006). Reading Educational Research. Heinemann. ISBN 0-325-00858-2.

SPSS User's Guide (optional, use of SPSS required & help available online)
SPSS Student Package (optional, use of SPSS required & available in lab)

Power Point presentations:

Evaluating Educational research:

http://courseweb.unt.edu/gjones/fall2010/cecs5610/educational_research.html

 

Summary Description of the Course:
This doctoral class provides an opportunity to apply the research designs, statistical methods and analytical frameworks learned in EDER 6010 and EDER 6020 to real data… your data, based on some theory base or research questions/ perspectives. Students will receive guidance on how to analyze their data for themselves, beginning with discussions, of how to formulate questions that can be answered: In addition, classmates will share their datasets and background information with peers thus giving everyone multiple “real world” examples of data from the field. Your final paper should be the beginning of a publication-quality product.

Grading:

Online participation: 10 pts. (including article summaries)
In-class participation: 10 pts.
Three Assignments: 30 pts. (develop instrument, administer, code/enter data)
Final project: 50 pts. (written paper, including conceptual foundation, analyzing characteristics of measurement scales; suggested improvements)


Contact Information:
Instructor: G. Knezek Voice Mail: 940-565-4195
FAX 940-565-4194
Email: gknezek@gmail.com

Mailing Address:
Technology and Cognition/UNT
P.O. Box 310530
Denton, TX  76203

Course Calendar

Aug. 25th - Face to face

Sept. 1st - Face to face

Sept. 8th - Dr. Knezek Hawaii

Sept. 15th - Dr. Knezek in D.C.

Sept. 22. - Face to face

Sept. 29 - Dr. Knezek in Netherlands

Oct. 6th - Dr. Knezek in Netherlands

Oct. 13th - Face to face

Oct. 20th - Face to face

Oct. 27th - Dr. Knezek in Hawaii at Elearn

Nov. 3rd - Face to face

Nov. 10th - Face to face

Nov. 17th - Dr. Knezek in Netherlands

Nov. 24th. - Thanksgiving

Dec. 1st - Face to face

Dec. 8th - Face to face and final presentations

Course Outline

I. Exploratory Data Analysis

A. Descriptive statistics (data reduction)

1. Central tendency

a) Mean – Geometric Average

b) Median- Middle Score, 50th Percentile

c) Mode – Most common score

2. Variation – measure of dispersion (How spread out are the scores)

a) Range (lowest to highest)

b) Standard deviation - p. 38

c) Variance (Squared of Standard deviation)

3. Skewness- look at values to determine skewness.

a) Negative – pointing to smaller values

b) Positive – tail points to larger values

c) Pearson’s R – P. 61 (3X mean – median)

4. Kurtosis

a) Platykurtic

b) Leptokurtic

c) Mesokurtic

B. Inferential Statistics (hypothesis testing/level of significance)

II. Distributions of Data

A. Normal Distribution (assumed for most parametric statistics)

1. Properties of normal distribution (standard normal curve)

a) mean of zero

b) Standard deviation of 1

c) 67% of the area is +-1 standard deviation of the mean

d) 95% of the area is within +-2 standard deviation of the mean

e) 99% of the area is within +- 3 standard deviations of the mean

f) Outlier – it is more than +-3 standard deviations from the mean.

2. Normal curve is symmetrical about the mean (no skew and mesokurtic

3. Mean is in the middle and divides the area into halves (50th %)

4. Total area under the curve = 1

B. Properties of other distributions (non parametric distributions)

1. Can assume any shape or form

2. It is what it is

3. Use non parametric test as appropriate upon non levels of measurement.

4. Often based simply on counting frequencies of occurance (Chi Sq. Test.

C. Cumulative data distributions are

1. Also based on counting of occurrences.

2. Normally from smallest to largest.

3. Used to calculate P Levels (probability)

4. P levels for things like T-Test Assume near infinity in cases. (N)

5. P-Levels for ANOVAs, F Distributions, assume infinite instances in the population

III. Levels of Measurement- Note: Level of Measurement largely determines which test is appropriate

A. Nominal (Categorical)

B. Ordinal (Rank Order)

C. Interval (Equally spaced units)

D. Ratio (True Zero point)

1. Other Measurement Issues-Reliability (Consistency)

a) Test/ Re-test

b) Internal Consistency (Cronbach’s Alpha)

c) Note: Reliability is the Ceiling for Validity

IV. Validity (Relevance, appropriateness)

A. Content (Look See)

B. Construct (Factor Analytic)

C. Criterion

1. Concurrent

2. Predictive

V. Research Designs (Three basic types)

A. Quantitative

B. Qualitative

C. Mixed Methods

VI. Hypothesis Testing

A. (Probability that effect would occur by chance)

1. P is less than something like .05

2. Difference would have occurred less than 5/100 by chance

a) But, Difference would have occurred 5/100 purely by chance

B. ‘Type I’ error- when two groups were not different but we concluded that

1. (null hypothesis was true but was rejected.

C. ‘Type II’ error-when two groups were actually different but you failed to conclude

1. Null hypothesis was false but was accepted as true

D. Effect Size (practical significance)

VII. Analysis techniques

A) Association/Prediction

Correlation

Pearson r

Rank order

Kendall's

Tau

Regression analysis

B) Comparison of means

C) Other

Citation needed from Amy. She will post in the forum. (I don’t see it.)

Find article about Moments 0-Statistical description of data, a sample page of numerical – Haoli.org Chapter in a book. Gaulton, Sir Francis, maybe Spearman, Lee Cronbach

Book on Histograms by Knezek.

Jacob Cohen on data type of effect size