Visualizing Statistical Concepts

Probability
       Random Patterns
       Birthday Problem
       Let's Make a Deal

Univariate Distributions
       Examples of Histograms
       Histogram Bin Size
       Cumulative Distributions
       Boxplots
       Mean and Median
       Normal Distribution Shapes
       Normal Distribution Probabilities
       Normal Approximation to Real Data

Bivariate Relationships
       Guessing the Size of a Correlation
       Correlation and Regression

Sampling and Sampling Distributions
       Randomization
       Sampling Distribution
       Law of Large Numbers
       Central Limit Theorem
       Normal Approximation to the Binomial
       The Standard Deviation is Biased (but not the Variance)

Statistical Inference for Means
       Difference Between t and Z
       Confidence Interval for a Mean
       Hypothesis Test for a Mean
       P-Value Calculation for a t-Test
       Randomization Tests
       Power
       Effect Size
       Relative Precision of the Mean and Median
       Analysis of Variance

Statistical Inference for Categorical Data
       Binomial Distribution
       Confidence Interval for a Proportion
       Hypothesis Test for a Proportion
       Power
       Margin of Error and Sample Size Calculations for a Finite Population
       Unimportance of Population Size
       Chi-Square

Power and Sample Size Calculations
 
 


Probability

Random Patterns:
This applet can be used to illustrate the patterns that appear in random processes.
http://users.ece.gatech.edu/~gtz/ee3340/java/random/
(Seems to work a little better in Microsoft Internet Explorer.  Choose the "uniform" distribution option.  Random patterns are nicely produced with about N=300)
 
Birthday Problem:
These applets use simulations to assess the probability that two (or more) out of N people have the same birthday.
http://www-stat.stanford.edu/~susan/surprise/Birthday.html
(You can vary N, and you can run the simulation in either a fast or slow mode.  This applet has a very attractive display.)

www.math.uah.edu/stat/java/BirthdayExperiment.html
(You can vary N.  You can also run the simulation either one trial at a time or many repetitions at a time, and the results accumulate in either case.)

 
Let's Make a Deal:
These web sites and applets use simulation to assess the probabilities of winning in the "Let's Make a Deal" game, which is also called the "Monty Hall" game.
htttp://cartalk.cars.com/About/Monty/
(An entertaining introduction to the puzzle.  Lets you play one game at a time and also presents a running total of the results from everyone who has ever played the game at this site.  To get to the running total, first click on "go play," and then click on "Statistics.")

www.intergalact.com/threedoor/threedoor.html
(Provides an introduction, plays one game at a time, and keeps a running total of your results; all with nice visuals.)

www.math.uah.edu/stat/java/MontyGame.html
(Plays one game at a time and keeps a running total of your results.)

www.math.uah.edu/stat/java/MontyExperiment.html
(Lets you run many repetitions of the game at a time and lets you vary the probability of switching doors to see which probability maximizes your chances of winning.  If you use this applet, you may first want to run the applet immediately above which better explicates how the game works.)

www.stat.sc.edu/~west/javahtml/LetsMakeaDeal.html
(Plays one game at a time and keeps a running total of your results.  

Univariate Distributions

Examples of Histograms:
These web sites present histograms for a variety of data sets and variables.
http://stat-www.berkeley.edu/users/stark/Java/HistHiLite.htm
(Seems to work best in Microsoft Internet Explorer.  You can also change the interval width -- i.e., the bin size.)

http://research.ed.asu.edu/siip/data.gal/
(Scroll down and click on any of the options under the heading "Graphics by dataset.")
 

Histogram Bin Size:
These applets adjust the shape of the histogram as you change the interval width (i.e., the bin size). www.stat.sc.edu/~west/javahtml/Histogram.html
(A histogram of the times between eruptions of Old Faithful.)
www.psychstat.smsu.edu/introbook/exercises/groupedfrequency.htm
(Requires Microsoft Internet Explorer.  Displays histograms for a variety of what appear to be "hypothetical" data sets.)


Cumulative Distributions:
These applets illustrate cumulative frequency histograms.

www.psychstat.smsu.edu/introbook/exercises/groupedfrequency.htm
(Requires Microsoft Internet Explorer.  Displays cumulative distributions for a variety of what appear to be "hypothetical" data sets.)
www.stat.ucla.edu/textbook/demos/distributions.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=densdemo].  This applet displays the density curves and cumulative density curves for a variety of distributions including the normal.  Very Useful.)


Boxplots:
This applet presents a variety of histograms and boxplots side by side.

http://research.ed.asu.edu/cgi-bin/sidebyside.pl


Mean and Median:
These applets allow you to manipulate the scores in a histogram and view the resulting effects on the mean and median.

http://surfstat.newcastle.edu.au/surfstat/main/1-2-2.html
(Scroll down to the box entitled "Applet Centres.")

www.ruf.rice.edu/~lane/stat_sim/descriptive/
(Click the "Begin" button.)

http://stat-www.berkeley.edu/users/stark/Java/HistHiLite.htm
(Seems to work best in Microsoft Internet Explorer.  Let's you present histograms for a variety of data sets and let's you highlight a section of the histogram.  By highlighting either the upper or lower half of a histogram you can get a visual image of the median, and compare it to the value of the mean.)

www.stat.ucla.edu/textbook/demos/center.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=meanmed].  In the applet, you can click and drag any of the data points.)
 

Normal Distribution Shapes:
These applets illustrate the variety of shapes of a normal distribution. http://stat-www.Berkeley.edu/users/stark/Java/StandardNormal.htm
(Seems to work better using the "scroll bars" rather than the "arrow keys.")

www.stat.ucla.edu/textbook/demos/distributions.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=densdemo].)
 

Normal Distribution Probabilities:
These applets allow you to calculate the areas under various portions of standard and nonstandard normal distributions. http://u2.newcastle.edu.au/surfstat/main/tables.html
(Calculates Z scores given areas or calculates areas given Z scores.  You can also choose other distributions besides the normal distribtuion.)

http://www-stat.stanford.edu/~naras/jsm/FindProbability.html
(Calculates areas given Z scores and Z scores given percentiles.)

http://psych.colorado.edu/~mcclella/java/zcalc.html
(Four options:
1. Calculates areas given Z scores
2. Calculates Z scores given percentiles or percentiles given Z scores
3. Calculates probabilities given a score, mean, and standard deviation
4. Calculates the area between a Z score and the mean in four different ways.)
 

Normal Approximation to Real Data:
This applet shows how well real data (100 measurements of the acceleration of gravity) approximate a normal distribution. http://stat-www.berkeley.edu/users/stark/Java/NormApprox.htm
(You can highlight different portions of the histogram to assess the accuracy of the approximation.)
   

Bivariate Relationships

Guessing the Size of a Correlation:
These applets present scatterplots and ask you to guess the size of the correlation. www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/
(You are presented with scatterplots of data at random and asked, in multiple choice format, to guess the size of the correlation.  You can learn the correct answer with a click of a button.  You can also try to position the regression line by hand given the MSE for each position, and you can do so with or without knowing the minimum MSE.  With a click of a button, you can then learn the correct positioning of the regression line.  Very Useful.)

www.psychstat.smsu.edu/introbook/exercises/scattertest.htm
(Requires Microsoft Internet Explorer.   Generates scatterplots at random, gives you multiple choice options for the size of the correlation, and then tells you the correct answer.)
 

Correlation and Regression:
These applets present scatterplots that you can use to illustrate a variety of concepts including: how the appearance of a scatterplot differs for data with different degrees of correlation, the positioning of the regression line within a scatterplot, and how both the regression line and the correlation are altered when data points are added or altered. www.stat.uiuc.edu/~stat100/java/guess/PPApplet.html
(Starts with a blank scatterplot to which you can add, and erase, individual data points by clicking the mouse.  Alternatively, you can create a scatterplot for randomly generated sets of data, for any size N and for any (target) correlation value.  The applet calculates both the correlation and the coefficients of the regression line, and lets you plot the regression line and residuals. Very useful.)

http://stat-www.berkeley.edu/users/stark/Java/ScatterPlot.htm
(Let's you draw scatterplots for a variety of sets of real data -- or import your own data.  The applet calculates the correlation and let's you plot the regression line, a graph of averages, and the residuals, among other options.  You can also add data points and  toggle between the results with and without these additional points.  Very useful.)

http://stat-www.berkeley.edu/users/stark/Java/Correlation.htm
(You can vary both the correlation and N with scroll bars and a scatterplot is automatically updated -- in this regard, note that the scroll bars seem to work better than the arrow keys.  The applet let's you plot the regression line, a graph of averages, and the residuals.  You can also add data points and toggle between the results with and without these additional points. Very useful.)

www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/
(You are presented with scatterplots of data at random.  You can try to position the regression line by hand given the MSE for each position, and you can do so with or without knowing the minimum MSE.  With a click of a button, you can then learn the correct positioning of the regression line. Very Useful.)

www.stat.ucla.edu/textbook/demos/correlation.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=showcorr].  Presents scatterplots with regression lines for any degree of correlation and for any size N.)

www.math.csusb.edu/faculty/stanton/m262/regress/regress.html
(Starts with a blank scatterplot to which you can add individual data points.  The applet calculates the regression coefficients, plots the residuals and gives you the option to plot the regression line with or without the data.)
 
 


Sampling and Sampling Distributions

Randomization:
With these applets, you can select random samples from a population or make random assignments to treatment conditions.
www.assumption.edu/html/academic/users/avadum/applets/applets.html
(Three options:
1. Random assignment to treatment conditions
2. Random assignment to treatment conditions within blocks
3. Simple random sampling)

www.randomizer.org/
(Simple random sampling)

www.stat.ucla.edu/calculators/perm.phtml
(Random permutations of N digits.)

 
Sampling Distribution:
These applets can be used to illustrate the behavior of the sampling distribution of the mean.
  http://acad.cgu.edu/wise/sdmmod/sdm.html
(This applet allows you to draw samples either one at a time, or 100 at a time.  It then plots the scores from a single sample or from all of the sample means.  You can add overlays of the population density and of the sampling distribution density.  You can also vary N and the shape of the population density. Very well crafted and informative.)

www.ruf.rice.edu/~lane/stat_sim/sampling_dist/
(On separate axes, one over the other, this applet plots [a] the population density (which can take different forms), [b] a histogram of sample data, and [c] two sampling distributions for any of a variety of sample sizes and estimators (including the mean, median, standard deviation, and variance.  Very well crafted and  informative.)

   
Law of Large Numbers:
This applet plots both the percentage and the number of successes of a Bernoulli process as the number of trials increases, for any specified probability of success.
  http://stat-www.berkeley.edu/users/stark/Java/lln.htm
(Very useful.)
 
Central Limit Theorem:
This applet plots an empirical sampling distribution for the sum of from 1 to 5 rolls of a die.
  www.stat.sc.edu/~west/javahtml/CLT.html
(Scroll down and click the "Roll the dice" button.  You can draw one or many samples at a time, both of which continuously update the empirical sampling distribution. A nice illustration.)
   
Normal Approximation to the Binomial:
These applets illustrate how well the normal distribution approximates binomial distributions.
  www.ruf.rice.edu/~lane/stat_sim/binom_demo.html
(This is an older version of the applet listed immediately below, but gives a nicer overall view of the contrast between the binomial histogram and the normal density.  Allows for a variety of binomial distributions.)

www.ruf.rice.edu/~lane/stat_sim/normal_approx/
(Contrasts the binomial histogram to the normal density and calculates the binomial and normal probabilities for any range of outcomes.  Allows for a variety of binomial distributions.)

www.ms.uky.edu/~mai/java/stat/GaltonMachine.html
(An attractive, but time-consuming, quincunx.)
 
 

The Standard Deviation is Biased (but not the Variance):
This applet can be used to demonstrate that the standard deviation is a biased estimator but that the variance is not.
  www.ruf.rice.edu/~lane/stat_sim/sampling_dist/
(A very informative display.)
 

Statistical Inference for Means

Difference Between t and Z:
These applets plot a density curve for the standardized normal distribution overlayed on a density curve for the t distribution with a specified degrees of freedom.
http://www-stat.stanford.edu/~naras/jsm/TDensity/TDensity.html
(Seems to work best in Microsoft Internet Explorer.  The degrees of freedom for the t distribution can be varied from 1 to 49.  The resolution is such that you won't see any difference between the normal and t curves except for the smaller degrees of freedom.)

www.stat.ucla.edu/textbook/demos/distributions.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=densdemo].  Choose the "T" distribution in the first window.)

 
Confidence Interval for a Mean:
These applets illustrate the likelihood that a confidence interval for the mean contains the population mean.
  www.ruf.rice.edu/~lane/stat_sim/conf_interval/
(Draws 100 samples at random, plots the position of the confidence intervals relative to the population mean, and calculates the proportion that cover the population mean for both 95% and 99% confidence intervals.  Allows you to vary the sample size so you can see the effects on the size of the confidence intervals.)

www.stat.sc.edu/~west/javahtml/ConfidenceInterval.html
(Draws 50 samples at random, plots the position of the confidence intervals relative to the population mean, and counts how many fail to cover the population mean.  You can ask for more sets of 50 random samples to be drawn and you can vary alpha to show the effect that has on the width of the confidence intervals.)

http://stat-www.berkeley.edu/users/stark/Java/Ci.htm
(Uses a simulation to calculate the proportion of 68% confidence intervals that contain the population mean.  Let's you draw 1 to 500 samples at a time and plots the position of each confidence interval relative to the population mean.)
 

Hypothesis Test for a Mean:
These applets can be used to explicate the logic of hypothesis tests.
  http://acad.cgu.edu/wise/hypothesis/hypoth_applet.html
(On separate axes, one above the other, this applet plots a density curve for a normal population and for the sampling distribution of the mean, both assuming the null hypothesis is true and assuming the alternative hypothesis is true. Clicking a button plots a histogram for a random sample on top of  the true population density and plots the sample mean on top of the corresponding sampling distribution.  The z-score test statistic and p-value for the sample are also reported.  Repeating the simulation adds further means to the sampling distribution.  You can vary the size of N, the population means, the population standard deviation, and alpha. Very useful.)

www.stat.ucla.edu/textbook/demos/testing.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=t-test].  This applet draws samples at random, and plots them on an axis that also shows the sample distribution of the mean.  You can vary the null and alternative hypotheses, the population standard deviation, and N.  The applet reports the sample mean, the critical values, the p-value, and power. Very Useful.)
 

P-Value Calculation for a t-Test:
With this applet, you can calculate the p-value for a given score with a given degrees of freedom.
  www.stat.sc.edu/~ogden/javahtml/pvalcalc.html
(Make sure you scroll to the right to the end of the answer, because it is written in scientific notation.)  
Randomization Tests:
This applet illustrates the logic of a randomization test using a variety of data.  You can also enter your own data.
www.stat.psu.edu/~rho/tools/twogroup.htm
(Requires the Neuron plug-in which can be downloaded for free from www.asymetrix.com/products/toolbook2/neuron/dowload.html.)
 
Power:
These applets can be used to help students understand the notion of power.
  www.psychstat.smsu.edu/introbook/exercises/errorprobabilities.htm
(Requires Microsoft Internet Explorer, as well as Active X, "which "will be automatically obtained from Microsoft," if it is not already on your computer. The applet plots the sampling distribution of the mean both assuming the null hypothesis is true and assuming the alternative hypothesis is true.  The applet calculates power and shades both the area representing the (one-tailed) rejection region under the null hypothesis distribution and the area representing power under the alternative hypothesis distribution. You can vary N, the effect size, the population standard deviation, and alpha to see how this affects power.  Very useful.)

www.stat.psu.edu/~rho/tools/power.htm
(This applet plots the sampling distribution of the mean both assuming the null hypothesis is true and assuming the alternative hypotheis is true.  The applet calculates power and shades both the area representing the (one-tailed) rejection region under the null hypothesis distribution and the area representing power under the alternative hypothesis distribution.  You can vary N, the true mean, and alpha to see how this affects power.  Requires the Neuron plug-in which can be downloaded for free from www.asymetrix.com/products/toolbook2/neuron/dowload.html. Very useful.)

www.stat.ucla.edu/textbook/demos/testing.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=t-test].  This applet draws samples at random, and plots them on an axis that also shows the sample distribution of the mean.  You can vary the null and alternative hypotheses, the population standard deviation, and N.  The applet reports the sample mean, the critical values, the p-value, and power. Very Useful.)

www.ruf.rice.edu/~lane/stat_sim/repeated_measures/
(Runs simulations to estimate power for both within and between-subject designs.  You can vary N, the population means, the population standard deviations, and the correlation between pairs of scores.  The results of any single simulation can be presented in a scatterplot.  A creative way to come to understand power.)

http://acad.cgu.edu/wise/hypothesis/hypoth_applet.html
(On separate axes, one above the other, this applet plots a density curve for a normal population and for the sampling distribution of the mean, both assuming the null hypothesis is true and assuming the alternative hypothesis is true.  Clicking a button plots a histogram for a random sample on top of  the true population density and plots the sample mean on top of the corresponding sampling distribution.  The z-score test statistic and p-value for the sample are also reported.  Repeating the simulation adds further means to the sampling distribution.  Among other things, this applet can be used to show how power varies with the size of N, the population means, the population standard deviation, and alpha.)

www.stat.sc.edu/~ogden/javahtml/power/power.html
(Scroll down.  This applet plots two sampling distributions of the mean; one assuming the null hypothesis is true and other assuming the alternative hypothesis is true. The applet shades both the area representing the rejection regions under the null hypothesis distribution and the area representing power under the alternative hypothesis distribution.  You can vary N,  the null hypothesis value, the true population mean, and the standard deviation of the population.  This is a beautifully crafted applet but note that changes in N and the other values cause a rescaling of the axis, which may or may be confusing to students.)
 
 

Effect Size:
Though it may take a bit of ingenuity, you can perhaps use these applets to help students intuitively understand effect sizes.
  www.psychstat.smsu.edu/multibook/exercises/dichotomous.htm
(Requires Microsoft Internet Explorer.  The applet has you guess the size of a point-biserial correlation given the two means, standard deviations, and Ns.)

www.rug.rice.edu/~lane/stat_sim/group_diff.html
(Calculates the relative proportion of individuals who are above a specified cut-off point for any size mean difference between two populations.)
 

Relative Precision of the Mean and Median:
This applet can be used to estimate the relative precision of the mean and median for different sample sizes and for populations with different distributions.
  www.ruf.rice.edu/~lane/stat_sim/sampling_dist/
(A very informative display.)  
Analysis of Variance:
This applet explicates the behavior of a one-way analysis of variance.
  www.dartmouth.edu/~matc/X10/java/anova/Anova.html
(You can choose any number of groups (up to 10), any sample size (up to 100 per group), any population means, and any population variance.  The applet displays the null and noncentral F distributions, generates random data repeatedly, and tallies the results. Very useful.)
 

Statistical Inference for Categorical Data

Binomial Distribtuion:
These applets plot a histogram for the binomial distribution for any value of N and any value of p. www.ruf.rice.edu/~lane/stat_sim/binom_demo.html
(This is an older version of the applet listed immediately below, but gives a nicer overall view of the contrast between the binomial histogram and the normal density.)

www.ruf.rice.edu/~lane/stat_sim/normal_approx/
(Contrasts the binomial histogram to the normal density and calculates the binomial and normal probabilities for any range of outcomes.)

www.stat.ucla.edu/textbook/demos/distributions.phtml
(Requires Xlisp-Stat which can be downloaded for free via anonymous ftp [instructions] [file=densdemo]. Very Useful.)


Confidence Interval for a Proportion:
This applet illustrate the likelihood that a confidence interval for a proportion contains the population proportion.

www.ruf.rice.edu/~lane/stat_sim/normal_approx_conf/
(Let's you investigate how well an approximate confidence interval contains the population proportion for any value of N and for any valur of the population proportion.)  
Hypothesis Test for a Proportion:
This applet can be used to explicate the logic of hypothesis tests
  http://dartmouth.edu/~matc/X10/java/pvalue/PropSim4.html
(Seems to work best in Microsoft Internet Explorer.  Plots the sampling distribution for a proportion assuming both the null and alternative hypotheses are true.  You can vary N and the values of  the proportions for both the true and alternative hypotheses.  Clicking a button generates a random sample.  The applet reports the sample proportion and points to its position in the appropriate sampling distribution.  The simulation can be repeated and the applet will tally the number of times the null hypothesis is rejected at the .05 level.)
 
Power:
These applets can be used to explicate the notion of power.
  http://dartmouth.edu/~matc/X10/java/pvalue/PropSim4.html
(Seems to work best in Microsoft Internet Explorer.  Plots the sampling distributions for a proportion for both the null and alternative hypotheses.  You can choose whether the null or alternative hypothesis is true.  Clicking a button generates a random sample.  The simulation reports the sample proportion and points to its position in the appropriate sampling distribution.  The simulation can be repeated and the applet will tally the number of times the null hypothesis is rejected at the .05 level.  With this tally, you can show how power varies with N and with the values of  the proportion for both the true and alternative hypotheses.)

www.math.csusb.edu/faculty/stanton/m262/proportions/proportions.html

(Illustrates the likelihood that a sample proportion will lie within the critical values of a hypothesis test for any given null hypothesis value and for any given population proportion. Also can be used to illustrate how changing the level of alpha changes the critical values of the hypothesis test.  Scroll down.)     Margin of Error and Sample Size Calculations for a Finite Population:
This web site calculates either the margin of error (i.e., the width of a confidence interval) for a given sample size, or the sample size for a given margin of error.  The calculations are for estimating percentages using a simple random sample drawn from a finite population (of specified size).
  www.surveysystem.com/sscalc.htm#factors
(Scroll down.)

www.researchinfo.com/calculators/sscalc.htm
(This site contains the same sample size calculator as in the applet immediately above.)
 

Unimportance of Population Size:
I use the same sites as listed above to demonstrate that precision doesn't depend on the size of the population, as long as the sample is a relatively small proportion of the population.
  www.surveysystem.com/sscalc.htm#factors
(Scroll down.)

www.researchinfo.com/calculators/sscalc.htm
(This site contains the same sample size calculator as in the applet immediately above.)
 

Chi-Square:
These applets can be used to explicate different aspects of a chi-square hypothesis tests.
  www.ruf.rice.edu/~lane/stat_sim/chisq_theor/
(You can choose to sample 100 scores at random from either a normal or a uniform distribution.  The applet then presents -- in a very nice graphical form -- the calculations for two chi-square goodness-of-fit tests; one that tests whether the population has a normal distribution and the other that tests whether the population has a uniform distribution. Very useful.)

http://media.tasc.ac.uk/sobol/survey5/
(This web site lets you (a) choose among pairs of variables from a variety of data sets, (b) plot the bivariate frequency table for any given pair and calculate a chi-square test of the relationship, and (c) then see how the results change when you collapse cells from either or both of the two variables -- which is especially useful when you collapse cells that have small Ns.  Follow the directions at the web site.)
 
 



Power and Sample Size Calculations
The web sites listed below are just a few of the many sites that can be used to calculate either the level of power for a given sample size, or the sample size for a given level of power.  Calculations are provided for a wide variety of hypothesis tests. www.stat.ucla.edu/calculators/powercalc/

www.stat.ucla.edu/~jbond/HTMLPOWER/

http://members.aol.com/johnp71/javastat.html#Power


___________
Revised 10/18/99.  Support from the Center for Teaching and Learning at the University of Denver is gratefully acknowledged.  Please send suggestions for additions or other comments to Chip Reichardt (creichar@du.edu ).