http://www.statlab.uni-heidelberg.de/data/prim/<<< Pulse and Heart Rate Variability
G. Sawitzki StatLab Heidelberg Last revision: 2016-07-01 by gs
StatLab Heidelberg >  data >  prim

Prim 9 +/- 2

Prim-9 is a software by J. Tukey et al., presented in the video J.W. Tukey, J.H. Friedman and M.A. Fisherkeller: Prim-9 (1973).


Prim-9 has been a foundation stone for interactive graphical data analysis. A breakdown of this video, isolating its main contributions, is in the media repository section of G. Sawitzki: Data analysis and visualisation for small dimensions. (2007).


Prim-9 is designed for the analysis of data up to 9 dimensions. For illustration, Tukey uses a 7-dimension dataset on data collisions, available as PRIM7 as part of the R distribution in library(groc), in the GGobi book, and in various other sources.

In the application, Tukey focuses on two rod like structures. The background question is: what can be inferred from the apparent projection od a data structure in a projection about the underlying structure. This has been discussed in detail in George W. Furnas & Andreas Buja: Prosection Views: Dimensional Inference through Sections and Projections, JCGS Volume 3, Issue 4, 1994.

In the introduction, Tukey discusses that the seven imbedding dimension of this data set can be reduced by symmetry to at most four structural dimensions. He does not use this information in the video.

Does it help to to understand the data structure?

He uses three arguments. Two of them refer to conservation laws. Since energy and impulse is conserved, we the sum over the dimensions needs to be a constant, that is we have two-dimensional triangular structures in 7space. Unfortunately the variables are not labeled, so we have to hunt for it. But we know the target: except for non-generic spaces, we should look for triangles in two dimensional projections.

Marginal plots are sufficient for a first step. A short look hints to variables 3 an 5 as first candidates.

2d marginal plots

To look for the third variable to include for the triangle, we add Var3+Var5 and consider the remaining variables.


A visual inspection hints to variable 1 as a next candidate.

Added variable marginal plot for Var.3, 5

Inspecting the data set in the dimension spanned by variable 1, 3 and 5 gives a clear geometrical impression.

The curved shape indicates that this is not exactly the symmetry pointed out by Tukey, which would give a planar triangle. However it reveals a symmetry related aspect of the data, covering a 3d substructure.

The second symmetry used by Tukey can be revealed in analogous steps.

The third symmetry mentioned is rotational symmetry. Unfortunately it is not common to switch between cartesian and polar coordinate systems, so this requires some more effort.

For an analysis using projection pursuit and the grand tour, see Dianne Cook, Andreas Buja, Javier Cabrera and Catherine Hurley: Journal of Computational and Graphical Statistics, Vol. 4, No. 3 (Sep., 1995), pp. 155-172, available on JSTOR.

Software used

The displays have been generated by DataDesk 7.

Media files

prim/media/ScatMat.png
prim/media/ScatMat35x.png
prim/media/prim135.mov
prim/media/prim135.mp4


top
$Source: /u/math/sa3/cvswww/www/www.statlab.uni-heidelberg.de/data/prim/index.html,v $
$Revision: 1.7 $
$Date: 2016/07/02 16:01:26 $