The way I understand principleprincipal components is this: Data with multiple variables (height, weight, age, temperature, wavelength, percent survival, etc) can be presented in three dimensions to plot relatedness.
Now if you wanted to somehow make sense of "3D data", you might want to know which 2D planes (cross-sections) of this 3D data contain the most information for a given suite of variables. These 2D planes are the principleprincipal components, which contain a proportion of each variable.
Think of principal components as variables themselves, with composite characteristics from the original variables (this new variable could be described as being part weight, part height, part age, etc). When you plot one principleprincipal component (X) against another (Y), what you're doing is building a 2D map that can geometrically describe correlations between original variables. Now the useful part: since each subject (observation) being compared is associated with values for each variable, the subjects (observations) are also found somewhere on this X Y map. Their location is based on the relative contributions of each underlying variable (i.e. one observation may be heavily affected by age and temperature, while another one may be more affected by height and weight). This map graphically shows us the similarities and differences between subjects and explains these similarities/differences in terms of which variables are characterizing them the most.