The root-mean-square deviation (or error) characterizes the differences between two sets of data. Its name literally explains its definition: The difference is squared and then averaged, then the final result of the square root of the average.

For example, assume we have two sets of data:

Set A – x: 1, 2, 3 (red)

Set A – y: 1, 1, 2 (red)

Set B – x: 1, 2, 3 (blue)

Set B – y: 2, 3, 1 (blue)

Because the x coordinates of these two sets are identical, the RMSD can be directly calculated from the differences between each corresponding y values, as shown in the figure.

RMSD = sqrt( (e1^2+e2^2+e3^2) / 3 )

Therefore the sign of e1 (2, or 3) doesn’t matter, and only its amplitude affects the RMSD.

However, in real numerical calculations, the x coordinates of the two data sets are often not the same. In this case, we can specify one data set as the reference and the other one as the sample. The sample data will then be projected into the reference data and find the corresponding deviations.

For example, if we change the last x coor

dinate of set B to be 3.5 and select set A as the reference, the RMSD is calculated as illustrated in the figure. This time, the value of e3 is 3 instead of 2.

If the projection happens within two data points of the reference, then its value is a interpretation. If it is located outside the x range of the reference, its value is obtained through extrapolation (like the example).

Note that if we select set B to be the reference, the result will be different, as shown in the figure. Obviously, this time the value of e3 is much smaller than the case when set A is reference.

Here we give a Matlab program to calculate the RMSD. It requires the user to input two data sets and specify which data set is the reference.

link (works with Matlab and GNU Octave)