Wikia

Psychology Wiki

Kendall tau distance

Talk0
34,141pages on
this wiki
Revision as of 12:35, December 26, 2011 by Dr Joe Kiff (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


Merge-arrow
It has been suggested that this article or section be merged into Kendall tau rank correlation coefficient. (Discuss)

The Kendall tau distance is a metric that counts the number of pairwise disagreements between two lists. The larger the distance, the more dissimilar the two lists are. Kendall tau distance is also called bubble-sort distance since it is equivalent to the number of swaps that the bubble sort algorithm would make to place one list in the same order as the other list. The Kendall tau distance was created by Maurice Kendall.

DefinitionEdit

The Kendall tau distance between two lists \tau_1 and \tau_2 is

K(\tau_1,\tau_2) = |(i,j): i < j, ( \tau_1(i) < \tau_1(j) \wedge \tau_2(i) > \tau_2(j) ) \vee ( \tau_1(i) > \tau_1(j) \wedge \tau_2(i) < \tau_2(j) )|.

K(\tau_1,\tau_2) will be equal to 0 if the two lists are identical and n(n-1)/2 (where n is the list size) if one list is the reverse of the other. Often Kendall tau distance is normalized by dividing by n(n-1)/2 so a value of 1 indicates maximum disagreement. The normalized Kendall tau distance therefore lies in the interval [0,1].

Kendall tau distance may also be defined as

K(\tau_1,\tau_2) = \begin{matrix} \sum_{\{i,j\}\in P} \bar{K}_{i,j}(\tau_1,\tau_2) \end{matrix}

where

  • P is the set of unordered pairs of distinct elements in \tau_1 and \tau_2
  • \bar{K}_{i,j}(\tau_1,\tau_2) = 0 if i and j are in the same order in \tau_1 and \tau_2
  • \bar{K}_{i,j}(\tau_1,\tau_2) = 1 if i and j are in the opposite order in \tau_1 and \tau_2.

Kendall tau distance can also be defined as the total number of discordant pairs.

Kendall tau distance in Rankings: A permutation (or ranking) is an array of N integers where each of the integers between 0 and N-1 appears exactly once. The Kendall tau distance between two rankings is the number of pairs that are in different order in the two rankings. For example the Kendall tau distance between 0 3 1 6 2 5 4 and 1 0 3 6 4 2 5 is four because the pairs 0-1, 3-1, 2-4, 5-4 are in different order in the two rankings, but all other pairs are in the same order. [1]

ExampleEdit

Suppose we rank a group of five people by height and by weight:

Person A B C D E
Rank by Height 1 2 3 4 5
Rank by Weight 3 4 1 2 5

Here person A is tallest and third-heaviest, and so on.

In order to calculate the Kendall tau distance, pair each person with every other person and count the number of times the values in list 1 are in the opposite order of the values in list 2.

Pair Height Weight Count
(A,B) 1 < 2 3 < 4
(A,C) 1 < 3 3 > 1 X
(A,D) 1 < 4 3 > 2 X
(A,E) 1 < 5 3 < 5
(B,C) 2 < 3 4 > 1 X
(B,D) 2 < 4 4 > 2 X
(B,E) 2 < 5 4 < 5
(C,D) 3 < 4 1 < 2
(C,E) 3 < 5 1 < 5
(D,E) 4 < 5 2 < 5

Since there are 4 pairs whose values are in opposite order, the Kendall tau distance is 4. The normalized Kendall tau distance is

\frac{4}{5(5 - 1)/2} = 0.4.

A value of 0.4 indicates a somewhat low agreement in the rankings.

See alsoEdit

ReferencesEdit

External linksEdit

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Around Wikia's network

Random Wiki