% If you use this code, please cite: % Michael Lamar*, Yariv Maron*, Mark Johnson, Elie Bienenstock. % SVD and clustering for unsupervised POS tagging. (ACL 2010) % For more information, see the above reference. function [H1, H2, VI] = var_information(P) % computes entropies and variation of information from confusion matrix P [n1, n2] = size(P); P = P + eps; P1 = sum(P, 2); % marginal P2 = sum(P, 1); % marginal P1rep = repmat(P1 ,1, n2); P2rep = repmat(P2 ,n1, 1); H1 = - P1' * log2(P1 ); % entropy H2 = - P2 * log2(P2'); % entropy H12 = - sum(sum(P.*log2(P./P2rep))); % conditional entropy H21 = - sum(sum(P.*log2(P./P1rep))); % conditional entropy VI = H12 + H21; % variation of information