KDD procedures – KL-Collaps




Václav Lín


Václav Lín (theory), Václav Lín (software), Václav Lín (help)


KL-Collaps is a submodule of KL-Miner. Given a K×L contingency table corresponding to some KL-hypothesis discovered by KL-Miner, the user can use KL-Collaps to search for the strongest interactions in the contingency table. Thus the user can gain some additional insight into the results of KL-Miner.

More formally, consider a contingency table with rows r1, …, rK and columns c1, …, cL. Let nkl be the frequency at intersection of rk and cl. Any pair κ1 {r1, …, rK}, κ2 {c1, …, cL} determines a four fold contingency table

a, b, c, d

where a = {nkl: kκ1, lκ2}, b = {nkl: kκ1, lκ2}, c = {nkl: kκ1, lκ2}, d = {nkl: kκ1, lκ2}. KL-Collaps searches the set of all such pairs of κ1 and κ2, and outputs all the pairs for which the Χ2-statistic exceeds a critical value supplied by the user. The search set and the output set of pairs [κ1, κ2] can be restricted by some optional parameters that concern syntactic form and maximal number of the returned pairs. Such a restriction may have a great impact on computational complexity of the search and on intelligibility of the output.

To sum up, the pairs [κ1, κ2] on output of KL-Collaps describe the most important “sources” of dependence underlying the given KL-hypothesis.

Files to download:
LM.KL.Collaps.zip 400.43 kB October 29, 2009

The original COLLAPS procedure was designed by D. Pokorný in 1970’s [Po 78] (see also [Ha 83]). KL-Collaps is a subset of the original COLLAPS. It was implemented by V. Lín in 2004.

Print page


Send comments about this site to the webmaster