◤KDD procedures – SDCF-Miner◢
Milan Šimůnek, Jan Rauch
Jan Rauch (theory), Milan Šimůnek (software), Martin Kejkula (help)
Data mining procedure SDCF-Miner mines for patterns of the form (~ R) / (α, β, Cond). Here R is a categorial attribute with categories r_{1}, …, r_{K}. Further, α, β and Cond are Boolean attributes.
The procedure deals with data matrices. The attribute R corresponds to a column of the analysed data matrix. Boolean attributes α, β and Cond are derived from the other columns of the data matrix.
The intuitive meaning of the pattern (~ R) / (α, β, Cond) is that the conditional frequencies of the categories of attribute R in set α differs from those in set β when the condition given by the Boolean attribute Cond is satisfied.
The symbol ~ is called SDCF-quantifier. It corresponds to a condition imposed on two vectors of frequencies of particular categories of R. The pattern (~ R) / (α, β, Cond) is verified such that this condition is applied on vectors of frequencies of particular categories of R in data matrices M / α / Cond and M / β / Cond.
Here M is the analysed data matrix and M / α is a data matrix consisting of all rows of data matrix M given by the Boolean attribute α. Further, M / α / Cond is a data matrix consisting of all rows of M / α satisfying the condition Cond. Let us remark that M / α / Cond can be understood as M / (α ∧ Cond). Analogously for M / β / Cond.
LISp-Miner.Core.OldUI.zip | 33.45 MB | August 13, 2014 |
Legacy LISp-Miner system core files separated into modules for each GUHA procedure. Contains also other legacy modules LMAdmin and LMDataSource. |
The procedure SDCF-Miner was suggested by J. Rauch in 2003. Reason was the necessity to mine for patterns of the form (~ R) / (α, β, Cond). The second reason was the possibility to use the software tools for dealing with strings of bits developed for the 4ft-Miner procedure. The suggestion was published in [RS 04].
The first version of the procedure SDCF-Miner was implemented by M. Šimůnek.
