Algorithme FP-growth
Définition
XXXXXXXXX
Français
algorithme FP-growth
Anglais
FP-growth algorithm
FP stands for frequent pattern.[23]
In the first pass, the algorithm counts the occurrences of items (attribute-value pairs) in the dataset of transactions, and stores these counts in a 'header table'. In the second pass, it builds the FP-tree structure by inserting transactions into a trie.
Items in each transaction have to be sorted by descending order of their frequency in the dataset before being inserted so that the tree can be processed quickly. Items in each transaction that do not meet the minimum support requirement are discarded. If many transactions share most frequent items, the FP-tree provides high compression close to tree root.
Recursive processing of this compressed version of the main dataset grows frequent item sets directly, instead of generating candidate items and testing them against the entire database (as in the apriori algorithm).
Growth begins from the bottom of the header table i.e. the item with the smallest support by finding all sorted transactions that end in that item. Call this item {\displaystyle I}I.
A new conditional tree is created which is the original FP-tree projected onto {\displaystyle I}I. The supports of all nodes in the projected tree are re-counted with each node getting the sum of its children counts. Nodes (and hence subtrees) that do not meet the minimum support are pruned. Recursive growth ends when no individual items conditional on {\displaystyle I}I meet the minimum support threshold. The resulting paths from root to {\displaystyle I}I will be frequent itemsets. After this step, processing continues with the next least-supported header item of the original FP-tree.
Once the recursive process has completed, all frequent item sets will have been found, and association rule creation begins.[24]
Contributeurs: Claire Gorjux, Imane Meziani, wiki