Mahout has a Top K Parallel FPGrowth Implementation. Its based on the paper http://infolab.stanford.edu/~echang/recsys08-69.pdf with some optimisations in mining the data.
Given a huge transaction list, the algorithm finds all unique features(sets of field values) and eliminates those features whose frequency in the whole dataset is less that minSupport. Using these remaining features N, we find the top K closed patterns for each of them, generating a total of NxK patterns. FPGrowth Algorithm is a generic implementation, we can use any Object type to denote a feature. Current implementation requires you to use a String as the object type. You may implement a version for any object by creating Iterators, Convertors and TopKPatternWritable for that particular object. For more information please refer the package org.apache.mahout.fpm.pfpgrowth.convertors.string
e.g:
FPGrowth<String> fp = new FPGrowth<String>();
Set<String> features = new HashSet<String>();
fp.generateTopKStringFrequentPatterns(
new StringRecordIterator(new FileLineIterable(new File(input), encoding, false), pattern),
fp.generateFList(
new StringRecordIterator(new FileLineIterable(new File(input), encoding, false), pattern), minSupport),
minSupport,
maxHeapSize,
features,
new StringOutputConvertor(new SequenceFileOutputCollector<Text, TopKStringPatterns>(writer))
);
The command line launcher for string transaction data org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver has other features including specifying the regex pattern for spitting a string line of a transaction into the constituent features.
Input files have to be in the following format.