Class MinHashingTransformer
- java.lang.Object
-
- ai.libs.jaicore.ml.core.dataset.attribute.transformer.multivalue.MinHashingTransformer
-
- All Implemented Interfaces:
ISingleAttributeTransformer
public class MinHashingTransformer extends java.lang.Object implements ISingleAttributeTransformer
Converts the sets of multi-value features to short signatures. At first the feature value is transformed into a binaryzation, i.e. a 0/1 Vector, and the MinHashing applied on this vectors afterwards. If two multi-value feature sets are very similar with respect to the Jaccard-Similarity, then the two signatures will be similar as well with a high probability depending on the desired length of the signatures.
For a signature of length n, the same amount of permutations will be created and the n-th element of the signature is determined by the index where the n-th permutation finds the finds the first 1 in the 0/1 Vector.
-
-
Constructor Summary
Constructors Constructor Description MinHashingTransformer(int[][] permutations)Constructor where the user gives predefined permutations.MinHashingTransformer(int domainSize, int signatureLength, long seed)Constructor where suitable permutations are created randomly.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double[]transformAttribute(IAttributeValue<?> attributeToTransform)
-
-
-
Constructor Detail
-
MinHashingTransformer
public MinHashingTransformer(int[][] permutations)
Constructor where the user gives predefined permutations.- Parameters:
permutations- Predefined permutations. The amount of permutations defines the length of the signature the MinHashing creates and each permutation has to have the length of the domain size.
-
MinHashingTransformer
public MinHashingTransformer(int domainSize, int signatureLength, long seed)Constructor where suitable permutations are created randomly.- Parameters:
domainSize-signatureLength-
-
-
Method Detail
-
transformAttribute
public double[] transformAttribute(IAttributeValue<?> attributeToTransform)
- Specified by:
transformAttributein interfaceISingleAttributeTransformer
-
-