NAME
Set::Similarity::BV - similarity measures for sets using fast bit
vectors (BV)
SYNOPSIS
use Set::Similarity::BV::Dice;
# object method
my $dice = Set::Similarity::BV::Dice->new;
my $similarity = $dice->similarity('af09ff','9c09cc');
# class method
my $dice = 'Set::Similarity::BV::Dice';
my $similarity = $dice->similarity('af09ff','9c09cc');
DESCRIPTION
This is the base class including mainly helper and convenience methods.
Use one of the child classes:
Set::Similarity::BV::Cosine
Set::Similarity::BV::Dice
Set::Similarity::BV::Jaccard
Set::Similarity::BV::Overlap
Overlap coefficient
( A intersect B ) / min(A,B)
Jaccard Index
The Jaccard coefficient measures similarity between sample sets, and is
defined as the size of the intersection divided by the size of the
union of the sample sets
( A intersect B ) / (A union B)
The Tanimoto coefficient is the ratio of the number of features common
to both sets to the total number of features, i.e.
( A intersect B ) / ( A + B - ( A intersect B ) ) # the same as Jaccard
The range is 0 to 1 inclusive.
Dice coefficient
The Dice coefficient is the number of features in common to both sets
relative to the average size of the total number of features present,
i.e.
( A intersect B ) / 0.5 ( A + B ) # the same as sorensen
The weighting factor comes from the 0.5 in the denominator. The range
is 0 to 1.
METHODS
All methods can be used as class or object methods.
new
$object = Set::Similarity::BV->new();
similarity
my $similarity = $object->similarity($hex1,$hex2);
$hex is a string of hexadecimal characters.
from_integers
my $similarity = $object->from_integers($AoI1,$AoI2);
Croaks if called directly. This method should be implemented in a child
module.
intersection
my $intersection_size = $object->intersection($AoI1,$AoI2);
$AoI is an array reference of integers. Returns the length of the
intersection.
combined_length
my $set_size_sum = $object->combined_length($AoI1,$AoI2);
$AoI is an array reference of integers.
min
my $min = $object->min($int1,$int2);
bits
my $bits = $object->bits($int);
Returns the number of bits set in integer.
SEE ALSO
Set::Similarity::BV::Cosine
Set::Similarity::BV::Dice
Set::Similarity::BV::Jaccard
Set::Similarity::BV::Overlap
SOURCE REPOSITORY
http://github.com/wollmers/Set-Similarity-BV
AUTHOR
Helmut Wollmersdorfer,
COPYRIGHT AND LICENSE
Copyright (C) 2016 by Helmut Wollmersdorfer
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.