Calculates the significance of a term found in a subset sample
The frequency of the term in the selected sample
The size of the selected sample (typically number of docs)
The frequency of the term in the superset from which the sample was taken
The size of the superset from which the sample was taken (typically number of docs)
a "significance" score
A heuristic used to calculate the significance of a term in a subset