List of top term objects
Tokenized documents from which top terms were found.
Minimum number of documents that must contain a sequence for it to be considered strong
A list of the top terms and/or phrases sorted by score. (Phrases have >1 term in the array and are ranked by the highest scoring term in the phrase)
Having discovered statistically significant terms it is often useful to reorganise them into more readable groups e.g. detecting that the keywords musk, doge and elon are better organised for readability if the words elon and musk are placed after each other in sequence because that is how they typically appear.
This function detects common token sequences within a set of tokenized documents and reorders terms to prefer sequence ordering while preserving overall significance rank.