Near Duplicates

This IPRO Analytics function identifies textual near-duplicate groups in item text. Textual near-duplicates are items in which most of the text displays in other items in the group and in the same order.

The metadata of the items identified as part of a near-duplicate group will be updated with a group ID to identify the near-duplicate group to which the items belong. See the Near Duplicates panel.

For Analytics:

The documents are grouped by IA ND Family.

The documents are sorted by IA ND Sort.

The selected fields are BEGDOC, IA ND Family, IA ND Sort, IA ND Words, and IA ND Score.

 

Related pages: