Searching Confidence Scores

A score is assigned to each data set classified for PCI, PHI, or PII. The Confidence Score is visible in the Results View. In the case of the first result shown below, the model is 80% sure that this item contains PCI—see Viewing Classified Data.

You can then perform a Message, Attachment, Document, Federated, or Advanced Search to look for results. Until you know what results turn up, leave the Confidence Score Range at the default setting. This will cast the widest net. Review the results that turn up. How many false positives turn up?

You have the opportunity to fine-tune the confidence range to find what you are looking for. Remember that with a range that starts too low, you’ll get higher false positives, and with a range that starts too high, you'll risk missing results. It’s all about finding the sweet spot within your data.

Suggested method for fine-tuning the range

  • Bring the bottom slider to 80%, rerun classification, and review 2 results.
  • Bring the bottom slider to 75%, rerun classification, and review 2 results
  • Keep lowering the range in increments of 5% until the results are mostly not what you are looking for.
  • Return the bottom slider to the last 5% increment where most of the results are classified correctly.

This method will give you a good indication of the confidence range you should use.

Once you have decided on a new confidence range, return to the Search and make the adjustment.

For all details on searching, see About Searching.