Work with the QC Flags Window

The QC Flags tab is located at the bottom right side of the eCapture QC interface (default view). The eCapture QC module comes with numerous default flags that have system-assigned numbers. Most of the system flags cannot be modified. Also, users can create additional flags that have user-assigned numbers.

As you navigate through the documents, you may see some flags with a red shaded box indicating that the selected document was "flagged." Flagging a document/page means that you are assigning a condition to that document/page such as "Passed QC", "Threshold Exceeded", "ZeroByte", and so on.

Tip: You can use keyboard shortcuts to toggle flags assigned to documents. QC Flags are numbered sequentially on the Flags tab. To use a keyboard shortcut combination, press CTRL + the number of the keyboard shortcut. For example, to apply the Inline Images flag (16), press CTRL, press 1, press 6, and release. (Repeat to remove the flag.)

The flags that display on the Flags tab may differ according to the data set.

Unmodifiable System Flags

The list of the default flags that ship with eCapture QC is as follows:

  1. Passed QC: The document has passed QC. The item in the Documents pane is green.

  2. Exception: This document did not process successfully. The item in the Documents pane is red.

  3. Threshold Exceeded: The number of pages in the document exceeds the Max Pages (count) values specified in the Jobs Flex Processor rule.

  4. Text Missing: The document contains pages with no extracted text.

  5. ZeroByte: The file contains no data.

  6. Unprocessable: Items that could not be extracted from .ZIP files, e-mail messages, and so on.

  7. Page Count Discrepancy: Results.Pages differs from number of pages on disk.

  8. One Page: The document was successfully processed and resulted in only one page. Not applicable for protected Office documents.

  9. QC Deleted Page: Set when a page is deleted from a document while in eCapture QC.

  1. Text Cutoff: Set when the system detects text that was cut off at the right margin when processed.

  2. Alternate Text Extraction: OCR was used to extract text from an image file.

  3. Protected: Set when encountering encrypted documents with protected sections, or password protected files for Office and PDF documents.

  4. Untyped Embedded Control: Lotus Notes document had an untyped embedded control that had to be replaced for successful processing.

  5. Generated Page Based Text: Document contained Arabic or related language, and so page text was simulated from data extraction of the document.

  6. Inline Images: Set for all emails that contain inline images. Other embedded content, such as files, are not identified by this flag. As of version 2018.5.2, this flag identifies Microsoft Office Files (Word, Excel, and PowerPoint) that contain embedded (inline) images. If necessary, use the native reprocessing functionality in the QC module to manually re-size the cutoff images, or reprint message.

    Note: If the Discovery option, Treat email Inline Images as Attachments, is enabled, the inline images are extracted during Discovery and can be OCRed during Processing or Data Extract operations. Text in the inline images will not be lost even if the images are cut off in the emails’ TIFF images.

  7. Missing Document Form: Set when encountering Lotus Notes (NSF) processing errors regarding custom forms. Reverts to use the default Lotus form.

  8. Text not Extractable: Text cannot be extracted from all pages of a document, before any OCR is attempted.

  9. OCR Failure: Set when one or more OCR errors are encountered in images and PDFs.

  10. MHT Failure: Indicates that the native file will be exported if MHT or RTF is selected.

  11. Collapsed Section: Indicates that all sections were expanded in Lotus Notes before printing.

  12. Restricted Office Document: Indicates that a Word or PowerPoint document has restricted fonts. For a Word document, the text data is extracted using Oracle (formerly Stellent). For a PowerPoint document, Oracle is used to print it. In either case, this flag is set so the document may be natively reprocessed.

  13. Legacy Lotus Notes Handling: Indicates either that the processing option Use Legacy Handling (Print from Lotus UI) was selected, or that this Lotus Notes file could not be processed with the current method to extract metadata and is passed off to the Use Legacy Handling (Print from Lotus UI) method.

  14. Lotus Notes Custom Form: Indicates this Lotus Notes file is a Custom Form which could not be processed with current method to extract metadata and is passed off to Use Legacy Handling (Print from Lotus UI) method.

  15. Lotus Notes Imaged Through Word: Indicates that the generated RTF image files of the Lotus Notes document were printed through Microsoft Word.

  16. Lotus Notes DAOS Attachment: Indicates that the document contains a Domino attachment and Object service (DOAS) attachment. Set for both Medium Speed and Low Speed (Legacy) processing.

  17. Lotus Notes Encrypted Field: Indicates that the document contains encrypted items (messages and/or attachments).

  18. Lotus Notes Legacy Candidate: Set when the Lotus Notes Legacy handling mode was not selected for Discovery, Processing, or Data Extract and the property EsiThrowOnInvalidTypes (located in the ConfigurationProperties database) is set to 0 (False). The default setting is 1 (True). This flag is set for any documents normally forced to Legacy due to the following Medium Speed checks being bypassed: Embedded Links, Layout Fields, and Embedded OLE documents.

  19. Lotus Notes High Speed Failure: Set when the Lotus Notes email encounters an error when sent through Data Extract using the High Speed method and had to use the Medium Speed method instead.

  20. Outlook Missing Recipients Table Entries: Indicates a document in which the recipients table was not populated. Contents are extracted for the recipient display fields.

  21. Email Cutoff Text Handling Failed: Differentiates between a successful attempt through "Email Cutoff Text Handling" and those items that never successfully obtain the cutoff text.

  22. Inline Image Exceeds Page Size: Indicates the possibility that the image size may be larger than the paper’s width or height.

  23. HTML Character Codes Detected: Indicates that HTML code was identified in the text representation of the body.

  24. Negative Text Coordinates Detected: Indicates that the text contains negative coordinates less than -25.

  25. Stellent Processed: Set if a document is data extracted or processed through Stellent. For example, if Office 2010 is on the Worker computer, then older versions of Office (97, 95, and so on) will go through Stellent rather than the native application. This is due to Microsoft limitations whereby Office 2010 cannot handle Office 97 and older documents. Applies to both Processing Jobs and Data Extract Jobs.

  26. Imported Images: Set for images when Image Files is selected as an item type in the eCapture Import Wizard. The flag is document based.

  27. Imported Text: Set for images when Document Text is selected as an item type in the eCapture Import Wizard. The flag is not set if all pages contain missing text and all pages are OCRed due to missing text. The flag is document based

  28. Placeholder: Applied to documents that receive a placeholder through a Flex Processor rule or by user action in QC. The flag is cleared if the document is reprocessed through QC or post-process operations.

  1. Foreign Language: The document contains a language other than English.

  2. Email Body Contains Tables: Indicates that tables are found in body of the email.

  3. PDF Crop Box: A PDF document that has at least one crop box that possibly contains hidden content.

  4. Date Field Exists: Set for PowerPoint documents in which a date exists within headers, footers, and body. Applies to Data Extract Job exports and Streaming Discovery export series direct to disk.

    Note: For Streaming Discovery, date time fields are detected within the contents of the PowerPoint documents (but not within the master slides).

  5. OCR Low Confidence: Set for documents that have a low average confidence level. The level is set for both Processing Jobs and Data Extract Jobs under General options.

  6. Possible Header Info In Body: Applied to Lotus Notes emails (for Processing Jobs and Data Extract Jobs) when the Message Header of an in-line email (a previous message in the email thread) is included in the body text when processing through standard (Medium Speed) mode.

  7. Embedded Document: Applied if an embedded file’s parent is not an email file.

  8. Unsupported HTML Font: The document contains the Wingdings font.

  9. XFA Form PDF: Set for LiveCycle PDF forms (XFA). These PDF forms are handled through both Streaming Discovery Jobs and regular Discovery Jobs, Data Extract Jobs, or Processing Jobs. The forms can be attached to emails or embedded in non-email files.

  10. Image Text Size Mismatch: Set if the extracted plain-text body size from a Data Extract Job differs from that of the imaged document during a Processing Job.

  11. Email No Body: The email either had no body, or the body was empty.

  12. Email No Body - Reextracted: The attached email is missing the body after extraction, but the body exists after re-extraction from parent message.

  13. Lotus Notes Edit Control Detected: Flagged when content in tabulated DXL forms is included in the output.

  14. PDF Comments: Set when a PDF contains comments.

  1. Word Revisions: The Word document contains revisions.

  2. Word Comments: The Word document contains comments.

  3. Word Hidden Text: The Word document contains hidden text.

  1. Excel Hidden Rows: The Excel document contains hidden rows.

  2. Excel Hidden Columns: The Excel document contains hidden columns.

  3. Excel AutoFilter: The Excel document has auto filter on.

  4. Excel Hidden Worksheets: The Excel document contains hidden worksheets.

  5. Excel Very Hidden Worksheets: The Excel document contains very hidden worksheets. This can be set only through programming.

  6. Excel Comments: The Excel document contains comments.

  7. Excel Protected Workbook: The Excel workbook is protected.

  8. Excel Pivot Table in Worksheet: The Excel worksheet contains pivot table.

  9. Excel Protected Worksheet: The Excel worksheet is protected.
  1. PowerPoint Hidden Slides: The PowerPoint document contains hidden slides.

  2. PowerPoint Speaker Notes: The PowerPoint document contains speaker notes.

  1. OfficeLinked Content: Set if any linked content exists. Linked content includes hyperlinks and OLE linked files.

  2. Streaming Discovery Failure: Set if eCapture data extracts a document, and the Job is a Streaming Discovery Job that failed. The Job is reverted to eCapture for processing.

  3. Streaming Discovery Errors Forced Through Export: Identifies parent documents that have container-level errors that were forced through processing and Export (through Publish Errors). Applies to Streaming Discovery Jobs only.

  4. Password Applied: Applies to Streaming Discovery Jobs only, to files unlocked successfully with user-defined passwords. File types include: Microsoft Word, Excel, PowerPoint, and Adobe PDFs. Unlike the restriction type QC flags that cover native document access restricted by password only, the Password Applied QC flag is wholly independent because it describes a processing activity; that is, did the system employ a user-configured password in order to process the native file? This QC flag is not a property of a native document.

    Note: The Exception QC flag is not applied for documents flagged with the Password Applied QC flag.

  1. Protected_Document: The document is access-protected and cannot be viewed without first entering a password.
  2. Protected_Content: Content within the document is protected; generally, the file can be viewed without a password.
  3. Protected_Functionality: The document can be viewed in read-only mode without a password.
  4. Is Inline Image: Identifies items that are inline images.

Modifiable System Flags

Three additional flags are included and can be modified. They are provided as an example of custom-defined flags that organizations can add to the list of available flags to meet the requirements of their organization. For more information, see Create or Delete User-Defined Flags.

  1. Low Priority: Flag this as a low-priority item.

  2. Medium Priority: Flag this as a medium-priority item.

  3. High Priority: Flag this as a high-priority item.

The QC Flags tab is located at the bottom right side of the eCapture QC interface (default view). The eCapture QC module comes with numerous default flags that have system-assigned numbers. Most of the system flags cannot be modified. Also, users can create additional flags that have user-assigned numbers.

As you navigate through the documents, you may see some flags with a red shaded box indicating that the selected document was "flagged." Flagging a document/page means that you are assigning a condition to that document/page such as "Passed QC", "Threshold Exceeded", "ZeroByte", and so on.

Tip: You can use keyboard shortcuts to toggle flags assigned to documents. QC Flags are numbered sequentially on the Flags tab. To use a keyboard shortcut combination, press CTRL + the number of the keyboard shortcut. For example, to apply the Inline Images flag (16), press CTRL, press 1, press 6, and release. (Repeat to remove the flag.)

The flags that display on the Flags tab may differ according to the data set.

Unmodifiable System Flags

The list of the default flags that ship with eCapture QC is as follows:

  1. Passed QC: The document has passed QC. The item in the Documents pane is green.

  2. Exception: This document did not process successfully. The item in the Documents pane is red.

  3. Threshold Exceeded: The number of pages in the document exceeds the Max Pages (count) values specified in the Jobs Flex Processor rule.

  4. Text Missing: The document contains pages with no extracted text.

  5. ZeroByte: The file contains no data.

  6. Unprocessable: Items that could not be extracted from .ZIP files, e-mail messages, and so on.

  7. Page Count Discrepancy: Results.Pages differs from number of pages on disk.

  8. One Page: The document was successfully processed and resulted in only one page. Not applicable for protected Office documents.

  9. QC Deleted Page: Set when a page is deleted from a document while in eCapture QC.

  1. Text Cutoff: Set when the system detects text that was cut off at the right margin when processed.

  2. Alternate Text Extraction: OCR was used to extract text from an image file.

  3. Protected: Set when encountering encrypted documents with protected sections, or password protected files for Office and PDF documents.

  4. Untyped Embedded Control: Lotus Notes document had an untyped embedded control that had to be replaced for successful processing.

  5. Generated Page Based Text: Document contained Arabic or related language, and so page text was simulated from data extraction of the document.

  6. Inline Images: Set for all emails that contain inline images. Other embedded content, such as files, are not identified by this flag. As of version 2018.5.2, this flag identifies Microsoft Office Files (Word, Excel, and PowerPoint) that contain embedded (inline) images. If necessary, use the native reprocessing functionality in the QC module to manually re-size the cutoff images, or reprint message.

    Note: If the Discovery option, Treat email Inline Images as Attachments, is enabled, the inline images are extracted during Discovery and can be OCRed during Processing or Data Extract operations. Text in the inline images will not be lost even if the images are cut off in the emails’ TIFF images.

  7. Missing Document Form: Set when encountering Lotus Notes (NSF) processing errors regarding custom forms. Reverts to use the default Lotus form.

  8. Text not Extractable: Text cannot be extracted from all pages of a document, before any OCR is attempted.

  9. OCR Failure: Set when one or more OCR errors are encountered in images and PDFs.

  10. MHT Failure: Indicates that the native file will be exported if MHT or RTF is selected.

  11. Collapsed Section: Indicates that all sections were expanded in Lotus Notes before printing.

  12. Restricted Office Document: Indicates that a Word or PowerPoint document has restricted fonts. For a Word document, the text data is extracted using Oracle (formerly Stellent). For a PowerPoint document, Oracle is used to print it. In either case, this flag is set so the document may be natively reprocessed.

  13. Legacy Lotus Notes Handling: Indicates either that the processing option Use Legacy Handling (Print from Lotus UI) was selected, or that this Lotus Notes file could not be processed with the current method to extract metadata and is passed off to the Use Legacy Handling (Print from Lotus UI) method.

  14. Lotus Notes Custom Form: Indicates this Lotus Notes file is a Custom Form which could not be processed with current method to extract metadata and is passed off to Use Legacy Handling (Print from Lotus UI) method.

  15. Lotus Notes Imaged Through Word: Indicates that the generated RTF image files of the Lotus Notes document were printed through Microsoft Word.

  16. Lotus Notes DAOS Attachment: Indicates that the document contains a Domino attachment and Object service (DOAS) attachment. Set for both Medium Speed and Low Speed (Legacy) processing.

  17. Lotus Notes Encrypted Field: Indicates that the document contains encrypted items (messages and/or attachments).

  18. Lotus Notes Legacy Candidate: Set when the Lotus Notes Legacy handling mode was not selected for Discovery, Processing, or Data Extract and the property EsiThrowOnInvalidTypes (located in the ConfigurationProperties database) is set to 0 (False). The default setting is 1 (True). This flag is set for any documents normally forced to Legacy due to the following Medium Speed checks being bypassed: Embedded Links, Layout Fields, and Embedded OLE documents.

  19. Lotus Notes High Speed Failure: Set when the Lotus Notes email encounters an error when sent through Data Extract using the High Speed method and had to use the Medium Speed method instead.

  20. Outlook Missing Recipients Table Entries: Indicates a document in which the recipients table was not populated. Contents are extracted for the recipient display fields.

  21. Email Cutoff Text Handling Failed: Differentiates between a successful attempt through "Email Cutoff Text Handling" and those items that never successfully obtain the cutoff text.

  22. Inline Image Exceeds Page Size: Indicates the possibility that the image size may be larger than the paper’s width or height.

  23. HTML Character Codes Detected: Indicates that HTML code was identified in the text representation of the body.

  24. Negative Text Coordinates Detected: Indicates that the text contains negative coordinates less than -25.

  25. Stellent Processed: Set if a document is data extracted or processed through Stellent. For example, if Office 2010 is on the Worker computer, then older versions of Office (97, 95, and so on) will go through Stellent rather than the native application. This is due to Microsoft limitations whereby Office 2010 cannot handle Office 97 and older documents. Applies to both Processing Jobs and Data Extract Jobs.

  26. Imported Images: Set for images when Image Files is selected as an item type in the eCapture Import Wizard. The flag is document based.

  27. Imported Text: Set for images when Document Text is selected as an item type in the eCapture Import Wizard. The flag is not set if all pages contain missing text and all pages are OCRed due to missing text. The flag is document based

  28. Placeholder: Applied to documents that receive a placeholder through a Flex Processor rule or by user action in QC. The flag is cleared if the document is reprocessed through QC or post-process operations.

  1. Foreign Language: The document contains a language other than English.

  2. Email Body Contains Tables: Indicates that tables are found in body of the email.

  3. PDF Crop Box: A PDF document that has at least one crop box that possibly contains hidden content.

  4. Date Field Exists: Set for PowerPoint documents in which a date exists within headers, footers, and body. Applies to Data Extract Job exports and Streaming Discovery export series direct to disk.

    Note: For Streaming Discovery, date time fields are detected within the contents of the PowerPoint documents (but not within the master slides).

  5. OCR Low Confidence: Set for documents that have a low average confidence level. The level is set for both Processing Jobs and Data Extract Jobs under General options.

  6. Possible Header Info In Body: Applied to Lotus Notes emails (for Processing Jobs and Data Extract Jobs) when the Message Header of an in-line email (a previous message in the email thread) is included in the body text when processing through standard (Medium Speed) mode.

  7. Embedded Document: Applied if an embedded file’s parent is not an email file.

  8. Unsupported HTML Font: The document contains the Wingdings font.

  9. XFA Form PDF: Set for LiveCycle PDF forms (XFA). These PDF forms are handled through both Streaming Discovery Jobs and regular Discovery Jobs, Data Extract Jobs, or Processing Jobs. The forms can be attached to emails or embedded in non-email files.

  10. Image Text Size Mismatch: Set if the extracted plain-text body size from a Data Extract Job differs from that of the imaged document during a Processing Job.

  11. Email No Body: The email either had no body, or the body was empty.

  12. Email No Body - Reextracted: The attached email is missing the body after extraction, but the body exists after re-extraction from parent message.

  13. Lotus Notes Edit Control Detected: Flagged when content in tabulated DXL forms is included in the output.

  14. PDF Comments: Set when a PDF contains comments.

  1. Word Revisions: The Word document contains revisions.

  2. Word Comments: The Word document contains comments.

  3. Word Hidden Text: The Word document contains hidden text.

  1. Excel Hidden Rows: The Excel document contains hidden rows.

  2. Excel Hidden Columns: The Excel document contains hidden columns.

  3. Excel AutoFilter: The Excel document has auto filter on.

  4. Excel Hidden Worksheets: The Excel document contains hidden worksheets.

  5. Excel Very Hidden Worksheets: The Excel document contains very hidden worksheets. This can be set only through programming.

  6. Excel Comments: The Excel document contains comments.

  7. Excel Protected Workbook: The Excel workbook is protected.

  8. Excel Pivot Table in Worksheet: The Excel worksheet contains pivot table.

  9. Excel Protected Worksheet: The Excel worksheet is protected.
  1. PowerPoint Hidden Slides: The PowerPoint document contains hidden slides.

  2. PowerPoint Speaker Notes: The PowerPoint document contains speaker notes.

  1. OfficeLinked Content: Set if any linked content exists. Linked content includes hyperlinks and OLE linked files.

  2. Streaming Discovery Failure: Set if eCapture data extracts a document, and the Job is a Streaming Discovery Job that failed. The Job is reverted to eCapture for processing.

  3. Streaming Discovery Errors Forced Through Export: Identifies parent documents that have container-level errors that were forced through processing and Export (through Publish Errors). Applies to Streaming Discovery Jobs only.

  4. Password Applied: Applies to Streaming Discovery Jobs only, to files unlocked successfully with user-defined passwords. File types include: Microsoft Word, Excel, PowerPoint, and Adobe PDFs. Unlike the restriction type QC flags that cover native document access restricted by password only, the Password Applied QC flag is wholly independent because it describes a processing activity; that is, did the system employ a user-configured password in order to process the native file? This QC flag is not a property of a native document.

    Note: The Exception QC flag is not applied for documents flagged with the Password Applied QC flag.

  1. Protected_Document: The document is access-protected and cannot be viewed without first entering a password.
  2. Protected_Content: Content within the document is protected; generally, the file can be viewed without a password.
  3. Protected_Functionality: The document can be viewed in read-only mode without a password.
  4. Is Inline Image: Identifies items that are inline images.

Modifiable System Flags

Three additional flags are included and can be modified. They are provided as an example of custom-defined flags that organizations can add to the list of available flags to meet the requirements of their organization. For more information, see Create or Delete User-Defined Flags.

  1. Low Priority: Flag this as a low-priority item.

  2. Medium Priority: Flag this as a medium-priority item.

  3. High Priority: Flag this as a high-priority item.

 

Related Topics

Create or Delete User-Defined Flags

Save the QC Interface Layout