Input

The input for Read from Documents is a single file or folder. This stage supports the following file types:
  • Text
  • PDF
  • Microsoft Outlook
  • Microsoft Word
  • HTML
Read from Documents performs three types of extractions:
  • Document—Use the entire document
  • Page—Use a specific page of a document
  • Selective—Use a selected part of a document
  • Bookmarks—Use bookmarks from a PDF document
Read from Documents is part of the Information Extraction Module.