Class OCR

  • All Implemented Interfaces:
    io.annot8.api.components.Annot8ComponentDescriptor<OCR.Processor,​OCR.Settings>, io.annot8.api.components.ProcessorDescriptor<OCR.Processor,​OCR.Settings>

    @ComponentName("Tesseract OCR")
    @ComponentDescription("Use Tesseract to extract text from images stored in FileContent, or directly from Image content")
    @SettingsClass(Settings.class)
    @ComponentTags({"image","text","ocr","tesseract"})
    public class OCR
    extends io.annot8.common.components.AbstractProcessorDescriptor<OCR.Processor,​OCR.Settings>
    Takes FileContent containing either an image or PDF file, or Image content directly, and produces a Text content with the text from the image as extracted by Tesseract
    • Constructor Detail

      • OCR

        public OCR()
    • Method Detail

      • createComponent

        protected OCR.Processor createComponent​(io.annot8.api.context.Context context,
                                                OCR.Settings settings)
        Specified by:
        createComponent in class io.annot8.common.components.AbstractComponentDescriptor<OCR.Processor,​OCR.Settings>
      • capabilities

        public io.annot8.api.capabilities.Capabilities capabilities()