Class AmazonTranscribe

  • All Implemented Interfaces:
    Serializable, org.apache.tika.config.Initializable, org.apache.tika.parser.Parser

    public class AmazonTranscribe
    extends org.apache.tika.parser.AbstractParser
    implements org.apache.tika.config.Initializable
    Amazon Transcribe implementation. See Javadoc for configuration options.

    Silently becomes unavailable when client keys are unavailable. N.B. it is not necessary to create the bucket before hand. This implementation will automatically create the bucket if one does not already exist, per the name defined above.

    Since:
    Tika 2.0
    See Also:
    Serialized Form
    • Field Detail

      • SUPPORTED_TYPES

        protected static final Set<org.apache.tika.mime.MediaType> SUPPORTED_TYPES
    • Constructor Detail

      • AmazonTranscribe

        public AmazonTranscribe()
    • Method Detail

      • getSupportedTypes

        public Set<org.apache.tika.mime.MediaType> getSupportedTypes​(org.apache.tika.parser.ParseContext context)
        Specified by:
        getSupportedTypes in interface org.apache.tika.parser.Parser
      • parse

        public void parse​(InputStream stream,
                          ContentHandler handler,
                          org.apache.tika.metadata.Metadata metadata,
                          org.apache.tika.parser.ParseContext context)
                   throws IOException,
                          SAXException,
                          org.apache.tika.exception.TikaException
        Starts AWS Transcribe Job with language specification.
        Specified by:
        parse in interface org.apache.tika.parser.Parser
        Parameters:
        stream - the source input stream.
        handler - handler to use
        metadata -
        context - -- set the LanguageCode in the ParseContext if known
        Throws:
        org.apache.tika.exception.TikaException - When there is an error transcribing.
        IOException - If an I/O exception of some sort has occurred.
        SAXException
        See Also:
        AWS Language Code
      • isAvailable

        public boolean isAvailable()
        Returns:
        true if this Transcriber is probably able to transcribe right now.
        Since:
        Tika 2.1
      • setClientId

        @Field
        public void setClientId​(String id)
        Sets the client Id for the transcriber API.
        Parameters:
        id - The ID to set.
      • setClientSecret

        @Field
        public void setClientSecret​(String secret)
        Sets the client secret for the transcriber API.
        Parameters:
        secret - The secret to set.
      • setBucket

        @Field
        public void setBucket​(String bucket)
        Sets the client secret for the transcriber API.
        Parameters:
        bucket - The bucket to set.
      • setRegion

        @Field
        public void setRegion​(String region)
      • initialize

        public void initialize​(Map<String,​org.apache.tika.config.Param> params)
                        throws org.apache.tika.exception.TikaConfigException
        Specified by:
        initialize in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException
      • checkInitialization

        public void checkInitialization​(org.apache.tika.config.InitializableProblemHandler problemHandler)
                                 throws org.apache.tika.exception.TikaConfigException
        Specified by:
        checkInitialization in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException