Package 

Class LlmModule


  • 
    public class LlmModule
    
                        

    LlmModule is a wrapper around the Executorch LLM. It provides a simple interface to generate text from the model.

    Warning: These APIs are experimental and subject to change without notice

    • Constructor Summary

      Constructors 
      Constructor Description
      LlmModule(String modulePath, String tokenizerPath, float temperature) Constructs a LLM Module for a model with given model path, tokenizer, temperature.
      LlmModule(String modulePath, String tokenizerPath, float temperature, String dataPath) Constructs a LLM Module for a model with given model path, tokenizer, temperature and datapath.
      LlmModule(int modelType, String modulePath, String tokenizerPath, float temperature) Constructs a LLM Module for a model with given path, tokenizer, and temperature.
    • Method Summary

      Modifier and Type Method Description
      void resetNative()
      int generate(String prompt, LlmCallback llmCallback) Start generating tokens from the module.
      int generate(String prompt, int seqLen, LlmCallback llmCallback) Start generating tokens from the module.
      int generate(String prompt, LlmCallback llmCallback, boolean echo) Start generating tokens from the module.
      int generate(String prompt, int seqLen, LlmCallback llmCallback, boolean echo) Start generating tokens from the module.
      native int generate(Array<int> image, int width, int height, int channels, String prompt, int seqLen, LlmCallback llmCallback, boolean echo) Start generating tokens from the module.
      long prefillImages(Array<int> image, int width, int height, int channels, long startPos) Prefill an LLaVA Module with the given images input.
      long prefillPrompt(String prompt, long startPos, int bos, int eos) Prefill an LLaVA Module with the given text input.
      native int generateFromPos(String prompt, int seqLen, long startPos, LlmCallback callback, boolean echo) Generate tokens from the given prompt, starting from the given position.
      native void stop() Stop current generate() before it finishes.
      native int load() Force loading the module.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • LlmModule

        LlmModule(String modulePath, String tokenizerPath, float temperature)
        Constructs a LLM Module for a model with given model path, tokenizer, temperature.
      • LlmModule

        LlmModule(String modulePath, String tokenizerPath, float temperature, String dataPath)
        Constructs a LLM Module for a model with given model path, tokenizer, temperature and datapath.
      • LlmModule

        LlmModule(int modelType, String modulePath, String tokenizerPath, float temperature)
        Constructs a LLM Module for a model with given path, tokenizer, and temperature.
    • Method Detail

      • generate

         int generate(String prompt, LlmCallback llmCallback)

        Start generating tokens from the module.

        Parameters:
        prompt - Input prompt
        llmCallback - callback object to receive results.
      • generate

         int generate(String prompt, int seqLen, LlmCallback llmCallback)

        Start generating tokens from the module.

        Parameters:
        prompt - Input prompt
        seqLen - sequence length
        llmCallback - callback object to receive results.
      • generate

         int generate(String prompt, LlmCallback llmCallback, boolean echo)

        Start generating tokens from the module.

        Parameters:
        prompt - Input prompt
        llmCallback - callback object to receive results
        echo - indicate whether to echo the input prompt or not (text completion vs chat)
      • generate

         int generate(String prompt, int seqLen, LlmCallback llmCallback, boolean echo)

        Start generating tokens from the module.

        Parameters:
        prompt - Input prompt
        seqLen - sequence length
        llmCallback - callback object to receive results
        echo - indicate whether to echo the input prompt or not (text completion vs chat)
      • generate

         native int generate(Array<int> image, int width, int height, int channels, String prompt, int seqLen, LlmCallback llmCallback, boolean echo)

        Start generating tokens from the module.

        Parameters:
        image - Input image as a byte array
        width - Input image width
        height - Input image height
        channels - Input image number of channels
        prompt - Input prompt
        seqLen - sequence length
        llmCallback - callback object to receive results.
        echo - indicate whether to echo the input prompt or not (text completion vs chat)
      • prefillImages

         long prefillImages(Array<int> image, int width, int height, int channels, long startPos)

        Prefill an LLaVA Module with the given images input.

        Parameters:
        image - Input image as a byte array
        width - Input image width
        height - Input image height
        channels - Input image number of channels
        startPos - The starting position in KV cache of the input in the LLM.
      • prefillPrompt

         long prefillPrompt(String prompt, long startPos, int bos, int eos)

        Prefill an LLaVA Module with the given text input.

        Parameters:
        prompt - The text prompt to LLaVA.
        startPos - The starting position in KV cache of the input in the LLM.
        bos - The number of BOS (begin of sequence) token.
        eos - The number of EOS (end of sequence) token.
      • generateFromPos

         native int generateFromPos(String prompt, int seqLen, long startPos, LlmCallback callback, boolean echo)

        Generate tokens from the given prompt, starting from the given position.

        Parameters:
        prompt - The text prompt to LLaVA.
        seqLen - The total sequence length, including the prompt tokens and new tokens.
        startPos - The starting position in KV cache of the input in the LLM.
        callback - callback object to receive results.
        echo - indicate whether to echo the input prompt or not.
      • stop

         native void stop()

        Stop current generate() before it finishes.

      • load

         native int load()

        Force loading the module. Otherwise the model is loaded during first generate().