Class TextChunker
- java.lang.Object
-
- com.microsoft.semantickernel.text.TextChunker
-
public class TextChunker extends Object
Split text in chunks, attempting to leave meaning intact. For plain text, split looking at new lines first, then periods, and so on. For markdown, split looking at punctuation first, and so on.
-
-
Constructor Summary
Constructors Constructor Description TextChunker()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static List<String>splitMarkDownLines(String text, int maxTokensPerLine)Split markdown text into linesstatic List<String>splitMarkdownParagraphs(List<String> lines, int maxTokensPerParagraph)Split markdown text into paragraphsstatic List<String>splitPlainTextLines(String text, int maxTokensPerLine)Split plain text into linesstatic List<String>splitPlainTextParagraphs(List<String> lines, int maxTokensPerParagraph)Split plain text into paragraphs
-
-
-
Method Detail
-
splitPlainTextLines
public static List<String> splitPlainTextLines(String text, int maxTokensPerLine)
Split plain text into lines- Parameters:
text- Text to splitmaxTokensPerLine- Maximum number of tokens per line- Returns:
- List of lines
-
splitMarkDownLines
public static List<String> splitMarkDownLines(String text, int maxTokensPerLine)
Split markdown text into lines- Parameters:
text- Text to splitmaxTokensPerLine- Maximum number of tokens per line- Returns:
- List of lines
-
splitPlainTextParagraphs
public static List<String> splitPlainTextParagraphs(List<String> lines, int maxTokensPerParagraph)
Split plain text into paragraphs- Parameters:
lines- Lines of textmaxTokensPerParagraph- Maximum number of tokens per paragraph.- Returns:
- List of paragraphs
-
-