@Generated(value="software.amazon.awssdk:codegen") public final class Block extends Object implements SdkPojo, Serializable, ToCopyableBuilder<Block.Builder,Block>
A Block represents items that are recognized in a document within a group of pixels close to each other.
The information returned in a Block object depends on the type of operation. In text detection for
documents (for example DetectDocumentText), you get information about the detected words and lines of text. In
text analysis (for example AnalyzeDocument), you can also get information about the fields, tables, and
selection elements that are detected in the document.
An array of Block objects is returned by both synchronous and asynchronous operations. In synchronous
operations, such as DetectDocumentText, the array of Block objects is the entire set of results.
In asynchronous operations, such as GetDocumentAnalysis, the array is returned over one or more responses.
For more information, see How Amazon Textract Works.
| Modifier and Type | Class and Description |
|---|---|
static interface |
Block.Builder |
| Modifier and Type | Method and Description |
|---|---|
BlockType |
blockType()
The type of text item that's recognized.
|
String |
blockTypeAsString()
The type of text item that's recognized.
|
static Block.Builder |
builder() |
Integer |
columnIndex()
The column in which a table cell appears.
|
Integer |
columnSpan()
The number of columns that a table cell spans.
|
Float |
confidence()
The confidence score that Amazon Textract has in the accuracy of the recognized text and the accuracy of the
geometry points around the recognized text.
|
List<EntityType> |
entityTypes()
The type of entity.
|
List<String> |
entityTypesAsStrings()
The type of entity.
|
boolean |
equals(Object obj) |
boolean |
equalsBySdkFields(Object obj) |
Geometry |
geometry()
The location of the recognized text on the image.
|
<T> Optional<T> |
getValueForField(String fieldName,
Class<T> clazz) |
boolean |
hasEntityTypes()
For responses, this returns true if the service returned a value for the EntityTypes property.
|
int |
hashCode() |
boolean |
hasRelationships()
For responses, this returns true if the service returned a value for the Relationships property.
|
String |
id()
The identifier for the recognized text.
|
Integer |
page()
The page on which a block was detected.
|
Query |
query()
|
List<Relationship> |
relationships()
A list of child blocks of the current block.
|
Integer |
rowIndex()
The row in which a table cell is located.
|
Integer |
rowSpan()
The number of rows that a table cell spans.
|
List<SdkField<?>> |
sdkFields() |
SelectionStatus |
selectionStatus()
The selection status of a selection element, such as an option button or check box.
|
String |
selectionStatusAsString()
The selection status of a selection element, such as an option button or check box.
|
static Class<? extends Block.Builder> |
serializableBuilderClass() |
String |
text()
The word or line of text that's recognized by Amazon Textract.
|
TextType |
textType()
The kind of text that Amazon Textract has detected.
|
String |
textTypeAsString()
The kind of text that Amazon Textract has detected.
|
Block.Builder |
toBuilder() |
String |
toString()
Returns a string representation of this object.
|
clone, finalize, getClass, notify, notifyAll, wait, wait, waitcopypublic final BlockType blockType()
The type of text item that's recognized. In operations for text detection, the following types are returned:
PAGE - Contains a list of the LINE Block objects that are detected on a document page.
WORD - A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
In text analysis operations, the following types are returned:
PAGE - Contains a list of child Block objects that are detected on a document page.
KEY_VALUE_SET - Stores the KEY and VALUE Block objects for linked text that's detected on a
document page. Use the EntityType field to determine if a KEY_VALUE_SET object is a KEY
Block object or a VALUE Block object.
WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
TABLE - A table that's detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each.
CELL - A cell within a detected table. The cell is the parent of the block that contains the text in the cell.
SELECTION_ELEMENT - A selection element such as an option button (radio button) or a check box that's
detected on a document page. Use the value of SelectionStatus to determine the status of the
selection element.
SIGNATURE - The location and confidene score of a signature detected on a document page. Can be returned as part of a Key-Value pair or a detected cell.
QUERY - A question asked during the call of AnalyzeDocument. Contains an alias and an ID that attaches it to its answer.
QUERY_RESULT - A response to a question asked during the call of analyze document. Comes with an alias and ID for ease of locating in a response. Also contains location and confidence score.
If the service returns an enum value that is not available in the current SDK version, blockType will
return BlockType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
blockTypeAsString().
PAGE - Contains a list of the LINE Block objects that are detected on a document
page.
WORD - A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
In text analysis operations, the following types are returned:
PAGE - Contains a list of child Block objects that are detected on a document page.
KEY_VALUE_SET - Stores the KEY and VALUE Block objects for linked text that's
detected on a document page. Use the EntityType field to determine if a KEY_VALUE_SET object
is a KEY Block object or a VALUE Block object.
WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
TABLE - A table that's detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each.
CELL - A cell within a detected table. The cell is the parent of the block that contains the text in the cell.
SELECTION_ELEMENT - A selection element such as an option button (radio button) or a check box
that's detected on a document page. Use the value of SelectionStatus to determine the status
of the selection element.
SIGNATURE - The location and confidene score of a signature detected on a document page. Can be returned as part of a Key-Value pair or a detected cell.
QUERY - A question asked during the call of AnalyzeDocument. Contains an alias and an ID that attaches it to its answer.
QUERY_RESULT - A response to a question asked during the call of analyze document. Comes with an alias and ID for ease of locating in a response. Also contains location and confidence score.
BlockTypepublic final String blockTypeAsString()
The type of text item that's recognized. In operations for text detection, the following types are returned:
PAGE - Contains a list of the LINE Block objects that are detected on a document page.
WORD - A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
In text analysis operations, the following types are returned:
PAGE - Contains a list of child Block objects that are detected on a document page.
KEY_VALUE_SET - Stores the KEY and VALUE Block objects for linked text that's detected on a
document page. Use the EntityType field to determine if a KEY_VALUE_SET object is a KEY
Block object or a VALUE Block object.
WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
TABLE - A table that's detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each.
CELL - A cell within a detected table. The cell is the parent of the block that contains the text in the cell.
SELECTION_ELEMENT - A selection element such as an option button (radio button) or a check box that's
detected on a document page. Use the value of SelectionStatus to determine the status of the
selection element.
SIGNATURE - The location and confidene score of a signature detected on a document page. Can be returned as part of a Key-Value pair or a detected cell.
QUERY - A question asked during the call of AnalyzeDocument. Contains an alias and an ID that attaches it to its answer.
QUERY_RESULT - A response to a question asked during the call of analyze document. Comes with an alias and ID for ease of locating in a response. Also contains location and confidence score.
If the service returns an enum value that is not available in the current SDK version, blockType will
return BlockType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
blockTypeAsString().
PAGE - Contains a list of the LINE Block objects that are detected on a document
page.
WORD - A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
In text analysis operations, the following types are returned:
PAGE - Contains a list of child Block objects that are detected on a document page.
KEY_VALUE_SET - Stores the KEY and VALUE Block objects for linked text that's
detected on a document page. Use the EntityType field to determine if a KEY_VALUE_SET object
is a KEY Block object or a VALUE Block object.
WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
TABLE - A table that's detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each.
CELL - A cell within a detected table. The cell is the parent of the block that contains the text in the cell.
SELECTION_ELEMENT - A selection element such as an option button (radio button) or a check box
that's detected on a document page. Use the value of SelectionStatus to determine the status
of the selection element.
SIGNATURE - The location and confidene score of a signature detected on a document page. Can be returned as part of a Key-Value pair or a detected cell.
QUERY - A question asked during the call of AnalyzeDocument. Contains an alias and an ID that attaches it to its answer.
QUERY_RESULT - A response to a question asked during the call of analyze document. Comes with an alias and ID for ease of locating in a response. Also contains location and confidence score.
BlockTypepublic final Float confidence()
The confidence score that Amazon Textract has in the accuracy of the recognized text and the accuracy of the geometry points around the recognized text.
public final String text()
The word or line of text that's recognized by Amazon Textract.
public final TextType textType()
The kind of text that Amazon Textract has detected. Can check for handwritten text and printed text.
If the service returns an enum value that is not available in the current SDK version, textType will
return TextType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
textTypeAsString().
TextTypepublic final String textTypeAsString()
The kind of text that Amazon Textract has detected. Can check for handwritten text and printed text.
If the service returns an enum value that is not available in the current SDK version, textType will
return TextType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
textTypeAsString().
TextTypepublic final Integer rowIndex()
The row in which a table cell is located. The first row position is 1. RowIndex isn't returned by
DetectDocumentText and GetDocumentTextDetection.
RowIndex isn't
returned by DetectDocumentText and GetDocumentTextDetection.public final Integer columnIndex()
The column in which a table cell appears. The first column position is 1. ColumnIndex isn't returned
by DetectDocumentText and GetDocumentTextDetection.
ColumnIndex isn't
returned by DetectDocumentText and GetDocumentTextDetection.public final Integer rowSpan()
The number of rows that a table cell spans. Currently this value is always 1, even if the number of rows spanned
is greater than 1. RowSpan isn't returned by DetectDocumentText and
GetDocumentTextDetection.
RowSpan isn't returned by DetectDocumentText and
GetDocumentTextDetection.public final Integer columnSpan()
The number of columns that a table cell spans. Currently this value is always 1, even if the number of columns
spanned is greater than 1. ColumnSpan isn't returned by DetectDocumentText and
GetDocumentTextDetection.
ColumnSpan isn't returned by
DetectDocumentText and GetDocumentTextDetection.public final Geometry geometry()
The location of the recognized text on the image. It includes an axis-aligned, coarse bounding box that surrounds the text, and a finer-grain polygon for more accurate spatial information.
public final String id()
The identifier for the recognized text. The identifier is only unique for a single operation.
public final boolean hasRelationships()
isEmpty() method on the property).
This is useful because the SDK will never return a null collection or map, but you may need to differentiate
between the service returning nothing (or null) and the service returning an empty collection or map. For
requests, this returns true if a value for the property was specified in the request builder, and false if a
value was not specified.public final List<Relationship> relationships()
A list of child blocks of the current block. For example, a LINE object has child blocks for each WORD block that's part of the line of text. There aren't Relationship objects in the list for relationships that don't exist, such as when the current block has no child blocks. The list size can be the following:
0 - The block has no child blocks.
1 - The block has child blocks.
Attempts to modify the collection returned by this method will result in an UnsupportedOperationException.
This method will never return null. If you would like to know whether the service returned this field (so that
you can differentiate between null and empty), you can use the hasRelationships() method.
0 - The block has no child blocks.
1 - The block has child blocks.
public final List<EntityType> entityTypes()
The type of entity. The following can be returned:
KEY - An identifier for a field on the document.
VALUE - The field text.
EntityTypes isn't returned by DetectDocumentText and
GetDocumentTextDetection.
Attempts to modify the collection returned by this method will result in an UnsupportedOperationException.
This method will never return null. If you would like to know whether the service returned this field (so that
you can differentiate between null and empty), you can use the hasEntityTypes() method.
KEY - An identifier for a field on the document.
VALUE - The field text.
EntityTypes isn't returned by DetectDocumentText and
GetDocumentTextDetection.
public final boolean hasEntityTypes()
isEmpty() method on the property).
This is useful because the SDK will never return a null collection or map, but you may need to differentiate
between the service returning nothing (or null) and the service returning an empty collection or map. For
requests, this returns true if a value for the property was specified in the request builder, and false if a
value was not specified.public final List<String> entityTypesAsStrings()
The type of entity. The following can be returned:
KEY - An identifier for a field on the document.
VALUE - The field text.
EntityTypes isn't returned by DetectDocumentText and
GetDocumentTextDetection.
Attempts to modify the collection returned by this method will result in an UnsupportedOperationException.
This method will never return null. If you would like to know whether the service returned this field (so that
you can differentiate between null and empty), you can use the hasEntityTypes() method.
KEY - An identifier for a field on the document.
VALUE - The field text.
EntityTypes isn't returned by DetectDocumentText and
GetDocumentTextDetection.
public final SelectionStatus selectionStatus()
The selection status of a selection element, such as an option button or check box.
If the service returns an enum value that is not available in the current SDK version, selectionStatus
will return SelectionStatus.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available
from selectionStatusAsString().
SelectionStatuspublic final String selectionStatusAsString()
The selection status of a selection element, such as an option button or check box.
If the service returns an enum value that is not available in the current SDK version, selectionStatus
will return SelectionStatus.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available
from selectionStatusAsString().
SelectionStatuspublic final Integer page()
The page on which a block was detected. Page is returned by synchronous and asynchronous operations.
Page values greater than 1 are only returned for multipage documents that are in PDF or TIFF format. A scanned
image (JPEG/PNG) provided to an asynchronous operation, even if it contains multiple document pages, is
considered a single-page document. This means that for scanned images the value of Page is always 1.
Synchronous operations operations will also return a Page value of 1 because every input document is
considered to be a single-page document.
Page is returned by synchronous and asynchronous
operations. Page values greater than 1 are only returned for multipage documents that are in PDF or TIFF
format. A scanned image (JPEG/PNG) provided to an asynchronous operation, even if it contains multiple
document pages, is considered a single-page document. This means that for scanned images the value of
Page is always 1. Synchronous operations operations will also return a Page
value of 1 because every input document is considered to be a single-page document.public final Query query()
public Block.Builder toBuilder()
toBuilder in interface ToCopyableBuilder<Block.Builder,Block>public static Block.Builder builder()
public static Class<? extends Block.Builder> serializableBuilderClass()
public final boolean equalsBySdkFields(Object obj)
equalsBySdkFields in interface SdkPojopublic final String toString()
Copyright © 2023. All rights reserved.