Package org.apache.arrow.dataset.file
Class JniWrapper
java.lang.Object
org.apache.arrow.dataset.file.JniWrapper
JniWrapper for filesystem based
Dataset implementations.-
Method Summary
Modifier and TypeMethodDescriptionstatic JniWrapperget()longmakeFileSystemDatasetFactory(String[] uris, int fileFormat) Create FileSystemDatasetFactory and return its native pointer.longmakeFileSystemDatasetFactory(String uri, int fileFormat) Create FileSystemDatasetFactory and return its native pointer.voidwriteFromScannerToFile(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStreaminto files.
-
Method Details
-
get
-
makeFileSystemDatasetFactory
Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uri- file uri to read, either a file or a directoryfileFormat- file format ID- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
makeFileSystemDatasetFactory
Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uris- List of file uris to read, each path pointing to an individual filefileFormat- file format ID- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
writeFromScannerToFile
public void writeFromScannerToFile(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStreaminto files. This internally depends on C++ write API: FileSystemDataset::Write.- Parameters:
streamAddress- the ArrowArrayStream addressfileFormat- target file format (ID)uri- target file uripartitionColumns- columns used to partition output filesmaxPartitions- maximum partitions to be included in written filesbaseNameTemplate- file name template used to make partitions. E.g. "dat_{i}", i is current partition ID around all written files.
-