public class GoogleHadoopSyncableOutputStream
extends java.io.OutputStream
implements org.apache.hadoop.fs.Syncable
Syncable interface by composing objects
created in separate underlying streams for each hsync() call.
Prior to the first hsync(), sync() or close() call, this channel will behave the same way as a basic non-syncable channel, writing directly to the destination file.
On the first call to hsync()/sync(), the destination file is committed and a new temporary file using a hidden-file prefix (underscore) is created with an additional suffix which differs for each subsequent temporary file in the series; during this time readers can read the data committed to the destination file, but not the bytes written to the temporary file since the last hsync() call.
On each subsequent hsync()/sync() call, the temporary file closed(), composed onto the destination file, then deleted, and a new temporary file is opened under a new filename for further writes.
Caveat: each hsync()/sync() requires many underlying read and mutation requests occurring sequentially, so latency is expected to be fairly high.
If errors occur mid-stream, there may be one or more temporary files failing to be cleaned up, and require manual intervention to discover and delete any such unused files. Data written prior to the most recent successful hsync() is persistent and safe in such a case.
If multiple writers are attempting to write to the same destination file, generation ids used with low-level precondition checks will cause all but a one writer to fail their precondition checks during writes, and a single remaining writer will safely occupy the stream.
| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
TEMPFILE_PREFIX |
| Constructor and Description |
|---|
GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs,
java.net.URI gcsPath,
org.apache.hadoop.fs.FileSystem.Statistics statistics,
CreateFileOptions createFileOptions)
Creates a new GoogleHadoopSyncableOutputStream with initial stream initialized and expected to
begin at file-offset 0.
|
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
void |
hflush()
There is no way to flush data to become available for readers without a full-fledged
hsync(), so this method is a no-op.
|
void |
hsync()
This overrides Syncable.hsync(), but is not annotated as such because the method doesn't
exist in Hadoop 1.
|
void |
sync() |
void |
write(byte[] b,
int offset,
int len) |
void |
write(int b) |
public static final java.lang.String TEMPFILE_PREFIX
public GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs, java.net.URI gcsPath, org.apache.hadoop.fs.FileSystem.Statistics statistics, CreateFileOptions createFileOptions) throws java.io.IOException
java.io.IOExceptionpublic void write(int b)
throws java.io.IOException
write in class java.io.OutputStreamjava.io.IOExceptionpublic void write(byte[] b,
int offset,
int len)
throws java.io.IOException
write in class java.io.OutputStreamjava.io.IOExceptionpublic void close()
throws java.io.IOException
close in interface java.io.Closeableclose in interface java.lang.AutoCloseableclose in class java.io.OutputStreamjava.io.IOExceptionpublic void sync()
throws java.io.IOException
sync in interface org.apache.hadoop.fs.Syncablejava.io.IOExceptionpublic void hflush()
throws java.io.IOException
java.io.IOExceptionpublic void hsync()
throws java.io.IOException
CompositeLimitExceededException - if this hsync() call would require any future close()
call to exceed the component limit. If CompositeLimitExceededException is thrown, no
actual GCS operations are taken and it's safe to subsequently call close() on this
stream as normal; it just means data written since the last successful hsync() has not
yet been committed.java.io.IOExceptionCopyright © 2019. All rights reserved.