public class DatasetProfile
extends java.lang.Object
implements java.io.Serializable
| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
TAG_PREFIX |
| Constructor and Description |
|---|
DatasetProfile(java.lang.String sessionId,
java.time.Instant sessionTimestamp) |
DatasetProfile(@NonNull java.lang.String sessionId,
@NonNull java.time.Instant sessionTimestamp,
java.time.Instant dataTimestamp,
@NonNull java.util.Map<java.lang.String,java.lang.String> tags,
@NonNull java.util.Map<java.lang.String,ColumnProfile> columns)
DEVELOPER API.
|
DatasetProfile(@NonNull java.lang.String sessionId,
@NonNull java.time.Instant sessionTimestamp,
@NonNull java.util.Map<java.lang.String,java.lang.String> tags)
Create a new Dataset profile
|
| Modifier and Type | Method and Description |
|---|---|
static DatasetProfile |
fromProtobuf(com.whylogs.core.message.DatasetProfileMessage message) |
java.util.Map<java.lang.String,ColumnProfile> |
getColumns() |
ModelProfile |
getModelProfile() |
DatasetProfile |
merge(@NonNull DatasetProfile other)
Merge the data of another
DatasetProfile into this one. |
DatasetProfile |
mergeStrict(@NonNull DatasetProfile other) |
static DatasetProfile |
parse(java.io.InputStream in) |
byte[] |
toBytes() |
java.util.Iterator<com.whylogs.core.message.MessageSegment> |
toChunkIterator() |
com.whylogs.core.message.DatasetProfileMessage.Builder |
toProtobuf() |
com.whylogs.core.message.DatasetSummary |
toSummary() |
void |
track(java.util.Map<java.lang.String,?> columns) |
void |
track(java.lang.String columnName,
java.lang.Object data) |
DatasetProfile |
withAllMetadata(java.util.Map<java.lang.String,java.lang.String> metadata) |
DatasetProfile |
withClassificationModel(java.lang.String prediction,
java.lang.String target,
java.lang.String score) |
DatasetProfile |
withClassificationModel(java.lang.String prediction,
java.lang.String target,
java.lang.String score,
java.lang.Iterable<java.lang.String> additionalOutputFields)
Returns a new dataset profile with the same backing datastructure.
|
DatasetProfile |
withMetadata(java.lang.String key,
java.lang.String value) |
DatasetProfile |
withRegressionModel(java.lang.String prediction,
java.lang.String target) |
DatasetProfile |
withRegressionModel(java.lang.String prediction,
java.lang.String target,
java.lang.Iterable<java.lang.String> additionalOutputFields) |
DatasetProfile |
withTag(java.lang.String key,
java.lang.String value) |
void |
writeTo(java.io.OutputStream out) |
public static final java.lang.String TAG_PREFIX
public DatasetProfile(@NonNull
@NonNull java.lang.String sessionId,
@NonNull
@NonNull java.time.Instant sessionTimestamp,
@Nullable
java.time.Instant dataTimestamp,
@NonNull
@NonNull java.util.Map<java.lang.String,java.lang.String> tags,
@NonNull
@NonNull java.util.Map<java.lang.String,ColumnProfile> columns)
sessionId - dataset namesessionTimestamp - the timestamp for the current profiling sessiondataTimestamp - the timestamp for the dataset. Used to aggregate across different cadencestags - tags of the datasetcolumns - the columns that we're copying over. Note that the source of columns should stop
using these column objects as they will back this DatasetProfile insteadpublic DatasetProfile(@NonNull
@NonNull java.lang.String sessionId,
@NonNull
@NonNull java.time.Instant sessionTimestamp,
@NonNull
@NonNull java.util.Map<java.lang.String,java.lang.String> tags)
sessionId - the name of the dataset profilesessionTimestamp - the timestamp for this runtags - the tags to track the dataset withpublic DatasetProfile(java.lang.String sessionId,
java.time.Instant sessionTimestamp)
public java.util.Map<java.lang.String,ColumnProfile> getColumns()
public ModelProfile getModelProfile()
public DatasetProfile withTag(java.lang.String key, java.lang.String value)
public DatasetProfile withMetadata(java.lang.String key, java.lang.String value)
public DatasetProfile withAllMetadata(java.util.Map<java.lang.String,java.lang.String> metadata)
public void track(java.lang.String columnName,
java.lang.Object data)
public void track(java.util.Map<java.lang.String,?> columns)
public DatasetProfile withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score, java.lang.Iterable<java.lang.String> additionalOutputFields)
public DatasetProfile withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score)
public DatasetProfile withRegressionModel(java.lang.String prediction, java.lang.String target)
public DatasetProfile withRegressionModel(java.lang.String prediction, java.lang.String target, java.lang.Iterable<java.lang.String> additionalOutputFields)
public com.whylogs.core.message.DatasetSummary toSummary()
public java.util.Iterator<com.whylogs.core.message.MessageSegment> toChunkIterator()
public DatasetProfile mergeStrict(@NonNull @NonNull DatasetProfile other)
public DatasetProfile merge(@NonNull @NonNull DatasetProfile other)
DatasetProfile into this one.
We will only retain the shared tags and share metadata. The timestamps are copied over from this dataset. It is the responsibility of the user to ensure that the two datasets are matched on important grouping information
other - a DatasetProfileDatasetProfile with summed up columnspublic com.whylogs.core.message.DatasetProfileMessage.Builder toProtobuf()
public void writeTo(java.io.OutputStream out)
throws java.io.IOException
java.io.IOExceptionpublic byte[] toBytes()
throws java.io.IOException
java.io.IOException@Nullable public static DatasetProfile fromProtobuf(@Nullable com.whylogs.core.message.DatasetProfileMessage message)
public static DatasetProfile parse(java.io.InputStream in) throws java.io.IOException
java.io.IOException