Package no.nav.arxaas.model.risk
Class ReIdentificationRisk
- java.lang.Object
-
- no.nav.arxaas.model.risk.ReIdentificationRisk
-
public class ReIdentificationRisk extends Object
-
-
Field Summary
Fields Modifier and Type Field Description private AttackerSuccessattackerSuccessRateprivate Map<String,Double>measuresprivate StringpopulationModelprivate List<String>quasiIdentifiersprivate static doubleTHRESHOLD
-
Constructor Summary
Constructors Constructor Description ReIdentificationRisk(Map<String,Double> measures, AttackerSuccess attackerSuccessRate, List<String> quasiIdentifiers, String populationModel)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private static doubleaverageProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the average prosecutor re-identification risk found in the data set, based on the population model that is defined.static ReIdentificationRiskcreate(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)booleanequals(Object o)private static doubleestimatedJournalistRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the estimated journalist re-identification risk found in the data set, based on the population model that is defined.private static doubleestimatedMarketerRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the estimated marketer re-identification risk found in the data set, based on the population model that is defined.private static doubleestimatedProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the estimated prosecutor re-identification risk found in the data set, based on the population model that is defined.AttackerSuccessgetAttackerSuccessRate()Map<String,Double>getMeasures()StringgetPopulationModel()List<String>getQuasiIdentifiers()inthashCode()private static doublehighestJournalistRisk(org.deidentifier.arx.risk.RiskModelSampleSummary riskModelSampleSummary)Returns a double that shows the highest journalist re-identification risk found in the data set, based on the population model that is defined.private static doublehighestProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the highest prosecutor re-identification risk found in the data set, based on the population model that is defined.private static doublelowestProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)Returns a double that shows the lowest prosecutor re-identification risk found in the data set, based on the population model that is defined.private static org.deidentifier.arx.risk.RiskModelPopulationUniqueness.PopulationUniquenessModelpopulationUniquenessModel(org.deidentifier.arx.risk.RiskEstimateBuilder builder)Returns the method name used to estimating population uniqueness that assumes that the data set is a uniform sample of the population.private static doublepopulationUniques(org.deidentifier.arx.risk.RiskEstimateBuilder builder)Returns a double that shows the amount of unique records/fields in the data set, which are also unique within the underlying population model from which the data is a part of.private static List<String>quasiIdentifiers(org.deidentifier.arx.DataHandle data)Returns a set of strings that contains field names from the data set that has an attribute type of quasi-identifyingprivate static doublerecordsAffectByRisk(org.deidentifier.arx.risk.RiskModelSampleRiskDistribution sampleRiskDistribution, double risk)Returns a double that shows the amount of records/fields that are affected by a specific amount of risk.private static Map<String,Double>riskMeasures(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)private static doublesampleUniques(org.deidentifier.arx.risk.RiskModelSampleUniqueness riskModelSampleUniqueness)Returns a double that shows the amount of unique records/fields in the data set.StringtoString()
-
-
-
Field Detail
-
THRESHOLD
private static final double THRESHOLD
- See Also:
- Constant Field Values
-
attackerSuccessRate
private final AttackerSuccess attackerSuccessRate
-
populationModel
private final String populationModel
-
-
Method Detail
-
create
public static ReIdentificationRisk create(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)
-
riskMeasures
private static Map<String,Double> riskMeasures(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)
-
lowestProsecutorRisk
private static double lowestProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the lowest prosecutor re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- lowest risk found in the data set
-
recordsAffectByRisk
private static double recordsAffectByRisk(org.deidentifier.arx.risk.RiskModelSampleRiskDistribution sampleRiskDistribution, double risk)Returns a double that shows the amount of records/fields that are affected by a specific amount of risk.- Parameters:
sampleRiskDistribution- RiskModelSampleRiskDistribution for the datasetrisk- specific amount of risk that affects one or more records- Returns:
- records affect by a specific amount of risk
-
averageProsecutorRisk
private static double averageProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the average prosecutor re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- average risk found in the data set
-
highestProsecutorRisk
private static double highestProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the highest prosecutor re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- highest prosecutor risk found in the data set
-
estimatedProsecutorRisk
private static double estimatedProsecutorRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the estimated prosecutor re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- estimated prosecutor risk found in the data set
-
highestJournalistRisk
private static double highestJournalistRisk(org.deidentifier.arx.risk.RiskModelSampleSummary riskModelSampleSummary)
Returns a double that shows the highest journalist re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleSummary- containing summary of the dataset risks- Returns:
- highest journalist risk found in the data set
-
estimatedJournalistRisk
private static double estimatedJournalistRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the estimated journalist re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- estimated journalist risk found in the data set
-
estimatedMarketerRisk
private static double estimatedMarketerRisk(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
Returns a double that shows the estimated marketer re-identification risk found in the data set, based on the population model that is defined.- Parameters:
riskModelSampleRisks- SampleRisks for the dataset- Returns:
- estimated marketer risk found in the data set
-
sampleUniques
private static double sampleUniques(org.deidentifier.arx.risk.RiskModelSampleUniqueness riskModelSampleUniqueness)
Returns a double that shows the amount of unique records/fields in the data set.- Parameters:
riskModelSampleUniqueness- RiskModelSampleUniqueness for the dataset- Returns:
- amount of unique records/fields found in the data set
-
populationUniques
private static double populationUniques(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
Returns a double that shows the amount of unique records/fields in the data set, which are also unique within the underlying population model from which the data is a part of.- Parameters:
builder- RiskEstimateBuilder for the dataset- Returns:
- amount of unique records/fields found in the data set which are also unique in the population model
-
populationUniquenessModel
private static org.deidentifier.arx.risk.RiskModelPopulationUniqueness.PopulationUniquenessModel populationUniquenessModel(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
Returns the method name used to estimating population uniqueness that assumes that the data set is a uniform sample of the population.- Parameters:
builder- RiskEstimateBuilder for the dataset- Returns:
- PopulationUniquenessModel for det dataset
-
quasiIdentifiers
private static List<String> quasiIdentifiers(org.deidentifier.arx.DataHandle data)
Returns a set of strings that contains field names from the data set that has an attribute type of quasi-identifying- Parameters:
data- tabular data set to be analysed against re-identification risk- Returns:
- set of strings containing quasi-identifying fields
-
getAttackerSuccessRate
public AttackerSuccess getAttackerSuccessRate()
-
getPopulationModel
public String getPopulationModel()
-
-