Class LicenseCompareHelper

java.lang.Object
org.spdx.utility.compare.LicenseCompareHelper

public class LicenseCompareHelper extends Object
Primarily a static class of helper functions for comparing two SPDX licenses
Author:
Gary O'Neall
  • Field Details

    • TOKEN_SPLIT_REGEX

      protected static final String TOKEN_SPLIT_REGEX
      See Also:
    • TOKEN_SPLIT_PATTERN

      protected static final Pattern TOKEN_SPLIT_PATTERN
    • PUNCTUATION

      protected static final Set<String> PUNCTUATION
    • SKIPPABLE_TOKENS

      protected static final Set<String> SKIPPABLE_TOKENS
    • NORMALIZE_TOKENS

      protected static final Map<String,String> NORMALIZE_TOKENS
    • CROSS_REF_NUM_WORDS_MATCH

      protected static final Integer CROSS_REF_NUM_WORDS_MATCH
    • REGEX_QUANTIFIER_PATTERN

      protected static final Pattern REGEX_QUANTIFIER_PATTERN
  • Constructor Details

    • LicenseCompareHelper

      public LicenseCompareHelper()
  • Method Details

    • isLicenseTextEquivalent

      public static boolean isLicenseTextEquivalent(String licenseTextA, String licenseTextB)
      Returns true if two sets of license text is considered a match per the SPDX License matching guidelines documented at spdx.org (currently http://spdx.org/wiki/spdx-license-list-match-guidelines) There are 2 unimplemented features - bullets/numbering is not considered and comments with no whitespace between text is not skipped
      Parameters:
      licenseTextA -
      licenseTextB -
      Returns:
    • removeLineSeparators

      public static String removeLineSeparators(String s)
      Parameters:
      s - Input string
      Returns:
      s without any line separators (---, ***, ===)
    • removeCommentChars

      public static String removeCommentChars(String s)
      Remove common comment characters from either a template or license text strings
      Parameters:
      s -
      Returns:
    • normalizeText

      public static String normalizeText(String s)
      Normalize quotes and no-break spaces
      Parameters:
      s - String to normalize
      Returns:
      String normalized for comparison
    • locateOriginalText

      public static String locateOriginalText(String fullLicenseText, int startToken, int endToken, Map<Integer,LineColumn> tokenToLocation, String[] tokens)
      Locate the original text starting with the start token and ending with the end token
      Parameters:
      fullLicenseText -
      startToken -
      endToken -
      tokenToLocation -
      Returns:
    • tokenizeLicenseText

      public static String[] tokenizeLicenseText(String licenseText, Map<Integer,LineColumn> tokenToLocation)
      Tokenizes the license text, normalizes quotes, lowercases and converts multi-words for better equiv. comparisons
      Parameters:
      tokenToLocation - location for all of the tokens by line and column
      licenseText -
      Returns:
      tokens
    • getFirstLicenseToken

      public static String getFirstLicenseToken(String text)
      Parameters:
      text -
      Returns:
      the first token in the license text
    • isSingleTokenString

      public static boolean isSingleTokenString(String text)
      Parameters:
      text -
      Returns:
      true if the text contains a single token
    • isLicenseEqual

      public static boolean isLicenseEqual(AnyLicenseInfo license1, AnyLicenseInfo license2, Map<String,String> xlationMap) throws SpdxCompareException, InvalidSPDXAnalysisException
      Compares two licenses from potentially two different documents which may have different license ID's for the same license
      Parameters:
      license1 -
      license2 -
      xlationMap - Mapping the license ID's from license 1 to license 2
      Returns:
      Throws:
      SpdxCompareException
      InvalidSPDXAnalysisException
    • getNonOptionalLicenseText

      @Deprecated public static List<String> getNonOptionalLicenseText(String licenseTemplate, boolean includeVarText) throws SpdxCompareException
      Deprecated.
      The TemplateRegexMatcher class should be used in place of this method. This method will be removed in the next major release. Get the text of a license minus any optional text - note: this include the default variable text
      Parameters:
      licenseTemplate - license template containing optional and var tags
      includeVarText - if true, include the default variable text; if false remove the variable text
      Returns:
      list of strings for all non-optional license text.
      Throws:
      SpdxCompareException
    • getNonOptionalLicenseText

      public static List<String> getNonOptionalLicenseText(String licenseTemplate, FilterTemplateOutputHandler.VarTextHandling varTextHandling) throws SpdxCompareException
      Get the text of a license minus any optional text
      Parameters:
      licenseTemplate - license template containing optional and var tags
      varTextHandling - include original, exclude, or include the regex (enclosed with "~~~") for "var" text
      Returns:
      list of strings for all non-optional license text.
      Throws:
      SpdxCompareException
    • getNonOptionalLicenseText

      public static List<String> getNonOptionalLicenseText(String licenseTemplate, FilterTemplateOutputHandler.VarTextHandling varTextHandling, FilterTemplateOutputHandler.OptionalTextHandling optionalTextHandling) throws SpdxCompareException
      Get the text of a license converting variable and optional text according to the options
      Parameters:
      licenseTemplate - license template containing optional and var tags
      varTextHandling - include original, exclude, or include the regex (enclosed with "~~~") for "var" text
      optionalTextHandling - include optional text, exclude, or include a regex for the optional text
      Returns:
      list of strings for all non-optional license text.
      Throws:
      SpdxCompareException
    • nonOptionalTextToPatterns

      @Deprecated public static org.apache.commons.lang3.tuple.Pair<Pattern,Pattern> nonOptionalTextToPatterns(List<String> nonOptionalText, int numberOfWords)
      Deprecated.
      The TemplateRegexMatcher class should be used in place of this method. This method will be removed in the next major release. Creates a regular expression pattern to match the start of a license text This method should be replaced by the TemplateRegexMatcher class and methods
      Parameters:
      nonOptionalText - List of strings of non-optional text from the license template (see List<String> getNonOptionalLicenseText)
      numberOfWords - Number of words to use in the match
      Returns:
      A pair of Patterns the first of which will match the start of the license text the second of which will match the end of the license
    • isTextMatchingTemplate

      public static CompareTemplateOutputHandler.DifferenceDescription isTextMatchingTemplate(String template, String compareText) throws SpdxCompareException, InvalidSPDXAnalysisException
      Parameters:
      template - Template in the standard template format used for comparison
      compareText - Text to compare using the template
      Returns:
      any differences found
      Throws:
      SpdxCompareException
      InvalidSPDXAnalysisException
    • isTextStandardLicense

      public static CompareTemplateOutputHandler.DifferenceDescription isTextStandardLicense(License license, String compareText) throws SpdxCompareException, InvalidSPDXAnalysisException
      Compares license text to the license text of an SPDX Standard License
      Parameters:
      license - SPDX Standard License to compare
      compareText - Text to compare to the standard license
      Returns:
      any differences found
      Throws:
      SpdxCompareException
      InvalidSPDXAnalysisException
    • isTextStandardException

      Compares exception text to the exception text of an SPDX Standard exception
      Parameters:
      exception - SPDX Standard exception to compare
      compareText - Text to compare to the standard exceptions
      Returns:
      any differences found
      Throws:
      SpdxCompareException
      InvalidSPDXAnalysisException
    • isStandardLicenseWithinText

      public static boolean isStandardLicenseWithinText(String text, SpdxListedLicense license)
      Detect if a text contains the standard license (perhaps along with other text before and/or after)
      Parameters:
      text - The text to search within (should not be null)
      license - The standard SPDX license to search for (should not be null)
      Returns:
      True if the license is found within the text, false otherwise (or if either argument is null)
    • isStandardLicenseExceptionWithinText

      public static boolean isStandardLicenseExceptionWithinText(String text, ListedLicenseException exception)
      Detect if a text contains the standard license exception (perhaps along with other text before and/or after)
      Parameters:
      text - The text to search within (should not be null)
      exception - The standard SPDX license exception to search for (should not be null)
      Returns:
      True if the license exception is found within the text, false otherwise (or if either argument is null)
    • matchingStandardLicenseIds

      public static String[] matchingStandardLicenseIds(String licenseText) throws InvalidSPDXAnalysisException, SpdxCompareException
      Returns a list of SPDX Standard License ID's that match the text provided using the SPDX matching guidelines.
      Parameters:
      licenseText - Text to compare to the standard license texts
      Returns:
      Array of SPDX standard license IDs that match
      Throws:
      InvalidSPDXAnalysisException - If an error occurs accessing the standard licenses
      SpdxCompareException - If an error occurs in the comparison
    • matchingStandardLicenseIdsWithinText

      public static List<String> matchingStandardLicenseIdsWithinText(String text, List<String> licenseIds) throws InvalidSPDXAnalysisException, SpdxCompareException
      Returns a list of SPDX Standard License ID's from the provided list that were found within the text, using the SPDX matching guidelines.
      Parameters:
      text - Text to compare to
      licenseIds - License ids to compare against
      Returns:
      List of SPDX standard license IDs from licenseIds that match
      Throws:
      InvalidSPDXAnalysisException - If an error occurs accessing the standard licenses
      SpdxCompareException - If an error occurs in the comparison
    • matchingStandardLicenseIdsWithinText

      public static List<String> matchingStandardLicenseIdsWithinText(String text) throws InvalidSPDXAnalysisException, SpdxCompareException
      Returns a list of SPDX Standard License ID's that were found within the text, using the SPDX matching guidelines.
      Parameters:
      text - Text to compare to all of the standard licenses
      Returns:
      List of SPDX standard license IDs that match
      Throws:
      InvalidSPDXAnalysisException - If an error occurs accessing the standard licenses
      SpdxCompareException - If an error occurs in the comparison
    • matchingStandardLicenseExceptionIdsWithinText

      public static List<String> matchingStandardLicenseExceptionIdsWithinText(String text, List<String> licenseExceptionIds) throws InvalidSPDXAnalysisException, SpdxCompareException
      Returns a list of SPDX Standard License Exception ID's from the provided list that were found within the text, using the SPDX matching guidelines.
      Parameters:
      text - Text to compare to
      licenseExceptionIds - License Exceptions Ids to compare against
      Returns:
      Array of SPDX standard license exception IDs from licenseExceptionIds that match
      Throws:
      InvalidSPDXAnalysisException - If an error occurs accessing the standard license exceptions
      SpdxCompareException - If an error occurs in the comparison
    • matchingStandardLicenseExceptionIdsWithinText

      public static List<String> matchingStandardLicenseExceptionIdsWithinText(String text) throws InvalidSPDXAnalysisException, SpdxCompareException
      Returns a list of SPDX Standard License Exception ID's that were found within the text, using the SPDX matching guidelines.
      Parameters:
      text - Text to compare to all of the standard license exceptions
      Returns:
      Array of SPDX standard license exception IDs that match
      Throws:
      InvalidSPDXAnalysisException - If an error occurs accessing the standard license exceptions
      SpdxCompareException - If an error occurs in the comparison
    • isLicensePassBlackList

      public static boolean isLicensePassBlackList(AnyLicenseInfo license, String... blackList) throws InvalidSPDXAnalysisException
      Detect if a license pass black lists
      Parameters:
      license - license
      blackList - license black list
      Returns:
      if the license pass black lists
      Throws:
      InvalidSPDXAnalysisException
    • isLicensePassWhiteList

      public static boolean isLicensePassWhiteList(AnyLicenseInfo license, String... whiteList) throws InvalidSPDXAnalysisException
      Detect if a license pass white lists
      Parameters:
      license - license
      whiteList - license white list
      Returns:
      if the license pass white lists
      Throws:
      InvalidSPDXAnalysisException