Class MergeTools

java.lang.Object
com.ssgllc.fish.service.util.published.MergeTools

@Component public class MergeTools extends Object
  • Method Summary

    Modifier and Type
    Method
    Description
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    Double
    calculateMatchScore(DtoT a, DtoT b, String matchExpr)
    Find the match score between two entities without attempting to merge them
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity>
    Double
    calculateMatchScore(EntityT a, EntityT b)
    Find the match score between two domain entities without attempting to merge them, using the configured match expression for the entity type.
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity>
    Double
    calculateMatchScore(EntityT a, EntityT b, String matchExpr)
    Find the match score between two domain entities without attempting to merge them.
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.service.dto.CasetivityMergeDTO<?>
    dryRunMerge(DtoT a, DtoT b)
    Dry run the result of merging two transient (not-yet-persisted) entities provided as DTOs.
    static com.ssgllc.fish.service.dto.CasetivityMergeDTO<?>
    dryRunMerge(String entityType, String entityId, String matchId)
    Dry run the result of merging two already-persisted entities by their IDs.
    static String
    Get the blocking set query configured for the given entity type.
    static Map<String,Double>
    getDedupeFeatures(com.ssgllc.fish.domain.CasetivityPerson a, com.ssgllc.fish.domain.CasetivityPerson b)
    Compute the named feature scores used by the ML match model for two person entities.
    static <T extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.dedupe.base.DedupeGroupDTO<T>
    getMatches(T dto)
    Find the existing entities matching a proposed one, using the existing dedupe configuration.
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT>
    getMatchesForBlockQueryAndMatchExpr(DtoT dto, String matchExprOverride, String blockQuery)
    Find the existing entities matching a proposed one, using a custom match expression and blocking set query It's recommended that you use the version that takes a blocking List instead, to avoid putting raw SQL in your scripts
    static <T extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.dedupe.base.DedupeGroupDTO<T>
    getMatchesForMatchExpr(T dto, String matchExprOverride)
    Find the existing entities matching a proposed one, using a custom match expression
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT>
    getMatchesInBlockingSet(DtoT dto, List<DtoT> blockingSet)
    Find matches between a proposed entity and a given set, using the configured match expression
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT>
    getMatchesInBlockingSetForMatchExpr(DtoT dto, List<DtoT> blockingSet, String matchExprOverride)
    Find matches between a proposed entity and a given set, using a custom match expression
    static String
    Get the match expression configured for the given entity type.
    static double
    Get the match score threshold configured for the given entity type.
    static double
    Get the not-match score threshold configured for the given entity type.
    static <DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    boolean
    markEntitiesDontMatch(DtoT dto, DtoT other)
    If the two entities are possible matches, marks them as not matching.
    static boolean
    markEntitiesDontMatch(String entityType, String entityId, String otherId)
    If the two entities are possible matches, marks them as not matching.
    static <MergeT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMrg<EntityT>, EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, MergeT>>
    boolean
    markEntitiesDontMatch(String entityType, UUID entityId, UUID otherId)
    If the two entities are possible matches, marks them as not matching.
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    DtoT
    merge(DtoT a, DtoT b)
    Merge two entities provided as DTOs.
    static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO>
    DtoT
    merge(String entityType, String entityId, String matchId)
    Merge two already-persisted entities by their IDs.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • dryRunMerge

      public static com.ssgllc.fish.service.dto.CasetivityMergeDTO<?> dryRunMerge(String entityType, String entityId, String matchId) throws ClassNotFoundException
      Dry run the result of merging two already-persisted entities by their IDs.

      This is the preferred overload for entities that already exist in the database. Both entities are loaded directly by ID without saving either one, so neither entity's lastModifiedDate is modified. Completeness scoring uses lastModifiedDate to break ties between otherwise-equal records (the more-recently-modified entity wins), so this overload gives stable, deterministic results.

      For entities that have not yet been persisted, use dryRunMerge(CasetivityEntityDTO, CasetivityEntityDTO) instead.

      Parameters:
      entityType - the entity type name (e.g. "Person")
      entityId - the ID of the first entity
      matchId - the ID of the second entity
      Returns:
      the merge result; .getResult() gives the post-merge entity state
      Throws:
      ClassNotFoundException - if entityType does not correspond to a known entity type
      IllegalArgumentException - if the entity type is not configured for merging
    • dryRunMerge

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.service.dto.CasetivityMergeDTO<?> dryRunMerge(DtoT a, DtoT b) throws ClassNotFoundException
      Dry run the result of merging two transient (not-yet-persisted) entities provided as DTOs. No changes are permanently committed.

      This overload is intended for entities that have not yet been saved to the database. Both entities are saved inside the dry-run transaction to assign them IDs, then the merge test runs. Because the transaction rolls back when the dry run completes, neither save is permanently committed — but both entities receive an updated lastModifiedDate during the transaction. Since completeness scoring uses lastModifiedDate to break ties (the more-recently-modified entity wins), calling this overload on already-persisted entities can cause both timestamps to converge and make the tiebreaker non-deterministic.

      For entities that are already persisted, prefer dryRunMerge(String, String, String) — it bypasses the save step entirely and does not affect lastModifiedDate which is used for merge logic.

      Parameters:
      a - first entity DTO to merge
      b - second entity DTO to merge
      Returns:
      the merge result; .getResult() gives the post-merge entity state
      Throws:
      ClassNotFoundException - if the entity type of the arguments isn't mergeable
    • merge

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> DtoT merge(String entityType, String entityId, String matchId) throws ClassNotFoundException
      Merge two already-persisted entities by their IDs.

      This is the preferred overload for entities that already exist in the database. Both entities are loaded directly by ID without saving either one, so neither entity's lastModifiedDate is modified. Completeness scoring uses lastModifiedDate to break ties between otherwise-equal records (the more-recently-modified entity wins), so this overload gives stable, deterministic results.

      For entities that have not yet been persisted, use merge(CasetivityEntityDTO, CasetivityEntityDTO) instead.

      Parameters:
      entityType - the entity type name (e.g. "Person")
      entityId - the ID of the first entity
      matchId - the ID of the second entity
      Returns:
      the merge result DTO for the surviving entity
      Throws:
      ClassNotFoundException - if entityType does not correspond to a known entity type
      IllegalArgumentException - if the entity type is not configured for merging
    • merge

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, ?>, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> DtoT merge(DtoT a, DtoT b) throws ClassNotFoundException
      Merge two entities provided as DTOs.

      This overload is intended for transient entities that have not yet been saved to the database. For already-persisted entities, prefer merge(String, String, String) to avoid modifying lastModifiedDate, which affects which entity's field values are used in the merged result.

      Parameters:
      a - first entity to merge
      b - second entity to merge
      Returns:
      the merge result DTO for the surviving entity
      Throws:
      ClassNotFoundException - if the entity type of the arguments isn't mergeable
    • getMatches

      public static <T extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.dedupe.base.DedupeGroupDTO<T> getMatches(T dto) throws ClassNotFoundException
      Find the existing entities matching a proposed one, using the existing dedupe configuration.
      Parameters:
      dto - entity to match
      Returns:
      result object containing matches, potential matches, and match scores
      Throws:
      ClassNotFoundException - if the given entity's type isn't mergeable
    • getMatchesForMatchExpr

      public static <T extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.dedupe.base.DedupeGroupDTO<T> getMatchesForMatchExpr(T dto, String matchExprOverride) throws ClassNotFoundException
      Find the existing entities matching a proposed one, using a custom match expression
      Parameters:
      dto - entity to match
      matchExprOverride - String containing a (SPEL-based) match expression to use for this operation, if null, will use default setting for entity
      Returns:
      result object containing matches, potential matches, and match scores
      Throws:
      ClassNotFoundException - if the given entity's type isn't mergeable
    • getMatchesForBlockQueryAndMatchExpr

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT> getMatchesForBlockQueryAndMatchExpr(DtoT dto, String matchExprOverride, String blockQuery) throws ClassNotFoundException
      Find the existing entities matching a proposed one, using a custom match expression and blocking set query It's recommended that you use the version that takes a blocking List instead, to avoid putting raw SQL in your scripts
      Parameters:
      dto - entity to match
      matchExprOverride - String containing a (SPEL-based) match expression to use for this operation, if null, will use default setting for entity
      blockQuery - String containing SQL query used to select which entities are considered for matching
      Returns:
      result object containing matches, potential matches, and match scores
      Throws:
      ClassNotFoundException - if the given entity's type isn't mergeable
    • getMatchesInBlockingSet

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT> getMatchesInBlockingSet(DtoT dto, List<DtoT> blockingSet) throws ClassNotFoundException
      Find matches between a proposed entity and a given set, using the configured match expression
      Parameters:
      dto - entity to match
      blockingSet - list of entities to compare, which can but don't need to exist in the database
      Returns:
      object containing matches, potential matches, and match scores
      Throws:
      ClassNotFoundException - if the given entity's type isn't mergeable
    • getMatchesInBlockingSetForMatchExpr

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> com.ssgllc.fish.dedupe.base.DedupeGroupDTO<DtoT> getMatchesInBlockingSetForMatchExpr(DtoT dto, List<DtoT> blockingSet, String matchExprOverride) throws ClassNotFoundException
      Find matches between a proposed entity and a given set, using a custom match expression
      Parameters:
      dto - entity to match
      blockingSet - list of entities to compare, which can but don't need to exist in the database
      matchExprOverride - String containing a (SPEL-based) match expression to use for this operation, if null, will use default setting for entity
      Returns:
      object containing matches, potential matches, and match scores
      Throws:
      ClassNotFoundException - if the given entity's type isn't mergeable
    • getMatchExpression

      public static String getMatchExpression(String entityName)
      Get the match expression configured for the given entity type. Scores at or above the match threshold indicate an automatic match.
      Parameters:
      entityName - the entity type name (e.g. "Person")
      Returns:
      the SPEL-based match expression, or the system default if none is configured
    • getMatchThreshold

      public static double getMatchThreshold(String entityName)
      Get the match score threshold configured for the given entity type. Entities whose match score meets or exceeds this value are considered automatic matches.
      Parameters:
      entityName - the entity type name (e.g. "Person")
      Returns:
      the match threshold, or the system default if none is configured
    • getNotMatchThreshold

      public static double getNotMatchThreshold(String entityName)
      Get the not-match score threshold configured for the given entity type. Entities whose match score falls at or below this value are considered definite non-matches. Scores between this value and the match threshold are treated as possible matches.
      Parameters:
      entityName - the entity type name (e.g. "Person")
      Returns:
      the not-match threshold, or the system default if none is configured
    • getBlockingSetQuery

      public static String getBlockingSetQuery(String entityName)
      Get the blocking set query configured for the given entity type. The blocking set query is a SQL query used to select candidate entities for dedupe comparison.
      Parameters:
      entityName - the entity type name (e.g. "Person")
      Returns:
      the blocking set query string, or null if none is configured
    • calculateMatchScore

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity, DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> Double calculateMatchScore(DtoT a, DtoT b, String matchExpr) throws ClassNotFoundException
      Find the match score between two entities without attempting to merge them
      Parameters:
      a - first entity to compare
      b - second entity to compare
      matchExpr - match expression to use
      Returns:
      the numeric match score
      Throws:
      ClassNotFoundException - if the entity type of the arguments isn't mergeable
    • calculateMatchScore

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity> Double calculateMatchScore(EntityT a, EntityT b)
      Find the match score between two domain entities without attempting to merge them, using the configured match expression for the entity type.
      Groovy example:
      def a = new Person().firstName("Jane").lastName("Doe")
      def b = new Person().firstName("Jane").lastName("Doe")
      def score = mergeTools.calculateMatchScore(a, b)
      Returns: a Double score between 0.0 (no match) and 1.0 (exact match)
      Parameters:
      a - first entity to compare
      b - second entity to compare
      Returns:
      the numeric match score
    • calculateMatchScore

      public static <EntityT extends com.ssgllc.fish.domain.CasetivityEntity> Double calculateMatchScore(EntityT a, EntityT b, String matchExpr)
      Find the match score between two domain entities without attempting to merge them. Prefer this overload over the DTO variant when entities are already in scope — it skips the DTO-to-entity conversion that the DTO overload performs internally.
      Groovy example:
      def a = bpmUtil.getEntity("Person", idA)
      def b = bpmUtil.getEntity("Person", idB)
      def score = mergeTools.calculateMatchScore(a, b, mergeTools.getMatchExpression("Person"))
      Returns: a Double score between 0.0 and 1.0
      Parameters:
      a - first entity to compare
      b - second entity to compare
      matchExpr - match expression to use
      Returns:
      the numeric match score
    • getDedupeFeatures

      public static Map<String,Double> getDedupeFeatures(com.ssgllc.fish.domain.CasetivityPerson a, com.ssgllc.fish.domain.CasetivityPerson b)
      Compute the named feature scores used by the ML match model for two person entities. Useful for inspecting which individual features drive a high or low match score.
      Groovy example:
      def a = bpmUtil.getEntity("Person", idA)
      def b = bpmUtil.getEntity("Person", idB)
      def features = mergeTools.getDedupeFeatures(a, b)
      Returns: a map of feature name to numeric score (e.g. {firstNameScore=0.9, lastNameScore=1.0})
      Parameters:
      a - first person entity
      b - second person entity
      Returns:
      map of feature name to numeric score
    • markEntitiesDontMatch

      public static <DtoT extends com.ssgllc.fish.service.dto.CasetivityEntityDTO> boolean markEntitiesDontMatch(DtoT dto, DtoT other) throws ReflectiveOperationException
      If the two entities are possible matches, marks them as not matching. Otherwise does nothing.
      Parameters:
      dto - the first entity
      other - the entity matching the first
      Returns:
      whether any merge records were affected
      Throws:
      ReflectiveOperationException
    • markEntitiesDontMatch

      public static boolean markEntitiesDontMatch(String entityType, String entityId, String otherId) throws ReflectiveOperationException
      If the two entities are possible matches, marks them as not matching. Otherwise does nothing.
      Parameters:
      entityId - the ID of the first entity
      otherId - the ID of the entity matching the first
      Returns:
      whether any merge records were affected
      Throws:
      ReflectiveOperationException
    • markEntitiesDontMatch

      public static <MergeT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMrg<EntityT>, EntityT extends com.ssgllc.fish.domain.CasetivityEntity & com.ssgllc.fish.domain.CasetivityMergeable<EntityT, MergeT>> boolean markEntitiesDontMatch(String entityType, UUID entityId, UUID otherId) throws ReflectiveOperationException
      If the two entities are possible matches, marks them as not matching. Otherwise does nothing.
      Parameters:
      entityId - the ID of the first entity
      otherId - the ID of the entity matching the first
      Returns:
      whether any merge records were affected
      Throws:
      ReflectiveOperationException