Optimized algorithm to match entities together based on heuristics

I've encountered a problem which I think is rather suited to be solved using a Constraint Satisfaction Problem algorithm. I am however, not entirely certain that this is the best approach, as the specifics seem to deviate a bit from the classical CSP problems. The problem: I've got a list of entities that I need to match together based on a heuristic, which has been calculated using a distance function. The list might look like this: Entity A - H(euristic): 0.1 -> Entity B - H: 0.25 -> Entity D Entity B - H: 0.1 -> Entity A - H: 0.4 -> Entity C Entity C - H: 0.4 -> Entity B Entity D - H: 0.25 -> Entity A Now what I wish to accomplish is to match these entities together 1 to 1, where the only constraint of the problem, is that any entity can only be matched with 1 entity. Which in the case above, obviously mean that the solution would be 2 sets of entities linked together. Now this might be an easy task, but this is where my problem deviates from the CSP problems I've encountered in the past. I wish to find a solution based on the following criterias: - 1. Most amount of matches (Some entities might not know of more than 1 other entity, if the heuristic is too high, it will not be included in the list of possible relationships). - 2. Lowest overall cost of matching entities. Another main feature of this problem, is that it doesn't have to be solved to perfection. Not every Entity needs to have a match. If leaving out a single entity for matching will result in 2 more matches down the road, then that would be prefered. And by the way, a solution to the example above based on the mentioned criterias could be the following: - Entity A Entity D - Entity B Entity C If the entity list was bigger, and there were to be two solutions with the same amount of matches, I'd obviously want to compare these on the heuristic cost, and pick the lowest of these. Now the question is what kind of algorithm would be the best fit for this kind of problem? If CSP is the choice, then how would I go about allowing it to return an (unsolved) dataset, but pick the best partial solution with the least amount of iterations? Edit: Example of dataset that's been run through the algorithm dan1111 provided. I forgot to mention that the higher the heuristic, the better in this case. Should've done a 1-h, before I took the screenshot.

Mar 11, 2025 - 20:34

Optimized algorithm to match entities together based on heuristics

I've encountered a problem which I think is rather suited to be solved using a Constraint Satisfaction Problem algorithm. I am however, not entirely certain that this is the best approach, as the specifics seem to deviate a bit from the classical CSP problems.

The problem:

I've got a list of entities that I need to match together based on a heuristic, which has been calculated using a distance function.

The list might look like this:

Entity A
    - H(euristic): 0.1 -> Entity B
    - H: 0.25 -> Entity D

Entity B
    - H: 0.1 -> Entity A
    - H: 0.4 -> Entity C

Entity C
    - H: 0.4 -> Entity B

Entity D
    - H: 0.25 -> Entity A

Now what I wish to accomplish is to match these entities together 1 to 1, where the only constraint of the problem, is that any entity can only be matched with 1 entity. Which in the case above, obviously mean that the solution would be 2 sets of entities linked together.

Now this might be an easy task, but this is where my problem deviates from the CSP problems I've encountered in the past.

I wish to find a solution based on the following criterias:

 - 1. Most amount of matches (Some entities might not know of more than 1 other entity, if the heuristic is too high, it will not be included in the list of possible relationships).
 - 2. Lowest overall cost of matching entities.

Another main feature of this problem, is that it doesn't have to be solved to perfection. Not every Entity needs to have a match. If leaving out a single entity for matching will result in 2 more matches down the road, then that would be prefered.

And by the way, a solution to the example above based on the mentioned criterias could be the following:

- Entity A <--> Entity D
- Entity B <--> Entity C

If the entity list was bigger, and there were to be two solutions with the same amount of matches, I'd obviously want to compare these on the heuristic cost, and pick the lowest of these.

Now the question is what kind of algorithm would be the best fit for this kind of problem? If CSP is the choice, then how would I go about allowing it to return an (unsolved) dataset, but pick the best partial solution with the least amount of iterations?

Edit:

Example of dataset that's been run through the algorithm dan1111 provided.

I forgot to mention that the higher the heuristic, the better in this case. Should've done a 1-h, before I took the screenshot.