Union-Find (Disjoint Set)
Main idea Imagine you're invited to a party, and as soon as you arrive, the person who invited you introduces you to someone else at the event. You both shake hands and form a connection, starting to build a network of people you know. But it doesn't stop there – the new person you meet introduces you to one of their friends, and now you're connected with even more people at the party. As you continue to make more connections, you start to realize that you're slowly becoming part of a larger group of people who are all connected, either directly or through someone else. Initially, when you first arrive, you're part of a single isolated group – just you and your inviter. But as more introductions happen, you're forming clusters or groups of people who know each other. These connections grow and evolve throughout the night, and you can quickly see who belongs to which group based on who knows whom. As the party progresses, you can ask if two people know each other, and the answer will depend on whether they've been introduced through a series of connections. This process of creating and expanding groups of people through continuous introductions is very similar to how the **Union-Find **algorithm works. The algorithm helps track and manage these connections, grouping people (or elements) into sets. When a new connection (or "union") happens between two people, the algorithm merges their respective groups into one. If you ever want to know if two people belong to the same group (whether they're connected through a chain of introductions), the algorithm can quickly tell you if they're part of the same group or not. Just like in the party scenario, Union-Find helps us manage and track evolving networks of connections. Use-cases Network Connectivity: Determine if two nodes in a network (e.g., users in a social network or computers in a network) are connected or not. Cycle detection: Union-Find checks if adding an edge creates a cycle by seeing if the nodes are already in the same connected component. Image Segmentation: Union-Find can be applied in image processing to group pixels that belong to the same segment based on similarity or connectivity. Key operations Find(x): This finds the main item of the group where 'x' is. Union(x, y): This joins the groups of 'x' and 'y'. Same(x, y): This checks if 'x' and 'y' are in the same group. Implementation We can use tree structures but they come with some overhead, so we use a simple array. Each item points to its parent in the group, while the main item of a group points to itself. Initially every item belongs to its own group. class UnionFind: def __init__(self, size): self.parent = [i for i in range(size)] def find(self, x): while x !== self.root[x]: x = self.root[x] return x def union(self, x, y): parent_x = self.find(x) parent_y = self.find(y) if parent_x != parent_y: self.parent[parent_y] = parent_x def isConnected(self, x, y): return self..find(x) == self.find(y) Time complexity: O(n) for finding parent. Space complexity: O(n) as we store n elements in the array. Example There're 7 people: Alice, Bob, Charlie, David, Eve, Frank, Hannah. Step 1: Initialize the array Let's enumerate people by index: 0: Alice, 1: Bob, 2: Charlie, 3: David, 4: Eve, 5: Frank, 6: Hannah The resulting array: Index: 0 1 2 3 4 5 6 Parent: 0 1 2 3 4 5 6 Step: 2: Add the relationships Add the 1st relationship: Alice - Bob We unite the Alice and Bob by setting a parent Alice to Bob, so the resulting array: Index: 0 1 2 3 4 5 6 Parent: 0 0 2 3 4 5 6 Or we can make the Bob as a parent as well: Index: 0 1 2 3 4 5 6 Parent: 1 1 2 3 4 5 6 The next relationship is Alice - David. We unite Alice and David: Index: 0 1 2 3 4 5 6 Parent: 0 0 2 0 4 5 6 The next relationship is Charlie - Frank. We unite Charlie and Frank by setting Charlie as a parent: Index: 0 1 2 3 4 5 6 7 Parent: 0 0 2 0 4 2 6 7 The next relationship is Frank - Hannah. We unite Frank and Hannah by setting Frank as a parent: Index: 0 1 2 3 4 5 6 Parent: 0 0 2 0 4 2 5 So we have 3 groups (disjoint sets) of people: Group 1: {Alice, Bob, David} Group 2: {Charlie, Frank, Hannah} Group 3: {Eve} Check if the people are in the same group: Alice - David Alice's index is 0 and the parent is 0 so Alice' parent is Alice. David's index is 3 and the parent is 0 so David' parent is Alice. They belong to the same group. The next pair is Frank and Hannah: Frank's index is 5 and the 1st parent is 2 (Charlie). Charlie' parent is Charlie. So Frank' parent is Charlie. Hannah' index is 6 and the parent is 5 (Frank). At this step we don't know that we're checking Frank so we proceed. Frank' parent is Charlie. Charlie' parent is Charlie. They belong to the same group. The next pair is Charlie and Eve: Charlie' parent is Charlie. Simple. Eve' pa

Main idea
Imagine you're invited to a party, and as soon as you arrive, the person who invited you introduces you to someone else at the event. You both shake hands and form a connection, starting to build a network of people you know. But it doesn't stop there – the new person you meet introduces you to one of their friends, and now you're connected with even more people at the party. As you continue to make more connections, you start to realize that you're slowly becoming part of a larger group of people who are all connected, either directly or through someone else.
Initially, when you first arrive, you're part of a single isolated group – just you and your inviter. But as more introductions happen, you're forming clusters or groups of people who know each other. These connections grow and evolve throughout the night, and you can quickly see who belongs to which group based on who knows whom. As the party progresses, you can ask if two people know each other, and the answer will depend on whether they've been introduced through a series of connections.
This process of creating and expanding groups of people through continuous introductions is very similar to how the **Union-Find **algorithm works. The algorithm helps track and manage these connections, grouping people (or elements) into sets. When a new connection (or "union") happens between two people, the algorithm merges their respective groups into one. If you ever want to know if two people belong to the same group (whether they're connected through a chain of introductions), the algorithm can quickly tell you if they're part of the same group or not. Just like in the party scenario, Union-Find helps us manage and track evolving networks of connections.
Use-cases
- Network Connectivity: Determine if two nodes in a network (e.g., users in a social network or computers in a network) are connected or not.
- Cycle detection: Union-Find checks if adding an edge creates a cycle by seeing if the nodes are already in the same connected component.
- Image Segmentation: Union-Find can be applied in image processing to group pixels that belong to the same segment based on similarity or connectivity.
Key operations
- Find(x): This finds the main item of the group where 'x' is.
- Union(x, y): This joins the groups of 'x' and 'y'.
- Same(x, y): This checks if 'x' and 'y' are in the same group.
Implementation
We can use tree structures but they come with some overhead, so we use a simple array. Each item points to its parent in the group, while the main item of a group points to itself. Initially every item belongs to its own group.
class UnionFind:
def __init__(self, size):
self.parent = [i for i in range(size)]
def find(self, x):
while x !== self.root[x]:
x = self.root[x]
return x
def union(self, x, y):
parent_x = self.find(x)
parent_y = self.find(y)
if parent_x != parent_y:
self.parent[parent_y] = parent_x
def isConnected(self, x, y):
return self..find(x) == self.find(y)
Time complexity: O(n) for finding parent.
Space complexity: O(n) as we store n elements in the array.
Example
There're 7 people: Alice, Bob, Charlie, David, Eve, Frank, Hannah.
Step 1: Initialize the array
Let's enumerate people by index:
0: Alice,
1: Bob,
2: Charlie,
3: David,
4: Eve,
5: Frank,
6: Hannah
The resulting array:
Index: 0 1 2 3 4 5 6
Parent: 0 1 2 3 4 5 6
Step: 2: Add the relationships
Add the 1st relationship: Alice - Bob
We unite the Alice and Bob by setting a parent Alice to Bob, so the resulting array:
Index: 0 1 2 3 4 5 6
Parent: 0 0 2 3 4 5 6
Or we can make the Bob as a parent as well:
Index: 0 1 2 3 4 5 6
Parent: 1 1 2 3 4 5 6
The next relationship is Alice - David.
We unite Alice and David:
Index: 0 1 2 3 4 5 6
Parent: 0 0 2 0 4 5 6
The next relationship is Charlie - Frank.
We unite Charlie and Frank by setting Charlie as a parent:
Index: 0 1 2 3 4 5 6 7
Parent: 0 0 2 0 4 2 6 7
The next relationship is Frank - Hannah.
We unite Frank and Hannah by setting Frank as a parent:
Index: 0 1 2 3 4 5 6
Parent: 0 0 2 0 4 2 5
So we have 3 groups (disjoint sets) of people:
Group 1: {Alice, Bob, David}
Group 2: {Charlie, Frank, Hannah}
Group 3: {Eve}
Check if the people are in the same group:
Alice - David
Alice's index is 0 and the parent is 0 so Alice' parent is Alice.
David's index is 3 and the parent is 0 so David' parent is Alice.
They belong to the same group.
The next pair is Frank and Hannah:
Frank's index is 5 and the 1st parent is 2 (Charlie). Charlie' parent is Charlie. So Frank' parent is Charlie.
Hannah' index is 6 and the parent is 5 (Frank). At this step we don't know that we're checking Frank so we proceed. Frank' parent is Charlie. Charlie' parent is Charlie.
They belong to the same group.
The next pair is Charlie and Eve:
Charlie' parent is Charlie. Simple.
Eve' parent is Eve.
They belong to different group.
Optimizations
Path Compression
This technique flattens the structure of the tree whenever we perform a find operation, ensuring that all nodes point directly to the root of their set. This reduces the time complexity of subsequent queries.
Remember we united Frank and Hannah?
If we use path compression, we set both of Frank and Hanna parent Charlie:
Index: 0 1 2 3 4 5 6
Parent: 0 0 2 0 4 2 2
Without path compression:
2
\
5
\
6
With path compression:
2
/ \
5 6
We need to change the find function:
def find(x):
if x == self.root[x]:
return x
self.root[x] = find(self.root[x])
return self.root[x]
Time complexity:
The path compression technique ensures that the find() operation runs in nearly constant time. In practice, its time complexity is O(α(N)), where α(N) is the inverse Ackermann function, which grows very slowly. For all practical purposes, this is close to O(1).
Union by Rank (or Size)
To enhance the parent array method, we can manage the tree's height by using Union by Rank. The concept is to always attach the smaller tree to the root of the larger tree, effectively reducing the overall height of the tree.
Suppose we need to unite the Hannah and Eve in the examples above:
2
/ \
5 6
\
4
So we got 2 groups:
0 2
/ \ / \
1 3 5 6
\
4
If we unite parents 0 and 2, we can choose the root of the larger tree (based on height or rank) as the new root. By always joining the smaller tree under the larger one, we keep the trees balanced.
We need to add rank array to the constructor and change union function:
def __init__(self, size):
self.rank = [0] * size
self.parent = [i for i in range(size)]
def union(self, x, y):
parent_x = self.find(x)
parent_y = self.find(y)
if parent_x == parent_y:
return
# Union by Rank
if self.rank[parent_x] < self.rank[parent_y]:
self.parent[parent_y] = parent_y
elif self.rank[parent_y] < self.rank[parent_y]:
self.parent[parent_y] = parent_x
else:
self.parent[parent_y] = parent_x
self.rank[parent_x] += 1
Conclusion
By combining path compression with Union by Rank, both the find and union operations become highly efficient, approaching constant time complexity. This makes the algorithm exceptionally scalable for handling large graphs and numerous queries.