trees

dfs

dsa

Trees: Depth First Search (Part I)

Trees: Depth First Search

Ankit Josh (f_in_the_chat)

When motivation ends, discipline takes over. When discipline ends, ambition takes over.

In this series of blogs, we'll be going over 15 Tree Questions and try to understand the ideas behind every problem.

I suggest you to try every problem before reading it's solution.

Note: The blog is divided into 3 parts. In each part, we'll be covering 5 questions and try to understand the ideas behind those problems.

Part I

  1. In this problem, we are asked to find the sum of all unique paths in the tree.

1c9962225b484b4be775b21ea4f0cae761cb1c5a

The answer for the first tree is: 8
The answer for the second tree is: 90

Let Path(x,y) give the weight of the simple path from Node (x) to Node (y).

For the first tree the answer is:
Path(1,2) + Path(1,3) + Path(2,3) = 1 + 3 + (1 + 3) = 8

Now, let's make a few observations:

  • Observation:
    The path between any two nodes (x,y) can be considered as the sum of two paths, Path (x,z) + Path (z,y), where z is an intermediate node in the simple Path (x,y). So, removing a node divides the graph into two subtrees.

  • Observation:
    Let N (e) be the number of times the edge (e) appears in the final answer and W be its weight.
    Then, the contriution of the edge (e) = N (e) * W.

Therefore, our final answer is the sum of contribution of every edge.

So, the question boils down to finding the number of times an edge (e) repeats across all the paths.

Let's say the tree consists of N nodes.
Also, let's assume the tree is rooted at an arbitary vertex(say 1).

Now, let's consider a Node (x), which is connected to some Node (y).
If we remove the Node (x), the graph is now divided into two parts.

Now, let c be the number of nodes in the subtree of Node (y) including (y).
So, the edge(x,y) is a common edge for every path between any node in the subtree of Node(y) and the remaining the nodes in the tree.

Therefore, the total contribution of this edge is: c * (N-c) * W, where W is the weight of the edge.

Solution:

Time Complexity: O(n)
Space Complexity: O(n) [Recursive stack space]

Key takeaways:

  • If we remove a node from a tree, it's divided into two parts.
  • Instead of trying to find the contribution of every path, try finding the contribution every edge.
  1. In this problem, we are asked to count the number of distinct color nodes in the subtree of every node.

image002

The answer is: 1 2 3 1 1

The number of nodes with different colors for node 1: 1 (1)
The number of nodes with different colors for node 2: 2 (1, 2)
The number of nodes with different colors for node 3: 3 (1, 2, 3)
The number of nodes with different colors for node 4: 1 (3)
The number of nodes with different colors for node 5: 1 (1)

Now, let' make an observation:

  • Observation:
    Let F(x) give the distinct colors in the subtree rooted at x.
    Now, let there be three nodes (p,q,r) such that p is the parent of both (q, r).
    We have, F(p)=F(q) U F(r), where U is the union operation of two sets.

So, we can use sets to keep track of distinct colors in subtrees of every node.

So, for a parent node, we merge its children' sets into the parent node's set.

Now, how do we merge two sets efficiently? The process is simple.
We always merge the smaller set into the bigger set.

But what if our parent set is smaller than the child set ?

In that case, we can just swap the two sets and then merge the smaller set inside the larger set.

Solution:

Time Complexity: O(n)
Space Complexity: O(n)

Key takeaways:

  • We can use containers like maps, sets to keep track of certain properties for every node.
  • We can merge children' properties together to get the corresponding property for the parent.
  • Merging should be efficiently done.
  1. In this problem, we are asked to choose a manager (node) and find the number of ninjas (nodes) in it's subtrees which when combined take salary upto the given budget. We want the highest satisfaction value that can be achieved.

Screenshot-2023-02-07-at-2.08.54-AM
The answer for the given sample case is: 6

In this testcase, the budget is 4.
Here, we can choose Node (1) as our manager and Node (3) and (4) as our dispatched ninjas. Our total cost should be less than or equal to the given budget.

Total Cost = Cost of Node (3) + Cost of Node (4) = 2 + 2 = 4.

Now, satisfaction of customer = Number of Ninjas dispatched * Leadership Level of Manager.

So, answer = 2 * Level (1) = 2 * 3 = 6.

Now, let's make some observations:

  • Observation:
    If a node is our manager then ninjas will be from it's subtrees. So, our goal is to take as many nodes as possible under given budget.

  • Observation:
    It doesn't matter which nodes we select as our ninjas. So, its' optimal to select the nodes with as low salary as possible.

But how do we do this?

We can maintain a priority queue or set for every node, that contains the salary of all the nodes in it's subtrees.
Now, we should try to take all the nodes.
If it's more than our budget, then we can remove the highest value from our priority queue and decrease the required cost.

Once we find the highest cost which is less than or equal to our budget, the size of our priority queue is our number of nodes.
We can multiply it by leadership level of the node to get satisfaction level.

Now, this technique can be used recursively for all the nodes to get our final answer.

Solution:

Key takeaways:

  • We can use priority queues/sets to keep track of certain properties of nodes in a monotonic order. This order can be used to greedily solve some problems.
    Instead of trying to find the exact nodes as our ninja, we tried to find max number of ninjas possible if some node (x) becomes our manager for all the nodes.
  • The final answer can be found by using recursion for all the nodes.
  1. In this problem, we are asked to find the closest ancestor for each node such that the ancestor has the same color as the node.

image002-2

The answer is: -1 -1 -1 1 3

Since, Node 1 is the root, we don't have any ancestor.
Node 2 has no ancestor with its color.
Node 3 has no ancestor with its color.
Node 4 has Node 1 as its closest ancestor with the same color.
Node 5 has Node 3 as its closest ancestor with the same color.

Now, let's make an observation:

  • Observation:
    If for a node (x) with color (c), some other node (y) is the closest ancestor, then for the next successive node (z) with color (c) in subtree of node (x), node (x) will be the ancestor.
    But, while going back to the other children of parent of node (x), we will clear node (x) and it's children from our list of ancestors.

So, this gives us a hint that we should probably use a stack data-structure for every color to keep track of its ancestors at every level.

So, our algorithm will be doing a plain DFS over the tree and for every color, we will have a stack that keeps track of the closest ancestor for that color.

Example:
In the above tree, when the DFS reaches node (5), we will see that node (3) is an ancestor for color (2). So, for node (5), the answer will be (3).

Solution:

Time Complexity: O(n)
Space Complexity: O(n)

Key takeaways:

  • Data structures like stacks/queues can be used to keep track of level-wise ancestors following a certain property for every node.
  • Once we pop a node out from our ancestor, it can never be our ancestor again for any other node, which means every node will be pushed once and popped once.
    Hence, the time complexity is linear.
  1. In this problem, we are asked to find the number of paths such that for two given nodes (u,v), node (u) doesn't come before node (v).

image002-3

Let's make an observation:

  • Observation:
    Correct paths = Total Paths - Wrong Paths.
    So, if we find the total number of paths in the tree along with the number of wrong paths, our work is done.

  • Observation:
    Keep in mind here {1->2} and {2->1} are treated as different paths.

Let, C(n, r) and P(n, r) be combinatoric functions.
C(n, r) calculates the number of ways to select (r) things out of (n) things.
P(n, r) calculates the number of ways to select (r) things and also arrange them out of (n) things.

So, essentialy we need to select any two nodes from all the N nodes.
That is C(N, 2), but for us the order in which we select the nodes also matter. So, it becomes P(N, 2).
Now, P(N, 2) = N * (N - 1).
So, Total Paths = N * (N - 1).

Now, let's find the number of wrong paths:
Let, Path(u, v) represent the simple path between (u) and (v).
Now, let's remove all the nodes that comes in the Path(u,v) other than (u,v).

Now, all we are left with are the subtrees of node (u) and node (v).
Let, c1, c2 be the number of nodes in the subtree of node (u) and node (v) respectively.

Now, let's select one node from c1 nodes i.e C(c1, 1) = c1.
Also, let's select one node from c2 nodes i.e C(c2, 1) = c2.
So, Wrong Paths = C(c1, 1) * C(c2, 1) = c1 * c2.

Final Answer: N * (N-1) - (c1 * c2).

But, how do we find c1 and c2?
We can keep track of the parent of both our starting and ending nodes.

For getting c2, we can find the number of nodes in the subtrees of ending node other than it's parent.

For getting c1, we can find the number of nodes in the subtrees of starting nodes other than the node that lies in the path to ending node.
Solution:

Key takeaways:

  • Correct Answer = Total Answer - Wrong Answer
  • Total Paths in a tree: N*(N-1)
  • We can find number of nodes in subtrees of two nodes seperately by keeping track of their parents.