Accounts merge

Graphs Hard Problems II Hard
  • This problem is actually a critical part of many modern communication apps and customer relationship management (CRM) systems
  • These systems often need to process thousands of accounts or user profiles from various sources and merge duplicates to ensure consistency and accuracy in managing user or customer data
  • This process, called deduplication, is an everyday problem, especially where data is sourced from multiple inputs, like social media, customer feedback, emails etc
  • An efficient algorithm to solve this in real-time can significantly improve the UX for customer support systems and data analytics tasks

Given a list of accounts where each element account [i] is a list of strings, where the first element account [i][0] is a name, and the rest of the elements are emails representing emails of the account.


Now, merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.


After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order.

Examples:

Input: N = 4,

accounts =

[["John","johnsmith@mail.com","john_newyork@mail.com"],

["John","johnsmith@mail.com","john00@mail.com"],

["Mary","mary@mail.com"],

["John","johnnybravo@mail.com"]]


Output: [["John","john00@mail.com","john_newyork@mail.com", "johnsmith@mail.com"],

["Mary","mary@mail.com"],

["John","johnnybravo@mail.com"]]


Explanation: The first and the second John are the same person as they have a common email. But the third Mary and fourth John are not the same as they do not have any common email. The result can be in any order but the emails must be in sorted order. The following is also a valid result:

[['Mary', 'mary@mail.com'],

['John', 'johnnybravo@mail.com'],

['John', 'john00@mail.com' , 'john_newyork@mail.com', 'johnsmith@mail.com' ]]

Input: N = 6,

accounts =

[["John","j1@com","j2@com","j3@com"],

["John","j4@com"],

["Raj",”r1@com”, “r2@com”],

["John","j1@com","j5@com"],

["Raj",”r2@com”, “r3@com”],

["Mary","m1@com"]]


Output: [["John","j1@com","j2@com","j3@com","j5@com"],

["John","j4@com"],

["Raj",”r1@com”, “r2@com”, “r3@com”],

["Mary","m1@com"]]


Explanation: The first and the fourth John are the same person here as they have a common email. And the third and the fifth Raj are also the same person. So, the same accounts are merged.

Input: N = 3,


accounts = [

  ["Alice", "alice@mail.com", "alice_work@mail.com"],

  ["Bob", "bob@gmail.com"],

  ["Alice", "alice_personal@mail.com", "alice@mail.com"]

]

Constraints

·  1 <= N <= 1000

·  2 <= accounts[i].size <= 15

·  1 <= accounts[i][j].size <= 30

·  accounts[i][0] consists of English letters.

Hints

  • Map each email to a unique identifier (index) using Union-Find. Merge emails that belong to the same account by performing Union operations on them.
  • "Group all emails by their root parent and store them under the correct name. Sort and format the result to match the required output. "

Company Tags

OYO Rooms Electronic Arts Snowflake Zynga Epic Systems Qualcomm Airbnb Optum DoorDash Ernst & Young Shopify Micron Technology McKinsey & Company Wayfair Bain & Company PayPal GE Healthcare Flipkart Stripe Walmart Cloudflare Red Hat Broadcom HashiCorp Seagate Technology Google Microsoft Amazon Meta Apple Netflix Adobe