Record linkage
Record linkage is the process of identifying records across two or more datasets that refer to the same entity. The term is most common in statistics and healthcare.
Record linkage is, for practical purposes, a synonym for entity matching. It has the same goal: connect records that describe the same real-world entity despite differences in formatting and completeness. The term comes from statistics and is widely used in healthcare and official statistics, where linking individuals across systems is foundational.
Classical record linkage often uses probabilistic methods, such as the Fellegi–Sunter model, valued for being explainable. Learned approaches add accuracy on complex domains at the cost of needing labels, which is a trade-off to weigh for your case.