Why teams choose MadMatcher.

We compare on approach rather than on logos. Four things MadMatcher does differently.

Trainable to your domain

MatchFlow learns a matcher from your labeled pairs, rather than applying one fixed pretrained model. Accuracy improves as you add labels.

Runs in your infrastructure

Everything runs in your own Spark or single-machine environment. Your data does not leave your perimeter for a vendor cloud.

Blocking that is benchmarked

Sparkly’s TF/IDF blocking was published at VLDB 2023 and outperforms eight state-of-the-art blocking solutions. The method and benchmarks are public.

Composable by design

Use one tool or all three. MadMatcher slots into your existing pipeline instead of making you adopt a full platform.

Backed by published research.

The blocking core was introduced and benchmarked in a peer-reviewed VLDB 2023 paper, and builds on a decade of work from UW–Madison’s Magellan group. You can read the method and the numbers yourself.

Compare approaches See the tools

Have a matching problem?

Book a call to scope it with the team, or explore the code on GitHub.

Book a call View on GitHub