Architecture
When Fraud Finds Your Platform

Three users in one morning. Transactions they hadn’t made. Money gone. By afternoon, a dozen. On a platform processing real financial transactions, this wasn’t a support queue problem but a structural one. By the time you name it fraud, the clock is already running.
The platform wasn’t built with this threat model in mind. When the transaction surface expanded, we attracted bad actors who understood it better than we did. Account takeover fraud—compromised credentials, VPN-masked access, funds moved to temporary accounts before anyone noticed—is a well-worn playbook. We never had to defend against it before. We had the logs and incident reports but lacked a system to evaluate signals fast enough to act before the damage landed.
So we made the call that’s never clean: stop shipping features and fix this. There’s always a roadmap, commitments, and a product org with quarterly priorities and stakeholders who aren’t watching the same fraud queue. But the math was simple once we said it out loud: every release shipped while the fraud vector was open made the problem bigger. Velocity was the accelerant. We paused.
Build vs. Buy Under Fire
Account takeover fraud was a different discipline—we had no in-house fraud detection expertise. We deployed hotfixes while the system still bled. Engineers pulled from roadmap work were manually investigating incidents, buying time while we found a real solution.
Building from scratch wasn’t the answer. A fraud model requires a calibration cycle measured in quarters and a large enough historical dataset to train on. You can’t label fraud you haven’t instrumented. We needed something production-ready in weeks.
We needed a service we could instrument fast, that covered the signal types relevant to account takeover, and that could operate inside a regulated environment on day one. The first vendor we evaluated demonstrated all three. Under other circumstances, we might have run a longer comparison.
What mattered about its design was that it separated signal from verdict. It evaluated risk across four types: IP address, device fingerprint, email, and phone number. Each returned a score plus a set of attributes—connection type, bot indicators, proxy flags. Their models handle the pattern recognition. Our job was to decide what those signals meant for our specific users.
A fraud score is an input, not a decision. A datacenter IP with an elevated score might be a legitimate enterprise user on a corporate VPN. A residential IP with a proxy flag might be a privacy-conscious user who’s never committed fraud. On a financial platform, a false positive is a trust event—sometimes more damaging than the fraud itself.
Building the Judgment Layer
We didn’t wire scores to decisions. We built a rule evaluation layer on top—one handler per signal type, each consuming the service’s response, applying a configurable threshold, and returning a binary determination.
The thresholds weren’t hardcoded. They were pulled from configuration at runtime, allowing us to adjust what “fraud” meant for our platform without deploying new code. We knew the thresholds would need tuning as real traffic patterns emerged and wanted to avoid a deployment cycle for calibration changes. The device fingerprint check showed how the layered system operated in practice.
First gate: if the device fingerprint in the request didn't match the one on record, we triggered step-up verification. On mobile, reinstalls and upgrades produce legitimate mismatches too frequently to justify a hard block.
Second gate: if fraud probability exceeded the configured threshold, we blocked—email and phone signals provided sufficient corroboration.
Third gate: if the score was elevated but below the threshold, we evaluated connection type alongside bot status. A datacenter connection with an active bot flag read differently than a residential connection using a proxy.
Same score, different context, different outcome.
The Signal We Kept Dark
We deferred IP address evaluation entirely in the first release.
The account-takeover pattern made IP the most tempting signal and the most dangerous to miscalibrate. Bad actors masked behind VPNs, but so did legitimate users. Blocking on IP characteristics alone would have caught both. Everyone who looked at the false-positive rate reached the same conclusion: IP addresses are the noisiest of the four signal types, and we hadn’t characterized our legitimate user population well enough to set a trustworthy threshold. We had the integration ready but chose not to enable it.
Because each signal type was independently scoped, deferring IP didn’t require touching anything else. Email, phone, and device fingerprint went live. IP went dark but stayed instrumented—we could watch the signal without acting on it, which meant we were building the dataset we’d need to calibrate it later. When we eventually brought it online, it slotted in. Nothing else changed.
The Lever We Didn’t Pull
Within weeks of shipping, successful fraud all but stopped. The support ticket clusters stopped. Unreconciled transaction reports dropped to near zero. The on-call rotation stopped getting pulled into fraud incidents at scale. Engineers returned to the roadmap.
The IP deferral seemed like a half-measure at the time. In retrospect, it was the most disciplined call in the project. Shipping an uncalibrated block rule would’ve caused a different kind of harm: legitimate transactions declined, accounts locked, and support volume climbed for the wrong reason. A lever you don’t understand is not yours to pull.








