Community
In AML and fraud monitoring, expected behavior refers to the normal patterns of transactions for a given customer, their typical transfer amounts, frequency of transactions, usual beneficiaries or destinations, etc. Defining this baseline is critical because it enables compliance teams to distinguish routine activity from anomalies. Unusual transactions that deviate significantly from a customer’s expected behavior can be red flags for money laundering or fraud. Conversely, what’s “unusual” for one customer might be completely ordinary for another, so a one-size-fits-all threshold will either miss risks or generate false alarms.
Regulators and industry standards increasingly demand a risk-based approach that incorporates customer-specific behavior norms. Global authorities like FATF and the EBA emphasize continuous monitoring of customer activity against an individualized risk baseline. The importance of tracking expected vs. actual behavior isn’t just theoretical, there have been real enforcement cases underscoring this need. For example, an FCA action against Santander UK revealed that a small business customer, who said to expect around £5,000 in monthly deposits, was in fact receiving millions per month – far beyond its stated profile. The bank’s systems generated some alerts, but poor tuning and follow-up allowed ~£298 million to flow through the account unchecked. In another case, NatWest was fined after a client’s deposits (totaling £265 million) wildly exceeded expected volumes without timely intervention. These incidents show that failing to compare current behavior against a proper baseline can let obvious anomalies slip through, or conversely, flag too many “noise” alerts if expectations aren’t factored in. In short, knowing a customer’s normal habits is key to catching the abnormal.
To define what’s “normal” for a customer, analysts use a few fundamental statistical measures. Each has its strengths and weaknesses in capturing typical behavior:
The average (mean) is the sum of all observed values divided by the count of values. It’s straightforward and often used as a quick baseline for transaction amounts or counts. For example, if a customer made 10 transfers totaling $5,000 last month, their average transfer amount is $500. A simple average is easy to compute and understand, which makes it a common reference point.
However, the mean can be misleading when there are outliers. One or two unusually large transactions will skew the average upward (or downward, if the outliers are very small). In other words, the mean is not a robust measure of central tendency, it’s sensitive to extreme values. If a user usually sends $200 - $500 per transfer but made one $10,000 transfer, the average might shoot up into the thousands, no longer reflecting the typical range. This is a major con of relying solely on the mean: it may not represent the true normal when the data distribution is skewed or contains anomalies.
On the pro side, the average does capture overall trends and is useful if the data doesn’t contain extreme outliers. In many cases, compliance teams might start with an average transaction size or average count per period as a baseline, then adjust for its weaknesses using the measures below.
The median is the middle value of a data set when sorted from low to high (or the average of the two middle values if there’s an even number of points). By definition, half of the transactions are above the median and half below. This makes the median a robust indicator of the “typical” value, especially for skewed distributions. Unlike the mean, the median isn’t dragged up or down by a few extreme outliers. In a dataset of transfers that are mostly around $300 but with one or two huge transfers, the median will likely stay near $300, reflecting what’s normal for the majority of transactions.
For AML compliance monitoring, using the median transaction amount can be very useful.
Pros: It provides a realistic center of gravity for a customer’s behavior, filtering out sporadic spikes. If a customer’s median transfer is $250, that’s a good sense of their usual transaction size even if they occasionally have a $5,000 wire.
Cons: The median doesn’t convey anything about variability or range by itself, it won’t tell you if the customer sometimes does $1,000 or $5,000 transfers unless you also look at other metrics. Also, if the activity volume is low (say a customer has only a few transactions), the median might be less informative (in an extreme case, with only one transaction, the median equals the mean equals that value).
While the mean and median gauge the center of the behavior, the standard deviation (std dev) measures the spread of the data, how much variation there is around the average. In transaction monitoring, standard deviation is useful to understand how volatile or consistent a customer’s behavior is. If most of a user’s transactions are usually in a tight band (e.g. $200-$300), the standard deviation will be relatively small; if their transaction amounts swing wildly between $50 and $5,000, the std dev will be large.
The key use of standard deviation is to set an adaptive threshold for anomaly detection. Statistically, for many distributions (assuming a normal bell-curve approximation), about 95% of observations lie within ±2 standard deviations of the mean, and ~99.7% lie within ±3 standard deviations. That means anything beyond ~2 or 3 std dev from the average is quite rare and can be considered unusual. For example, if a customer’s average transfer is $500 with a standard deviation of $100, then a $800 transaction is 3 std dev above the mean, an extreme outlier in context, since fewer than 0.3% of points would naturally fall beyond 3σ in a normal distribution. In practice, such a transaction would be flagged for review because it’s far outside the expected range.
Pros: Standard deviation gives a concrete way to quantify how far off a current transaction is from the norm. Using rules like “flag if amount is more than 2σ above the average” adapts to each customer’s variability (what’s high for a usually steady customer might be normal for a volatile one).
Cons: Std dev assumes a somewhat symmetric distribution around the mean; it may not be as meaningful if the distribution is highly skewed (where median is more relevant) or if there are frequent outliers (which themselves inflate the std dev). Also, calculating a meaningful std dev requires a decent history of data points, it’s less reliable for a new customer with little transaction history.
In summary, mean, median, and standard deviation together help sketch a customer’s normal behavior profile. The mean gives an overall average, the median provides a robust typical value, and the standard deviation sets a gauge for what counts as a significant deviation from the norm. Next, we’ll see how these metrics are applied in real monitoring scenarios.
Defining a baseline means establishing an expected range for each user’s behavior and then continuously comparing new events against that personal benchmark. Compliance analysts often create a profile like: “This user usually sends $200-$500 per transfer to 2-3 known beneficiaries per week.” With such a baseline, the system can then automatically flag transactions that fall outside this expected pattern.
In practice, this might involve rules such as: if a transaction amount exceeds X times the user’s average (mean) or if it falls outside 2 standard deviations of their normal amount range. Similarly, frequency-based checks can be used. For example, if a user typically makes around 5 transactions a week (with a certain std dev), suddenly making 20 in a day would be an anomaly. These techniques essentially enable dynamic thresholds that adjust to each customer. Rather than a static rule like “alert on any transfer above $10,000,” the threshold for each user can be proportional to their own historical behavior (e.g., “alert if amount is 3× more than this user’s 90-day average”). According to industry best practices, layering in such user-specific behavior signals helps surface risks that wouldn’t be visible under one-size-fits-all rules. By comparing each transaction to the customer’s own baseline, their own median amount, average frequency, usual payees, etc. institutions can stay responsive as behavior shifts over time.
Let’s consider two examples of how these baseline metrics help in detection:
In both examples, the key is context. A $20,000 wire or 15 transactions in a day might not be alarming for some high-net-worth or business customers at all. What makes it suspicious is that it’s out-of-character for that specific user. By establishing each user’s typical behavior (through metrics like median amounts, average frequencies, etc.) and monitoring deviations, banks and fintechs dramatically improve detection accuracy. They can catch truly suspicious outliers while ignoring irrelevant noise. This balance reduces false positives (alerts for routine activity that just looks large in a generic sense) and false negatives (missed true threats).
In the fight against financial crime, one of the most powerful questions a compliance team can ask is, “Is this behavior expected for this customer?” Establishing a baseline for each user, whether it’s their typical transaction size, normal login pattern, or usual transaction count is essential to answer that question. We’ve seen that different metrics serve different purposes: the median often gives a truer picture of typical transaction value in the presence of outliers, the mean can indicate overall trends, and the standard deviation provides a quantitative yardstick for what’s significantly outside the norm. There is no single metric that covers it all; a combination is often ideal. For example, using the median for transaction amounts (to handle skewed distributions) paired with standard deviation (to gauge variability) can establish a robust expected range. Compliance leads and risk analysts should choose the right tool for each behavior pattern, e.g. median or percentile-based thresholds for transaction amounts, vs. standard deviation or velocity measures for frequency of actions.
It’s also important to remember that expected behavior isn’t static. A customer’s habits can evolve over time (gradually or suddenly), and what was abnormal last year might be routine this year, or vice versa. This is why continuous monitoring and dynamic baselining are so crucial. By constantly recalibrating what “normal” looks like for each user, you ensure that your detection logic stays relevant and effective. A system grounded in expected behavior will catch the truly suspicious anomalies, those needle-in-haystack deviations that signal risk, while gracefully handling the day-to-day fluctuations that reflect genuine customer activity.
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Raktim Singh Senior Industry Principal at Infosys
04 August
03 August
John Adam Chief Revenue Officer at Aimprosoft
01 August
Nikunj Gundaniya Product manager at Digipay.guru
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.