Analytics: Finding hidden truths using partial and complete ensembles
In this context, an ensemble is defined as a connected set of transactions in support of an identifiable purpose.
The example below assumes that the transactions are financial in nature, and used in breach of AML laws. Other applications of our methods span a variety of industries, including power grid operation, logistics, retail and social media.
A simple laundering scenario could be:
Person A wishes to pass money to person B, without alerting authorities.
B buys a property on the open market for $1M.
B holds the property for 2 years, then sells to A for $2M
A holds the property for 2 years, then sells to the market for $1M.
Clearly, the scenario can be rendered even more difficult to detect by the interposition of company failures, loans, SMSFs, favourable tenancies, etc..
If enforcement authorities suspect such a transaction, they can apply traditional investigative practices to unravel the sometimes quite complex ensemble, and identify the points of malfeasance.
Our work deals with the preceding step of identifying the ensembles, for subsequent alert to authorities.
As a guide to the scale of difficulty of uninformed detection, a national banking system processes millions of transactions per day, and an AML proscribed ensemble might span several years. Thus, searching directly for any of the myriad potential scenarios is computationally infeasible.
Some progress has been made in detecting singular events which lie outside the norms for their class of participants – sudden increases in discretionary spending, etc, but, again, these feed into an already overloaded system of follow-up audits.
We define a novel method for ensemble detection, based on existing techniques for function optimisation over a metric space in which we define a new, rather unorthodox algebra.
By this method, we shift the problem of brute-force detection, followed by uninformed manual audits, to the dual challenges of specifying an algebra that embodies the conditions we seek, and detecting local optima (for which we have selected simulated annealing as our initial basis, due to its robustness, and suitability in a highly dimensioned, sparsely populated space).
This leverages the fact that optimisation techniques are mature, highly efficient processes, whereas graph theoretic search algorithms are, by comparison, still naive.