In-Depth

Data mining user profile: Daniele Micci-Barreca of ClearCommerce

Daniele Micci-Barreca, director of risk management at ClearCommerce, a user of SPSS Clementine software

"ClearCommerce is a software company, and we provide software for enabling online merchants to process payments and electronic checks. As well, we provide risk management and fraud detection for online merchants."

What is the alternative? "It is usually an in-house development solution."

"We use Clementine as a data mining environment to develop predictive models that are a component of our system."

"We build a data consortium. Customers provide historical data of the transactions they process, and charge-back records. This is assembled in a data mart using an Oracle database. We consolidate the data so there is a single record for a transaction and then we attach a label [that indicates] whether it was fraud or not."

The solution then points Clementine at the data mart, runs the company's algorithms on the data, and comes up with a neural model trained to predict fraud instances vs. non-fraud instances.

"In the past, neurals were seen as a black art by statisticians. [More recently,] the differences between neural nets and statistical models have come together."

Tips
"You want a solid environment. You can find a number of little tools [for free]. You can find source code for specialized algorithms, most of which come out of academia. These are mostly highly experimental. They are not written for commercial purposes."

What is important is the coupling of the data mart with the algorithms.

As systems get used, they end up moving data back and forth between databases and flat files. A lot of the tools work with flat files. But that can become unmanageable.

"ClearCommerce's complete process is built totally in a database. Clementine never moves the data out of the database the way we use it."

Back to Data mining comes of age