Challenges
- Data included 77,445 claim records of which only 120 had been determined to be potentially fraudulent
- So identified potentially fraudulent claims are rare events (0.15%) and therefore hard to detect
- It was however expected that there could be a large number of undetected fraudulent claims
Solution
- Gradient Boosting – a powerful machine learning algorithm was used for detecting potentially fraudulent cases
- Substantial lift demonstrated. – on the client test data set it sufficed to examine 7.75% of all claims to identify 91.67% of all fraudulent claims
