Start time: 9.00 am GMT
End time: 11.00 am GMT
Announcement from the European Actuarial Academy organiser: Actuarial analytics found its way into several areas of the insurance value chain, mostly through the use of tools from supervised learning such as linear or tree-based regression. On the other hand, unsupervised learning, such as partitional clustering, seems to be used rather less despite its potential to gain insights into high-dimensional insurance data sets.
Cluster analysis is the task of grouping a set of objects (often data points) in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups. In contrast to simple segmentation (e.g. by geographical location only), Clustering uses several features to differentiate among those groups. Potential applications are manifold and centred around questions such as, for example:
• In which customer segments do we mainly generate new business?
• Which typical customer should we have in mind while designing new insurance products?
• How can we make use of granular information, such as diagnose or treatment codes, for example, while dealing with a limited number of observations or claims?
• How can we identify outliers in our underwriting or claims process?
The course provides an introduction into clustering that does not require any previous knowledge in this area and shall give the participant a jump start to work on his/her own problems. Thus we put a focus on typical stumbling blocks arising when clustering techniques are applied in practice such as interpretability, missing values and mixed data types.
The web session is open to all interested persons. Previous knowledge about partitional clustering is not required, however, basic statistical knowledge is recommended.
Technical requirements: Please check with your IT department if your firewall and computer settings support web session participation (the programme GotoTraining/GotoWebinar is used for the web session).
Your early-bird registration fee is € 100.00 plus 19% VAT for bookings by 22 March 2021. After this date, the fee will be € 140.00 plus 19% VAT.
- Introduction into standard clustering techniques
- Feature selection
- Cluster validation
- Visualization techniques for clustering results such as transformation or perturbation based approaches
- Brief introduction into the imputation of missing values in pre-processing or during clustering
- Clustering of mixed data types (numerical and categorical features)
The theoretical explanations will be accompanied by a practical example in R on a public data set showcasing a typical insurance application.
Dr. Oliver Pfaffel
Oliver has been woking in the reinsurance industry for the past 8 years. He is currently specializing in the use of artificial intelligence for automatic underwriting and information extraction from text. Prior to this role, he was working as an actuarial data analyst using supervised machine learning techniques for advanced pricing approaches, and in risk management working on Solvency II related topics such as model validation. Dr Pfaffel has a PhD in mathematical statistics from the Technical University of Munich (TUM) with research stays at the National University of Singapore and the Columbia University in the City of New York. At TUM, he lectured a course on life insurance mathematics for master students. He has several peer-reviewed publications in the areas of financial mathematics and random matrix theory and is the author / maintainer of the CRAN packages FeatureImpCluster and ClustImpute.