Highlights
Task
1. Briefly describe the business problem, the definition and distribution of the target and the available data.
2. Split the data in a training set and a validation set . Use your r-number as the seed in the data partition node.
3. Explore the distribution of the variables and the relation with the target
Study the strength of the univariate relation of each predictor and the target
Visualize the relation between a few important predictors and the target
Use a decision tree to explore the multivariate relation between the predictors and the target
4. Build different types of predictive models
Logistic regression models that include different types of variable transformations, methods for variable selection, etc.
A Decision tree
A Random Forest model
5. Compare the performance of the constructed models (on validation data) using different criteria (proportion of incorrect classifications, ROC curve, Area under the curve, lift, cumulative lift, ?ptured response, etc).
6. Describe the results of the logistic regression model that yields the most meaningful insights for the business (i.e., describe importance of the variables, interpret regression coefficients, etc.).
7. Indicate which predictive model is most suited to select the top10% of policyholders with the highest probability to file a claim in the next 6 months. Discuss the predictive performance of this model (i.e., lift, cumulative lift, ?ptured response, etc.).
This Engineering Assignment Help has been solved by our Engineering Experts at My Uni Papers. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.