There is substantial global interest in trends in variables such as obesity, diabetes prevalence, blood pressure and cholesterol, as they are risk factors for non-communicable diseases such as cardiovascular conditions and cancers. The Non-Communicable Disease Risk Factor Collaboration (NCD-RisC) produces estimates of these variables at national level, and shows how they have varied over time. Patterns observed in the data include a general increase in body-mass index and obesity, and complex trends in blood pressure driven by diet and medication use. There are also interesting patterns in diabetes prevalence and cholesterol levels that can be explored.
The task for this project is to investigate aspects of the NCD-RisC data that you feel might be of interest. There are no restrictions on what you can investigate, and you will not be assessed on the complexity of your investigation, the level of sophistication of your analysis, or how “successful” your investigation is. The focus of this assignment is on the use of the tools and techniques from the module to support efficient and reproducible data analysis, rather than the data analysis itself. Feedback on this assessment will be based on the approach to the analysis that you take, and how well you observe best practice in your approach.
For the Analysis Report submission, you should submit a maximum 6-page report (made using R Markdown) describing your analysis, with reference to the phases of CRISP-DM. Your report should describe two consecutive cycles of CRISP-DM.
Note: that we will see ProjectTemplate in Week 5 of the module, so don’t panic that we haven’t seen this yet! For the ProjectTemplate Directory submission, you should submit a single zip file containing:
The assessment requires careful consideration of a specific dataset, which you should carry out yourself rather than using AI. You should only use AI in a very limited way, such as for debugging of code or spell-checking of your report.
This assessment focuses on the exploration of global health trends using data from the Non-Communicable Disease Risk Factor Collaboration (NCD-RisC). The dataset contains long-term estimates of variables such as obesity, BMI, diabetes prevalence, cholesterol levels, and blood pressure—key indicators linked to non-communicable diseases including cardiovascular disease and cancers.
The core objective is not to produce an advanced or highly sophisticated data analysis. Instead, the emphasis is on adopting best practices in data investigation, reproducibility, and analytical documentation. To achieve this, students are expected to:
Select any health-related variable or trend of interest from the NCD-RisC dataset.
Conduct a structured investigation following two complete cycles of the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework.
Use ProjectTemplate to ensure a reproducible workflow, where data, preprocessing scripts, cached files, and configuration settings are properly organised.
Apply dplyr for data manipulation and ggplot2 for visualisation.
Produce a maximum 6-page R Markdown report, fully reproducible and clearly linked to the CRISP-DM phases.
Submit a ZIP file containing the entire ProjectTemplate directory, including:
Data folder
Munge (preprocessing) scripts
Cache files
Config settings
Report folder with the .Rmd file
README explaining how to run the analysis
The assessment aligns with learning outcomes related to data handling, reproducibility, visualisation, scientific reasoning, software lifecycle awareness, and transparent analysis.
The Academic Mentor assisted the student throughout the assignment by breaking down the task into manageable steps and ensuring that the student followed best practices throughout the workflow.
The mentor first explained the importance of NCD risk factors and how they change over time. The student was encouraged to select one variable—such as BMI trends, diabetes prevalence, or cholesterol levels that was interesting yet manageable for the limited timeframe.
The mentor emphasised:
Choosing a clear and simple research question.
Ensuring that the analysis remains focused and aligned with available years of data
Before you download the academic solution available on this page, remember that it is strictly for learning and reference purposes only. The sample is designed to help you understand structure, formatting, academic style, and the correct approach to solving similar assignments. Submitting this file as your own work may lead to plagiarism penalties, so use it responsibly.
If you want a fresh, plagiarism-free, custom-written solution, our expert academic writers are here to help. We prepare every assignment from scratch based on your guidelines and marking rubric, ensuring high-quality content that is original, well-researched, and academically accurate.
Tailor-made content written specifically for your topic
Zero plagiarism each assignment comes with a report
Expert writers across all subjects and academic levels
On-time delivery with guaranteed confidentiality
Helps you understand the topic more clearly and improve grades
Your academic success deserves more than a generic sample. Get a professionally written, high-quality solution that meets your learning needs and university standards.
The downloadable sample is only a reference document. Submitting it as your original work may breach academic integrity guidelines. Always use the sample for study support only.
[Download Sample] [Order 100% Original Assignment Solution]
© Copyright 2025 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.