MA7007 Statistical Modelling and Forecasting Case Study Report 2024-2025

Published: 03 Jul, 2025

Category	Assignment	Subject	Education
University		Module Title	MA7007 Statistical Modelling and Forecasting

Word Count	5000 Words
Assessment Type	Case Study
Assessment Title	Report

Submission Date:

11th July 2025

MA7007 Assessment

This document describes the coursework for MA7007. The coursework involves the statistical analysis of real data sets in Rand the writing of a report to describe the results. [Note, the emphasis is on the report, which means that producing just computer output is not enough. The output should be complimented with intelligent comments and explanations].

The coursework consists of the following:

Each student is given two data sets.
Each student is required to find a third data set, related to their own interest.
Each studentis expected to analyse the firsttwo data setsfollowing the instructions given below.
Forthe third data setthe studentisrequired to show their own initiative in analysing the data.
Each student should write a small report (less than 5000 words) describing how they have done the three analyses and describing their results. Section 2 gives instructions on writing the report.
The students will have a chance to obtain feedbackon their work at two occasions:
– by showing the answers to the first question of the report to the tutor in week 4 for an informal assessment.
– by participating atthe tutorialsin week 6where they can show their progress and ask questions.

1 The data

1.1 Instructions on how to analyse the first data set

The first data set is a subset of Body Mass Index (BMI) data obtained from the Fourth Dutch Growth Study, Fredriks et al. 2000 [1].

The data contains BMI for different ages in years for Dutch boys. Each student will be given a different age, for example, 10 to 11 years old. The aim here is to find a suitable distribution of the BMI at this age.

(a) The original data,which contains all ages from zero to twenty two, existsinthegamlss.data package under the name of dbbmi. Each student should analyse a different age. Here we give an example how to analyse age 10. We first bring the data set in R and then create a subset data.frame containing only a specific age (here from 10-11). The following commands can be used:

library(gamlss)
data(dbbmi)
# this value will vary from student to student
old <- 10
da<- with(dbbmi,subset(dbbmi, age>old & age<old+1))
bmi10<-da$bmi
The data,frame bmi10 now contains only the subject aged from 10 to 11. You can plot the data
using:
hist(bmi10)

or:

library(MASS)
truehist(bmi10,nbins=30)

or:

library(gamlss,ggplots) gamlss.ggplots:::y_hist(bmi10)
Note that by increasing the argument nbins of the truehist() function the histogram will become more dense. Find a suitable value for nbinsfor the histogram to look good. You only need to show one histogram on your report.

(b) Fit different parametric distributions to the data and choose an appropriate distribution to the data. Justify the choice of the distribution by explaining what you have done and why you select this specific distribution.

(c) Output the parameter estimates for your chosen model using the function summary() and interpret the fitted parameters. (You may refer to the GAMLSS distribution book Rigby et al. (2019) [2] (or to its earlier version which can be found in the GAMLSS web-site https://www.gamlss.com) to find what the distribution parameters represent (i.e. location, scale, kurtosis etc.).

1.2 Instructions on how to analyse the second data set

Cohen et al. (2010) [3] analysed the handgrip (HG) strength in relation to gender and age in English schoolchildren. Here each student is required to analyse a different sample of 1000 from the original 3766 English boys. The data are stored in the packages gamlss.data under the name grip and contain the variables gripand age. The aim here isto create centile curvesfor grip given age.

(a) Read the data file by typing data(grip)into R. Note that the gamlss packages have to be
downloaded first i.e. library(gamlss).
(b) In order to select your individual sample a unique seed number will be given to you. (In the example below we use the seed number 243 for demonstration.)

Select your individualsample using your own seed number:
set.seed(243)
index<-sample(3766,1000)
mydata<-grip[index, ]
dim(mydata)

(d) Use the LMS method to fit the data1

. That is, fit the BCCG distribution for grip.

gbccg <- gamlss(grip∼pb(age),sigma.fo=∼pb(age),nu.fo=∼pb(age),data=da, family=BCCG), where the smoothing for age uses the P-splines function pb(), i.e. pb(age), for the predictors for parameter µ, σ and ν.
How many degrees of freedom were used for smoothing in the model? Use the function edf() or edfAll().

(e) Use the fitted values from the LMS model in (d) as starting values for fitting the BCT and the BCPE distributions to the data, e.g.
gbct <- gamlss(grip∼pb(age),sigma.fo =∼pb(age), nu.fo = ∼pb(age),tau.fo =∼pb(age),data=da,family=BCT,start.from=gbccg)
What are the effective degrees of freedom fitted for the parameters? Try to interpret the effective degrees of freedom.

(f) Use the generalised Akaike information criterion, GAIC, to compare the three models.

(g) Plot the fitted parameters for the fitted models in (d) and (e) using for example fitted.plot(gbccg, gbct,x=da$age) where gbccg and gbct are the BCCG and BCT models respectively.

(h) Obtaina centileplotforthe fittedmodelsin (d) and(e)using centiles() or centiles.split() and compare them.

(i) Investigate the residuals from the fitted models in (d) and (e) using e.g. plot(), wp() (worm plot) andQ.stats()(Q-statistics).
(j) Choose between the models and give a reason for your choice.

1.3 Instructions on how to analyse the third data set

a) Make sure that the data set is checked for its suitability by your tutor. We are dealing, in this
module, with regression type of models, so the data should contain one response variable (the target) and more than one explanatory variables.
b) Give the website source (or other relevant information) for your data set and explain the purpose of the analysis i.e. why you would like to analyse the data.
c) Perform a preliminary analysis on your data, This usually involves exploratory plots. Comment
on howreliable the data are and possible pitfalls on the original data collection.
d) Find an appropriate statistical model for the response variable in your data using the explanatory variables. This usually involves selecting:
• an appropriate distribution for yourresponse variable and
• a selection ofrelevant explanatory variablesto explain the response.
e) Use diagnosticsto check the assumptions of the model.
f) Use the model for prediction.

Do You Need Assignment of MA7007 Case Study Report

Order Non Plagiarized Assignment

2 How to write the report

The Reportshould have the following structure

1. Introduction

2. First data set (fitting distributionsto the data)
(a) Comment on the different distributions you are using.
(b) Which distribution did you choose?
(c) Give reasons why you chose the distribution in part (b).
(d) Plot the fitted distribution and comment.
(e) State the fitted parameter values ofthe final chosen model

3. Second data set (centile estimation)
(a) Comment on the different models you are using.
(b) Answer the explicit questionsin section 1.2.
(c) Use residual diagnosticsfor checking the model
(d) Comment on how you selected yourfinal model.
(e) Comment on the final centile plots.

4. Third data set (students’ data)
(a) Explainwhy you collected thedata andwhatisthequestion you are trying to answer.
(b) Give a preliminary analysis on the collected data and comment on the reliability of the data.
(c) Use an appropriate model(s) to fit the data.
(d) Comment on how you selected yourfinal model including diagnostics.
(e) Show how you will use the model for prediction.

5. Peer Review
(a) Choose one from a selection ofstudent work on the third data set.
(b) Write a short critique on the adequacy of the work. You should look at:

the quality of the explorative analysis of the data set
the choice of the distribution for the response (target) variable
the method forselecting and checking the model
the interpretation of the results

6. Conclusions

7. References

8. Appendix

Important points:

DONOT PUT DATAin the report (only the first 20 casesin the Appendix).
Any figure you use should have a caption below like: Figure 1: Showing the linear regression of y against time. (You should refer to Figure 1 in the text).
You should only putthe importantresults orfiguresin the report and comment on them.
Put PAGE Numbersinto yourreport.
The report does not have to be long. Extensive output without comments will get little credit. Your comments and explanation are most important.

MA7007 Statistical Modelling and Forecasting Case Study Report

MA7007 References

[1] A.M. Fredriks, S. van Buuren, R.J.F. Burgmeijer, J.F. Meulmeester, R.J. Beuker, E. Brug- man, M.J. Roede, S.P. Verloove-Vanhorick, and J. M. Wit. Continuing positive secular change in The Netherlands, 1955-1997. Pediatric Research, 47:316–323, 2000.
[2] Robert A Rigby, Mikis D Stasinopoulos, Gillian Z Heller, and Fernanda De Bastiani. Distributions for modeling location,scale, and shape: Using GAMLSS in R. CRC press, 2019.
[3] D. D. Cohen, C. Voss, M.J.D. Taylor, D.M. Stasinopoulos, A. Delextrat, and G.R.H. Sander- cock. Handgrip strength in English schoolchildren. Acta Paediatrica, 99:1065–1072, 2010.

Buy Answer of MA7007 Case Study Report & Raise Your Grades

Order Non Plagiarized Assignment

Request Answer of this Assignment

Share this with your Friends

Facebook

Instagram

Looking for the solution of the MA7007 Statistical Modelling and Forecasting Case Study Report ? Look no further! There are specialized professionals for all categories of assignments who offer you plagiarism-free and superior content. You are assured that our Report Writing Help will make you productive and help you achieve high grades in your academic year. A free list of assignment samples written by PhD experts is also provided here that can help you boost your study power and check the quality of the assignment. So contact us today and get your top-notch assignment!