| Word Count |
1500 Words |
| Assessment Type |
Written Assignment |
| Assessment Title |
Assignment 1 |
| Academic Year |
2025-26 |

CMS3503/CMS7503 Machine Learning Assignment 1
General Study Guidance
- The University has regulations relating to academic misconduct, including plagiarism. The Academic Skills Team can advise and help you with how to avoid ‘poor scholarship’ and potential academic misconduct.
- If you have any concerns about your writing, referencing, research, or presentation skills, please consult the Academic Skills Team and book a tutorial appointment.
- Further study resources, including the Academic Skills Team overview, can be found here: Study resources.
Assignment 1: Literature Investigation
1. Assignment Aims
Demonstrate comprehensive knowledge and critical understanding of the use of machine learning techniques in the chosen problem/application area.
2. Learning Outcomes:
Upon completion of this module, learners will be able to:
1. Recognise the multi-disciplinary nature of Data Analysis and/or Artificial Intelligence and the potential application areas.
2. Critically appraise relevant literature to formulate a plan for their own practical/experimental work.
3. Assessment Brief
Literature Investigation: The assessment is to apply machine learning algorithms of your choice to analyse a real-world benchmark problem. Three concrete application areas with benchmark datasets are listed in the Appendix; if you wish to work on another domain, you must consult with your module tutor first. This assignment should produce a scholarly report concerning the use of machine learning methods in the application area you’ve selected. You should evaluate both the appropriateness and the readiness of a set of ML techniques for a problem class.
The Word Count of this assignment is 1500. The weighting of this assignment is 40% of the overall module grade.
The report must be submitted electronically. The specific requirement for this component is as follows:
- The motivation and background to literature survey in this domain
- An introduction to the problem domain and the area of focus
- A critical review of the associated literature
- The findings or summary of the reviewed literature
This component will be assessed by submitting a PDF report
The reference should follow the APA7 style as recommended by the university here https://library.hud.ac.uk/pages/apareferencing/
Please note that the specified word count does not include references. And you must explicitly annotate the word count in your report.
For both assignments, they must work on the same topic/dataset.
4. Marking Scheme and Grading Rubric
Marking scheme for Literature Review – The weighting of this assignment is 40% of the overall module grade.
|
|
<30%
|
30-40%
|
40-50%
|
50-60%
|
60-70%
|
>70%
|
|
Coverage of the area, including background, planning, and literature review.
|
Not acceptable
|
Some attempt to cover the area, but with serious limitations.
|
Brief with significant limitations.
|
Good coverage, but with some notable limitations.
|
Very good coverage of the area and associated issues with a good review of the literature.
|
Excellent coverage, showing a sound understanding of the topic.
Excellent critical review of literature.
|
|
Structure and presentation.
References, bibliography.
|
No clear structure, and the presentation is very weak.
Poor or no bibliography, reference list, or citations in the report.
|
Weak structure, poor presentation.
Poor bibliography, reference list, and citations in the report.
|
Satisfactory approach to structure and presentation.
List of references present, but with significant limitations.
|
Well structured and well presented. Most references are in the correct format from both web and traditional sources.
|
Very well structured and prepared with only minor limitations.
References cited in correct notation from both web and traditional sources.
|
Highly professional approach; excellent structure.
Thorough reference citation from a variety of sources.
|
Appendix – Application Area
Please note: You are required to choose one of the benchmark data sets below for your investigation and development task. If you would like to work on alternative application areas, have a chat with the module leader, m.jilani@hud.ac.uk, first.
1. The World Health Organisation (WHO) characterised COVID-19, caused by SARS-CoV-2, as a pandemic on March 11, while the exponential increase in the number of cases has been risking overwhelming health systems around the world with a demand for ICU beds far above the existing capacity. This dataset contains anonymised data from patients seen at the Hospital Israelita Albert Einstein, in São Paulo, Brazil, and who had samples collected to perform the SARS-CoV-2 RT-PCR and additional laboratory tests during a visit to the hospital. Utilising this dataset, you are free to accomplish one of the two possible tasks below for this assignment
a. Task 1: Predict confirmed COVID-19 cases among suspected cases.
i. Based on the results of laboratory tests commonly collected for a suspected COVID-19 case during a visit to the emergency room, would it be possible to predict the test result for SARS-CoV-2 (positive/negative)?
b. Task 2: Predict admission to the general ward, semi-intensive unit or intensive care unit among confirmed COVID-19 cases.
i. Based on the results of laboratory tests commonly collected among confirmed COVID-19 cases during a visit to the emergency room, would it be possible to predict which patients will need to be admitted to a general ward, semi-intensive unit or intensive care unit?
c. More information about the data set and how to download it can be found with the following link https://www.kaggle.com/dataset/e626783d4672f182e7870b1bbe75fae6 6bdfb232289da0a61f08c2ceb01cab01
2. Shoulder Implant X-Ray Manufacturer Classification Data Set Images were collected by Maya Stark at BIDAL Lab at SFSU for her MS thesis project. The original collection included 605 X-ray images. Eight images that appeared to have been taken from the same patients were removed, resulting in the final 597 images. The final set contains images from the following manufacturers: 83 from Cofield, 294 from Depuy, 71 from Tornier, and 149 from Zimmer, resulting in a 4-class classification problem. Class labels are provided as the manufacturer's name in file names.
a. https://archive.ics.uci.edu/ml/datasets/Shoulder+Implant+X-Ray+Manufacturer+Classification
b. https://scholarworks.calstate.edu/concern/theses/79407z98n
3. Simulated Falls and Daily Living Activities Data Set: 20 falls and 16 daily living activities were performed by 17 volunteers with 5 repetitions while wearing 6 sensors (3.060 instances) that attached to their head, chest, waist, wrist, thigh and ankle.
a. https://archive.ics.uci.edu/ml/datasets/Simulated+Falls+and+Daily+Living+Activities+Data+Set
b. Ozdemir, A.T.; Barshan, B. “Detecting Falls with Wearable Sensors Using Machine Learning Techniques.” Sensors 2014, 14, 10691-10708.