| Category |
Assignment |
Subject |
Computer Science |
| University |
TAFE NSW - Sydney Institute |
Module Title |
ITICT209A Machine Learning Foundations |
| Assessment Name |
Practical Project Assignment |
| Academic year |
2025 |
Assessment Description
This is an Individual Assessment.
Content
Students will be provided with the following dataset in Moodle. This dataset contains 20,000 records of personal and financial information. It is designed to predict the binary outcome of loan approval, indicating whether an applicant is likely to be approved or denied for a loan. Dataset Columns include:
- Age: Applicant's age (in years).
- Income: Applicant's annual income (in dollars).
- Experience: Applicant's work experience (in years).
- Loan Amount: Requested loan amount (in dollars).
- Interest Rate: Interest rate on the loan (in percentage).
- Credit History Length: Length of the applicant's credit history (in years).
- Credit Score: A Numeric representation of the applicant's creditworthiness.
- Loan Approval Status: Binary outcome indicating loan approval status (yes for approved, no for denied).
Using the above dataset, the students will be required to examine the following three classification algorithms: Logistic Regression, Support Vector Machines (SVM), and K-Nearest Neighbour (KNN). The student is expected to:
- Conduct a brief literature review on each of the three algorithms (The student must read at least three journal articles and cite them in the
- literature review). Train the models on 80% of the dataset and evaluate their performance on the remaining 20%.
- Compare the performance of the algorithms using suitable evaluation metrics and present the results in a table.
- Provide a comprehensive discussion and analysis of the results.
- Summarise the conclusions and include a reference table listing all sources cited in the body of the report.
Structure of the report:
1. Introduction
- Provide a brief explanation of the importance of classification in machine learning.
- Introduce the three selected algorithms: Logistic Regression, Support Vector Machines (SVM), and K-Nearest Neighbour (KNN).
2. Literature Review of Classification Algorithms
- Summarise each algorithm, outlining its key features, strengths, limitations, and typical use cases.
3. Dataset
- Explain any necessary preprocessing steps, such as handling missing data or removing duplicates.
- Describe the process of splitting the dataset into training and testing sets.
- Briefly discuss the purpose of training data in building models and testing data in assessing their performance.
4. Analysis of the Three Classification Algorithms
- Train and test each algorithm using the dataset. Evaluate them using at least three metrics, such as accuracy, precision, recall, and F1-score.
- Display the results in a comparative table to clearly show the performance of each model.
- Interpret the outcomes, noting any significant patterns or differences observed.
5. Conclusion
- Recap the main findings, stating which algorithm performed the best and the reasons why (based on the dataset used).
- Offer any relevant insights or suggestions derived from the analysis.
6. References (up to 3% penalty for incorrect referencing)
- List all cited sources using a consistent and appropriate referencing style (e.g., APA, IEEE).