COS10022 Data Science Principles Assignment 1 Semester 1 2025 | SUT

Published: 14 Jul, 2025

Category	Assignment	Subject	Computer Science
University	Swinburne University of Technology	Module Title	COS10022 Data Science Principles

Assessment Title	Predictive Model Creation and Evaluation
Academic Year	2025

COS10022 Assessable Item:

One (1) piece of a written report no more than 10 pages long, with the signed Assignment Cover Sheet.

The submitted report must be checked by Turnitin, and the similarity from not the template part should be less than 12%.

The submitted report should answer all questions listed in the assignment task section in sequence.

You must include a digitally signed Assignment Cover Sheet with your submission.

Purpose of Assignment

This assignment aims to evaluate students' achievement of the following unit learning outcomes:

1. Explain the key concepts, techniques, and tools for handling the data and creating prediction models.

2. Work on feature and model selection and implementation in a data science project.

This is an individual assignment that requires peer review and communication with colleagues. Refer to the Unit Outline for the late submission penalty policy. You can ignore the high similarity on the cover page and the template wording, but not in your report content. You must ensure your submitted report has a similarity lower than 12% in total and less than 6% from a single source. Otherwise, your report will not be marked.

Key Lessons:

You are asked to divide the dataset and then utilise the linear and logistic regressions to build two models in the KNIME analytic platform.

COS10022 Introduction

The dataset contains 150 tuples of 7 commonly seen fish species in the market. There are 6 attributes included in the source data. We have two goals in this assignment: the first goal is building a linear regression model for predicting the weight of the fish, e.g., the value in the "Weight_of_Fish_in_Gram" attribute; the second goal is building a logistic regression model for predicting the species of the fish. You are expected to follow the instructions for building your predictive model and answer questions.

Assignment Goal

This assignment aims to build experiences for students to select independent attributes, split the data into training and test sets, train a usable predictive model, and explain the outputs. A small part of the discovery and research component is included in the assignment to expand the students' skill set.

COS10022 Assignment Task

The dataset has been cleaned and organised with no missing data. Your tasks are to select the proper attributes and to create the predictive models according to the instructions for answering the questions listed below. The source file i"Fish_Species_2024.csv".The report should be prepared with the template and answer the questions, followed by finding the required information, splitting the dataset, model training and testing. A table of contents is not required.

Data Preparation (10%)

You must follow the instructions to split the given data set into training and test sets. Remember, a well-split dataset is the foundation of support for the model training and testing. You are required to use a Shuffle node with 9214 as the seed value to shuffle the input data. Moreover, you need to partition 80% of the input data in the training set by the "draw randomly" method with 9214as the seed value.

Linear Regression (40%)

The data source contains many details of the record. Our goal is to build a predictive model for predicting the weight of the fish. The weight of the fish is recorded in the attribute "Weight_of_Fish_in_Gram" in the given file. Your mission is to create a linear regression model in KNIME and visualise the prediction result.

Logistic Regression (40%)

Using the same source file, we aim to build a predictive model for classifying the input fish into the corresponding species.

Performance Improvement (10%) -This part can be optional

Creating a linear regression model is simple. How to improve the accuracy of the prediction result requires a bit more effort. Let's focus on a single species of fish - the Perch. If you are limited to selecting three (3) attributes as the input for your linear regression model only, find a way to decide which attributes should be included. Note that when building the linear regression model, you must ensure that tuples in the new training and test sets are a full subset of the original training and test sets.

### Important Note ### You must use the seed value specified in the instructions. Otherwise, you will get different results from the correct answer in almost all questions.

There are 100 marks on this assignment. Your proposal must address the following tasks.

1. Follow the instructions above to split the source data into training and test sets. Answer the following questions after splitting the data. [10 marks in total]

1) Submit the workflow of Assignment 1 via Assignment 1.1.[2.5 marks]
2) How many tuples are included in the training set? [2.5 marks]
3) How many species are included in the test set? [2.5 marks]
4) Do species "Whitefish" and "Smelt" have the same number of tuples included in the test set? [2.5 marks]

Submit Your Assignment Questions & Get Plagiarism-Free Answers

Order Non-Plagiarised Assignment

Request Answer of this Assignment

Share this with your Friends

Facebook

Instagram

Get expert assignment help for COS10022 Data Science Principles! We specialise in offering high-quality computer science assignment help, with an option for students to pay our experts to take on their assignment challenges. Need a reference? We also provide a free list of assignment examples to help you get started. With years of experience, our writers deliver 100% plagiarism-free content and offer unlimited revisions to meet your needs. Trust us to help you excel in your studies!

Latest Related Questions

Hire Assignment Helper Today!

Your Name

Word Count

Your Email

Mobile Number

Reference Style

Paper Style

Enter Subject

Education Level

Select Deadline

Turn it in Report
YES: NO:

Choose Attachment/AttachMents

Add File

Get 100% AI & Plagiarism Free Work, Connect With Our Writers Now!

Let's Book Your Work with Our Expert and Get High-Quality Content

COS10022 Data Science Principles Assignment 1 Semester 1 2025 | SUT

COS10022 Assessable Item:

Purpose of Assignment

Key Lessons:

COS10022 Introduction

Assignment Goal

COS10022 Assignment Task

Data Preparation (10%)

Linear Regression (40%)

Logistic Regression (40%)

Performance Improvement (10%) -This part can be optional

Share this with your Friends

Latest Related Questions

304HSC Research Project Assignment 1 Task

ACC707 Auditing And Assurance Major Assignment Questions

WM9F1-15 Procurement and Inventory Management Assignment Brief and Front Sheet PGT

6037BMS Clinical Biochemistry and Immunology CW2 Assignment Brief | CU

CT7205 Machine Learning and Optimization Assignment Questions

TLC101 Communication Skills for Undergraduate Study Learning Journal 1 Final Questions S2 2024-25

LD9634 Designing and Implementing SME Business Strategy Assessment Brief 2024-25 | NUN

CMP7241 Information Security Governance PG CWK Assessment Brief 2024-25 | BCU

Customer Experience Strategy Coursework 6 Summative Assessment Brief 2024-25 | BPP

BE281 Data-Driven Decision Making Report Assignment Questions | UOE

Hire Assignment Helper Today!

Latest Free Samples for University Students

Get 100% AI & Plagiarism Free Work, Connect With Our Writers Now!