771768 - Introduction to Programming for Artificial Intelligence and Data Science Assignment

Published: 25 Jan, 2025
Category Assignment Subject Computer Science
University University of Hull Module Title 771768 - Introduction to Programming for Artificial Intelligence and Data Science

Context

This assignment is designed to evaluate your ability to read and write file formats of common types used in Data Science, and to manipulate complex data into different representations. The tasks provided here are indicative of Data Pre-processing workloads which are common to all Data Science projects. The techniques learned and evaluated in this module will prepare you for further theoretical and applied topics later on in the programme, where you will further develop your skills.

This assignment makes use of an extensive collection of mocked data. These have been generated with some resemblance to real world values and distributions, including some relations between data elements.

Whilst teaching, both asynchronous and synchronous, stops for this module by Teaching Week 4, ad-hoc support will be available until submission of the assignment on the MS Teams site.

Tasks Data Processing (70%) Using standard python (No pandas / seaborn) with default libraries (os, sys, time, json, csv, …) you have been given the following tasks:

1. Read in the provided ACW Data using the CSV library.

2. As a CSV file is an entirely flat file structure, we need to convert our data back into its rich structure. Convert all flat structures into nested structures. These are notably: a. Vehicle - consists of make, model, year, and type b. Credit Card - consists of start date, end date, number, security code, and IBAN. c. Address - consists of the main address, city, and postcode. For this task, it may be worthwhile inspecting the CSV headers to see which data columns may correspond to these above. Note: Ensure that the values read in are appropriately cast to their respective types.

3. The client informs you that they have had difficulty with errors in the dependants column. Some entries are empty (i.e. “ “ or “”), which may hinder your conversion from Task 2. These should be changed into something meaningful when encountered. Print a list where all such error corrections take place. E.g. Problematic rows for dependants: [16, 58, 80, 98]

4. Write all records to a processed.jsonfile in the JSON data format. This should be a list of dictionaries, where each index of the list is a dictionary representing a singular person.

5. You should create two additional file outputs, retired.jsonand employed.json, these should contain all retired customers (as indicated by the retired field in the CSV), and all employed customers respectively (as indicated by the employer field in the CSV) and be in the JSON data format.

6. The client states that there may be some issues with credit card entries. Any customers that have more than 10 years between their start and end date need writing to a separate file, called remove_ccard.json, in the JSON data format. The client will manually deal with these later based on your output. They request that you write a function to help perform this, which accepts a single row from the CSV data, and outputs whether the row should be flagged. This can then be used when determining whether to write the current person to the remove_ccard file. Note the dates are shown in the format used on credit cards which is “MM/YY. 

7. You have been tasked with calculating some additional metrics which will be used for ranking customers. You should create a new data attribute for our customers called “Salary-Commute”. Reading in from processed.json: 

a. Add, and calculate appropriately, this new attribute. It should represent the Salary that a customer earns, per Km of their commute.
i. Note: If a person travels 1 or fewer commute Km, then their salarycommute would be just their salary.

b. Sort these records by that new metric, in ascending order.

c. Store the output file out as a JSON format, for a commute.json file.

Are you looking for your “771768 – Introduction to Programming for Artificial Intelligence and Data Science” assignment to be done? Our Computer Science Assignment Help in UK is here to guide you. We provide personalized solutions crafted by PhD-qualified writers that cover key areas such as Python programming, machine learning algorithms, and data analysis techniques. Online assignment help can be provided in the form of expert-written assignment samples to meet academic standards that ensure clarity and accuracy. Fear not! We provide you with all types of assignments stress-free.
 
You can also see the solution of this brief. We also provide you 771768 assignment example
.
Online Assignment Help in UK