ALY 6110 : You Will Download And Process, With Spark, Two Of The Following Datasets Or Two Datasets- Data Management & Big Data Spring Assignment

Published: 25 Jan, 2025
Category Assignment Subject Computer Science
University Northeastern University Module Title ALY 6110 Data Management & Big Data Spring

Overview and Rationale

Spark’s intended use is for data lakes which were discussed previously. It is important to be able to process these large data sets effectively with Spark. This assignment will provide you with experience and practice in using Spark to analyze a large data set.

Course Outcomes

This assignment is directly linked to the following key learning outcomes from the course syllabus:

Use methodologies to analyze big data sets and determine insights, including the use of Spark

Assignment Summary

For this assignment, you will download and process, with Spark, two of the following datasets OR two datasets your team selected for the final project. Note however, this is an individual assignment. If you choose your project dataset, your group may choose to incorporate your work into their assignment, but that is not required.

Government Sourced

Annual House Price Indexes
(see Working Papers
https://www.fhfa.gov/research/papers/wp1601?redirect=

https://www.fhfa.gov/research/papers/wp1602?redirect=

https://www.fhfa.gov/research/papers/wp1604?redirect=

Annual House Price Indexes
(see Working Papers

https://www.fhfa.gov/research/papers/wp1601?redirect=

https://www.fhfa.gov/research/papers/wp1602?redirect=

https://www.fhfa.gov/research/papers/wp1604?redirect=

Three-Digit ZIP Codes (Developmental Index; Not Seasonally Adjusted) )

CSV download (for Spark)

Instructions

Select 2 datasets, use SparklyR library in R to load them into Spark, and use a combination of SparklyR and traditional R commands to find insight of the details using different types of graphs and charts.

Write a 3-5 page report (excluding Appendix) that includes a section for each data set you choose to analyze. (Note: this is a short analysis, so keep your business problem focused and your analysis brief.)

For each data set include:

A description of the steps you took to perform the analysis (with code snippets as needed)
Secondary appendix file that is either the commented R code used (recommended) or commented screenshots of your analysis steps Results of your analysis

Your insights based on your analysis Format & Guidelines

The paper should follow the following format:

Introduction

Provide a short description of the dataset you analyzed and purpose for the analysis. Identify questions you are attempting to answer with or insights you want to gain from the analysis.

Analysis and results Outline your steps and provide the results of your analysis. Connect the results and your analysis to the purpose described in the introduction. Be specific.

Insights
Provide your insights based on your analysis. Connect your insights to the purpose of the analysis.

References Cited
Likely brief, but at least include citations to your data sources.

Appendix

Provide commented R code (recommended) or commented screenshots outlining the complete analysis; you will hit on the key analysis steps in the Analysis and Results section while referencing this Appendix for the complete process.

For your reference, please see the SparklyR Getting Started Guide (https://therinspark.com/starting) for how to install Spark via the SparklyR library in RStudio and a general introduction to the SparklyR commands.

Windows users! When installing the Java JDK, be sure to choose the option to install in a different location (eg. c:/java8) rather than the standard Program Files directory.

Are you a Student at Northeastern University And Facing challenges with your ALY 6110 Spark Installation Assignment? Look no further! At Workingment, provide plagiarism-free online assignment help services for UK students, offering expert guidance with your computer science assignments. Whether you're short on time or need someone reliable to handle the technical details, Are you thinking can i pay someone to do your assignment? The answer is yes.   Our professional writers ensure accurate, timely, and AI-free solutions according to your requirements. Don’t let your assignments stress you out—get the best help with computer science assignments today and achieve A+ Grades in your assignments. Visit Workingment.com and let us take care of the hard work!

Online Assignment Help in UK