PSYC-1115 Applied Epidemiology and Statistics, Summative Assessment, University of Greenwich

Published: 04 Jan, 2025
Category Assignment Subject Statistics
University University of Greenwich Module Title PSYC-1115 Applied Epidemiology and Statistics

Questions from THIS Year (AY24-25):

1.    Question: Hi Nadya, I was trying to do a chi-square test to examine the relationship between my categorical outcome variable and another categorical variable, but I keep getting an error message that says the test is failing because the test assumptions are being violated (specifically, that the contingency table has at least one cell with an frequency of less than 5). What should I do??

Answer: FIRST: make sure you’ve coded your categorical variable correctly. It’s unusual for these datasets to dip below 5 in contingency tables, so double check the categorical variable first. Next: if your categorical outcome in its current form means that the chi-square contingency table has at least one cell of <5, the test won’t be able to run. Ask yourself two questions and acquire two possible solutions: 1) do I have to use the categorical version of the outcome variable in my inferential statistics? (No, see instructions under “inferential statistics”.) 2) do I have to use the categorical coding (3 categories) that you asked me to create for the descriptive statistics for my inferential statistics? (Also no. You can recode your outcome variable to a binary outcome variable and use that in your inferential statistics if you want, as long as your new binary outcome variable has clinically sensible cut-offs that you can describe and justify in your methods section). 

2.  Question: Hi Nadya, how do I decide whether to create Means (and SDs) or Medians (and IQRs) for a variable’s descriptive statistics? 

Answer: First, this is definitely something that you can google and learn more about yourself. We covered it veeerryy early in the term, if you want to go find it in lectures. 

But here’s what happen when I google your question: 
And oh look! This is the first thing that the internet says. YOU have the skills to check whether these conditions are true for the variable that you’re deciding on creating means vs. medians for. A GREAT answer would also explain WHY we use means for normal distributions w/ few outliers and medians for skewed distributions when there are outliers.

3.   Question: Hi Nadya, I’m writing code to create a new categorical variable but when I try and use it in statistical testing, I get an error message where Stata says something like “This is a string variable and I want the variable to not be a string variable.”

Answer: Remember what we worked on in the last lab, with the formative assessment. You had to create a categorical variable from a numerical variable, but you had assign numbers to the groups in the new categorical variable and then assign labels to those numbers? We had to do that because Stata is just a simple computer, and needs your categories assigned to numbers (even though it’s a categorical, non-numerical variable). You can think of this as Stata calling them “Group 1” “Group 2” “Group 3”, etc. 

Questions from Students Last Year (AY23-24) that I thought might help you this year:

1.    On how to indicate an Interquartile Range (IQR):
The IQR can be represented in two ways:
1)    As a range where you take the Third Quartile (Q3) and subtract the First Quartile (Q1), IQR = Q3 – Q1, so you’re representing the IQR as one number
2)    As two numbers, INDICATING a range, namely the first quartile (Q1) then a comma, followed by the third quartile (Q3), something like (“Q1, Q3”) 

Both give you almost the same information and both are valid ways to represent the IQR. 

2.   Why is there no obesity variable in my Pinkland dataset?
You are correct there’s no continuous BMI variable or categorical BMI variable (Obese is one of the categories of categorical BMI) in the dataset. 
What that means is you’ll have to create those variables yourself, and you DO have the height and weight variables, which is what you need to create a continuous BMI variable. Be careful though, height is in centimeters…
All of this was covered in the last computer lab, now you have to apply it. 

3.   Can I make boxplots of a categorical variable?
Um….friend? Have you been to class? ABSOLUTELY NOT. Boxplots are for continuous variables ONLY. Just like histograms. You can create a double boxplots of a continuous variable BY a categorical variable though, if you want to compare the boxplots of the continuous variables across categorical groupings. But you if try and make a boxplot of a categorical variable, you’re basically committing the same mistake as trying to calculate a mean or a median of a categorical variable like eye color. It doesn’t make sense.  

4.  Why do I have to reference things if everything in the technical report is in my own words? 
The introduction section will require you to set the scene with a few outside statistics (e.g. citations about how much of an issue your health outcome is becoming in the UK, [ you can use UK statistics for “Pinkland”], what types of consequences that health outcome has for health and wellbeing, etc.) SET THE SCENE. Then in the limitations and policy section, there’s PLENTY of other things to cite things: assumptions and limitations of your hypothesis tests? Was there anything strange about how the data were collected? Any types of bias that your study is open to? Are there any policy next steps that you could imagine working in this sample based on your results? ETC.. One important note: EVEN WHEN YOU’RE CITING SOURCES, you still have to paraphrase. You CANNOT copy the words from the source that you’re citing. That’s academic misconduct and it will tank your grade because you’ll be reported to the academic misconduct team (it’s very quick way to get a zero on this assessment).

5.   Can I copy Stata table output into my technical report? 

No. Do not not not not not not copy Stata table output directly into your report, I will burn my laptop if I see raw Stata table output in the summative assessment. And then I’ll find you and burn your laptop. ALWAYS make your own tables. You’re writing a report for someone who is not familiar with Stata, keep that in mind when you’re deciding what goes into your tables. Get rid of the numbers that won’t make sense to a public health professional who doesn’t code much.
[But Nadya, WHY can’t I just use Stata’s tabular output?] It’s just too ugly. That’s the simple answer. They aren’t “publication-quality tables,” which is what you should be aiming for in your report. 
THE BIG EXCEPTION IS FIGURES. Stata produces publication-quality figures. You are allowed to copy the figures boxplots, histograms, whatever other figures that YOU make, directly from Stata into your report. 

6.   Continuous outcome (BMI, A1c, cholesterol) versus categorical versions of those outcomes (BMI categories, Diabetes status, Cholesterol status) – Which one should I use for my hypothesis tests?

No easy answers here. The summative report is asking you to establish whether there’s a relationship between your outcome variable and a few different predictor variables. The Assessment Specifications document tells you that the main focus of the assessment is to determine whether there’s a statistically significant relationship/association between your main outcome variable and the other variables. The Assessment Specifications document also tells you that you can examine the relationship using continuous outcome OR categorical outcome. You can also use a COMBINATION of continuous outcome and categorical outcome (e.g., use continuous outcome for some of the hypothesis tests and then use categorical outcome for other hypothesis tests). 

7.  Question about categorical outcome: if I want to use categorical outcome for my inferential statistics, do I have to use the categories that I use in the descriptive statistics table (3 categories) or can I create a NEW binary categorical variable (obese vs. not-obese, diabetes vs. no diabetes, high cholesterol vs. not-high cholesterol)? 

You can use a binary categorical variable if you want, you’ll have to create it yourself and you will have to create it CORRECTLY. Maybe it goes without saying, but categorical variables have to have mutually exclusive categories (no overlap, no observations left without a categorization). 

8.  When we’re describing the sample at the beginning of the methods section, can we just work off understanding and paraphrasing the “Assessment data description” document (screenshot below) OR do we have to go find other evidence/sources on the internet for where this dataset comes from. 

You don’t have to find outside sources to cite when it comes to describing sampling methods and how the dataset you’re using came into being. You only have to use the information provided in the document “Assessment data description & codebook” You don’t have to cite that document either, you can assume that that document is shared knowledge with you and your other health department employees (the ones that are reading your technical report). IF you want to get fancy with the study LIMITATIONS, you may have to use outside sources that talk about things like measurement error or information bias and how dangerous they can be. 

Completing your PSYC-1115 Applied Epidemiology and Statistics Summative Assessment can be challenging, but our assignment help service UK is here to support you. We provide well-structured, plagiarism-free solutions that meet Greenwich University assignment help standards. Our experts cover all essential areas, including data analysis, interpretation, and report writing. You can also get assignment samples to better understand how to format your work and improve your writing skills. Save time and reduce stress by letting our professional writers assist you in delivering a high-quality assessment. Achieve better grades with our trusted academic services!

assignment help