Category | Coursework | Subject | Management |
---|---|---|---|
University | University Of Greenwich | Module Title | BIOT1012 Research Methods, Biostatistics & Data Management |
1. During this course, you will be mainly using Microsoft Excel (or SPSS/other software if you want) to analyse your data and then you will need to produce a word file report to be submitted as a formative assessment (date of submission is TBD).
. The purpose of the IT session is to give you the opportunity to learn through example data and to figure out using Excel as a software for statistical analyses and interpretation.
3. In the IT lab, please run the analyses in Excel file, and then once you are done with a particular analyses, put an organized form of your answer (This should contain figures, tables or interpretations as requested in each task) in a separate word file.
4. Note that this word file report that you will generate by the end of the IT sessions will represent the formative assessment for this module. This will be submitted towards the end of the IT sessions.
5. How you create the coursework report (the word file)?
3.1. At the end of all IT session, you will need to submit a coursework report in a form of a word file. This will be your formative assessment. You create this word file based on your answers in the Excel file.
3.2. Each IT lab will have its own exercises (numbered 1, 2, etc.) and datasets (aligned as possible with the lecture contents of the same week) and you would need to answer the tasks associated with these, put your answers in Excel file and use this to create your final word file coursework report. .
3.3. In your Excel file, type each data set into a spreadsheet (name the sheet accordingly), save the Excel file data using a suitable file name in a location that you can retrieve for future use. It is entirely your responsibility to ensure that you have adequate backups of your data. There is no excuse for losing data and this is not an excuse for poor performance in the final MCQ assessment.
3.4. Make sure to update any copy you have made accordingly. If you do not understand how to store your data safely, please consult the tutor/module leader.
6. By the end, you will produce one word file report containing the answers for the tasks.
In the context of data analyses, it is important to be able to summarize your data in a way that makes it easier for others to absorb. It is also important to check whether the collected data are normally distributed or not. This is of importance because it determines whether you will be analysing your data using parametric or non-parametric tests. There are several ways of testing data normality. A quick way is to plot the data using specific graphs (e.g. histogram) and to explore data descriptive statistics, which could give an indication of normality. A numerical method, that is more accurate, is to calculate expected values using the equation for a Gaussian distribution (Equation 1.1), and then use a “chi-squared”, χ2, “goodness-of-fit” test to check that your observed data are not statistically significantly different from that predicted by a Gaussian distribution.
Where f(x) is the frequency of variable x, σ is the population standard deviation, x is the variable that you are probing, and µ is the population mean.
The data:
Table 1.1 below shows a data set representing the measurements of absorption rate of drugs A and B (mg/hr) collected from multiple study participants. Absorption rates of drug A were collected from 30 participants, whereas absorption rates of drug B were collected from 73 participants. Note that table 1.1 is continued on the next page.
Table 1.1. Absorption rates of drug A and B (mg/hr) for the participants of the study. Note that the table is continued in the next page |
|||
Absorption Rate of drug A (mg/hr) N of participants = 30 |
Absorption Rate of drug B (mg/hr). N of participants = 73 |
||
1 |
0 |
4 |
7 |
1.2 |
0 |
4 |
7 |
1.5 |
0 |
4 |
7 |
2 |
0 |
4 |
7 |
2.5 |
1 |
4 |
7 |
3 |
1 |
4 |
7 |
3.5 |
1 |
4 |
7 |
4 |
1 |
4 |
7 |
5 |
1 |
4 |
7 |
5.5 |
2 |
4 |
8 |
6 |
2 |
6 |
8 |
6.5 |
2 |
6 |
8 |
7 |
2 |
6 |
8 |
7.5 |
2 |
6 |
9 |
8 |
2 |
6 |
10 |
9 |
2 |
6 |
|
10 |
2 |
6 |
|
12 |
2 |
6 |
|
14 |
2 |
6 |
|
13 |
5 |
3 |
|
12 |
5 |
3 |
|
19 |
5 |
3 |
|
14 |
5 |
3 |
|
19.8 |
5 |
3 |
|
20 |
5 |
3 |
|
20 |
5 |
3 |
|
1.2 |
5 |
3 |
|
1.5 |
5 |
3 |
|
20.5 |
5 |
3 |
|
21 |
You are tasked with calculating the frequency of the observations for both groups (drugs A and B). In doing this, please consider the bin values for each drug group shown in table 1.2 below. You will notice that the bin values in drug A were determined for a range of numbers (the range is shown in column “range”), which is not the case for drug B, where the bin values represent a single number rather than a range. In either case, you will calculate the frequency and plot the graph based on the bin values. Use the bin ranges for drug A only in case you would like to check the correctness of the frequency output that you will calculate in excel.
Table 1.2. Bin values for both groups |
||
Drug A |
Drug B |
|
Bin values |
Bin range (mg/hr) |
Bin values |
5 |
1.0 - 5.0 |
0 |
10 |
5.1 - 10.0 |
1 |
15 |
10.1 - 15.0 |
2 |
20 |
15.1 - 20.0 |
3 |
25 |
20.1 - 25.0 |
5 |
|
|
6 |
|
|
7 |
|
|
8 |
where “expected” is the predicted numbers to fall within a given range (predicted frequency) and the “observed” is the number of values actually observed to fall within a given range (observed frequency).
6. Find the χ 2critical from Table 1.3 below given the p-value threshold = 0.05 and the degrees of freedom = 8. You will find Table 1.3 also on moodle.
7. For both groups, confirm your results by calculating the predicted p value for this test and for this data. In Excel, this can be done using the function “CHITEST”.
Table 1.3. Critical values of χ 2
Background Information:
The presence of an outlier value would bias your analyses.
You have measured (in ppt – parts per trillion) the concentration of dioxins (toxin compound) in blood plasma samples taken from 10 individuals, who have been exposed to industrial pollution. You measured the concentrations.
The data:
You have the following data (measured in ppt – parts per trillion) for Blood dioxin concentrations (ppt):
0.9, 1.0, 1.1, 1.2, 10.0, 0.5, 0.6, 0.7, 2.0, 2.5,
Because you are trying the identification protocol for the first time, you suspect that one value is an outlier, but you do not know which. Use data analyses to detect which value might be the outlier using Dixon’s Q Test, and then confirm this with quartiles and then determine what effect did this outlier has on the parameters of the data?
Use the following equation to run the Dixon’s Q test.
x:is the suspected outier
x_((next) ):this is value next in order to x
x_((max) ): The largest value in the dataset
x_min:The smallest value in the dataset
Use Table 2.1 to calculate Q table
To calculate the quartiles, use the equation QUARTILE in Excel.
The function looks like this:
array: The range of cells that contains the data set.
quart: The quartile to return:
0: Minimum value (the 0th percentile).
1: First quartile (Q1), also known as the 25th percentile.
2: Second quartile (Q2), also known as the median or 50th percentile.
3: Third quartile (Q3), also known as the 75th percentile.
4: Maximum value (the 100th percentile).
Use the materials from the lecture to determine how to confirm outliers using Dixon’s test and quartile.
What you have to do and report:
Table 2.1 Critical values for Dixon’s Q test.
Background Information
For a particular value X, the NORM.DIST function in Excel (if you the cumulative probability which corresponds to the TRUE selection when you write the formula) calculates you the probability (the chance) that a randomly selected value from a normally distributed population will be ≤ the value X. In other words, it calculates the area under the normal distribution curve to the left of the value X.
For example: Let's say you have a normal distribution with a mean of 100 and a standard deviation of 15, and you want to calculate the cumulative probability for x = 120.
The formula in Excel will be like this
Make sure to select the option TRUE at the end of this equation. This will get you the cumulative probability.
Running this function will shows like this:
=NORM.DIST(120, 100, 15, TRUE)
This formula will calculates the probability that a value from the distribution is less than or equal to 120. In other words, it calculates the area under the curve to the left of x = 120.
Interpretation:
If the result is 0.908, it means that there is an 90.8% chance that a value from this distribution will be less than or equal to 120.
It is a usual practice for commercial provider (scheme provider) to test the efficiency of analytical methods in laboratories by comparing the laboratory results against their established standards.
A scheme provider established its own standard measures (See table 3.1 below) for the concentration of sodium bicarbonate and is using his measurement to approve the quality of other laboratories. Two laboratories submitted their measurements to the scheme provider to be approved.
Laboratory A: mean (μ) concentration of sodium bicarbonate = 100.2 mg/L
Laboratory B: mean (μ) concentration of sodium bicarbonate = 105 mg/L
Imagine you are the scheme provider; use your data analysis skills to determine which of the two laboratories would obtain approvals (if their concentrations lie in the range determined by the scheme provider’s value). You can do this by using NORM.DIST function in excel.
What you have to do and report in the word file:
- Obtain the mean and standard deviation of the values measured by the scheme providers (the values in table 3.1).
- Run the equation of NORM.DIST in Excel to obtain the cumulative probability
- What is the probability that a randomly picked value from the measurements of that provider would be ≤ the means of those laboratories?
- If you were the scheme provide, based on these results would you accept/approve the results from Lab A or lab B and why?
Table 3.1 Commercial provider values |
||
98.4 |
101 |
98.3 |
100.3 |
102 |
100.2 |
99.6 |
99.1 |
98.6 |
100.8 |
98.5 |
100.3 |
101.1 |
99.3 |
101.6 |
99.5 |
99.2 |
99.9 |
100.4 |
100.7 |
101.5 |
99.4 |
102.1 |
100 |
99.8 |
99.2 |
97 |
101.2 |
99.8 |
100.9 |
98.8 |
100 |
100.6 |
98.7 |
102.3 |
101.3 |
100.5 |
100.1 |
98.9 |
|
99.7 |
100.1 |
|
|
|
Background Information:
There is a published standard for the measurement of sun protection factors (SPF) of sunscreens that always come in different formulations and textures. Determining variances in SPF among suncreams formulations is important because of reasons such as efficacy assessments, consumers informed choices and product development guide.
Table 4.1 below shows the values of sun protection factors (SPFs) of 75 product that were claimed by the product manufacturers on the label of the sunscreen (column name: SPF values based on ISO 2). Also of note is the texture of the actual sunscreen product, which may influence the outcome of the tests. This came as 4- categories (i.e. creamy, fluid, liquid and paste) under the column named: Texture. Note that table 4.1 is continued in the next page
What you have to do:
Remember: To calculate the Benferroni correction, follow the following:
What you have to report in the word file:
Table 4.1. Sun protection factors (SPF) for commercial sunscreen products and their different textures. Note: the table is continued on next page.
Background Information:
Analysing batch effects (this effect happens when samples from different batches show significant differences) in nutritional supplements is crucial to ensure safety, quality and effectiveness, which in turn could support consumer confidence and regulatory compliance.
The Data
The following table (table 5.1) shows the values measured for 5-nutritional content (i.e. from the same original pool) over two batches (batch A and B). The aim is to analyse whether there is any batch effect by revealing the differences between the two batches for the same supplement using a paired t-test.
Table 5.1. The content of two batches of nutritional supplement. |
||
Supplement |
Batch A |
Batch B |
Fat |
40 |
17.1 |
Vit C |
25.5 |
7.9 |
Cabony lhydrate |
33.5 |
17.8 |
Fiber |
45.8 |
32.3 |
Protein |
35.4 |
20.2 |
What you need to do and report:
1. Given the paired nature of these data, visualise the data in a way that most easily shows the difference between the two batches for each supplement.
2. Report the full results of t-test (for 1- and 2-tailed test).
3. Is there a significant difference in the values of the supplements between the two batches at p = 0.05
4. Is there a significant difference in the values of the supplements between the two batches at p = 0.01?
5. Comment on the results of the t-test and what this means in practice?
Looking for the best Management Assignment Help Services for BIOT1012 Research Methods, Biostatistics & Data Management coursework? We are here to help you! If you need help with assignments or Online Coursework Help, our expert team will provide you with high-quality solutions. Having trouble writing Practical IT labs assignments? Contact us and get free consultation and free samples. No need to stress about best grades anymore, we are with you contact us today and boost your academic grads!
Let's Book Your Work with Our Expert and Get High-Quality Content