When finding quantitative information like a number, a table, or a whole dataset we want be able to judge its quality to help us select the right information.
Quality needs to be defined more precisely. There are several aspects:
One of the biggest hurdles you will encounter when trying to assess the original data is actually getting access to the initial raw data. The facts and figures which appear in peer reviewed journal articles often do not contain the raw data which hold information like how the data was entered into the system, what was and was not included or excluded, if the data had been manipulated in anyway prior to being plotted or graphed, or even if some of the data had been faked. The lack of access to the raw data is one of the leading issues in science, as Professor Tsuyoshi Miyakawa of Fujita Health University and Editor-in-Chief of the journal Molecular Brain discussed in a 2020 article, saying that:
I have handled 180 manuscripts since early 2017 and have made 41 editorial decisions categorized as “Revise before review,” requesting that the authors provide raw data. Surprisingly, among those 41 manuscripts, 21 were withdrawn without providing raw data, indicating that requiring raw data drove away more than half of the manuscripts. I rejected 19 out of the remaining 20 manuscripts because of insufficient raw data. Thus, more than 97% of the 41 manuscripts did not present the raw data supporting their results when requested by an editor, suggesting a possibility that the raw data did not exist from the beginning, at least in some portions of these cases (Miyakawa 2020: 1).
Data can be manipulated in several ways to make it appear significant or it can even be faked, but this manipulation and fakery is only visible if one has access to the raw data. What this means for you is that when you are assessing the data from an article, just know that even if you follow the steps above, assessing the quality of the data is still no easy task. What you are looking at is unlikely to be the raw data, and measuring the trustworthiness of that data is a problem which the scientific community as a whole currently faces.
Source: Miyakawa, T. (2020). No raw data, no science: another possible source of the reproducibility crisis. Molecular brain, 13(1), 1-6. https://doi.org/10.1186/s13041-020-0552-2
In this exercise you assess the original dataset.