The data we collected from the survey was scanty.
The data we collected from the survey were scanty.
Which sentence is correct?
I have observed there can be quite a bit of confusion as to whether data should be used as a singular or plural noun. But why is there such confusion? It stems from the origins of “data”, so let us begin there.
Data comes from Latin. In Latin data is a plural noun. It is the plural of “datum”, which means “something given”. Datum exists in English as the singular of data, though it is seldom used in everyday English. You are more likely to see “datum” as the singular of "data" in technical and academic documents. "Datum" also means a starting point for measurement in surveying and engineering.
So in the strictest grammatical sense, data is a plural noun. For this reason academic faculties are insistent on using data as a plural noun, especially the science fields. After all, they are more likely to use the seldom-used datum. I myself have proofread papers that use it.
However, in non-academic English it is quite widespread and acceptable to use data as a singular collective noun. That is the way I have always used it myself. And let’s admit it: although saying, for example, “the data on my hard drive were corrupted” is the more strict grammar, it would sound very formal to many people, who would be more likely to expect me to say “the data on my hard drive was corrupted”.
So my thoughts on whether to use data as a singular or plural noun are:
When writing an academic paper, use data as a plural noun to be on the safe side of academia. This is how I proofread the word “data” when I proofread an academic document. If you are unsure about which agreement to use, consult your supervisor or style guide.
Mind you, when I proofread data as a plural noun in academic documents, I regard the rule that governs it as more like an academic rule than a grammar one. I don’t proofread data as a plural noun in academic documents for grammatical reasons – I do so because academic style demands it.
Outside academic circles, it is perfectly acceptable and more common to use data as a singular collective noun. It would still not be grammatically wrong to use the plural, but you may have to consider whether it is appropriate to the degree of formality you are using.
After all, grammar check allows data to be used as both a singular and a plural noun. Have you ever noticed that grammar check never disagrees with the agreement of data, regardless of whether it is singular or plural?