A dataset (or data set) is a collection of data. Data is any item of information, usually numerical, that is not yet subject to interpretation. A dataset is essentially a list of numbers or other bits of information that can be used in statistical analysis.
"Big Data" is a term that describes an extremely large dataset. Computational manipulation (using computers to handle data) is usually required to make sense of big datasets.
Statistics is the practice of using data in a variety of ways: statistics collects, classifies, arranges, manipulates, and interprets datasets. If you see a report with charts and graphs, you will typically see an accompanying written analysis, or qualitative data analysis, that explains the data.
Some of the websites on this guide will lead you to raw data. Others will offer reports that interpret this data. Ask a Librarian if you are unsure of how to get what you need, and read a helpful book on how to use statistics in your research.
Statistics can be complicated and complex, and data can be manipulated in many ways to varying ends. It is good practice to study the methodologies of surveys and data collection in order to understand the results given by the researcher or institutional body.
You can cite datasets and statistics in any formatting style:
APA (OWL Purdue) - Used in the Social Sciences
MLA (OWL Purdue) - Used in the Humanities
Rule of thumb, make sure to include as many of these elements in your citation:
If you have published an article that uses datasets, and want to or are required to openly share your data - good news! You can post your data in The University of Tampa's Institutional Repository. Please see this guide to learn more about the IR and this guide to learn about scholarly publishing and managing datasets.