Data Management and You – A look at NSF requirements for data organization and sharing

This is Part 1 of a discussion series on data management requirements for government funded research.

NSF LogoData is powerful. From data comes information, and from information comes knowledge. Data is also a critical component in quantitative analysis and for proving or disproving scientific hypotheses. But what happens to data after it has served its initial purpose? And what are your obligations, and potential benefits, with respect to openly sharing data with other researchers?

Data management and data sharing is viewed with growing importance in today’s research environment, particularly in the eyes of government funding agencies. Not only is data management a requirement for most proposals using public funding, but effective data sharing can also work in your favor in the proposal review process. Consider the difference between two accomplished scientists, both conducting excellent research and publishing results in top journals, but only one of the scientists has made their data openly available, with 1000s of other researchers already accessing the data for further research. Clearly, the scientist who has shared data has created substantial additional impact on the community and facilitated a greater return on investment beyond the initially funded research. Such accomplishments can and should be included in your proposals.

As one example, let’s examine the data management requirements for proposals submitted to the U.S. National Science Foundation. What is immediately obvious when preparing a NSF proposal is the need to incorporate a two-page Data Management Plan as an addendum to your project description. Requirements for the Data Management Plan are outlined in the “Proposal and Award Policies and Procedures Guide” (2013) within both the “Grant Proposal Guide” and the “Award & Administration Guide.” Note that in some cases there are also specific data management requirements for particular NSF Directorates and Divisions, which need to be adhered to when submitting proposals for those programs.

To quote from the Data Management Plan: “Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.” Accordingly, the proposal will need to describe the “types of data… to be produced in the course of the project”, “the standards to be used for data and metadata format”, “policies for access and sharing”, “policies and provisions for re-use, re-distribution, and the production of derivatives”, and “plans for archiving data… and for preservation of access.” Proposals can not be submitted without such a plan.

As another important consideration, if “any PI or co-PI identified on the project has received NSF funding (including any current funding) in the past five years”, the proposal must include a description of past awards, including a synopsis of data produced from these awards. Specifcally, in addition to a basic summary of past projects, this description should include “evidence of research products and their availability, including, but not limited to: data, publications, samples, physical collections, software, and models, as described in any Data Management Plan.”

Along these same lines, NSF also recently adjusted the requirements for the Biographical Sketch to specify “Products” rather than just “Publications.” Thus, in addition to previous items in this category, such as publications and patents, “Products” now also includes data.

The overall implication is that NSF is interesting in seeing both past success in impacting the community through data sharing and specific plans on how this will be accomplished in future research. Be sure to keep this this in mind when writing your next proposal. And remember… data is powerful.

