Resources for Information Managers
Information managers typically work on behalf of a research group (e.g. lab or field station) to prepare data for publication, handle the evaluation and upload process, and utilize related EDI services to fulfill project requirements and research goals. The resources on this page support information management tasks occurring throughout the research life cycle and emphasize data curation best practices.
If you have data to publish and would like to work with an EDI Data Curator, then check out the resources for data authors.
Getting started
Checkout the Information Manager Quickstart to sign-up for an EDI account, get oriented to EDI projects, and become a part of the EDI community. Check out the resources for data authors page for an example data curation approach to adopt or refine, and perspectives from the data author.
The data package
A data package is the unit of publication within the EDI Data Repository. Understanding the components and characteristics of a data package provides context to most information management processes related to the EDI Data Repository. For more information see The Data Package.
Data management planning
A data management plan helps organize the data collected throughout a research project and ensures its preservation for future use. For a data management planning tool and example text to include EDI in the data management planning section of a proposal, see Data Management Planning.
Quality Assurance
Applying quality assurance measures and logging noteworthy events during data collection increases the likelihood of high-quality data and understanding the data during later use. For more information see Quality Assurance.
Cleaning data and Quality Control
Before publishing data, consider the data quality expectations of potential user communities. Doing so increases the chance of the data being reused and cited. Cleaning data may entail restructuring and reformatting a dataset, establishing consistency within the data, and checking the integrity of the data values. Improving data quality is an ongoing information management process in many organizations. For more information see Cleaning Data and Quality Control.
Designing a data package
When designing a data package it is important to consider multiple factors in order to satisfy local community research needs. Considerations such as whether data should be published as a single package or in a set of related packages must be made to ultimately create a data package that is optimized for discovery and reuse. For more information see Designing a Data Package.
Creating metadata for a data package
Metadata are data that describe the structure and context of other data. They are vital to the discovery and reuse of a dataset, and are a required element of an EDI data package. The EDI Data Repository operates on the EML metadata standard, a widely adopted and actively maintained metadata specification. For more guidance on how to create EML metadata and associated best practices see Creating Metadata for Publication.
Evaluating a data package
Evaluating a data package is key to ensuring that it contains a valid, well-formed EML document that accurately describes the data, and to ensure that data entities are consistent with documentation. For more on how to run data package quality evaluations, interpret results, and resolve issues see Evaluating a Data Package.
Publishing a data package
When a data package has passed evaluation and the metadata render correctly in the staging environment, then the data package can be published to the EDI Data Repository production environment. This upload step can be done manually or programmatically. For more information see Publishing a Data Package.
Updating a data package
Perform a data package update whenever data or metadata need to be changed or added to an existing data package. Updates may be performed routinely or sporadically and will result in a new "revision". A revision of a data package has the same identifier, but receives a new version number and is assigned a new DOI. All revisions of a data package are linked in the EDI Data Repository. Users who end up on the landing page of an older revision will be notified that they are not viewing the most recent version of the data package. For more information see Updating a Data Package.
Deprecating a data package
Deprecation enables data authors to discourage the continued use of a single data package, or series, in favor of a better alternative. It's typically applied when data issues cannot be resolved through a data package update, or when maintenance of a data package has moved to a new data package series. For more information see Deprecating a Data Package.
Adding citations to a data package
Adding citations to data packages allows authors to measure the impact that their data publications are making and allows potential users to see the context in which the data have been previously applied. For more information see Adding Journal Citations to a Data Package.
Permissions on a data package
The permissions governing who can make changes to a data package are set in the access control rules of the EML metadata file of the most recent version of a data package. To change the permissions on a data package (add/remove users), the access control rules of the current EML metadata file must be edited and uploaded in a data package revision by one of the users in current list of access control rules. For more information see Permissions on a Data Package.
The EDI Dashboard
The EDI Dashboard displays information and processes related to the functioning of the EDI Data Repository. The dashboard provides useful tools for generating site- or package-specific reports, checking the status of data package uploads and evaluations, or for monitoring the health of EDI services and portals. For more information see the EDI Dashboard.
Creating a data catalog
Customized data catalogs can leverage the archived data and metadata in the EDI Data Repository to create a searchable index and display of data for a personal or project website. This approach eliminates overhead and facilitates customized branding. For more information see Create a Local Data Catalog for a Website.
Thematic Standardization
Data in the EDI Repository are thematically and structurally diverse. Converting them into standardized formats, according to theme, facilitates interoperability and reuse. To learn more about EDI Thematic Standardization and how to create data in these formats see the page on Thematic Standardization.
Data package best practices
More "Best Practice" recommendations for ecological and environmental data packages are available here.
Defining data package replication
Ensuring data consumers are aware of replicated data across repositories is crucial for reducing data user confusion, data misuse and errors, and inefficiencies for data harvesters. For more information see Defining Data Package Replication.