Updating a Data Package
Perform a data package update whenever data or metadata need to be changed or added to a published data package. Updates may be performed routinely or sporadically and will result in a new "revision". A revision of a data package has the same identifier, but receives a new version number and is assigned a new DOI. All revisions of a data package are linked in the EDI Data Repository. Users who end up on the landing page of an older revision will be notified that a newer version is available.
Note to Information Managers: Be aware of the EML <access> element. Only credentials already specified in an existing version of a data package can be used to publish an update (i.e. to publish edi.10.2, credentials must be specified in edi.10.1). If you are unable to publish a revision for this reason, contact the EDI Curation Team.
Metadata to include
It is important to communicate changes and significance in the metadata of an updated data package so users can understand what has changed and why. This information is included in the maintenance section of EML metadata. Guidance on adding this information is provided below.
Editing data and metadata
ezEML
Edit data and metadata using ezEML:
-
Open the EML document for the original data package. If the package was created outside of ezEML, or you no longer have access to the original ezEML data package, select Fetch a Package from EDI from the Import/Export menu to retrieve and import an existing data package:
- Select the scope of your data package (e.g. edi, knb-lter-ntl, etc.).
- Select the package scope.identifier to start the import.
- Note any errors that may have occurred during import (if package was originally made outside of ezEML).
- Select the option to Get Associated Data Files if you plan to edit/reupload one or more tables.
-
Describe the changes in the new revision. From the Maintenance tab, add a new paragraph to the Description text.
-
Submit to EDI - Click Send to EDI and add a note mentioning that this is an update to an existing data package (e.g. "This submission is a revision to package edi.101.1").
-
The EDI curation team will receive the submission and iterate through the review process before the update is published.
EML created with ezEML can be downloaded directly and published to the EDI Repository. If opting to publish your own updates, remember to enter an incremented version number in the Data Package ID tab of ezEML.
EMLassemblyline
Edit data and metadata using EMLassemblyline:
- Get the metadata templates and
make_eml()
function call for the original data package. If these don't exist, use theeml2eal()
andeml2eal_losses()
functions to create them. - Update the metadata templates and
make_eml()
function arguments to reflect the changes. Describe the changes made between revisions using themaintenance.description
parameter ofmake_eml()
. - Increment the data package version number in the
package.id
parameter ofmake_eml()
. - Run
make_eml()
.
Publishing edited data and metadata
EDI Data Portal
An updated data package can be uploaded via the EDI Data Portal similarly to a new data package, but with one key difference:
Use the Allow PASTA+ to skip… option if any of the data files are unchanged between versions. This allows the EDI Data Repository (a.k.a. PASTA) to forgo reuploading replicate data and can save time and repository space. Caution: take care to ensure that the metadata-documented checksum values of each data file are accurate and up to date.
For more information on this option, watch this video.
EDIutils
An updated data package can be uploaded via the EDIutils R package using the update_data_package()
function. For updating with this function, all data files must be web-hosted and be associated with static data links. When using this function, the useChecksum
option can be selected.
Set the useChecksum
argument to TRUE if any of the data objects are unchanged between versions. This allows the EDI Data Repository to forgo reuploading replicate data and can save time and repository space. Caution: take care to ensure that metadata-documented checksum values are accurate and up to date.
For a language-agnostic solution, see the REST API documentation for Updating a Data Package.
Submit via email
Submit desired changes or the new and/or updated data along with the updated EML file to the EDI Curation Team via email. Make sure to mention the data package identifier that is being updated. The EDI Curation Team will create a proof for review before the update is published.