DataverseLV Digital Preservation Policy

1. Policy purpose and scope 

DataverseLV is Latvia’s national research data repository, providing long-term preservation, access, and reuse of datasets created by the research community. DataverseLV Digital Preservation Policy (hereinafter, the Policy) sets out the repository’s commitment and approach to the responsible and sustainable long-term storage, accessibility, and usability of research data. The Policy is developed in accordance with the Latvian Open Science Strategy, the OAIS reference model, the TRUST principles, the FAIR data principles, and the CoreTrustSeal requirements. The Policy applies to all datasets and their metadata published in DataverseLV, as well as to the processes from ingest to deaccessioning in the repository. The intended audience of the Policy includes dataset depositors, data curators, repository administrators, the LVRTC infrastructure operator, and the general DataverseLV user community.

Definitions of terms used in the Policy 

Hosting — deployment and maintenance of applications and services on server infrastructure (OS, application server, database, network) that ensures DataverseLV operation.​

Data storage — secure preservation of digital data in storage systems with replication, backups, and availability restoration in accordance with the SLA.​

OAIS (ISO 14721) — an international reference model for an Open Archival Information System for long-term preservation and access, defining functions, processes, and terminology.​

TRUST principles — guidelines for ensuring trustworthy repositories: Transparency, Responsibility, User focus, Sustainability, Technology.​

Fixity (integrity check) — evidence that a file has not changed at the bit level; the originally recorded and re-calculated checksums are compared.​

DOI — a persistent digital identifier for datasets and their versions that ensures durable citation and findability.​

Deaccessioning (withdrawal) — the removal of published data or parts thereof in exceptional cases; the DOI and metadata remain with a tombstone record explaining the reason.​

Embargo — a period during which data are not publicly available, after which access is opened according to the rules.​

Bit-level preservation — preservation focused on bit integrity (checksums, replication, backups) without engaging in content transformation.​

Normalization — conversion of files to recommended, preservation-friendly formats to mitigate risk and obsolescence.​

Format migration — planned transfer from an obsolete format to a current one while preserving essential characteristics and usability.​

SLA — a service-level agreement on availability, incident reporting/remediation, and recovery objectives between VPC and LVRTC.​

RTO/RPO — recovery time objective and recovery point objective.​

Curator — a repository specialist who helps depositors prepare data/metadata, checking compliance and quality before publication.​

OAI-PMH — a standard protocol that allows external systems to harvest and index metadata from the repository.

API – a set of technical protocols that allows other software and systems to automate tasks such as uploading data or searching the repository without using the website interface.

2. Organizational responsibility and governance 

The operation of the repository is ensured by The Higher Education and Science Information Technology Shared Service Centre (VPC), which is the central administrative structure for the management, processes and software administration of the repository. The physical infrastructure and hosting of the repository is ensured by the Latvian State Radio and Television Centre (LVRTC) as the operator of the national critical infrastructure in accordance with the mutual SLA. Partner universities manage their institutional collections and provide data curators for data preparation and quality monitoring; the supervision of the rest of the institutional collections is provided by the data curators of the VPC.

2.1. DataverseLV roles 

Depositor: ensures dataset usage rights, ethical and legal compliance, use of recommended formats, and complete documentation when submitting data for publication.

Partner institution curators: verify the quality and compliance of datasets from their institution, advise on metadata, licenses, formats, and access, and coordinate publication.

VPC data curators: oversee institutional collections not managed by partner universities, ensuring consistent quality practices across the repository.

Repository administrators (VPC): maintain DataverseLV software, access permissions, version control, and process documentation.

LVRTC: provides repository hosting and data storage with replication and backups, in compliance with SLA and Cabinet of Ministers Instruction No. 5.

VPC board: approves the policy and strategic decisions, ensuring continuity of governance.

3. OAIS compliance and functions

DataverseLV preservation processes are aligned with the OAIS model and encompass pre-ingest, ingest, archival storage, data management, access, administration, and preservation planning.
During pre-ingest, the repository provides guidelines, training, and lists of recommended file formats for depositors.
During ingest, data integrity and completeness are checked, metadata are validated, MD5 checksums are generated, and data quality is reviewed prior to publication.
In archival storage, replication and backups are implemented across two geographically separated locations.
In data management, administrative, descriptive, and access metadata are maintained, along with version history with unique DOIs.
Access is provided via a web interface, OAI-PMH, and API with flexible access control.
Administration functions are shared between VPC and partners with clear roles and responsibilities.

3.1. Implementation of the TRUST principles

DataverseLV implements transparency, accountability, user focus, sustainability, and the use of secure, modern technology across the entire data lifecycle.​
Aspects of policies, procedures, and SLA covering availability, incidents, and recovery are documented and communicated to stakeholders.​
Sustainability is ensured through stable governance, funding, and critical infrastructure services.​

4. Preservation strategy 

The goal is to ensure long-term accessibility and usability by accepting and storing data in formats that are documented, widely supported, and suitable for preservation. 

4.1. Format policy 

The repository accepts only recommended, preservation-suitable file formats for deposit; if data are in another format, the depositor must convert them before publication with curator support.​
The list of recommended formats and guidance is available in the dataverse.lv guide, and curators help assess compliance prior to publication.​

4.2. Acceptance and integrity 

At the point of ingest, checksums are calculated and recorded for all files to enable bit-level integrity verification in subsequent stages. 

4.3. Storage and access 

Data and metadata are stored on secure national infrastructure with replication across two geographically separated locations and regular backups in accordance with the SLA. 
Availability, recovery objectives (RTO/RPO), and incident processes are defined in the VPC–LVRTC SLA and are periodically tested. 

4.4. Changes and versions 

Any change to the data or documentation results in a new dataset version with a unique DOI and a complete change history, preserving citability. 
Also corrections are recorded separately in the quality checklist and stored in internal VPC Documentation, and previous versions remain accessible for audit purposes. 

4.5. Format obsolescence 

The repository does not perform centralized post-publication data improvement or general format migration; the responsibility for using recommended formats lies with the depositor, with curator support during the preparation stage. 
If a risk of format obsolescence is identified, the repository advises the depositor on potential republication in a recommended format and ensures version continuity. 

4.6. Exceptions and support 

In exceptional cases where applying the recommended format is not practically possible, a curator assessment is performed and an alternative is agreed with clear documentation of the choice in the quality checklist. 
In all cases, preference is given to open, well-documented, and community-supported formats to minimize dependencies and future migration risks. 

5. Dataset ingest and description 

Ingest criteria include checks for rights compliance, protection or anonymization of sensitive information, adherence to legal requirements, and absence of malware. 
During ingest, integrity checks, MD5 checksum generation, completeness verification, and, if needed, format identification are performed. 
International standards are used for metadata description to ensure findability and interoperability. 

5.1. FAIR principles support 

DataverseLV implements FAIR (Findable, Accessible, Interoperable, Reusable) principles to ensure that research data are findable, accessible, interoperable, and reusable for both humans and machine-readable processes: 
Findable: Each dataset is assigned a unique, persistent DOI for precise citation and discoverability in search engines. Datasets are described with rich metadata using international standards and are indexed in discovery systems, enabling users to find data by multiple criteria. 
Accessible: Datasets and their metadata are available through clearly defined access protocols—the web interface, OAI-PMH, and REST API. Even when files are access-restricted, metadata remain open, indicating the data’s existence and access conditions. Access rights and licenses are clearly stated. 
Interoperable: The repository supports open metadata standards (e.g., Dublin Core, DataCite) for system-to-system compatibility and integration across contexts. The OAI-PMH protocol and APIs enable automated harvesting and integration with other research infrastructures. Using recommended file formats ensures technical compatibility. 
Reusable: Datasets include sufficient documentation (ReadMe files, codebooks, methodological descriptions), clear use licenses, and version history, enabling others to reuse the data in new studies. A versioning policy maintains a transparent, citable change record to support research reproducibility and traceability. 

5.2. Draft dataset management 

Drafts are operational records maintained only until publication or deletion per the depositor’s preferences; long-term preservation applies to published datasets and metadata. 
Drafts may be deleted by the depositor or a curator if they do not meet requirements or are not progressing toward publication, and all actions are recorded in quality control logs. 

6. Version control and change management 

Minor versions apply to metadata corrections or additions, while major versions include changes to data or documentation files. Each version has a unique DOI, with a complete version history preserved and access to previous versions maintained. Users can cite a specific version or always the latest version, depending on need. 

7. Dataset deaccessioning 

Deaccessioning is permitted in cases of legal or ethical violations, copyright disputes, a substantiated depositor request, or irreversible loss of integrity. 
Files are deleted while retaining the DOI and metadata via a tombstone record that states the reason for deaccessioning. 
All deaccessioning cases are recorded in the dataset’s quality control checklist for governance and compliance purposes. 

8. InfrastructureSLA and security 

Data and metadata are stored on LVRTC infrastructure with synchronous replication across two geographically separate sites and daily, weekly, and monthly backups. 
Availability, incident reporting, recovery procedures, RTO, and RPO are defined in the VPC–LVRTC SLAs and are regularly tested in accordance with the agreement. 
The infrastructure is maintained in line with information security standards and national regulations. 
The LVRTC SLA is developed and applied in accordance with Cabinet Instruction No. 5 of 8 November 2022 “Procedure for ensuring the State Electronic Communications Service Centre.” 

8.1. Information security and access control  

Secure authentication, role-based access control, firewalls, encrypted data transfer, and auditing are used. Administrative access is strictly limited and emergency access is only possible in defined cases under the SLA. Compliance with data protection requirements is ensured according to defined roles and processes. 

9. Risk assessment and incident management 

Risk identification, assessment, monitoring and documentation is implemented in accordance with the VPC and LVRTC cooperation model, observing the LVRTC Service Level Agreement (SLA), which is based on Cabinet Instruction No. 5 of 8 November 2022. Availability indicators, incident reporting channels, response times, escalation procedures and reports are defined in the LVRTC SLA, and the operational coordination of these processes is provided by the VPC. 
Fixity checks and other bit-level integrity audits are internal processes and tools in DataverseLV that work independently of SLA and provide data immutability monitoring. 
Backup restoration testing and event logging are performed periodically according to a defined schedule to verify RTO/RPO reach and process readiness. If integrity discrepancies or other incidents are detected, the data is restored from the last secure copy by applying the incident and recovery procedures set out in the SLA in the coordination of the VPC. 

10. Succession and continuity 

Continuity is ensured by LVRTC as a national critical infrastructure operator, with DataverseLV operating on secure, geographically backed up infrastructure with backups and monitoring. Changes to the hosting of the repository are not planned, and service availability, incident reporting procedures, response times and recovery targets (RTO/RPO) are implemented in accordance with the LVRTC SLA, which is operationally coordinated by the VPC. 
In the event of critical disruptions, the incident and restoration procedures established by the SLA are applied until the service is fully restored to the same infrastructure, with users informed about the status and progress of operations. DOI and metadata are maintained unchanged, ensuring citability and findability of information even during incidents, while for planned or unplanned work, a temporary access restriction (e.g. read-only mode) may be applied with prior notice when possible. 

11.  Retention periods and availability

Published data is retained without a time limit The DOI identifier is also maintained in the event of deletion, and a publicly available “tombstone” entry explains the reason for the removal and retains citability. Access to datasets is determined according to usage needs, supporting open access, restricted access and embargo periods. 

12. Financial sustainability

The creation of DataverseLV has been implemented within the framework of the project “Support for the implementation of open science in practice, as well as solutions for sharing science data and participation in the EU open science cloud” (RRF project No. 2.1.3.1.i) with financial support from the European Union Recovery Fund and the State of Latvia. Further maintenance shall be ensured by the partner funding model and the national reference funding for open science initiatives, with an operational horizon of at least five years and a regular budget review. The aim is to ensure the continuity of staff, repository hosting and platform development by providing scenarios for different funding conditions. 

13. Technical platform and standards 

DataverseLV uses the open source Dataverse platform with regular updates and support from the international community. The technical environment includes Linux, Payara, PostgreSQL, Solr and additional tools for monitoring activity. Supported OAI-PMH, REST API and DOI registration, as well as data and metadata export capabilities. 

14. Policy review and update 

The policy shall be reviewed at least every three years or sooner in the event of changes in technology, legal framework or international standards. Changes are documented, co-ordinated, and made public, maintaining a public version history and ensuring transparency. The current version is published on the DataverseLV website. 

15. Community involvement and support 

User needs are identified through curatorial support, working groups, consultations, and guidelines, including recommended format and quality checklists. International collaborations with the Dataverse community and European infrastructures ensure alignment with best practices. Feedback is systematically used in the improvement of processes and services. 

16. Contacts 

Email: info@vpc.lv
Phone: +371 67 969 580
Address: Zigfrīda Annas Meierovica bulvāris 14, Rīga, LV-1050, Latvia
Website: https://dataverse.lv/