A global consistent database of plankton and detritus from in situ imaging by the Underwater Vision Profiler 5
Dataset summary
Plankton and detritus are essential components of the Earth’s oceans influencing biogeochemical cycles and carbon sequestration. Climate change impacts their composition and marine ecosystems as a whole. To improve our understanding of these changes, standardized observation methods and integrated global datasets are needed to enhance the accuracy of ecological and climate models. Here, we present a global dataset for plankton and detritus obtained by two versions of the Underwater Vision Profiler 5 (UVP5). This release contains the images classified in 33 homogenized categories, as well as the metadata associated with them, reaching 3,114 profiles and ca. 8 million objects acquired between 2008-2018 at global scale. The geographical distribution of the dataset is unbalanced, with the Equatorial region (30° S - 30° N) being the most represented, followed by the high latitudes in the northern hemisphere and lastly the high latitudes in the Southern Hemisphere. Detritus is the most abundant category in terms of concentration (90%) and biovolume (95%), although its classification in different morphotypes is still not well established. Copepoda was the most abundant taxa within the plankton, with Trichodesmium colonies being the second most abundant. The two versions of UVP5 (SD and HD) have different imagers, resulting in a different effective size range to analyse plankton and detritus from the images (HD objects >600 µm, SD objects >1 mm) and morphological properties (grey levels, etc.) presenting similar patterns, although the ranges may differ. A large number of images of plankton and detritus will be collected in the future by the UVP5, and the public availability of this dataset will help it being utilized as a training set for machine learning and being improved by the scientific community. This will reduce uncertainty by classifying previously unclassified objects and expand the classification categories, ultimately enhancing biodiversity quantification.
Data tables
The data set is organised according to:
- samples : Underwater Vision Profiler 5 profiles, taken at a given point in space and time. - objects : individual UVP images, taken at a given depth along the each profile, on which various morphological features were measured and that where then classified taxonomically in EcoTaxa.
samples and objects have unique identifiers. The sample_id is used to link the different tables of the data set together. All files are Tab separated values, UTF8 encoded, gzip compressed.
samples.tsv.gz
- sample_id <int> unique sample identifier
- sample_name <text> original sample identifier
- project <text> EcoPart project title
- lat, lon <float> location [decimal degrees]
- datetime <text> date and time of start of profile [ISO 8601: YYYY-MM-DDTHH:MM:SSZ]
- pixel_size <float> size of one pixel [mm]
- uvp_model <text> version of the UVP: SD: standard definition, ZD: zoomed, HD: high definition
samples_volume.tsv.gz
Along a profile, the UVP takes many images, each of a fixed volume. The profiles are cut into 5 m depth bins in which the number of images taken is recorded and hence the imaged volume is known. This is necessary to compute concentrations.
- sample_id <int> unique sample identifier
- mid_depth_bin <float> middle of the depth bin (2.5 = from 0 to 5 m depth) [m]
- water_volume_imaged <float> volume imaged = number of full images × unit volume [L]
objects.tsv.gz
- object_id <int> unique object identifier
- object_name <text> original object identifier
- sample_id <int> unique sample identifier
- depth <float> depth at which the image was taken [m]
- mid_depth_bin <float> corresponding depth bin [m]; to match with samples_volumes
- taxon <text> original taxonomic name as in EcoTaxa; is not consistent across projects
- lineage <text> taxonomic lineage corresponding to that name
- classif_author <text> unique, anonymised identifier of the user who performed this classification
- classif_datetime <text> date and time at which the classification was
- group <text> broader taxonomic name, for which the identification is consistent over the whole dataset
- group_lineage <text> taxonomic lineage corresponding to this broader group
- area_mm2 <float> measurements on the object, in real worl units (i.e. comparable across the whole dataset) …
- major_mm <float>
- area <float> measurements on the objet, in [pixels] and therefore not directly comparable among the different UVP models and units
- mean <float> …
- skeleton_area <float>
properties_per_bin.tsv.gz
The information above allows to compute concentrations, biovolumes, and average grey level within a given depth bin. The code to do so is in `summarise_objects_properties.R`.
- sample_id <int> unique sample identifier
- depth_range <text> range of depth over which the concentration/biovolume are computed: (start,end], in [m] where `(` means not including, `]` means including
- group <text> broad taxonomic group
- concentration <float> concentration [ind/L]
- biovolume <float> biovolume [mm3/L]
- avg_grey <float> average grey level of particles [no unit; 0 is black, 255 is white]
ODV_biovolumes.txt, ODV_concentrations.txt, ODV_grey_levels.txt
This is the same information as above, formatted in a way that Ocean Data View https://odv.awi.de can read. In ODV, go to Import > ODV Spreadsheet and accept all default choices.
Images
The images are provided in a separate, much larger, zip file. They are stored with the format `sample_id/object_id.jpg`, where `sample_id` and `object_id` are the integer identifiers used in the data tables above.
Simple
- Date (Publication)
- 2025-07-11
- Date (Revision)
- 2025-08-27
- Other citation details
- Nocera Ariadna, Stemmann Lars, Babin Marcel, Biard Tristan, Coustenoble Julie, Carlotti François, Coppola Laurent, Courchet Lucas, Drago Laetitia, Elineau Amanda, Guidi Lionel, Hauss Helena, Jalabert Laëtitia, Karp-Boss Lee, Kiko Rainer, Laget Marion, Lombard Fabien, McDonnell Andrew, Merland Camille, Motreuil Solène, Panaïotis Thelma, Picheral Marc, Rogge Andreas, Waite Anya, Irisson Jean-Olivier (2025). A global consistent database of plankton and detritus from in situ imaging by the Underwater Vision Profiler 5. SEANOE. https://doi.org/10.17882/107583
- Theme
-
- plankton
- particles
- Underwater Vision Profiler
- image
- Biological oceanography
- project
-
- H2020 TRIATLAS (Agreement: 817578)
- H2020 AtlantECO (Agreement: 862923)
- H2020 Blue-Cloud (Agreement: 862409)
- Use constraints
- Other restrictions
- Date (Publication)
- 2022
- Unique resource identifier
- 10.3389/fmars.2022.894372
- Association Type
- Cross reference
- Initiative Type
- Study
- Date (Publication)
- 2022
- Unique resource identifier
- 10.1002/lom3.10492
- Association Type
- Cross reference
- Initiative Type
- Study
- Date (Publication)
- 2022
- Unique resource identifier
- 10.1146/annurev-marine-041921-013023
- Association Type
- Cross reference
- Initiative Type
- Study
- Date (Publication)
- Unique resource identifier
- 10.1111/geb.13741
- Association Type
- Cross reference
- Initiative Type
- Study
- Date (Publication)
- Unique resource identifier
- 10.5194/essd-14-4315-2022
- Association Type
- Cross reference
- Initiative Type
- Study
- Date (Publication)
- 2021
- Unique resource identifier
- 10.1594/pangaea.924375
- Association Type
- Cross reference
- Initiative Type
- dataset
- Unique resource identifier
- 10.17600/8010090
- Association Type
- Cross reference
- Initiative Type
- Platform
- Unique resource identifier
- 10.17600/15001200
- Association Type
- Cross reference
- Initiative Type
- Platform
- Unique resource identifier
- 10.17600/13020010
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Unique resource identifier
- 10.18142/131
- Association Type
- Cross reference
- Initiative Type
- Platform
- Unique resource identifier
- 10.18142/235
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Association Type
- Cross reference
- Initiative Type
- Platform
- Unique resource identifier
- 10.17600/14007500
- Association Type
- Cross reference
- Initiative Type
- Platform
- Metadata language
- English
- Topic category
-
- Oceans
))
- Distribution format
-
-
TEXTE
(
)
-
IMAGE
(
)
-
TEXTE
(
)
- OnLine resource
-
Quality controlled data
(
WWW:DOWNLOAD-1.0-link--download
)
Data tables - 1 GB
- OnLine resource
-
Quality controlled data
(
WWW:DOWNLOAD-1.0-link--download
)
Images - 33 GB
- OnLine resource
- DOI of the product ( WWW:LINK-1.0-http--metadata-URL )
- OnLine resource
- Seanoe ( rel-canonical )
- Hierarchy level
- Dataset
- File identifier
- seanoe:107583 XML
- Metadata language
- English
- Character set
- UTF8
- Hierarchy level
- Dataset
- Date stamp
- 2025-08-27
- Metadata standard name
- ISO 19115:2003/19139
- Metadata standard version
- 1.0
Overviews

Spatial extent
))
Provided by
