MalaMix dataset: contextual and metabarcoding data

  • Pedro C Junger (Creator)
  • Hugo Sarmento (Creator)
  • Caterina R. Giner (Creator)
  • Mireia Mestre (Creator)
  • Marta Sebastián (Creator)
  • Xosé A. G. Moran (Centro Oceanográfico (IEO-CSIC), Gijón/Xixón, Spain.) (Creator)
  • Javier Arístegui (Creator)
  • Susana Agusti (Creator)
  • Carlos Duarte (Creator)
  • Silvia G. Acinas (Creator)
  • Ramon Massana (Creator)
  • Josep M. Gasol (Creator)
  • Ramiro Logares (Creator)



1. INTRODUCTION MalaMix is a compiled metabarcoding dataset composed of 451 marine samples collected from a range of depths - from the surface (3m) to deep waters (as far down as 4800m). This dataset covers three ocean layers: the epi- (0-200m – including DCM), meso- (200-1000m) and bathypelagic (1000-4000m). MalaMix combines samples obtained during two oceanographic expeditions with similar sampling strategies: i) the Malaspina-2010 global expedition that produced 263 samples collected between December 2010 and July 2011 from 120 stations distributed along the tropical and subtropical portions (latitudes between 35° N and 40° S) of the Pacific, Atlantic and Indian oceans; and ii) the HotMix trans-Mediterranean cruise that produced 188 samples collected between April and May 2014 in 29 stations distributed along the whole Mediterranean Sea (from -5° W to 33° E) and the adjacent Northeast Atlantic Ocean. MalaMix comprises: a 16S-V4V5 rRNA gene ASV table (MalaMix_16S.csv); an 18S-V4 rRNA gene ASV table (MalaMix_18S.csv); two tables of contextual metadata (MalaMix_EnvData_16S and MalaMix_EnvData_18S) including 6 standardized environmental parameters (temperature [°C], salinity, fluorescence, PO43− [µmol L-1], NO3− [µmol L-1], and SiO2 [µmol L-1]) as well as species taxonomic and phylogenetic diversity metrics a table (MalaMix_FCdata.csv) with flow cytometry microbial counts [cell mL-1] and bacterial activity measurements [pmol Leu L-1 h-1]; a README file (README_Metadata.csv) describing the meaning and units of each variable column in the metadata tables. The raw DNA sequences are publicly available at the European Nucleotide Archive ( under accession numbers PRJEB23913 [18S rRNA genes] & PRJEB25224 [16S rRNA genes] for the Malaspina surface dataset; PRJEB23771 [18S rRNA genes] & PRJEB45015 [16S rRNA genes] for the Malaspina vertical profiles; PRJEB45011 [16S rRNA genes] & PRJEB45014 [18S rRNA genes] for the Malaspina deep sea dataset; and PRJEB44683 [18S rRNA genes] & PRJEB44474 [16S rRNA genes] for the HotMix expedition. Further methodological details are available here: 2. FUTURE FORMAT CHANGES No major changes are expected for the main general format of the database. 3. ACKNOWLEDGMENTS The current dataset was generated with funds from the projects INTERACTOMICS (CTM2015-69936-P, MINECO, Spain), MicroEcoSystems (240904, RCN, Norway), MINIME (PID2019-105775RB-I00, AEI, Spain), and PID2021-125469NB-C31 (AEI, Spain), as well as DOREMI (CTM2012-34294) and HOTMIX (CTM2011-30010-C02-01 and CTM2011-30010-C02-02) of the Spanish Ministry of Economy and Innovation, co-financed with FEDER funds. 4. COPYRIGHT NOTICE This database is provided “as is” and without any warranty of any kind, of openly available for non-commerical purposes (CC BY-NC). CC BY-NC means that users can make use of the work (including copying, distributing, adapting and building upon the work), but only for noncommercial purposes and as long as attribution is given to the creator:
Date made availableSep 20 2023

Cite this