The subarctic Pacific is inhabit by three copepod congeners in the genus Neocalanus with an overlapping biogeographic range that includes the open ocean, marginal seas and fjord systems. Two distinct genetic variants of Neocalanus flemingeri have been reported from the western Pacific: the “small form” with an annual life cycle is found throughout the region, while the “large form” population with a 2-year life cycle is centered in the Sea of Okhotsk. Using a molecular approach, this study exami...
Show moreCollections and sample preservation
Collections during the spring and fall were made during oceanographic cruises of the Seward Line Long-term Observation Program (LTOP) and northern Gulf of Alaska Long Term Ecological Research programs (https://nga.lternet.edu/) between 2015 and 2022. Additional samples were collected at nearshore stations (GAK1 and RES2.5) in 2019 and 2023, and in 2019 in the Gulf of Alaska Seamount region from below 1000 m (see Block (2024) for more information about sampling).
RNA-Seq Data
Pre-adult Neocalanus flemingeri stage CV were collected from the upper 100 m in April and May from 5 to 6 stations. Upon retrieval of the net (QuadNet, 53 µm mesh), the plankton collection was live sorted under the microscope and preserved in RNALater. In 2019, N. flemingeri were collected in mid-April for an incubation experiment with individuals maintained in the laboratory for up to 2 months before preservation for RNA-Seq (Roncalli et al., 2023). In September, samples were collected from depth (300 to 700 m) using a 0.25 m2 Multinet (Hydrobios), live sorted and preserved for RNA-Seq immediately or after laboratory incubation. Fall collections were primarily from Prince William Sound (stations PWS2 and KIP2) with the exception of July 2019, when adult females were collected from depth (>1000 m) in the Gulf of Alaska (stations GAK19 and “Quinn deep”).
Sample processing (RNA extraction, library preparation and high-throughput sequencing) has been described previously (Roncalli et al., 2018, 2019, 2021). Paired-end sequencing was done on the Illumina Platform (NextSeq) and short-sequence read lengths were set to either 75 or 150 bp with sequencing depth of 10M or greater. Raw sequence data are available through National Center for Biotechnology Information with additional metadata available through BCO-DMO (Lenz et al., 2024a,b). Raw sequence data were quality checked, sequences with phred scores below 30 were removed, and sequences were trimmed to remove adapters and the first 9 bp (Roncalli et al., 2018). In addition, rRNA transcripts were removed using SortMeRNA (version 4.2.0) (Kopylova et al., 2012). Lineage identification was based on cytochrome c oxidase subunit-1 (mtCOI) reference sequences that were used to establish the sequence differences between the two forms (Machida and Tsuda, 2010). For the RNA-Seq data, a mtCOI reference database was generated using full length sequences for small-form and large-form N. flemingeri, as well as N. plumchrus, N. cristatus, Calanus marshallae, Eucalanus bungii and Metridia pacific (see Supplemental File "species_list_copepods.csv" for additional copepod species name information and taxonomic identifiers). The reference consisted of consensus sequences obtained by comparing sequences downloaded from NCBI and from de novo assemblies (Hartline et al., 2023; 2024). RNA-Seq libraries from each individual were mapped against the mtCOI reference using Bowtie2 (version 2.3.5.1, default options in paired end mode) as described previously for an approach that was originally developed to identify and quantify Calanus congeners in RNA-Seq data (Lenz et al., 2021a,b). Since mtCOI mapping is highly specific, this approach successfully distinguishes between closely related species and genetic lineages. Cross-mapping is minimal, and even between the two N. flemingeri lineages is typically below 2% (maximum 7%). Data analysis involved tallying the number of individuals that mapped to each lineage for each station/year collection. Analysis for spatial and temporal patterns of the number of small vs large form individuals was done by combining data by year from multiple locations, and by combining regional data for both pre-adult and adult individuals to examine interannual differences.
DNA Sequencing
In 2016, additional individuals were collected during the spring and fall cruises for DNA extraction (n=123), amplification of the mtCOI and Sanger sequencing. Individual N. flemingeri were preserved in RNALater prior to DNA extraction using the DNEasy Blood and Tissue Kit (Qiagen). Extracted DNA was amplified using universal DNA primers, LCO1490 and HCO2198, which consistently amplify a 710-bp region of the mitochondrial cytochrome oxidase subunit I gene from a variety of metazoan invertebrates (Folmer et al, 1994). PCR products were checked for expected size using gel electrophoresis, and purified using Qiagen`s Purification Kit prior to Sanger sequencing at the Advanced Studies in Genomics, Proteomics and Bioinformatics (ASGPB) at the University of Hawai‘i at Mānoa. Sequence data were edited for quality using Geneious and searched on NCBI for the closest match using BLAST (Altschul et al., 1990). Top N. flemingeri hits were compared to published lineage-specific sequences (Machida and Tsuda, 2010) for small form vs. large form identification.
Metabarcoding
DNA metabarcoding of bulk samples containing nauplii of N. flemingeri were collected in Resurrection Bay between January and March, 2023, size fractionated and preserved in ethanol (Block, 2024). The small size-fraction of the bulk samples (53 – 210 µm fraction) were extracted for DNA using the Qiagen DNeasy Blood and Tissue Mini Kit with an extended 24-hour proteinase K incubation to ensure adequate lysis. The mtCOI was amplified using the mlCOIintF and jgHCO2198 primers (Leray et al., 2013), sequenced and processed using established pipelines as described elsewhere (Block, 2024). Amplicon sequences were quantified, clustered and identified to lineage using reference sequences downloaded from the MetaZooGene database (Bucklin et al., 2021b). The two N. flemingeri genetic lineages were represented by two distinct operational taxonomic units (OTUs) that differed by ca. 3%. 10 diagnostic base pairs out of 307 are indicated by the highlighting in the fasta sequences:
>Nf1_Neocalanus flemingeri OTU 1 (small form)
GTCTAGAAATATTGCCCATGCGGGAGGTTCTGTAGACTTCGCTATTTTCTCACTTCATTT
AGCAGGTGTGAGATCTATTTTAGGGGCCGTAAACTTCATTAGAACCCTCGGAAACTTACG
AGTATTTGGTATATTATTAGACCGAATACCTTTATTTGCCTGAGCTGTTCTTATTACTGC
TGTTCTCCTTCTCCTGTCTTTACCAGTATTAGCTGGAGCTATTACAATATTGTTAACAGA
TCGTAACCTAAATACTTCTTTTTATGATGTTGGCGGGGGCGGTGACCCTATTCTGTACCA
GCATCTA
>Nf2_Neocalanus flemingeri OTU2 (large form)
CTCTAGAAATATTGCCCATGCGGGAGGTTCTGTAGACTTCGCTATTTTCTCACTTCACTT
GGCAGGTGTGAGATCTATTTTAGGGGCCGTAAACTTCATTAGGACCCTGGGAAACTTGCG
AGTATTTGGTATATTATTAGACCGAATACCTTTATTTGCCTGAGCTGTTCTTATTACTGC
TGTTCTCCTTCTCCTGTCTTTACCGGTATTAGCTGGAGCTATTACAATATTGTTAACAGA
TCGTAACCTAAATACTTCTTTCTATGATGTTGGGGGGGGCGGTGACCCTATTCTATACCA
GCATCTA
The proportion of the two lineages was estimated from the relative counts of each OTU.
Block, L. N., Lenz, P. H. (2025) Molecular identification of genetic variants of Neocalanus flemingeri in the Gulf of Alaska from samples collected from 2015 to 2023. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 1) Version Date 2025-02-21 [if applicable, indicate subset used]. http://lod.bco-dmo.org/id/dataset/954181 [access date]
Terms of Use
This dataset is licensed under Creative Commons Attribution 4.0.
If you wish to use this dataset, it is highly recommended that you contact the original principal investigators (PI). Should the relevant PI be unavailable, please contact BCO-DMO (info@bco-dmo.org) for additional guidance. For general guidance please see the BCO-DMO Terms of Use document.