MAKER2
2.31.8b
Halruf.gff3
Aug. 28, 2018, midnight
Genes
Genome sequences for transcript and protein homology gene predictions were obtained from NCBI for the following species: Crassostrea gigas (GCA_000297895.1 Zang et al. 2012), Crassostrea virginica (GCA_002022765.4), Mytilus galloprovincialis (GCA_001676915.1, Murgarella et al. 2016) and Mizuhopecten yessoensis (GCA_002113885.2 Wang 2017). Genomes were downloaded and transcripts were extracted using the gene models in the gff file (Supplemental Note 7). Three lanes worth of 150 bp paired-end (insert size ~300 bp) Illumina HiSeq 3000 RNA-Seq data for Haliotis rufescens were obtained from twelve tissues from a single female (cephalic tentacle, epipodium, epipodal tentacle, ganglion, gonad, heart, kidney, liver, foot, gill, mantel and post-esophagus), two tissues from a single male (gonad, light receptor), and from pools of individuals from each of early-life developmental stages (egg, 1 day, 6 days, 7 days [the 7 day time point included a 24 hour acute carbon dioxide exposure ~1200ppm & control CO2 exposure], 10 days post-hatch, and 1 month post-hatch). All samples were extracted in duplicate to create replicate libraries. RNA-seq read data have been deposited in the NCBI short read archive (BioProject Accession: PRJNA488641). The RNA-Seq data were assembled using Trinity version 2.3.2 Grabherr 2011) with default parameters, and subsequently used as EST evidence. EST evidence was also gathered from all available Bivalvia ESTs in NCBI on 01/26/18. MAKER2 version 2.31.8b (Holt and Yandell 2011) was run on the masked genome using all data described as evidence (Supplemental Note 7). In the final annotation, function information was added to the predicted gene models. Curated databases, SwissProt/UniProt (UniProt Consortium 2016, accessed Oct 5, 2017), were used to identify putative function based on blastp homology with default parameters and an upper e-value cutoff of 1e-5 (Camacho 2009). Default parameters for InterProScan version 5.26-65.0 (Jones et al. 2014) were applied to searches against the databases that make up the InterPro Consortium.