Construction of Mushroom MNP marker library and its application in cultivar identification
Zhang Tuo1, Chuan-Zheng Wei1, Lu-Yu Xie 2*, Huan Yang1, Zhi-Wen Lv1, and Bao-Gui Xie1*
1Mycological Research Center, Fujian Agriculture and Foristry University
2College of Computer Science, National University of Singapore
The currently used cultivar identification technologies, such as SSR, ISSR, SCAR, and AFLP, are based on analyzing the polymorphism of the PCR product. Many closely related cultivars are needed as control in these technologies, consequently, resulting in heavy and time-consuming workload, and also often getting false positive results because of the mismathes between primers and genomes. As to the maturation, low cost, and accuracy of the Next Generation Sequencing technology (NGS), the mainly cultivated species of Flammulina filiformis, Pleurotus eryngii, and Hypsizigus marmoreus from overseas and China, also including some wild species, have been collected for library construction in our study. After whole genome sequencing, bowtie2 software was used to map NGS data of the samples against the reference genome, consequently, GATK software was used to check SNPs. First, SNPs with Minor Allele Frequence(MAF) > 0.05 were chosen, then SNPs with miss rate >20% were filtered, finally, SNPs for Multiple Nucleotide Polymorphism library construction were selected after filtering SNPs with sequencing depth less than 50. The scanning window was set as 100 base pairs, as if there were 2 to 10 SNPs in one scanning window, which would be selected as MNP marker, i.e, defining a MNP marker as the scanning window which contains 2 to 10 SNPs. MNPs were ranked according to their PIC (Polymorphism Information Content) values. MNPs without overlapping, with PIC >0.5, and mutual distance over 50 kb were chosen in MNP library. The MNP libraries of F. filiformis, P. eryngii, and H. marmoreus have been constructed in this study, of which the total MNP marker numbers are 428, 503, 369, respectively.
When identifying new cultivar, the MNPs of the candiates are compared with the MNPs of the library species one by one, the number (indicated as n) of the mutual MNPs between the candiates and the library will be record for further calculating genetic similarity (GS) between strains. GS = n/N Í100% (for F.filiformis N= 428, for P.eryngii N = 503, for H. marmoreus N = 369).
In order to construct the Mushroom Molecular Identification Workstation (http://www.mrclab.top/), we rent the Linux server from Aliyun, and provide free service for species identification, cultivar identification, hybrid identification, and molecular marker screen. Among these service, molecular marker screen needs heavy computational calculation, so the server rented can only analyze strains with high genetic similarity by searching differentiated DNA fragments with length over 200 base pairs. So far, our workstation can only analyze F. filiformis, P. eryngii, and H. marmoreus. Users should only upload their NGS data to the workstation, and will obtain the results after about 5 five hours.
When the cultivar that users uploaded was a new one, the workstation will upgrade the MNP library automatically.
The mushroom MNP marker-based cultivar identification technology, which is independent on control or closely related species, is accurate, stable, reliable, small-workload, and low-cost.