An Open Platform for Cannabis Genomics Research

Locedie Mansueto0, Ramil Mauleon0, Tobias Kretzschmar0, Graham King0
(0) Southern Cross Plant Science - Southern Cross University

Find me on Tues Nov 24th, 1:40-3pm AEDT in Remo, table 141

Abstract
Recent global appreciation of the therapeutic properties of Cannabis has led to revised national legislations under the Single Convention on Narcotic Drugs (https://www.unodc.org/unodc/en/treaties/single-convention.html), leading to rapidly growing hempseed and medicinal cannabis industries. Part of government regulation is concerned on limiting the THC content, although some countries still prohibit possession. The high heterozygosity of the Cannabis genome results in wide morphological and metabolite variation, including cannabinoids (THC, CBD and up to 60 others) and terpene content. It is a challenge for growers to produce legally compliant products while maintaining other desired properties. While there are publicly available molecular data for Cannabis that could aid in crop improvement, these are scarce and scattered. Looking at these datasets, we see that Cannabis is an ideal model system for multi-omics study since chemovar or metabolite concentration is the primary phenotype of interest, which may readily be associated with genome through the transcriptome and proteome. In this project, we aim to build a bioinformatics portal to enable community-inclusive cannabis ‘omics and genetics research. We adopted Tripal (tripal.info), a well-established toolkit with numerous modules for storage and visualization of biological datasets. Publicly available omics-type data were obtained, re-analysed, and updated to add value to the data beyond the original publication, and harmonized to interoperate across the Tripal modules. Public genome data includes CBDRx cs10 genome and its NCBI RefSeq annotation and the Purple Kush genome that we annotated using GeneMark EP pipeline; these data can be accessed in a JBrowse genome browser. Genome variant calls were generated from 5 genome sequencing projects and these could be mined using the embedded SNP-Seek interface. Various gene expression data from GEO and publications can be visualized over biological pathways using the MapManJS interface through the curated pathways. More public data and Tripal modules are in active development.
The use of Tripal as underpinning infrastructure allows for contribution of relevant contents by the cannabis research community, enabling users to register, be assigned to roles and groups with different levels of access to the site such as view, edit or create certain contents, and post comments about the contents to encourage discussion. We hope this portal will bring together the cannabis research community and encourage sharing of relevant data and information for the advancement of cannabis research.