Direct RNA sequencing identifies isoform specific structures

Yue Wan0
(0) Genome Institute of Singapore

Abstract
The ability to correctly assign structure information to an individual transcript in a continuous and phased manner is critical to understanding RNA function. RNA structure play important roles in every step of an RNA’s lifecycle, however current short-read high throughput RNA structure mapping strategies are long, complex and cannot assign unique structures to individual gene-linked isoforms in shared sequences. To address these limitations, we present an approach that combines structure probing with SHAPE-like compound NAI-N3, nanopore direct RNA sequencing, and one-class support vector machines to detect secondary structures on near full-length RNAs (PORE-cupine). PORE-cupine provides rapid, direct, accurate and robust structure information along known RNAs and recapitulates global structural features in human embryonic stem cells. The majority of gene-linked isoforms showed structural differences in shared sequences both local and distal to the alternative splice site, highlighting the importance of long-read sequencing for phasing of structures. Structural differences between gene-linked isoforms are associated with differential translation efficiencies globally, highlighting the role of structure as a pervasive mechanism for regulating isoform-specific gene expression inside cells.