PaperPlayer biorxiv bioinformatics podcast

PaperPlayer biorxiv bioinformatics

Multimodal LLC

Audio versions of bioRxiv paper abstracts

959 Episodes

  • PaperPlayer biorxiv bioinformatics podcast

    A comprehensive analysis of the global human gut archaeome from a thousand genome catalogue

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.21.392621v1?rss=1 Authors: Chibani, C. M., Mahnert, A., Borrel, G., Almeida, A., Werner, A., Brugere, J.-F., Gribaldo, S., Finn, R. D., Schmitz, R. A., Moissl-Eichinger, C. Abstract: The human gut microbiome plays an important role in health and disease, but the archaeal diversity therein remains largely unexplored. Here we report the pioneering analysis of 1,167 non-redundant archaeal genomes recovered from human gastrointestinal tract microbiomes across countries and populations. We identified three novel genera and 15 novel species including 52 previously unknown archaeal strains. Based on distinct genomic features, we warrant the split of the Methanobrevibacter smithii clade into two separate species, with one represented by the novel Candidatus M. intestini. Patterns derived from 1.8 million proteins and 28,851 protein clusters coded in these genomes showed a substantial correlation with socio-demographic characteristics such as age and lifestyle. We infer that archaea are actively replicating in the human gastrointestinal tract and are characterized by specific genomic and functional adaptations to the host. We further demonstrate that the human gut archaeome carries a complex virome, with some viral species showing unexpected host flexibility. Our work furthers our current understanding of the human archaeome, and provides a large genome catalogue for future analyses to decipher its role and impact on human physiology. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Computational cell cycle analysis of single cell RNA-Seq data

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.21.392613v1?rss=1 Authors: Moussa, M. M. R., Mandoiu, I. I. Abstract: The variation in gene expression profiles of cells captured in different phases of the cell cycle can interfere with cell type identification and functional analysis of single cell RNA-Seq (scRNA-Seq) data. In this paper, we introduce SC1CC (SC1 - Cell Cycle analysis tool), a computational approach for clustering and ordering single cell transcriptional profiles according to their progression along cell cycle phases. We also introduce a new robust metric, GSS (Gene Smoothness Score) for assessing the cell cycle based order of the cells. SC1CC is available as part of the SC1 web-based scRNA-Seq analysis pipeline, publicly accessible at https://sc1.engr.uconn.edu/. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Don't miss an episode of PaperPlayer biorxiv bioinformatics and subscribe to it in the GetPodcast app.

    iOS buttonAndroid button
  • PaperPlayer biorxiv bioinformatics podcast

    Frameshift and frame-preserving mutations in zebrafish presenilin 2 affect different cellular functions in young adult brains

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.21.392761v1?rss=1 Authors: Barthelson, K., Pederson, S. M., Newman, M., Jiang, H., Lardelli, M. Abstract: Background: Mutations in PRESENILIN 2 (PSEN2) cause early disease onset familial Alzheimer's disease (EOfAD) but their mode of action remains elusive. One consistent observation for all PRESENILIN gene mutations causing EOfAD is that a transcript is produced with a reading frame terminated by the normal stop codon : the 'reading frame preservation rule'. Mutations that do not obey this rule do not cause the disease. The reasons for this are debated. Methods: A frameshift mutation (psen2N140fs) and a reading frame-preserving mutation (psen2T141_L142delinsMISLISV) were previously isolated during genome editing directed at the N140 codon of zebrafish psen2 (equivalent to N141 of human PSEN2). We mated a pair of fish heterozygous for each mutation to generate a family of siblings including wild type and heterozygous mutant genotypes. Transcriptomes from young adult (6 months) brains of these genotypes were analysed. Bioinformatics techniques were used to predict cellular functions affected by heterozygosity for each mutation. Results: The reading frame preserving mutation uniquely caused subtle, but statistically significant, changes to expression of genes involved in oxidative phosphorylation, long term potentiation and the cell cycle. The frameshift mutation uniquely affected genes involved in Notch and MAPK signalling, extracellular matrix receptor interactions and focal adhesion. Both mutations affected ribosomal protein gene expression but in opposite directions. Conclusion: A frameshift and frame-preserving mutation at the same position in zebrafish psen2 cause discrete effects. Changes in oxidative phosphorylation, long Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    RepairSig: Deconvolution of DNA damage and repaircontributions to the mutational landscape of cancer

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.21.392878v1?rss=1 Authors: Wojtowicz, D., Hoinka, J., Amgalan, B., Kim, Y.-A., Przytycka, T. M. Abstract: Many mutagenic processes leave characteristic imprints on cancer genomes known as mutational signatures. These signatures have been of recent interest regarding their applicability in studying processes shaping the mutational landscape of cancer. In particular, pinpointing the presence of altered DNA repair pathways can have important therapeutic implications. However, mutational signatures of DNA repair deficiencies are often hard to infer. This challenge emerges as a result of deficient DNA repair processes acting by modifying the outcome of other mutagens. Thus, they exhibit non-additive effects that are not depicted by the current paradigm for modeling mutational processes as independent signatures. To close this gap, we present RepairSig, a method that accounts for interactions between DNA damage and repair and is able to uncover unbiased signatures of deficient DNA repair processes. In particular, RepairSig was able to replace three MMR deficiency signatures previously proposed to be active in breast cancer, with just one signature strikingly similar to the experimentally derived signature. As the first method to model interactions between mutagenic processes, RepairSig is an important step towards biologically more realistic modeling of mutational processes in cancer. The source code for RepairSig is publicly available at https://github.com/ncbi/RepairSig. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    A probabilistic model for indel evolution: differentiating insertions from deletions

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.22.393108v1?rss=1 Authors: Loewenthal, G., Rapoport, D., Avram, O., Moshe, A., Itzkovitch, A., Israeli, O., Azouri, D., Cartwright, R. A., Mayrose, I., Pupko, T. Abstract: Insertions and deletions (indels) are common molecular evolutionary events. However, probabilistic models for indel evolution are under-developed due to their computational complexity. Here we introduce several improvements to indel modeling: (1) while previous models for indel evolution assumed that the rates and length distributions of insertions and deletions are equal, here, we propose a richer model that explicitly distinguishes between the two; (2) We introduce numerous summary statistics that allow Approximate Bayesian Computation (ABC) based parameter estimation; (3) We develop a neural-network model-selection scheme to test whether the richer model better fits biological data compared to the simpler model. Our analyses suggest that both our inference scheme and the model-selection procedure achieve high accuracy on simulated data. We further demonstrate that our proposed indel model better fits a large number of empirical datasets and that, for the majority of these datasets, the deletion rate is higher than the insertion rate. Finally, we demonstrate that indel rates are negatively correlated to the effective population size across various phylogenomic clades. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Inferring the spatial code of cell-cell interactions and communication across a whole animal body

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.22.392217v1?rss=1 Authors: Armingol, E., Joshi, C. J., Baghdassarian, H., Shamie, I., Ghaddar, A., Chan, J., Her, H.-L., O'Rourke, E. J., Lewis, N. E. Abstract: Cell-cell interactions are crucial for multicellular organisms as they shape cellular function and ultimately organismal phenotype. However, the spatial code embedded in the molecular interactions that drive and sustain spatial organization, and in the organization that in turns drives intercellular interactions across a living animal remains to be elucidated. Here we use the expression of ligand-receptor pairs obtained from a whole-body single-cell transcriptome of Caenorhabditis elegans larvae to compute the potential for intercellular interactions through a Bray-Curtis-like metric. Leveraging a 3D atlas of C. elegans' cells, we implement a genetic algorithm to select the ligand-receptor pairs most informative of the spatial organization of cells. Validating the strategy, the selected ligand-receptor pairs are involved in known cell-migration and morphogenesis processes and we confirm a negative correlation between cell-cell distances and interactions. Thus, our computational framework helps identify cell-cell interactions and their relationship with intercellular distances, and decipher molecular bases encoding spatial information in a whole animal. Furthermore, it can also be used to elucidate associations with any other intercellular phenotype and applied to other multicellular organisms. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Genome Wide Association Studies on 7 Yield-relatedTraits of 183 Rice Varieties in Bangladesh

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.22.393074v1?rss=1 Authors: Roy, N., Kabir, A. H., Zahan, N., Mouna, S. T., Chakravarty, S., Rahman, A. H., Bayzid, M. S. Abstract: Rice genetic diversity is regulated by multiple genes and is largely dependent on various environmental factors. Uncovering the genetic variations associated with the diversity in rice populations is the key to breed stable and high yielding rice varieties. We performed Genome Wide Association Studies (GWAS) on 7 rice yielding traits (grain length, grain width, grain weight, panicle length, leaf length, leaf width, and leaf angle) based on 39,40,165 single nucleotide polymorphisms (SNPs) in a population of 183 rice landraces of Bangladesh. Our studies reveal various chromosomal regions that are significantly associated with different traits in Bangladeshi rice varieties. We also identified various candidate genes, which are associated with these traits. This study reveals multiple candidate genes within short intervals. We also identified SNP loci, which are significantly associated with multiple yield-related traits. The results of these association studies support previous findings as well as provide additional insights into the genetic diversity of rice. This is the first known GWAS study on various yield-related traits in the varieties of Oryza sativa available in Bangladesh, the fourth largest rice producing country. We believe this study will accelerate rice genetics research and breeding stable high-yielding rice in Bangladesh. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Identification of cell-type marker genes from plant single-cell RNA-seq data using machine learning

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.22.393165v1?rss=1 Authors: Yan, H., Song, Q., Lee, J., Schiefelbein, J., Li, S. Abstract: An essential step of single-cell RNA sequencing analysis is to classify specific cell types with marker genes in order to dissect the biological functions of each individual cell. In this study, we integrated five published scRNA-seq datasets from the Arabidopsis root containing over 25,000 cells and 17 cell clusters. We have compared the performance of seven machine learning methods in classifying these cell types, and determined that the random forest and support vector machine methods performed best. Using feature selection with these two methods and a correlation method, we have identified 600 new marker genes for 10 root cell types, and more than 70% of these machine learning-derived marker genes were not identified before. We found that these new markers not only can assign cell types consistently as the previously known cell markers, but also performed better than existing markers in several evaluation metrics including accuracy and sensitivity. Markers derived by the random forest method, in particular, were expressed in 89-98% of cells in endodermis, trichoblast, and cortex clusters, which is a 29-67% improvement over known markers. Finally, we have found 111 new orthologous marker genes for the trichoblast in five plant species, which expands the number of marker genes by 58-170% in non-Arabidopsis plants. Our results represent a new approach to identify cell-type marker genes from scRNA-seq data and pave the way for cross-species mapping of scRNA-seq data in plants. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    Clipper: p-value-free FDR control on high-throughput data from two conditions

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.19.390773v1?rss=1 Authors: Ge, X., Chen, Y. E., Song, D., McDermott, M., Woyshner, K., Manousopoulou, A., Wang, L. D., Li, W., Li, J. J. Abstract: High-throughput biological data analysis commonly involves the identification of "interesting" features (e.g., genes, genomic regions, and proteins), whose values differ between two conditions, from numerous features measured simultaneously. To ensure the reliability of such analysis, the most widely-used criterion is the false discovery rate (FDR), the expected proportion of uninteresting features among the identified ones. Existing bioinformatics tools primarily control the FDR based on p-values. However, obtaining valid p-values relies on either reasonable assumptions of data distribution or large numbers of replicates under both conditions, two requirements that are often unmet in biological studies. To address this issue, we propose Clipper, a general statistical framework for FDR control without relying on p-values or specific data distributions. Clipper is applicable to identifying both enriched and differential features from high-throughput biological data of diverse types. In comprehensive simulation and real-data benchmarking, Clipper outperforms existing generic FDR control methods and specific bioinformatics tools designed for various tasks, including peak calling from ChIP-seq data, differentially expressed gene identification from RNA-seq data, differentially interacting chromatin region identification from Hi-C data, and peptide identification from mass spectrometry data. Notably, our benchmarking results for peptide identification are based on the first mass spectrometry data standard that has a realistic dynamic range. Our results demonstrate Clipper's flexibility and reliability for FDR control, as well as its broad applications in high-throughput data analysis. Copy rights belong to original authors. Visit the link for more info
  • PaperPlayer biorxiv bioinformatics podcast

    MetENPMetENPWeb: An R package and web application for metabolomics enrichment and pathway analysis in Metabolomics Workbench

    Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.20.391912v1?rss=1 Authors: Choudhary, K. S., Fahy, E., Coakley, K., Sud, M., Maurya, M. R., Subramaniam, S. Abstract: With the advent of high throughput mass spectrometric methods, metabolomics has emerged as an essential area of research in biomedicine with the potential to provide deep biological insights into normal and diseased functions in physiology. However, to achieve the potential offered by metabolomics measures, there is a need for biologist-friendly integrative analysis tools that can transform data into mechanisms that relate to phenotypes. Here, we describe MetENP, an R package, and a user-friendly web application deployed at the Metabolomics Workbench site extending the metabolomics enrichment analysis to include species-specific pathway analysis, pathway enrichment scores, gene-enzyme information, and enzymatic activities of the significantly altered metabolites. MetENP provides a highly customizable workflow through various user-specified options and includes support for all metabolite species with available KEGG pathways. MetENPweb is a web application for calculating metabolite and pathway enrichment analysis. Copy rights belong to original authors. Visit the link for more info

Get the whole world of podcasts with the free GetPodcast app.

Subscribe to your favorite podcasts, listen to episodes offline and get thrilling recommendations.

iOS buttonAndroid button