Programme des journées seqBIM 2022

Le programme est en ligne :

Orateurs invités

Cette année, nous accueillons 2 orateurs invités :

Brona Brejova, (Computational Biology, Comenius University in Bratislava, Slovakia)

Title: Two Probabilistic Models in Genomics

Abstract: In the first part of the talk, I will discuss probabilistic models of k-mer abundance in sequencing reads. Many successful tools in bioinformatics are based on working with k-mers, substrings of length k of the input sequences. The models of k-mer abundance capture dependence of the abundance on various phenomena, such as the size and repeat content of the genome, heterozygosity levels, and sequencing error rate. This in turn allows us to estimate these properties from k-mer abundance histograms observed in real data.

In the second part, I will talk about our work in estimating statistical significance of overlaps between two genomic annotations. Genome annotations are a common way to represent genomic features such as genes, regulatory elements or epigenetic modifications. The amount of overlap between two annotations is often used to ascertain if there is an underlying biological connection between them. We provide efficient algorithms for estimating statistical significance when the null hypothesis is formulated using Markov chains.

Joint work with Askar Gafurov, Michal Hozza, Paul Medvedev and Tomas Vinar.

Rayan Chikhi, (Sequence Bioinformatics, Institut Pasteur, Paris)

Title: The tumultuous fate of sequence bioinformatics ideas

Abstract: In this keynote talk I will give a behind-the-scenes view of a project that led to the development of minimizer-space de Bruijn graphs (MDBG). MDBGs are an adaptation of classical de Bruijn graphs, using a tokenized alphabet, for performing efficient genome assembly of PacBio HiFi reads. After briefly presenting the scientific concept and results, I will explain how this project was conducted. It will illustrate a common disconnect between how projects are presented in conferences, and how they are really carried out in the lab. In particular, this project was abandoned for nearly a year and subsequently pivoted from its original goal. I will then expand the scope towards what I consider to be “tumultuous” ideas in bioinformatics, with a focus on the story of the WaveFront Alignment algorithm (WFA), discovered in 2021, but rooted in forgotten alignment theory dating back to 1983.

Liste des exposés acceptés

Pengfei Wang, Eric Rivals and Michelle Sweering. Combinatorics of period sets
Roland Faure, Jean-François Flot and Dominique Lavenier. Hairsplitter: assembling an unknown number of haplotypes
Nikolai Romashchenko, Benjamin Linard, Fabio Pardi and Eric Rivals. Mutual Information-based feature selection of phylo-k-mers
Emile Benoist, Guillaume Fertin, Géraldine Jean and Dominique Tessier. Un modèle intégratif pour le problème d’inférence de protéines
Sandra Romain and Claire Lemaitre. SV Jedi-graph: using a variation graph to improve structural variant genotyping with long reads
Lucas Robidou and Pierre Peterlongo. fimpera: drastic improvement of Approximate Membership Query data-structures with counts
Guillaume Blin, Alexandru Popa, Mathieu Raffinot and Raluca Uricaru. Algorithmic results for the approximate cover problem
Guillaume Rizk and Jennifer Del Giudice. Design considerations and methodology of .ORA format to achieve efficient lossless up to 5X genomic compression
Léa Vandamme, Antoine Limasset and Bastien Cazaux. Kmer2Reads an associative index for Third Generation Sequencing data
Théo Boury, Laurent Bulteau, Bertrand Marchand and Yann Ponty. Parameterized algorithms for the RNA Energy Barrier problem
Kristoffer Sahlin, Thomas Baudeau, Bastien Cazaux and Camille Marchet. A survey of mapping algorithms in the long-reads era
Timothé Rouzé, Camille Marchet and Antoine Limasset. Memory efficient subsampling strategy for large scale analysis of sequencing data
Nikolai Romashchenko, Benjamin Linard, Fabio Pardi and Eric Rivals. RAPPAS2: Efficient and accurate alignment-free phylogenetic placement