Le programme des journées est disponible en version pdf
Tous les résumés de seqBIM 2021 : Actes_seqBIM_2021.pdf
Cette année, nous accueillons 2 orateurs invités :
- Paola Bonizzoni, (AlgoLab, Université de Milan, Italie)
- Titre : Back and forth in pangenomics: data structures for querying large collections of sequence data
- Résumé :
The speed in producing large amounts of genome data, driven by advances in sequencing technologies, is far from the slow progress in developing new methods for analyzing multiple related genomes. Most recent advances in the field are still based on notions rooted in established and quite old literature on combinatorics on words and space-efficient data structures.
In this talk we will go back and forth through the state-of-art with the goal of analyzing query operations and data structures that may help in managing and analyzing multiple genomes: the theoretical foundations of computational pangenomics.
- Thierry Lecroq, (équipe TIBS, LITIS, Rouen)
- Titre : Cartesian Pattern Matching
- Résumé :
Cartesian trees are associated to strings of numbers. They are structured as heap and original strings can be recovered by symmetrical traversal of the trees. Let x be a string of numbers of length m. The Cartesian tree of x is the binary tree where:
- the root corresponds to the index i of the minimal element of x (if there are several occurrences of the minimal element, the leftmost one is chosen);
- the left subtree of the root corresponds to the Cartesian tree of x[1..i-1];
- the right subtree of the root corresponds to the Cartesian tree of x[i+1..m]. Cartesian pattern matching can be applied to find patterns in time series data.
- given a text and a pattern that consist of sequences of numbers, find all the substrings of the text that have the same Cartesian tree than the pattern;
- given a text and a finite set of patterns that consist of sequences of numbers, find all the substrings of the text that have the same Cartesian tree than one of the patterns.
- given two strings that consist of sequences of numbers, find the length of the longest substring of both strings that have the same Cartesian tree.
Liste des exposés acceptés
- Estéban Gabory and Laurent Bulteau. Parametrized algorithms for consensus problems with swaps
- Albane Lysiak, Guillaume Fertin, Géraldine Jean and Dominique Tessier. SpecGlob: a new Dynamic Programming Algorithm to interpret Mass Spectra
- R. Charbel Maroun and Georges Khazen. PPIMem, a novel approach for predicting transmembrane protein-protein complexes
- Yoshihiro Shibuya, Djamal Belazzougui and Gregory Kucherov. Space-efficient representation of genomic k-mer count tables
- Clara Delahaye and Jacques Nicolas. Answer Set Programming based haplotype phasing of long read for di-polyploid species
- Grigorii Sukhorukov and Macha Nikolski. VirHunter: a deep learning-based method for detection of novel viruses in plant sequencing data
- Marie Mille, Julie Ripoll, Bastien Cazaux and Eric Rivals. Algorithms for searching dinucleotidic Position Weight Matrices (di-PWM)
- Roland Faure, Nadege Guiglielmoni and Jean-François Flot. GraphUnzip: unzipping assembly graphs with long reads and Hi-C
- Lucas Robidou and Pierre Peterlongo. On the fly reduction of Bloom filter false positives
- Téo Lemane, Rayan Chikhi and Pierre Peterlongo. Advances in k-mer matrix construction for analysis of large sequencing collections
- Victor Epain, Rumen Andonov, Jean-François Gibrat and Dominique Lavenier. The advantage of DNA reads overlaps’ reverse complement symmetry for their storage in an oriented graph
- Michael Sheinman, Ksenia Arkhipova, Peter Arndt, Bas Dutilh, Rutger Hermsen and Florian Massip. Identical sequences found in distant genomes reveal frequent horizontal transfer across the bacterial domain