Recherchez une offre d'emploi
Thèse Comprendre les Conflits Phylogénétiques H/F - 34
Description du poste
- Doctorat.Gouv.Fr
-
Montpellier - 34
-
CDD
-
Publié le 1 Avril 2026
Établissement : Université de Montpellier
École doctorale : GAIA - Biodiversité, Agriculture, Alimentation, Environnement, Terre, Eau
Laboratoire de recherche : ISEM - Institut des Sciences de l'Evolution -Montpellier
Direction de la thèse : Nicolas GALTIER
Début de la thèse : 2026-10-01
Date limite de candidature : 2026-05-07T23:59:59
Les phylogénies reconstruites à partir de différents segments du génome diffèrent fréquemment - on parle de conflits phylogénétiques. Certains de ces conflits sont dus à des erreurs de reconstruction, mais d'autres reflètent une réalité biologique, et notamment, aux échanges génétiques entre espèces/lignées ayant préalablement divergé - le flux de gène. Un premier objectif de ce projet vise à évaluer la performance des méthodes existantes de détection du flux de gène sur la base d'arbres de gène discordants, en tenant compte des erreurs de reconstruction, des biais d'échantillonnage du génome, du facteur confondant qu'est le tri de lignées incomplet, et de la contribution potentielle des lignées non-échantillonnées (='fantôme'). Sur cette base, les méthodes existantes seront combinées et optimisées de manière à permettre une quantification de l'importance du flux de gène ancien dans les conflits phylogénétiques entre gènes chez divers taxons de plantes et animaux.
Phylogenetic Conflicts
Phylogenetic trees as a key object in evolutionary biology. They are classically seen as the representation of species divergence history, and are thought to be best recovered via molecular dataanalysis. Decades of molecular phylogenetics research, however, have revealed that distinct genes often yield distinct trees - a situation called incongruence, or conflict. For long phylogenetic conflicts were considered to be mainly due to reconstruction errors, i.e., a problem to solve. Scientists are now realizing that conflicts often reflect a biological reality, i.e., an opportunity to learn about evolution (Scornavacca, Delsuc & Galtier 2020, section 3).
Causes of conflicts
Incomplete lineage sorting (ILS) and gene flow (=introgression, GF) are the two documented biological processes able to induce phylogenetic conflicts. They are quite different in essence: ILS is determined by drift, whereas GF reflects migration and maybe selection. Both are expected to apply if internal branches are sufficiently short, such that ancestral polymorphism survives two speciation events (ILS) or non-sister species exchange genes (GF). To determine how prevalent these two processes are in shaping gene treesis a major goal of current phylogenomics.
GF and ILS leave distinct signals regarding the expected topologies (Durand et al. 2011) and branch lengths (Galtier 2024) of discordant gene trees. A number of methods have been developed to distinguish between the two processes. These methods have not yet thoroughly assessed or compared. Some are computationally demanding (CoalHMM, Bpp), others are limited to 3 taxa(ABBA-BABA, QuIBL, Aphid). Recently, a confounding factor was identified: gene flow from so-called «ghost» lineages (i.e., extinct or unsampled lineages). Ghost GF are expected to generate potentially asymmetric numbers of the two discordant topologies (like regular GF) and long terminal branches (like ILS). Two articles suggest that the importance of this process has been so far underestimated (Tricou et al. 2022a,b). This is to be confirmed and quantified. Finally, the recent literature have often attributed all the observed conflict to either GF or ILS, denying the existence of reconstruction errors. This is probably to extreme a viewpoint.
Data analysis challenges
In the era of high-throughput sequencing, scientists typically collect and compare genome-wide data across closely related species - a situation where GF and ILS are likely to play a role. Analysing such data is challenging. Genomes are made of contiguous non-recombining segments separated by recombination breakpoints. Each segment has its own genealogy, neighbouring genealogies being correlated. This genome-wide coalescence process is described by the multi-species version of the Ancestral Recombination Graph - a particularly complex conceptual object, even in the absence of GF. The combination of large data sets and complex biological processes makes inference particularly challenging, hence the need for smart algorithms - for scalability and CO2 emission reasons.
Importantly, the user does not know where recombination events have really happened. Defining the segments to use for gene tree building is therefore not straightforward. Published studies typically apply arbitrary ways of subsampling the genome. What these arbitrary choices entail in terms of inference is currently unsure. Of particular concern would be the occurrence of recombination within the selected segments. What are the consequences of undetected recombination on the inference of gene trees, species trees, their branch lengths, and GF/ILS prevalence? These questions have been only rarely addressed so far.
Goals
Phylogenetic conflict analysis is a complex field, in which no methodological standard has emerged yet. Many applicative papers have been published, most of which ignore parts of the problem - for instance coalHMM-based papers neglects GF, ABBA-BABA-based papers neglect symmetrical GF, Aphid/Quibl-based papers neglect ghost GF, most of these neglect recombination. It is more than time to clarify which of these simplifications are plausible, which are problematic, and under which conditions.
Hence the objectives of this thesis:
·Assess the performance of existing methods for analyzing phylogenetic conflicts, with a focus on ghost gene flow and recombination
·Improve/optimize methods and propose a strategy of phylogenetic conflict analysis for multi-species genome-wide data sets
·Revisit the analysis and interpretation of phylogenetic conflicts in existing data sets in plants and animals
·Assess the performance of existing methods for analysing conflicting gene trees, and particularly, detecting gene flow in the face of reconstruction errors and incomplete lineage sorting, with a focus on so-called «ghost» lineages and recombination
·Improve/optimize gene flow detection methods and propose a strategy of phylogenetic conflict analysis for multi-species genome-wide data sets
·Revisit the analysis and interpretation of phylogenetic conflicts in existing data sets in plants and animals
Use the multi-species coalescent model to simulate phylogenomic datasets with gene flow accounting for ghost lineages and recombination.
Test the performances of existing approaches(ABBA-BABA, Quibl, Aphid, BPP) using gene trees reconstructed from correct (=non-recombining) or incorrect (=recombining) genome segments, and particularly, their ability to accurately infer gene flow.
Deduce the optimal strategy for analysing phylogenetic conflicts and assessing gene flow prevalence by combining/adapting/improving existing approaches, and addressing the n>3 and computational scalability challenges.
Revisit existing data sets in which the prevalence of gene flow has been discussed; this includes plants (Glémin et al. 2019), Anopheles mosquitoes (Fontaine et al. 2015), mammals (Foley et al. 2026), among which primates (Vanderpool et al. 2020).
Offres similaires
Thèse Succès Parasitaire chez l'Insecte de Nématodes et leurs Bactéries Associées Utilisés en Biocontrôle dans un Contexte de Changement Global H/F
-
Doctorat.Gouv.Fr
-
Montpellier - 34
-
CDD
-
1 Avril 2026
Thèse Conception de Modèles d'Intelligence Artificielle pour l'Évaluation Rapide et Efficace de l'État de Communautés Végétales Entomophiles Méditerranéennes H/F
-
Doctorat.Gouv.Fr
-
Montpellier - 34
-
CDD
-
1 Avril 2026
Thèse Modelisation Physique de la Segregation du Genome Bacterien H/F
-
Doctorat.Gouv.Fr
-
Montpellier - 34
-
CDD
-
30 Mars 2026
Déposez votre CV
Soyez visible par les entreprises qui recrutent à Montpellier.
Chiffres clés de l'emploi à Montpellier
- Taux de chomage : 14%
- Population : 295542
- Médiane niveau de vie : 18870€/an
- Demandeurs d'emploi : 39020
- Actifs : 134890
- Nombres d'entreprises : 30684
Sources :
Un site du réseaux :