BACKGROUND: Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. RESULTS: We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. CONCLUSIONS: We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.
Journal article
2025-08-14T00:00:00+00:00
26
HIV, Next-generation sequencing, Random forest, Recency of infection, Time since infection, HIV Infections, Humans, HIV-1, Incidence, High-Throughput Nucleotide Sequencing, Cross-Sectional Studies, Phylogeny