BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20250712T230040EDT-1265l5WA5t@132.216.98.100 DTSTAMP:20250713T030040Z DESCRIPTION:\n Matrix completion in genetic methylation studies: LMCC\, a Li near Model of Coregionalization with informative Covariates\n\n  \n\n\nAbst ract:\n\nDNA methylation is an important epigenetic mark that modulates ge ne expression through the inhibition of transcriptional proteins binding t o DNA. As in many other omics experiments\, missing values is an issue and appropriate imputation techniques are important to avoid an unnecessary s ample size reduction as well as to optimally leverage the information coll ected. We consider the case where a relatively small number of samples are processed via an expensive high-density Whole Genome Bisulfite Sequencing (WGBS) strategy and a larger number of samples are processed using more a ffordable low-density array-based technologies. In such cases\, one can im pute/complete the data matrix of the low coverage (array-based) methylatio n data using the high-density information provided by the WGBS samples. In this work\, we propose an efficient Linear Model of Coregionalization wit h informative Covariates (LMCC) to predict missing values based on observe d values and informative covariates. Our model assumes that at each genomi cs position\, the methylation vector of all samples is linked to the set o f fixed factors (covariates) and a set of latent factors. Furthermore\, we exploit the functional nature of the data and the spatial correlation acr oss positions/sites by assuming Gaussian processes on the fixed and latent coefficient vectors\, respectively. Our simulations show that the use of covariates can significantly improve the accuracy of imputed values\, espe cially in cases where missing data contain some relevant information about the explanatory variable. We also show that the proposed model is efficie nt when the number of columns is much greater than the number of rows in t he data matrix-which is usually the case in methylation data analysis. Fin ally\, we apply and compare the proposed method with alternative approache s to complete a matrix of DNA methylation containing 15 rows (methylation samples) and 1 million columns (sites). Joint work with Melina Ribaud and Aurelie Labbe (HEC\, Montreal).\n\nSpeaker\n\nKarim Oualkacha is a profess or in the Department of Mathematics at Université du Québec à Montréal (UQ AM). He received BSc in Mathematics and MSc in Statistics and Operational Research from Université Cadi Ayyad (Marrakech\, Morocco)\, and MSc and Ph D in Statistics from Université Laval (Quebec city). His research interest s focus on sparse multivariate statistical methods for high-dimensional da ta and modelling of multidimensional dependencies based on copulas\, with applications in statistical genetics.\n\nhttps://mcgill.zoom.us/j/82678428 848\n\nMeeting ID: 826 7842 8848\n\nPasscode: None\n\n \n DTSTART:20240216T203000Z DTEND:20240216T213000Z LOCATION:Room 1104\, Burnside Hall\, CA\, QC\, Montreal\, H3A 0B9\, 805 rue Sherbrooke Ouest SUMMARY:Karim Oualkacha (UQAM) URL:/mathstat/channels/event/karim-oualkacha-uqam-3553 50 END:VEVENT END:VCALENDAR