Responsable : Rodolphe Garbit
Séminaires à venir
We investigate a general class of stochastic gradient descent (SGD) algorithms, called conditioned SGD, based on a preconditioning of the gradient direction. Under some mild assumptions, we establish the almost sure convergence and the asymptotic normality for a broad class of conditioning matrices. In particular, when the conditioning matrix is an estimate of the inverse Hessian at the optimal point, the algorithm is proved to be asymptotically optimal. The benefits of this approach are validated on simulated and real datasets.
Séminaires passés
La gestion de risque en finance et assurance se base souvent sur le calcul d’un seul quantile (ou Value-at-Risk). Un inconvénient des quantiles est qu’ils ne mesurent que la fréquence d’un évènement extrême et ne donnent pas d’information sur l’impact d’un tel événement. En assurance, un autre inconvénient est que les quantiles ne définissent pas une mesure de risque cohérente. Dans cet exposé, j’expliquerai comment, en partant de la formulation du quantile comme la solution d’un problème d’optimisation, on peut construire deux familles alternatives de mesures de risque, appelées expectiles et extremiles. Je donnerai un aperçu de leurs propriétés, ainsi que quelques résultats sur leur estimation à des niveaux extrêmes pour des distributions à queue lourde, et j’expliquerai également, au moyen de quelques applications sur données réelles, pourquoi ces mesures constituent des compléments raisonnables aux quantiles en gestion de risque. Cet exposé se base sur des résultats obtenus en collaboration avec Abdelaati Daouia, Irène Gijbels, Stéphane Girard et Antoine Usseglio-Carleve.
Nous obtenons des estimées de concentration gaussienne non asymptotiques pour la différence entre la mesure invariante ? d’une diffusion brownienne ergodique et la mesure empirique d’un schéma d’approximation à pas décroissants évaluée le long d’une classe admissible de fonctions tests f telles que f-v(f) soit un co-bord du générateur infinitésimal. Nous montrons que ces bornes peuvent être améliorées lorsque le carré de certaines normes du coefficient de diffusion appartient également à cette classe. Nous déduisons de ces estimées des intervalles de confiance non asymptotiques explicitement calculables pour le schéma d’approximation. Nous obtenons également, en terme d’application théorique, des estimées de déviations non asymptotiques pour le théorème de la limite centrale presque sûr. (travail commun avec I. Honoré et G.Pagès)
In this talk I present a L0 penalization method called the adaptive-ridge (first introduced by Frommlet and Nuel, 2016), in a survival analysis context. This method is shown to be of interest for the estimation of the piecewise constant hazard model. Starting from a large grid, the number and locations of the cuts of the hazard function can be determined using this L0 penalization method by forcing two similar adjacent hazard values to be equal. Two extensions of this method to survival data are presented. Firstly, in the age-cohort-period setting, the hazard function is considered as a bi-dimensional function. Since the number of parameters can be quite large compared to the sample size, the L0 penalization method is used to avoid overfitting issues. The method is illustrated on the SEER data for survival after breast cancer. The adaptive ridge technique is implemented to produce a parsimonious representation of the risk of death after breast cancer as a bi-dimensional function of the date of diagnostic and the time elapsed since cancer onset. Secondly, when dealing with interval censored data, a Cox model with piecewise constant hazard baseline is also introduced. The adaptive ridge procedure is used for the baseline function, resulting in a flexible regression model. The method is illustrated on a dental dataset.
Considering a Poisson process observed on a bounded, fixed interval, we are interested in the problem of detecting an abrupt change in its distribution, characterized by a jump in its intensity. Formulated as an off-line change-point problem, we address two distinct questions : the one of detecting a change-point and the one of estimating the jump location of such change-point once detected. This study aims at proposing a non-asymptotic minimax testing set-up, first to construct a minimax and adaptive detection procedure and then to give a minimax study of a multiple testing procedure designed for change-point localisation.
Markov Chain Monte Carlo (MCMC) is a class of algorithms to sample complex and high-dimensional probability distributions. The Metropolis-Hastings (MH) algorithm, the workhorse of MCMC, provides a simple recipe to construct reversible Markov kernels. Reversibility is a tractable property which implies a less tractable but essential property here, invariance. Reversibility is however not necessarily desirable when considering performance. This has prompted recent interest in designing kernels breaking this property. At the same time, an active stream of research has focused on the design of novel versions of the MH kernel, some nonreversible, relying on the use of complex invertible deterministic transforms. While standard implementations of the MH kernel are well understood, aforementioned developments have not received the same systematic treatment to ensure their validity. In this talk, we will introduce develop general tools to ensure that a class of nonreversible Markov kernels, possibly relying on complex transforms, has the desired invariance property and lead to convergent algorithms. This leads to a set of simple and practically verifiable conditions.
De nos jours, les procédures de machine learning sont utilisées dans beaucoup de champs d’applications à l'exception notables des domaines dits sensibles (santé, justice, défense pour n'en citer que quelques-uns) dans lesquels les décisions à prendre sont lourdes de conséquence. Dans ces domaines, il est nécessaire d'obtenir une décision précise mais, pour entrer effectivement en application, ces algorithmes doivent fournir une explication du mécanisme qui conduit à la prise de décision et, en ce sens, être interprétable. Malheureusement les algorithmes les plus précis actuellement sont souvent les plus complexes. Une technique classique pour tenter d'expliquer leurs prédictions consiste à calculer des indicateurs correspondant à la force du lien entre chaque variable d’entrée et la variable de sortie à prédire. Dans cet exposé, nous nous intéresserons à un indicateur d'importance créé pour les arbres de décision, le Mean Decrease Impurity, et nous verrons en quoi l'étude théorique permet de fournir des explications quant à son utilisation pratique.
We propose a mathematical model on the oncolytic virotherapy incorporating virus-specific CTL response, which contribute to killing of infected tumor cells. In order to improve the understanding of the dynamic interactions between tumor cells and virus-specific CTL, stochastic differential equation models are constructed. We obtain sufficient conditions for existence, persistence and extinction of the stochastic system. In relation to the therapy control, we also analyze the stochasticity role of equilibrium point stabilities. The Monte Carlo algorithm is used to estimate the mean extinction time and the extinction probability of cancer cells or viruses-specific CTLs. Our simulations highlighted the switch of the system leaving the attractor basin of the three species co-existence equilibrium towards that of cancer cell extinction or that of virus specific CTLs depletion. This allowed us to characterize the spaces of cancer control parameters. Finally, we determine the model solution robustness by analyzing the sensitivity of the model characteristic parameters. Our results demonstrate the highly dependence of the virotherapy success or failure on the combination of stochastic diffusion parameters with the maximum per capita growth rate of uninfected tumor cells, the transmission rate, the viral cytotoxicity and the strength of the CTL response.
Optimal transport (OT) has recently gained lot of interest in machine learning. It is a natural tool to compare in a geometrically faithful way probability distributions. It finds applications in both supervised learning (using geometric loss functions) and unsupervised learning (to perform generative model fitting). OT is however plagued by the curse of dimensionality, since it might require a number of samples which grows exponentially with the dimension. In this talk, I will explain how to leverage entropic regularization methods to define computationally efficient loss functions, approximating OT with a better sample complexity. More information and references can be found on the website of our book "Computational Optimal Transport" https://optimaltransport.github.io/
We study the problem of the non-parametric estimation for the density $\pi$ of the stationary distribution of a stochastic two-dimensional damping Hamiltonian system $(Z_t)_{t\in[0,T]}=(X_t,Y_t)_{t \in [0,T]}$. From the continuous observation of the sampling path on $[0,T]$, we study the rate of estimation for $\pi(x_0,y_0)$ as $T \to \infty$. We show that kernel based estimators can achieve the rate $T^{-v}$ for some explicit exponent $v \in (0,1/2)$. One finding is that the rate of estimation depends on the smoothness of $\pi$ and is completely different with the rate appearing in the standard i.i.d. setting or in the case of two-dimensional non degenerate diffusion processes. Especially, this rate depends also on $y_0$. Moreover, we obtain a minimax lower bound on the $L^2$-risk for pointwise estimation, with the same rate $T^{-v}$, up to $\log(T)$ terms. (joint work with Sylvain Delattre, Univ. Paris Diderot; and Nakahiro Yoshida, Univ. of Tokyo)
In this talk, we start by introducing the intriguing van Dantzig problem which consists in characterizing the subset of Fourier transforms of probability measures on the real line that remain invariant under the composition of the reciprocal map with a complex rotation. We first focus on the so-called Lukacs class of solutions that is the ones that belong to the set of Laguerre-P?lya functions which are entire functions with only real zeros. In particular, we show that the Riemann hypothesis is equivalent to the membership to the Lukacs class of the Riemann ? function. We state several closure properties of this class including adaptation of known results of P?lya, de Bruijn and Newman but also some new ones. We proceed by presenting a new class of entire functions, which is in bijection with a set of continuous negative definite functions, that are solutions to the van Dantzig problem and discuss the possibility of the Riemann ? function to belong to this class.
Le laboratoire de Bioinfomique à l’ICO travaille sur le développement d'outils bio-informatiques pour améliorer les traitements dans les cancers du sein. Notamment, nous cherchons à sélectionner des variables prédictives de la tumeur qui permettraient en amont d’évaluer l’efficacité d'un traitement. Nous utilisons pour cela des algorithmes de machine learning en analyse supervisée. Malheureusement, la construction d’un modèle prédictif est particulièrement ardue en santé car le nombre de variables de la tumeur dépasse largement le nombre d’observations, et les modèles pour l’instant générés ont des performances de prédiction trop faible pour être utilisables en clinique. Afin d’augmenter le nombre d’observations, nous avons ici combiné des datasets provenant d’origine différente en comparant différentes méthodes, et nous avons testé les performances de plusieurs algorithmes sur cette combinaison de données.
Let p(t,x,y) be the fundamental solution of the equation \partial_t u(t,x) = \Delta^{\alpha/2} u(t,x).
I will consider the integral equation
\tilde{p}(t,x,y) = p(t,x,y) + \int_0^t \int_{\mathbb{R}^d} p(t-s,x,z) q(z) \tilde{p}(s,z,y) dz ds,
where q(z) = \frac{\kappa}{|z|^{\alpha}} and \kappa is some constant. The function \tilde{p} solving this equation will be called the Schrödinger perturbations of the function p by q. I will present the results concerning the estimates of the function \tilde{p} in both cases \kappa>0 and \kappa<0.