Colloquium/Seminar

YearMonth
2019 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug  
2018 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Oct   Nov   Dec  
2017 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Oct   Nov   Dec  
2016 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Oct   Nov   Dec  
2015 Jan   Feb   Mar   Apr   May   Jun   Aug   Sep   Oct   Nov   Dec  
2014 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2013 Jan   Feb   Mar   Apr   May   Jun   Aug   Sep   Nov   Dec  
2012 Jan   Feb   Apr   May   Jun   Jul   Aug   Sep   Nov   Dec  
2011 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2010 Jan   Feb   Mar   Apr   May   Jun   Sep   Oct   Nov   Dec  
2009 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2008 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2007 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2006 Jan   Feb   Mar   Apr   May   Jun   Jul   Sep   Oct   Nov   Dec  
2005 Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec  
2004 Jan   Feb   Mar   Apr   May   Aug   Sep   Oct   Nov   Dec  

Coming event(s)


  • Thursday, 25th July, 2019

    Title: Information retrieval from electronic medical record
    Speaker: Dr. Xiang WAN, Shenzhen Research Institute of Big Data, Shenzhen, China
    Time/Place: 15:00  -  15:45
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: Information retrieval (IR) in natural language processing is a standard technique used for efficiently accessing information in large collections of texts. In this talk, I will first present how to improve information access in electronic medical record (EMR) with advanced natural language processing (NLP) techniques. We concentrate more specifically on the NLP tasks of named entity recognition (NER), relation extraction, and text classification. Second, we recently propose a flexible framework field embedding to jointly learn Chinese word embeddings, which incorporates morphological, phonetic and other linguistic information. Experiments demonstrate that our model can make use of multiple fields to extract semantic information while other existing methods cannot. Empirical results on the word similarity, word analogy, text classification tasks illustrate the proposed model outperforms state-of-the-art methods, such as word2vec, CWE, JWE, and cw2vec.


  • Thursday, 25th July, 2019

    Title: New HSIC-based tests for independence between two stationary multivariate time series
    Speaker: Dr Guochang WANG, Department of Statistics, Jinan University, Guangzhou, China
    Time/Place: 15:45  -  16:15
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: This paper proposes some novel one-sided omnibus tests for independence between two multivariate stationary time series. These new tests apply the Hilbert-Schmidt independence criterion (HSIC) to test the independence between the innovations of both time series. Under regular conditions, the limiting null distributions of our HSIC-based tests are established. Next, our HSIC-based tests are shown to be consistent. Moreover, a residual bootstrap method is used to obtain the critical values for our HSIC-based tests, and its validity is justified. Compared with the existing cross-correlation-based tests for linear dependence, our tests examine the general (including both linear and non-linear) dependence to give investigators more complete information on the causal relationship between two multivariate time series. The merits of our tests are illustrated by some simulation results and a real example.


  • Thursday, 25th July, 2019

    Title: Encoding the category to select the feature genes for single-cell RNA-seq classification
    Speaker: Dr Yan ZHOU, Department of Statistics, Shenzhen University, Shenzhen, China
    Time/Place: 16:15  -  17:00
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: With the development of next-generation sequencing techniques, microRNA-seq or single-cell RNA-seq (scRNA-seq) data are becoming popular as an alternative for biology and medicine studies, such as different expressed (DE) genes detected or disease diagnosis. Using microRNA-seq or scRNA-seq data to diagnose the type of diseases is an effective way in medical research. For microRNA-seq or scRNA-seq data, several statistical methods have been developed for classification, including for example Poisson linear discriminant analysis (PLDA), negative binomial linear discriminant analysis (NBLDA) and zero-inflated Poisson logistic discriminant analysis (ZIPLDA). We know that the feature genes are vitally important for microRNA-seq or scRNA-seq data classification. In fact, the majority of genes are not differentially expressed and they are irrelevant for class distinction. To improve the classification performance and save the computation time, it is necessary to remove the irrelevant genes and detect the important feature genes is necessary. The widely used methods in the literature assume the data as normally distributed so that they may not be suitable for microRNA-seq and scRNA-seq data. In this paper, we propose an encoding the category (ENTC) method to select the feature genes for single-cell RNA-seq data classification. The novel method encodes the category again by employing the rank of samples for each gene in each class. We then consider the correlation coefficient of gene and class with rank of sample and new rank of category. The highest correlation coefficient genes are considered as the differentially expressed genes which are most effective to classify the samples. We also establish the sure screening and rank consistency properties of the proposed ENTC method. Simulation studies show that the classifier using the proposed ENTC method performs better than, or at least as well as, the existing methods in most settings. Two real datasets including a microRNA-seq dataset and a scRNA-seq dataset are also analyzed, and the results demonstrate the superior performance of the proposed method over the existing competitors. To cater for the demands of the application, we have also developed an R package called “ENTC” and have made it freely available for download. Availability: The R package named ”ENTC” is available at https://github.com/zhangli1109/ENTC.


  • Monday, 29th July, 2019

    Title: PhD Oral Exam: Stochastic Gradient Descent for Pairwise Learning: Stability and Optimization Error
    Speaker: Mr SHEN Wei, Department of Mathematics, Hong Kong Baptist University, HKSAR
    Time/Place: 10:00  -  12:00
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: n this presentation, we focus on the stability and its trade-off with optimization error for stochastic gradient descent (SGD) algorithms in the pairwise learning setting. Pairwise learning refers to a learning task which involves a loss function depending on pairs of instances among which notable examples are bipartite ranking, metric learning, area under ROC curve (AUC) maximization and minimum error entropy (MEE) principle. Our contribution is twofold. Firstly, we establish the stability results for SGD for pairwise learning in the convex, strongly convex and non-convex settings, from which generalization errors can be naturally derived. Moreover, we also give the stability results of buffer-based SGD and projected SGD. Secondly, we establish the trade-off between stability and optimization error of SGD algorithms for pairwise learning. This is achieved by lower-bounding the sum of stability and optimization error by the minimax statistical error over a prescribed class of pairwise loss functions. From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses. In addition, we illustrate our stability results by giving some specific examples and experiments of AUC maximization and MEE.


  • Tuesday, 30th July, 2019

    Title: PhD Oral Exam: Order Determination for Large matrices with Spiked Structure
    Speaker: Mr ZENG Yicheng, Department of Mathematics, Hong Kong Baptist University, HKSAR
    Time/Place: 10:00  -  12:00
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: We investigate order determination for large dimensional matrices with spiked structures in which the dimensions of the matrices are proportional to the sample sizes. Because the asymptotic behaviors of the estimated eigenvalues differ completely from those in fixed dimension scenarios, we then discuss the largest possible order, say q, we can identify and introduce criteria for different settings of q. When q is assumed to be fixed, we propose a “valley-cliff” criterion with two versions. This generic method is very easy to implement and computationally inexpensive, and it can be applied to various matrices. As examples, we focus on spiked population models, spiked Fisher matrices and factor models with auto-covariance matrices. For the case of divergent q, we propose a scale-adjusted truncated double ridge ratio (STDRR) criterion, where a scale adjustment is implemented to deal with the bias in scale parameter for large q. Again, examples include spiked population models, spiked Fisher matrices. Numerical studies are conducted to examine the finite sample performances of the method and to compare it with existing methods. As for theoretical contributions, we investigate the limiting properties, including convergence in probability and central limit theorems, for spiked eigenvalues of spiked Fisher matrices with divergent q.


  • Thursday, 8th August, 2019

    Title: PhD Oral Exam: Nonlinear Optimized Schwarz Preconditioning for Heterogeneous Elliptic Problems
    Speaker: Mr GU Yaguang, Department of Mathematics, Hong Kong Baptist University, HKSAR
    Time/Place: 14:30  -  16:30
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: In the oral defense, we study problems with heterogeneities using the zeroth order optimized Schwarz preconditioning. There are three main parts. In the first part, we propose an Optimized Restricted Additive Schwarz Preconditioned Exact Newton approach (ORASPEN) for nonlinear diffusion problems. In this approach, we use the Robin condition to communicate subdomain errors. We will see that for problems with large heterogeneities, the Robin parameter has a significant impact on the convergence behavior when subdomain boundaries cut through the discontinuities. In the second main part, therefore, we perform an algebraic analysis for a linear diffusion model problem. In this analysis, we will carefully discuss two possible choices of Robin parameters on the artificial interfaces and derive asymptotic expressions of both the optimal Robin parameter and the convergence rate for each choice. Finally, in the third main part, we will study a time-dependent nonequilibrium Richards equation (NERE), which can be used to model preferential flow in physics. We semi-discretize the NERE in time, and then study how the ORASPEN approach performs for the resulting elliptic problems.


  • Tuesday, 20th August, 2019

    Title: PhD Oral Exam: Statistical Methods of MendelianRandomization Using GWAS Summary Data
    Speaker: Ms HU Xianghong, Department of Mathematics, Hong Kong Baptist University, HKSAR
    Time/Place: 10:30  -  12:30
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: Mendelian Randomization (MR) is a powerful tool for accessing causality of exposure on an outcome using GWAS data, However, the accuracy of the MR causal effect estimates could be challenged in case of the MR assumptions are violated. The source of biases could attribute to the weak effects arising because of polygenicity, the presentence of horizontal pleiotropy and other biases, e.g., selection bias. In this thesis, we firstly propose a Bayesian weighted Mendelian randomization for causal inference, which takes into account weak effects and violation of MR assumptions due to pleiotropy. Based on the framework of BWMR, we further develop a method for correction of selection bias in MR analysis. We evaluate the performance of our methods through comprehensive simulations and real data analysis, demonstrating advantages over competitors. With the increasing availability of GWAS summary, our methods are believed to be of great practical value.


  • Wednesday, 28th August, 2019

    Title: PhD Oral Exam: Some New Developments in Data Transformation and Meta-analysis with Small Number of Studies
    Speaker: Mr LIN Enxuan, Department of Mathematics, Hong Kong Baptist University, HKSAR
    Time/Place: 10:30  -  12:30
    FSC1217, Fong Shu Chuen Library, HSH Campus, Hong Kong Baptist University
    Abstract: In this presentation, we focus on the three critical issues for the statistical parts in meta-analysis. The first issue is how to convert OR to RR in the case-control study. In view of this, we establish a new formula for this transformation to fulfill the gap. The performance of the new method will be examined through simulations and real data analysis. Our method and formulas can not only handle meta-analyses with different effect sizes, but also offer some insights for medical researchers to further understand the meaning of OR and RR in both cohort and case-control studies. Another issue is the model selection in meta-analyses with few studies. we propose to further improve the estimation accuracy of the average effect in the fixed-effects model by assigning different weight for each study as well as fully utilizing the information in the within-study variances. Through theory and simulation, we demonstrate that the fixed-effects model can serve as the most convincing model for meta-analysis with few studies. And most importantly, with a total of three candidate models, we expect that meta-analysis can be conducted more flexibly, more meaningfully, and more accurately. The third issue is that most existing methods for the heterogeneity measurement were derived under the assumption of known within-study variances. In practice, however, a direct use of the reported within-study variance estimates may largely reduce the power of the tests and also lower the accuracy of the estimates, especially when the sample sizes in some studies are not sufficiently large. To overcome this problem, we propose a family of shrinkage estimators for the within-study variances that are able to borrow information across the studies, and derive the optimal shrinkage parameters under the Stein loss function. We then apply the new estimates of the within-study variances to some well known methods for measuring heterogeneity. Simulation studies and real data examples show that our shrinkage estimators can dramatically reduce the estimation bias and hence improve the exiting literature.