Similarity Algorithms by Anas Ismail Aug 2, 09:00 - 11:00 B3 R5209 Abstract Here we provide two similarity algorithms each for a specific type of data. First, we provide a new algorithm to calculate the Gromov hyperbolicity constant which is a measure of how similar a metric is to a tree metric. We also provide a new algorithm determining how similar two spatial trajectories are.
AI in Cancer Precision Medicine Workshop Dr. Paul Schofield Jul 25, 08:00 - Jul 26, 13:00 B3 R5220 We will investigate how novel AI technologies, including progress in machine learning, knowledge representation and reasoning can be applied to improving diagnosis and treatment of cancer in the era of genomic medicine.
Neural Inductive Matrix Factorization for Predicting Disease-Gene Associations Siqing Hou, M.S., Computer Science Apr 18, 10:00 - 11:30 B3 R5208 bioinformatics machine learning Disease-Gene Associations In silico prioritization of undiscovered associations can help find causal genes of newly discovered diseases. Some existing methods are based on known associations and side information of diseases and genes. We exploit the possibility of using a neural network model, Neural Inductive Matrix Completion (NIMC) in disease-gene prediction.
Ontology Design Patterns for Combining Pathology and Anatomy: Application to Study Ageing and Longevity in Inbred Mouse Strains Sarah Alghamdi, Ph.D. Student, Computer Science Apr 10, 13:00 - 14:30 B9 R3120 biomedicine Ontologies data analysis semantic analysis computation techniques Abstract In biomedical research, ontologies are widely used to represent knowledge as well as annotate datasets. Many of the existing ontologies cover a single type of phenomena, such as a process, cell type, gene, pathological entity or anatomical structure. Consequently, it is required to use multiple ontologies to fully characterize the observations in the datasets. Although this allows precise annotation of different aspects of a given dataset, it limits our ability to use the ontologies in data analysis, as the ontologies are usually disconnected and their combination cannot be exploited
Big Data in Biodiversity and Health Mar 26, 09:30 - Mar 28, 13:30 B3 L5 5209 About We are witnessing today an enormous increase in the volume and complexity of data across a variety of domains, including bioscience. Extracting useful information from such data is challenging. Although many approaches have already been developed, efficient analysis of big data in bioscience domain is far from satisfactory. Biodiversity and health are prominently characterized by a high volume of data with great complexity of information contained, which lead to various approaches to data analyses. The goal of this workshop is to present a selection of efforts currently being made at
Computational and Statistical Interface to Big Data Xin Gao, Program Chair, Computer Science Mar 19, 08:00 - Mar 21, 17:00 B9 L2 H2 We are now in the fourth paradigm of science: Data Science. The massive amount of structured and unstructured data has posed new challenges and opportunities to the fields of computer science and statistics. Traditional computational and statistical methods for data storage, curation, sharing, querying, updating, visualization, analysis, and privacy have been shown to fail in the big data scenario due to the unprecedented volume, velocity, variety, veracity and value of the big data. This conference will bring together a number of prominent researchers in Computer Science and Statistics with common interests and active research in big data, as well as the researchers at KAUST who regularly generate or face big data, such as those in bioscience and red sea research.
Symbolic AI in Computational Biology Robert Hoehndorf, Associate Professor, Computer Science Mar 12, 12:00 - 13:00 B9 H1 R2322 About The life sciences have invested significant resources in the development and application of semantic technologies to make research data accessible and interlinked, and to enable the integration and analysis of data. Utilizing the semantics associated with research data in data analysis approaches is often challenging. Now, novel methods are becoming available that combine symbolic methods and statistical methods in Artificial Intelligence. In my talk, I will describe how to combine symbolic and statistical Artificial Intelligence approaches for the analysis of biological and biomedical
Causality-based new drug development: Some successful cases and a new challenge Naoyuki Kamatani, MD, PhD Mar 11, 12:00 - 13:00 B2 L5 R5209 artificial intelligence biomedicine drug development Abstract The success rate of new drug development is extremely low. It is even lower when the new drug has a novel mechanism of action. From my experience, I propose that the success rate can be dramatically increased by predicting the effects of a drug in humans based on the causality-confirmed data. It is dangerous to develop new drugs based on the data in which causality is not confirmed. In biology, there are three different types of relationships in which the causality is confirmed, i.e. the relationships between parent and child, between gene and phenotype and between intervention and
Symbolic AI in Computational Biology: Applications to Disease Gene and Drug Target Identification Robert Hoehndorf, Associate Professor, Computer Science Feb 26, 16:30 - 17:30 The University of Cambridge in the United Kingdom Abstract KAUST Assistant Professor Robert Hoehndorf will give a seminar on " Symbolic AI in Computational Biology: Applications to Disease Gene and Drug Target Identification" at the University of Cambridge in the United Kingdom. More Information The life sciences have invested significant resources in the development and application of semantic technologies to make research data accessible and interlinked, and to enable the integration and analysis of data. Utilizing the semantics associated with research data in data analysis approaches is often challenging. Now, novel methods are
Keynote Speaker | The 8th BEAR PGR Conference & Users Forum 2018 Robert Hoehndorf, Associate Professor, Computer Science Feb 23, 09:00 - 16:30 The University of Birmingham in the United Kingdom High Performance Computing cloud storage data visualisation Abstract KAUST Assistant Professor Robert Hoehndorf will be a keynote speaker at the 8th BEAR PGR Conference & Users Forum at the University of Birmingham in the United Kingdom. This event focuses on, but not limited to, computational analysis and numerical modeling the conference will cater to researchers of all schools interested in BEAR facilities. Such as the use of the high-performance computing (HPC) system, BlueBEAR, cloud storage or data visualization.
Fifth KAUST-NVIDIA Workshop on Accelerating Scientific Applications Using GPUs Timothy Lanfear , Brent Leback Feb 18, 08:00 - Feb 20, 17:00 B4 B5 A0215 supercomputing The KAUST Supercomputing Laboratory is co-organizing with NVIDIA, a leader in accelerated computing and artificial intelligence, a full-day workshop on accelerating scientific applications using GPUs on Tuesday, February 20th, 2018 in the auditorium between buildings 4 and 5.
KAUST Research Workshop on Optimization and Big Data Peter Richtarik, Professor, Computer Science Feb 5, 08:00 - Feb 7, 05:00 B19 L3 H2 optimization machine learning Social Network Analysis asynchronous algorithms The age of "big data" is here: data of unprecedented sizes is becoming ubiquitous, which brings new challenges and new opportunities. With this comes the need to solve optimization problems of unprecedented sizes.
Novel Computational Methods to Predict Drug–target Interactions Using Graph Mining and Machine Learning Approaches Rawan Olayan, Ph.D., Computer Science Dec 11, 10:00 - 12:00 B3 L5 R5220 bioinformatics data integration data mining graph mining machine learning Abstract Computational drug repurposing aims at finding new medical uses for existing drugs. The identification of novel drug-target interactions (DTIs) can be a useful part of such a task. Finding computationally DTIs is a convenient strategy to identify potentially new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer a high false positive prediction rate. Here, we present a comprehensive review of the recent progress in the field of DTI prediction from data-centric and algorithmic-centric perspectives that can help in constructing novel reliable
Big Data Analyses in Evolutionary Biology Dec 4, 08:00 - Dec 6, 17:00 B9 H2 big data Big data analysis evolutionary biology This event is organized by CBRC with financial support from the KAUST Office of Sponsored Research
Contributions to In Silico Genome Annotation Manal Kalkatawi, Ph.D., Computer Science Nov 9, 10:00 - 13:00 B3 L5 R5209 bioinformatics data mining machine learning Deep learning genomics Abstract Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation while identifying functions of these regions are considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and time-consuming. This study contributes to genome annotation by introducing
PCCFD - Predictive Complex Computational Fluid Dynamics David Keyes, Professor, Applied Mathematics and Computational Science May 22, 08:45 - May 24, 05:00 B9 L2 H1 CFD algorithms applied mathematics numerical analysis Computer science The PCCFD workshop will focus on cutting-edge research in the field of algorithmic development for CFD and multi-scale complex flow simulations.
Mining Genome-Scale Growth Phenotype Data through Constant-Column Biclustering Majed Alzahrani, Ph.D., Computer Science May 17, 15:00 - 17:00 B3 L5 R5209 data mining machine learning Computational biology Growth phenotype profiling of genome-wide gene-deletion strains overstresses conditions can offer a clear picture that the essentiality of genes depends on environmental conditions. In this dissertation, we first demonstrate that detecting such "co-fit" gene groups can be cast as a less well-studied problem in biclustering, i.e., constant-column biclustering. Despite significant advances in biclustering techniques, very few were designed for mining in growth phenotype data.
Breaking the Boundaries: from Structure to Algorithms Vadim Lozin, Professor, University of Warwick, UK Apr 17, 14:00 - 15:00 KAUST maximum independent set line graphs boundary classes of graphs Abstract Finding a maximum independent set in a graph is an NP-hard problem. However, restricted to the class of line graphs this problem becomes polynomial-time solvable due to the celebrated matching algorithm of Jack Edmonds. What makes the problem easy in the class of line graphs and what other restrictions can lead to an efficient solution? To answer these questions, we employ the notion of boundary classes of graphs. In this talk, we shed some light on the structure of the boundary separating difficult instances of the problem from polynomially solvable ones and analyze algorithmic tools
Computational Methods for ChIP-seq Data Analysis and Applications Haitham M. Ashoor, Ph.D., Computer Science Apr 10, 16:00 - 17:30 B3 L5 5209 computation techniques machine learning bioinformatics data analysis Abstract The development of Chromatin immunoprecipitation followed by sequencing (ChIP-seq) technology has enabled the construction of genome-wide maps of protein-DNA interaction. Such maps provide information about transcriptional regulation at the epigenetic level (histone modifications and histone variants) and at the level of transcription factor (TF) activity. This dissertation presents novel computational methods for ChIP-seq data analysis and applications. The work of this dissertation addresses four main challenges. First, I address the problem of detecting histone modifications from
Genetic Algorithms for Optimization of Machine-learning Models and their Applications in Bioinformatics Arturo Magana Mora, Ph.D., Computer Science Apr 10, 13:00 - 15:00 B3 L5 R5209 machine learning data mining biology genetics bioinformatics Abstract Machine-learning (ML) techniques have been widely applied to solve different problems in biology. However, biological data are large and complex, which often results in extremely intricate ML models. Frequently, these models may have poor performance or may be computationally unfeasible. This study presents a set of novel computational methods and focuses on the application of genetic algorithms (GAs) for the simplification and optimization of ML models and their applications to biological problems. The dissertation addresses the following three challenges. The first challenge is
Novel Computational Methods that Facilitate Development of Cyanofactories for Free Fatty Acid Production by Olaa Motwalli Olaa A. Motwalli, Ph.D., Computer Science Apr 9, 16:00 - 17:00 B3 L5 R5209 machine learning bioinformatics graph mining genomics Abstract Finding a source from which high-energy-density biofuels can be derived at an industrial scale has become an urgent challenge for renewable energy production. Some microorganisms can produce free fatty acids (FFA) as precursors towards such high-energy-density biofuels. In particular, photosynthetic cyanobacteria are capable of directly converting carbon dioxide into FFA. However, current engineered strains need several rounds of engineering to reach the level of FFA production for it to be commercially viable. Thus, new chassis strains that require less engineering are needed
Novel Data Mining Methods for Virtual Screening of Biological Active Chemical Compounds by Othman Soufan Othman Soufan, Ph.D., Computer Science Nov 16, 14:00 - 15:00 H2 B9 machine learning data mining Computational biology biomedical applications Chemical compounds visualization Abstract Drug discovery is a process that takes many years and hundreds of millions of dollars to reveal a con dent conclusion about a specific treatment. Part of this sophisticated process is based on preliminary investigations to suggest a set of chemical compounds as candidate drugs for the treatment. Computational resources have been playing a significant role in this part through a step known as virtual screening. From a data mining perspective, the availability of rich data resources is key in training prediction models. Yet, the difficulties imposed by big expansion in data and its