Computer Science and Statistics, UC Berkeley; Mathematical Sciences, QUT
Peter Bartlett is a professor in the Computer Science Division and Department of Statistics at the University of California at Berkeley and a professor in Mathematical Sciences at the Queensland University of Technology. He is the co-author, with Martin Anthony, of the book Learning in Neural Networks: Theoretical Foundations, has edited four other books, and has co-authored many papers in the areas of machine learning and statistical learning theory. He has served as an associate editor of several machine learning, AI and statistics journals, and as program committee co-chair for COLT and NIPS. He has consulted to a number of organizations, including General Electric, Telstra, SAC Capital Advisors, and Sentient AI. He has had visiting and honorary positions at the University of Queensland, UC Berkeley, and the University of Paris. He was awarded the Malcolm McIntosh Prize for Physical Scientist of the Year in Australia in 2001, and was chosen as an Institute of Mathematical Statistics Medallion Lecturer in 2008, an IMS Fellow in 2011, and Australian Laureate Fellow in 2011. He was elected to the Australian Academy of Science in 2015. His research interests include machine learning and statistical learning theory.
Deep neural networks have had a huge impact, lifting the state-of-the-art performance in an impressive range of prediction problems. These neural networks differ from those that were widely studied through the 1990s in several key aspects: they have more layers, many more parameters, and different activation functions. In this talk, we consider how these properties affect performance. First, we consider how the representational power of a network depends on the number of parameters, the depth, and the choice of activation function. We study combinatorial dimensions, which determine the sample complexity of learning. For a piecewise constant activation function, a result of Baum and Haussler implies that there is no dependence on the depth; for a piecewise linear activation function, the dependence is linear; for piecewise nonlinear polynomial activation functions, it is no more than quadratic. Second, we investigate computational hardness results for learning neural networks, giving an easy reduction from the problem of learning noisy parity. Third, we review constructive learning algorithms for networks with an unbounded number of hidden units and bounded parameters. These algorithms run in polynomial time provided that the fan-in of the hidden units is bounded by a constant. We show a strong equivalence between these algorithms and local optimization approaches (Frank-Wolfe methods, also called ``convex neural networks'') that have been considered recently. We conclude with open problems, suggesting some theoretical directions that might improve our understanding of the performance of learning methods for deep neural networks.
Deepak Agarwal leads the relevance and machine learning team at LinkedIn, which is responsible for optimizing and personalizing user experience across all consumer and enterprise products. Prior to that, he was a Principal Research scientist at Yahoo! research, where his work on optimizing content on Yahoo! front page won him the Yahoo! super star award. He is the Fellow of the American Statistical Association and serves on the board of SIGKDD. He is an associate editor for two top-tier statistical journals and regularly serves on the program committee of major data mining and machine learning conferences. He was a program co-chair of 2012 ACM SIGKDD conference. Most recently, he has co-authored a book on Statistical Methods for Recommender Systems, published by Cambridge university press.
Recommender systems that arise in the context of social networks have characteristics that give rise to new technical challenges. I will provide an overview and discuss using two examples from LinkedIn -- a) People recommendations and b) Feed optimization. The talk would focus both on scientific methodologies and engineering challenges that are necessary to deploy and maintain such systems in a large scale industrial environment like LinkedIn.
Carnegie Mellon University
Eduard Hovy is a professor at the Language Technology Institute in the School of Computer Science at Carnegie Mellon University. He holds adjunct professorships at universities in the US, China, and Canada, and is co-Director of Research for the DHS Center for Command, Control, and Interoperability Data Analytics, a distributed cooperation of 17 universities. Dr. Hovy completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987, and was awarded honorary doctorates from the National Distance Education University (UNED) in Madrid in 2013 and the University of Antwerp in 2015. He is one of the initial 17 Fellows of the Association for Computational Linguistics (ACL). From 1989 to 2012 he directed the Human Language Technology Group at the Information Sciences Institute of the University of Southern California. Dr. Hovy’s research addresses several areas in Natural Language Processing, including machine reading of text, question answering, information extraction, automated text summarization, the semi-automated construction of large lexicons and ontologies, and machine translation. His contributions include the co-development of the ROUGE text summarization evaluation method, the BLANC coreference evaluation method, the Omega ontology, the Webclopedia QA Typology, the FEMTI machine translation evaluation classification, the DAP text harvesting method, the OntoNotes corpus, and a model of Structured Distributional Semantics. Dr. Hovy is the author or co-editor of six books and over 350 technical articles and is a popular invited speaker. In 2001 Dr. Hovy served as President of the ACL, in 2001–03 as President of the International Association of Machine Translation (IAMT), and in 2010–11 as President of the Digital Government Society. Dr. Hovy regularly co-teaches courses and serves on Advisory Boards for institutes and funding organizations in Germany, Italy, Netherlands, and the USA.
Computer systems that educate themselves by reading text has been a longstanding dream of AI. Despite progress in NLP on Information Extraction and Text Mining, no NLP systems to date try to represent the entirety of a single document in depth. One of the main obstacles is the inadequacy of semantic representations of semantic content—the actual meaning of the symbols used in semantic propositions. The traditional extensional and intensional models of semantics are difficult to actually flesh out in practice, and no large-scale models exist. Recent developments in so-called Distributional Semantics, based either on word co-occurrence statistics or on neural encodings thereof, offer some exciting new possibilities that are very actively being explored. However, these approaches are not true semantics either, because they lack certain requirements. In this talk I outline one way to combine traditional symbolic logic-based proposition-style semantics (of the kind used in older NLP) with Distributional Semantics. Our core resource is the PropStore, a single lexico-semantic ‘lexicon’ that can be used for a variety of tasks. I describe how to define and build such a lexicon and how to use its contents for various NLP tasks. I describe experiments on composing its contents to form larger representation units. A serious problem is data sparsity —the PropStore is only about 2% full, despite containing most of Wikipedia and much of Gigaword— and I describe our current efforts to condense its representations into latent dimensions. Using the PropStore as a kind of background knowledge model one can address learning by reading in a new way.
University of Minnesota
Vipin Kumar is a Regents Professor at the University of Minnesota, where he holds the William Norris Endowed Chair in the Department of Computer Science and Engineering. Kumar received the B.E. degree in Electronics & Communication Engineering from Indian Institute of Technology Roorkee (formerly, University of Roorkee), India, in 1977, the M.E. degree in Electronics Engineering from Philips International Institute, Eindhoven, Netherlands, in 1979, and the Ph.D. degree in Computer Science from University of Maryland, College Park, in 1982. Kumar's current research interests include data mining, high-performance computing, and their applications in Climate/Ecosystems and Biomedical domains. Kumar is the Lead PI of a 5-year, $10 Million project, "Understanding Climate Change - A Data Driven Approach", funded by the NSF's Expeditions in Computing program that is aimed at pushing the boundaries of computer science research. He is author of over 300 research papers/articles and 11 widely used textbooks. Kumar's foundational research in data mining and its applications to scientific data was honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD).
This talk will present an overview of research being done in a large interdisciplinary project on the development of novel data driven approaches that take advantage of the wealth of climate and ecosystem data now available from satellite and ground-based sensors, the observational record for atmospheric, oceanic, and terrestrial processes, and physics-based climate model simulations. These information-rich datasets offer huge potential for monitoring, understanding, and predicting the behavior of the Earth's ecosystem and for advancing the science of global change. This talk will discuss some of the challenges in analyzing such data sets and our early research results.
Université catholique de Louvain
Yurii Nesterov is a Russian mathematician, internationally recognized expert in convex optimization especially in the development of efficient algorithms and numerical optimization analysis. He is currently professor at the Université catholique de Louvain (UCL). Yurii Nesterov has been associated with Moscow State University, Central Economic- Mathematical Institute of the Russian Academy of Sciences and UCL. He has been working at UCL since 1993 in the Department of Mathematical Engineering from the Polytechnic School of Louvain, Center for Operations Research and Econometrics. He has received Dantzing Prize and John von Neumann Theory Prize. Nesterov is most famous for his work in convex optimization.
We provide Frank-Wolfe (Conditional Gradients) method with a convergence analysis allowing to approach a primal-dual solution of convex optimization problem with composite objective function. Additional properties of complementary part of the objective (strong convexity) significantly accelerate the scheme. We also justify a new variant of this method, which can be seen as a trust-region scheme applying the linear model of objective function. Our analysis works also for a quadratic model, allowing to justify the global rate of convergence for a new second-order method. To the best of our knowledge, this is the first trust-region scheme supported by the worst-case complexity bound.
Krithi Ramamritham is professor at Dept of Computer Science and Engineering at IIT Bombay. His research explores timeliness and consistency issues in computer systems, in particular, databases, real-time systems, and distributed applications. His recent work addresses these issues in the context of Dynamic Data in sensor networks, embedded systems, mobile environments and the web. His recent work has been related to the use of Information and Communication Technologies for creating tools aimed at socio-economic development. He obtained Ph.D. in Computer Science from the University of Utah in 1981 after his B.Tech. in Electrical Engineering (1976) and M.Tech. in Computer Science (1978), both from the Indian Institute of Technology Madras.
These days, unless something has the epithet "smart" attached to it, it is nothing. Smart Energy solutions promise cleaner, cheaper and more reliable energy. Smart Cities promise better quality of life for its citizens. We will argue that for a "system" to be SMART, it should Sense Meaningfully, Analyze and Respond Timely. Using real-world examples from the domains of Smart Energy and Smart Cities, this talk will illustrate the central role of data in being SMART.
|Submission deadline||20th October 2015 |
|Paper decisions||15th December 2015|
|Final Camera Ready||21st January 2016|
|Conference||13th-16th March 2016|