AAII Seminar: May 20, 2020

When:  9am PT
How:  Zoom at https://ucsc.zoom.us/j/272379932
Title: SimCLR is a simple framework for contrastive learning of visual representations
Abstract: SimCLR is a simple framework for contrastive learning of visual representations. It simplifies recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.

Bio: Ting Chen is a research scientist in the Google Brain Team. His main research interest is on representation learning, discrete structures, and generative modeling. He joined Google in 2019 after finishing his PhD at University of California, Los Angeles.

AAII Seminar: May 6, 2020

Held Jointly with the Astrophysics Machine Learning Club — mlclub.net

When/where: May 6, 2020 at 9am PT;   Zoom only — https://ucsc.zoom.us/j/272379932

Talk 1:
Title: Quasar continua predictions with neural spline flows
Speaker: David Reiman (UCSC)

Talk 2:
Title:
QuasarNET
Speaker:
James Farr (UCL)

AAII Seminar: April 8, 2020

Held Jointly with the Astrophysics Machine Learning Club — mlclub.net

When/where: April 8, 2020 at 12pm PT;   Zoom only — https://ucsc.zoom.us/j/611217103

Title: Neural Networks with Euclidean Symmetry for Physical Sciences
Speaker: Tess E. Schmidt (LBNL)

Abstract: Neural networks are built for specific data types and assumptions about those data types are built into the operations of the neural network. For example, full-connected layers assume vector inputs are made of independent components and 2D convolutional layers assume that the same features can occur anywhere in an image. In this talk, I show how to build neural networks for the data types of physical systems, geometry and geometric tensors, which transform predictably under rotation, translation, and inversion — the symmetries of 3D Euclidean space. This is traditionally a challenging representation to use with neural networks because coordinates are sensitive to 3D rotations and translations and there is no canonical orientation for physical systems. I present a general neural network architecture that naturally handles 3D geometry and operates on the scalar, vector, and tensor fields that characterize physical systems. Our networks are locally equivariant to 3D rotations and translations at every layer. In this talk, I describe how the network achieves these equivariances and demonstrate the capabilities of our network using simple tasks. I also present applications of Euclidean neural networks to quantum chemistry and geometry generation using our Euclidean equivariant learning framework, e3nn.

AAII Seminar: March 18, 2020

When/where: March 18, 2020 at 12pm PT;   Zoom only — https://ucsc.zoom.us/j/562952785

Title: Deep Learning for Predicting Domain Prices
Speaker: Jason Ansel, Distinguished Engineer at GoDaddy

Learn how GoDaddy uses neural networks to predict the price of a
domain name in the aftermarket. GoDaddy Domain Appraisals (GoValue) is
available to millions of GoDaddy customers and provides estimated
values to help both buyers and sellers more effectively price domain
names. GoValue is 1.25x better at predicting past domain name sale
prices than human experts.

This talk will explain the hybrid recurrent neural networks behind
GoValue. It will discuss some of the practical aspects of scaling and
deploying a sophisticated machine learning system. Finally, we will
dive into recent research at GoDaddy that created a new neural network
structure for outputting tighter prediction intervals than preexisting
techniques.

Try GoDaddy Domain Appraisals for yourself:
https://www.godaddy.com/domain-value-appraisal

 

AAII Seminar: March 4, 2020

When/where: 12pm/E2-215 (pizza will be provided!)

Presenter: Erica Rutter (UC Merced; Applied Math)

Title: Methods for Few-Shot Biomedical Image Segmentation and Learning Equations from Data
Abstract: Machine learning methods have had a powerful impact in the field of biomedical image segmentation and analysis. Traditionally, these machine learning methods have had two drawbacks: they do not ensure contiguity of a segmented object, and they require many expensive manually annotated images to train. I will present a novel methodology for image segmentation by reformulating the machine learning method to trace the boundary of an object, the way a human would. We explore the accuracy of the method on benchmark cell segmentation datasets as well as in-house ‘real’ data. We further compare the ability of the method to work in the low-data limit and show that our method can be used to generate training data for more sophisticated segmentation methods, thereby reducing the burden of human annotation. I will then explore a hybrid framework the combines elements from dynamical systems and machine learning to analyze the quantifiable data generated from such image segmentations. In particular, I will present a robust method for learning underlying dynamical systems from noisy spatiotemporal data, drawing from examples in cell migration and cancer.

AAII Seminar: February 26th, 2020

When/where: 12pm/E2-215 (pizza will be provided!)

Presenter: Vanessa Boehm (LBNL)

Title:  The powers and pitfalls of deep generative models in scientific applications

Abstract: Of all machine learning methods generative models are particularly interesting for scientific applications because of their probabilistic nature and ability to fit complex data and probability distributions. However, in their vanilla form generative models have a number of shortcomings and failure modes which can be a hindrance to their application: They can be difficult to train on high dimensional data and they can fail in crucial tasks such as outlier detection, correct uncertainty estimation or the generation of realistic artificial data. In my talk I am going to explore the reasons for these failures and propose new generative models and generative model based approaches that are robust to these shortcomings. The proposed approaches are easy to train and validate, numerically stable and do not require fine-tuning. They should thus be particularly fitting for scientific applications. I will demonstrate how these approaches can be used for scientifically relevant tasks such as realistic data generation, probabilistic inference on corrupted data and outlier detection.

AAII Seminar: January 29th, 2020

Where/when:  E2-215  (pizza will be provided!)

Presenter:  David Shih (Rutgers University)

Title: “New Approaches to Anomaly Detection Inspired by High Energy Physics”

Abstract: With an enormous and complex dataset, and the ability to cheaply generate realistic simulations, the Large Hadron Collider (LHC) provides a novel arena for machine learning and artificial intelligence. In this talk I will give an overview of a number of recently proposed methods for anomaly detection that were motivated by the search for new particles and forces at the LHC. This includes methods based on autoencoders, weak supervision, density estimation and simulation-assisted reweighting. I will also summarize the status of the ongoing “LHC Olympics 2020” anomaly detection data challenge, where many of these techniques are being applied to “black box” datasets by a number of groups from around the world. Finally, while these methods are being developed in the context of high energy physics, I will attempt to highlight the ways in which they are general and could be applied to data-driven discovery in other branches of science and beyond.

 

AAII Seminar: January 22, 2020

When/Where:   E2-215 at 12pm on January 22nd, 2020

Presenter:   Ioannis “Yianni” Anastopoulos  (UC Santa Cruz)

Title: Generalizing Drug Response from Cell Lines to Patients

Abstract:
Cancer treatment poses a unique challenge in part due to the heterogeneous nature of the disease. It is estimated that a staggering 75% of cancer patients do not respond to any type of cancer treatment. Preclinical models, such as cell lines and patient-derived xenografts, have been used to better understand the genes and pathways that contribute to tumorigenesis, as well as to identify markers of treatment response. For my thesis project, I am leveraging deep learning techniques to generalize drug response prediction from preclinical models to patients.

Standard machine learning methods have modest prediction accuracy and are challenged by the ever-increasing dimensionality of available data. Deep learning solves many shortcomings of previous methods and has been used successfully to advance drug design, drug-target profiling, and drug repositioning. In my preliminary results, I have shown that a deep learning model incorporating transcriptome and drug structure information achieves competitive drug response prediction performance on the Cancer Cell Line Encyclopedia (CCLE) dataset. I plan to extend this work to enable generalization to future drug compounds.

AAII Seminar: January 15th at 12pm in E2-215

We kick off the New Year with a seminar by at 12pm in E2-215 or
by Zoom given by:

Speaker:  Rich Caruana (Microsoft)

on

Title: Friends Don’t Let Friends Use Black-Box Models: The Importance of Interpretability in Machine Learning

Abstract: Every data set is flawed, often in ways that are unanticipated and difficult to detect. If you can’t understand what your model learned from the data, your model probably is less accurate than it could be, and might even be risky to deploy. Unfortunately, historically there has been a tradeoff between accuracy and intelligibility: accurate models such as deep neural nets, boosted tress and random forests are not very intelligible, and intelligible models such as linear regression and decision trees usually are less accurate. In mission-critical domains such as healthcare, where being able to understand, validate, edit and ultimately trust a model is important, one often had to choose less accurate models. But this is changing. We have developed a learning method based on generalized additive models with pairwise interactions (GA2Ms) that is as accurate as full complexity models, yet even more interpretable than linear regression. In this talk I’ll show how interpretable, high-accuracy machine learning is helping us discover what our models learned and uncover flaws lurking in our data. I’ll also show how we’re using these models to uncover bias where fairness and transparency are important. Code for GA2Ms is available at https://github.com/interpretML.

Bio: Rich Caruana is a Senior Principal Researcher at Microsoft. His research focus is on intelligible/transparent modeling, machine learning for medical decision making and computational ecology, and deep learning. Before joining Microsoft, Rich was on the faculty in Computer Science at Cornell, at UCLA’s Medical School, and at CMU’s Center for Learning and Discovery. Rich’s Ph.D. is from CMU, where he worked with Tom Mitchell and Herb Simon. His thesis on Multitask Learning helped create interest in the subfield of machine learning called Transfer Learning. Rich received an NSF CAREER Award in 2004 (for Meta Clustering), best paper awards in 2005 (with Alex Niculescu-Mizil), 2007 (with Daria Sorokina), and 2014 (with Todd Kulesza, Saleema Amershi, Danyel Fisher, and Denis Charles), and co-chaired KDD in 2007 with Xindong Wu.

 

1 2 3 4 6