Month: May 2019

AAII Seminar: May 29, 2019

When/where:  E2-215 at 12pm

Presenter: Jaehoon Lee (Google Brain)

Title:  Everything you wanted to know about batch size (in neural net training) but were afraid to ask

Abstract: Recent hardware developments have made unprecedented amounts of data parallelism available for accelerating neural network training. Among the simplest ways to harness next-generation accelerators is to increase the batch size in standard mini-batch neural network training algorithms. In this work, we aim to experimentally characterize the effects of increasing the batch size on training time, as measured in the number of steps necessary to reach a goal out-of-sample error. Eventually, increasing the batch size will no longer reduce the number of training steps required, but the exact relationship between the batch size and how many training steps are necessary is of critical importance to practitioners, researchers, and hardware designers alike. We study how this relationship varies with the training algorithm, model, and data set and find extremely large variation between workloads. Along the way, we reconcile disagreements in the literature on whether batchsize affects model quality. Finally, we discuss the implications of our results for efforts to train neural networks much faster in the future.

Reference: https://arxiv.org/abs/1811.03600

 

AAII Seminar: May 22, 2019

When/where:  E2-215 at 12pm

Presenter: David Haan (PBSE)

LURE (Learning UnRealized Events): Finding New(or Equivalent) Driver Mutation Events using Supervised Machine Learning

Cancer is a genetic disease typically resulting from an accumulation of mutations.  Mutations in normal cells generally result in repair or cell suicide. In cancer cells, the mutations accumulate leading to an uncontrolled growth otherwise known as a tumor. There are two broadly defined types of mutations, driver and passenger mutations.  Tumors contain around 2-5 driver mutations which cause and accelerate cancer, and about 10-200 passenger mutations which are accidental by products and result of thwarted DNA repair mechanism. The driver mutations are what defines the tumor, subtype and are therapeutic targets.

The Cancer Genome Atlas (TCGA) is a publicly accessible atlas of cancer related data from the National Cancer Institute (NCI).  This atlas of data is a comprehensive analysis of 9000 patients and 33 cancer subtypes cataloging mutation data, DNA, mRNA, methylation, and protein expression.  In particular, the TCGA study of Papillary Thyroid Carcinoma identified two subtypes, one harboring mutations in BRAF and the other were more RAS-like with mutations in KRAS, NRAS, HRAS.  The study identified driver mutations, whether BRAF or H/K/NRAS, in about 95% of the samples, leaving about 5% with no known driver mutations. Here we present a tool, “Learning UnRealized Events” (LURE) designed to identify driver mutations in those samples without known driver mutations.

AAII Seminar: May 15, 2019

Where/when:  E2-215 at 12pm

Presenter:  J. Xavier Prochaska (Astronomy & Astrophysics)

Title:  Activation Atlas

Abstract:  While it remains conventional wisdom that Deep Learning techniques are primarily the result of impenetrable “black boxes”, there is a growing effort to peer into the box.  I will describe a few of the initial efforts and then focus on the Activation Atlas built by researchers at Google and Open AI.  I have been willing to state out-loud that I find this akin to peering into the brain of the network.

Here is the link to the primary paper:  https://distill.pub/2019/activation-atlas/

Pizza will be provided.

AAII Seminar: May 8, 2019

No formal presentation this week.  Those who attend are encouraged to bring ~5 min of material to discuss.  Their own research, a paper that has excited them, etc.

Pizza will be provided.

Skip to toolbar