Archive of ‘Uncategorized’ category

MLonMB Conference @ UCSC: November 10, 2021

See the main website for more!

We look forward to MLonMB on November 10, 2021 from 9am-3:30pm at the UCSC Hay Barn (coffee and fruit will be available from 8am-9am). Our last requests of you prior to the meeting are described below.  They include input on your dietary requirements, parking, and a bit more.

Logistical info

This Google doc describes logistics like parking and COVID protocol.  We will supplement it when asked.

Please fill out this form to provide us with your dietary information and other salient info.  We request this information by October 20th.

Introductory Slide

We are asking each participant to provide a single slide introducing themselves.  At a minimum, please provide a photo, a brief bio, and a few interesting tidbits related to your attendance at the meeting.

Here is an example from one of the organizers.

You can copy that Google Slide and then edit.  Or generate your own PDF with any software.  We will generate and share the slide deck for the full set of attendees.

We request this slide by November 1st.  Send it to cjhangen@ucsc.edu with subject “MLonMB slide”.

Agenda

A draft of the agenda is below.  It emphasizes interaction, i.e. meeting each other and holding informal discussions.

Talks

Time

Activity

Participants/Notes

8-9

Coffee/fruit

All

9-9:05am

Welcome (Refer to attendee intro slide packet, Intro to Idea Boards)

All

9:05a-9:30a

Introductory Bingo

All

9:30a-10:30a

Data Talks (10 x 6 minutes)
— Who I am and represent
— Data I have
— Why I obtain it
— What I aim to do with it

Capture on Idea boards

10:30-10:48

Break

10:48-12pm

ML Talks (12 x 6 minutes)
— Who I am and represent

  — Technical expertise
— What it is useful for
— Data I’d love to work on

All

12-1p

Lunch

Review Idea Boards

All

Organizers

1-2pm

Results from idea boards, break into smaller groups

Capture on idea boards focused on “areas of interest”

2-3pm

Idea group discussion / walk-around?

3:15pm-3:30pm

Wrap-Up + Future engagement

Avoid Friday!

3:30pm-end

Collaboration building
Hike (for those that want)
Otherwise at site

AAII SEMINAR: MARCH 31, 2021

When: 9am PT, noon ET, 6pm CET
How: http://MLclub.net
Who:

  •    Helena Dominguez-Sanchez (ICE)
  •    Sandra M. Faber (UCSC)
  •    Josh Peek (STScI)
  •    Simon D.M. White (MPA)

Title: “Galaxy Morphology in the machine learning era: revolution or incremental science?”
Abstract: An online debate on the challenges of photometric redshift estimation and the role Machine Learning can play in wide-field surveys.

AAII Seminar: MARCH 17, 2021

When: 9am PT, noon ET, 5pm CET
How: http://MLclub.net
Who:

  • Gary Bernstein (UPenn)
  • Olivier Ilbert (LAM, Marsielle)
  • Alex Malz (Ruhr U., Bochum)
  • Emmanuel Bertin (LAP, Paris)

Title: “Will Machine Learning solve photometric redshifts?”
Abstract: An online debate on the challenges of photometric redshift estimation and the role Machine Learning can play in wide-field surveys.

AAII SEMINAR: MARCH 3, 2021

When: 9am PT, noon ET, 6pm CET
How: http://MLclub.net
Who:

  • Julia Kempe (NYU Center for data science)
  • David Spergel (Flatiron institute)
  • Alex Szalay (John Hopkins Univ.)
  • J. Xavier Prochaska (UC Santa Cruz, AAII)

Title: “How should ML penetrate the natural sciences? Do we need ML institutes?”
Abstract: An online debate on the role of ML institutes / Data Science centers in research and education.

 

 

AAII Seminar: March 18, 2020

When/where: March 18, 2020 at 12pm PT;   Zoom only — https://ucsc.zoom.us/j/562952785

Title: Deep Learning for Predicting Domain Prices
Speaker: Jason Ansel, Distinguished Engineer at GoDaddy

Learn how GoDaddy uses neural networks to predict the price of a
domain name in the aftermarket. GoDaddy Domain Appraisals (GoValue) is
available to millions of GoDaddy customers and provides estimated
values to help both buyers and sellers more effectively price domain
names. GoValue is 1.25x better at predicting past domain name sale
prices than human experts.

This talk will explain the hybrid recurrent neural networks behind
GoValue. It will discuss some of the practical aspects of scaling and
deploying a sophisticated machine learning system. Finally, we will
dive into recent research at GoDaddy that created a new neural network
structure for outputting tighter prediction intervals than preexisting
techniques.

Try GoDaddy Domain Appraisals for yourself:
https://www.godaddy.com/domain-value-appraisal

 

AAII Seminar: May 29, 2019

When/where:  E2-215 at 12pm

Presenter: Jaehoon Lee (Google Brain)

Title:  Everything you wanted to know about batch size (in neural net training) but were afraid to ask

Abstract: Recent hardware developments have made unprecedented amounts of data parallelism available for accelerating neural network training. Among the simplest ways to harness next-generation accelerators is to increase the batch size in standard mini-batch neural network training algorithms. In this work, we aim to experimentally characterize the effects of increasing the batch size on training time, as measured in the number of steps necessary to reach a goal out-of-sample error. Eventually, increasing the batch size will no longer reduce the number of training steps required, but the exact relationship between the batch size and how many training steps are necessary is of critical importance to practitioners, researchers, and hardware designers alike. We study how this relationship varies with the training algorithm, model, and data set and find extremely large variation between workloads. Along the way, we reconcile disagreements in the literature on whether batchsize affects model quality. Finally, we discuss the implications of our results for efforts to train neural networks much faster in the future.

Reference: https://arxiv.org/abs/1811.03600

 

AAII Seminar: May 22, 2019

When/where:  E2-215 at 12pm

Presenter: David Haan (PBSE)

LURE (Learning UnRealized Events): Finding New(or Equivalent) Driver Mutation Events using Supervised Machine Learning

Cancer is a genetic disease typically resulting from an accumulation of mutations.  Mutations in normal cells generally result in repair or cell suicide. In cancer cells, the mutations accumulate leading to an uncontrolled growth otherwise known as a tumor. There are two broadly defined types of mutations, driver and passenger mutations.  Tumors contain around 2-5 driver mutations which cause and accelerate cancer, and about 10-200 passenger mutations which are accidental by products and result of thwarted DNA repair mechanism. The driver mutations are what defines the tumor, subtype and are therapeutic targets.

The Cancer Genome Atlas (TCGA) is a publicly accessible atlas of cancer related data from the National Cancer Institute (NCI).  This atlas of data is a comprehensive analysis of 9000 patients and 33 cancer subtypes cataloging mutation data, DNA, mRNA, methylation, and protein expression.  In particular, the TCGA study of Papillary Thyroid Carcinoma identified two subtypes, one harboring mutations in BRAF and the other were more RAS-like with mutations in KRAS, NRAS, HRAS.  The study identified driver mutations, whether BRAF or H/K/NRAS, in about 95% of the samples, leaving about 5% with no known driver mutations. Here we present a tool, “Learning UnRealized Events” (LURE) designed to identify driver mutations in those samples without known driver mutations.

AAII Seminar: May 8, 2019

No formal presentation this week.  Those who attend are encouraged to bring ~5 min of material to discuss.  Their own research, a paper that has excited them, etc.

Pizza will be provided.

AAII Seminar: May 1, 2019 at 12pm in E2-215

Speaker: Majid Moghadam (CS)
Title: Tactical Decision Making for Autonomous Driving Using Deep Reinforcement Learning
Abstract:  Following the recent advances in AI, autonomous driving has gained considerable attention in both academia and industries. For autonomous driving the classical paradigm is to use a hierarchical architecture of perception, planning and control; but recent deep learning progress lets foresee the AI-based approaches as the alternative solutions to the problem. Companies are pushing hard to produce the first fully autonomous self-driving cars. Various approaches ranging from end-to-end deep learning techniques to multi-layer hierarchical architectures are being taken to achieve this goal. In most of the approaches, advanced driving assistance systems (ADAS) play a pivotal role in enhancing the driving intelligence. Our work is mostly focused on the decision-making layer of the ADAS systems. High-level decision making is a critical feature for ADAS, that involves several challenges such as uncertainty in other driver’s behaviors and the trade-off between safety and agility. In this work, we develop a novel simulation environment that emulates these challenges and train a deep reinforcement learning agent that yields consistent performance in a variety of dynamic and uncertain traffic scenarios.
pizza will be provided

1 2