Gearing Up for Data Science
Eight years ago, venture capitalist Geoffrey Moore tweeted this prescient assertion: “Without big-data analytics, companies are blind and deaf.”
Increasingly, data runs our lives. It allows companies to compete, drives web advertising and supports policy making. It can lead to improved performance by farmers, airlines, athletes, highway systems and banks. It fuels artificial intelligence and the Internet of Things, and catalyzes scientific discovery on many levels.
But, on its own, data is just raw information. It takes people — data scientists, in particular — to give meaning to the numbers. With their unique skillsets, data scientists are able to extract knowledge and glean insights from the extensive datasets that society now generates every day. As a result, data science is, increasingly, where the jobs are.
To answer the growing demand for data scientists, the National Science Foundation (NSF) has awarded UC Santa Barbara a three-year, nearly $920,000 grant to fund what is known as the Central Coast Data Science Initiative. The program will support coursework and project-based classes at the community college and undergraduate levels at four partnering colleges and universities.
The initiative will include classes and student projects as well as a collaboration of data scientists. Most of the funds will go toward undergraduate fellowships for these new data scientists, explained Tim Robinson, an academic coordinator in the Department of Computer Science. UC Santa Barbara plans to provide 65 fellowships over a two-year period.
“The overall idea is to build a culture around data science on campus,” said computer science professor Ambuj Singh, who leads the university’s working group on data science and is the principal investigator on the NSF project. The other principal investigators on the UC Santa Barbara team are Mike Ludkovski, professor and chair of statistics and applied probability; Alexander Franks and Sang-Yun Oh, assistant professors of statistics and applied probability; and computer scientist Yekaterina Kharitonova.
Unlike most other disciplines, which are defined by a set of common topics of interest, data science is a collection of tools and techniques that enable people to glean patterns, insights and knowledge from data. It encompasses aspects of mathematics, statistics and computer science and finds application in nearly every pursuit imaginable.
As a result, demand for data science courses and projects has soared among students. For instance, the number of majors in the Department of Statistics and Applied Probability (PSTAT) has more than tripled in the past six years, in part due to the surge of interest in data science.
“Our students desperately want research projects,” said Ludkovski. The NSF grant will enable the university to expand its offerings to meet this burgeoning interest.
The computer science and PSTAT departments are collaborating to establish a year-long, three-course capstone project in data science, similar to PSTAT’s highly successful actuarial science capstone project and a very popular capstone course sequence in computer science. The curriculum will involve small student teams and industry partnerships. A mix of faculty from the two departments will teach the series, which starts in fall 2020, Ludkovski said.
The university’s interdepartmental effort reflects the interdisciplinary nature of data science itself. “You can situate data science in a single department and let that department expand, or you can create something new,” said Singh. “We’re leaning toward the latter approach. It requires more work up front, but we hope that a broader perspective will bring together faculty and students from different streams and departments, resulting in a more unified approach.”
Joining UC Santa Barbara on the grant are Cal Poly San Luis Obispo, California State University San Bernardino and Santa Barbara City College. The NSF grant totals more than $1.2 million over the four institutions.
The program will provide a pathway for a number of community college students to explore and move into a four-year university,” said Robinson. “It also connects the more practically minded Cal State University system with the research-oriented UC system.”
The data science consortium is part of a growing emphasis UC Santa Barbara is placing on data science. For instance, the Bren School of Environmental Science & Management will soon offer a Master of Environmental Data Science in addition to its existing Master of Environmental Science and Management. The inaugural class will begin in fall 2021.
In addition, close to 200 students participate in the campus's very active student-run data science club, which organizes quarterly data science projects as well as hosting workshops and speakers. The organization provides students from various departments a place to grow their data science skills, regardless how much or how little they know coming in.
“Working with such great professors and having such great classmates and mentors in the club made me realized how much I like working with data,” said Natalie Rozak, a senior statistics major and one of the club’s project group directors. “It gets applied in every field, so it’s very marketable as a career.”
Rozak, who has a job with the chemical company BASF waiting for her after she graduates, is excited about the new initiative and the capstone projects it supports. It’s fantastic that the series will include both theory and hands-on experience, she said, since both are important in learning how to be a data scientist.
The faculty is equally enthusiastic. “With the NSF grant, we will provide data science fellowships for over 30 undergraduates per year,” said Singh, who added that he’s looking forward to working with the first class of fellows.