West Hub Collaborates with UC Berkeley for Data Science Education Workshop

By Arnav Gupta, San Diego Supercomputer Center REHS Intern

The sixth annual National Workshop on Data Science Education recently took place at UC Berkeley. UC Berkeley's Division of Computing, Data Science, and Society led the mid-June event with support from Microsoft and the West Big Data Innovation Hub.

An array of nationwide academic institutions – ranging from community colleges and four-year universities to secondary schools – were represented at the workshop. Attendees collaborated on creating a cohesive educational ecosystem for undergraduate students who are studying data science.

The workshop featured two panels organized by the West Hub. Both were designed to present current work and a vision for the future.

The first panel – Project Based Learning Case Studies – provided an in-depth look at projects running at UC Berkeley’s Data Science Discovery Program, the University of Washington’s Data Science for Social Good, and the University of Pittsburgh and Caldwell University (DataJam).

Anthony Suen, director of the Data Science Discovery Program at Berkeley, discussed the value of project-based experiential data learning as an essential component of data science education. Suen’s Data Science Discovery program, which has encompassed more than 750 student projects, aims to nurture and accelerate data science research by providing real-world projects for data science students at Berkeley.

Sarah Stone, executive director of the University of Washington’s eScience Institute, introduced the Data Science for Social Good (DSSG) program. DSSG – a laboratory for understanding and applying ethical aspects of data science - includes projects focused on topics such as housing affordability and city planning. Stone also discussed DSSG’s ten-week summer program, which will be accepting project proposals and student fellow applications in January for Summer 2024.

Judy Cameron, founder and director of the University of Pittsburgh’s DataJam, presented the inner workings of the data science competition and how it empowers participating youth. DataJam provides high school students with a way to answer their own questions while being awarded for their findings, creating a core experience while gauging the interest of these younger minds. The resources (datasets, guides, past projects) and training from experienced mentors provide students with the ultimate tools to explore their interests.

The second panel – Data Science Experiential Pathways (DSXP) – described the West Big Data Innovation Hub’s plan to leverage the strengths of these existing projects to create a connected data science learning pathway. The keynote was offered by National Science Foundation (NSF) Program Director Jennifer Noll, and titled NSF Perspective on Project-Based Experiential Learning about Data Science Programs.

According to a preliminary analysis by Noll, NSF has awarded 293 data science education grants since 2007. Nearly a quarter of those were labeled possible project-based learning awards, and most were aimed at undergraduate and graduate students. Few of these awards have targeted teachers or K-12 students, groups that need to develop interest and capacity in learning data science to expand the pipeline, and there also needs to be more research into the how’s and why’s of project-based learning.

The DSXP pathway is designed to address these gaps as described by Noll.

“The vision for the pathway is to offer an opportunity for students to be engaged in experiential data science learning as they progress from high school to community college/undergraduate to graduate school - with a goal of preparing them for a successful career built on a solid understanding of data science,” West Hub Executive Director Ashley Atkins said. “In addition to data-related technical skills, these students gain translational experience by learning how to communicate with stakeholders and researchers across disciplines. Additionally, they leave with an understanding of how to navigate data ethics considerations like data governance, privacy, security, and potential bias.”

Identified needs included building a research community to support the expansion of pathways for future data leaders, and the aforementioned experiential data learning/projects. Each of the projects, from DataJam to DSSG, is meticulously designed to nurture and grow data science skills from high school all the way up to graduate students.

The DSXP team will be presenting the pathway model at additional conferences and meetings throughout the summer and fall. DSXP leaders are looking to engage with partners to determine how this core connected structure can be leveraged and adapted to better meet pressing needs for communities, sectors, and institutions across the country. Those interested can reach Atkins at

