Avatar

Community-Engaged Data Science

Short-term 2022

Bates College

Community-Engaged Data Science

Students will work in multi-disciplinary teams on a research problem identified by a community partner from business, industry, or government. Together with community partners, students will work to gain insight from data, building skills in reproducible analysis (literate programming and version control) and collaboration, using modern programming tools and techniques.

This course emphasizes putting knowledge into practice, including going beyond individual fields of study to solve real world problems and understand community partner needs. Students will build skills in project management, using agile methodologies and weekly meetings with community partners designed to foster co-development and iterative and incremental project delivery.

Students will develop their mathematical and programming skills as well as skills and traits valued by employers of STEM professionals, such as teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.

Timetable

Afternoons: Project Work. Morning: Optional Tutorials.

Monday

Sprint planning and sprint review

Tuesday, Wednesday

Project work and meetings with community partners

Thursday

Presentations

Welcome to Community-Engaged Data Science!

Course Schedule

Overview

This is a tentative course schedule. The flow of topics might change slightly depending on how quickly / slowly it feels right to …

Week 1 Project Design

Get acquainted with the projects, the technology, the workflow, and create a personal development plan for the skills you want to …

Week 2 - Exploratory Analysis

Exploratory Data Analysis.

Week 3 - Feedback and Iteration

Get feedback early and often to create a shared vision and meet partner needs

Week 4 - Wrapping Up

This week is about final communication, documenting, and future proofing our analysis so that it can be carried on and continued in the …

Projects

*

Developing a data pipeline for monitoring temperature and intertidal biodiversity data in the Gulf of Maine.

Developing a data pipeline for monitoring temperature and intertidal marine communities in the Gulf of Maine collected by the …

Mapping Primary Care Providers for Lyme Disease Education Outreach

Maine has high rates of tickborne diseases including Lyme disease, anaplasmosis, and babesiosis. Tickborne disease are spread to humans …

Parole in Maine - what can we learn from public perceptions, media coverage, and parole outcomes in other states?

What are the public’s attitudes towards parole for violent and non-violent crimes? How is parole for prisoners portrayed through …

Understanding tobacco-use and smoke exposure among immigrants in Maine

AK Health and Social Services is a community based nonprofit organization that provides services to immigrants, refugees, asylum …

Syllabus

Course Description

Hello and welcome to Community-engaged Data Science. This course is intended for students with a strong interest in applied problem-solving. Students will work in multi-disciplinary teams on a research problem identified by a community partner from the local community, industry, or government. Together with community partners, students will work to gain insight from data, building skills in reproducible analysis (literate programming and version control) and collaboration, using modern programming tools and techniques.

This course emphasizes putting knowledge into practice, including going beyond individual fields of study to solve real world problems and understand community partner needs. The goal of this course is not to “complete the project” but rather to explore and reflect on the problem-solving process, including the parts that are messy and don’t go smoothly. Come prepared to engage with your teams and community partners and be ready to roll with the challenges and think creatively about the project.

During the course, students will build skills in project management, using agile methodologies and weekly meetings with community partners designed to foster co-development and iterative and incremental project delivery.

Students will develop their mathematical and programming skills as well as skills and traits valued by employers of STEM professionals, such as teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.

The course content is organized in three units:

  • Week 1 - Project Planning and Scoping: This unit is an introduction to the projects and toolkit of the course. Students will define and identify key project milestones, brainstorm ideas and approaches to solving the problem, and plan the project tasks.
  • Week 2 and 3 - Exploring and analyzing the data: This unit focuses on exploring the data and creating a first mock up of the project for partners.
  • Week 4 - Final Project Presentations and Handover: After completing the exploratory phase, students wrap up the project in a report where they present their findings and next steps to partners.

Learning Objectives

This course is designed as a community learning journey. Together, we will:

  • Use research skills to integrate and apply multiple forms of knowledge to an issue of interest to a community partner.
  • Demonstrate community leadership skills as a collaborator that shares strengths, builds weaknesses, and contributes to a broader shared understanding.
  • Gain experience in data collection, wrangling, visualization, exploratory data analysis, predictive modeling and effective oral and written communication of results to audiences beyond Bates.
  • Build skills valued by employers including teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.
  • Identify areas for personal growth and develop and implement a personal development plan to work towards individual and collaborative goals related to the project.

It is also my hope that in this course you:

  • Cultivate an interdisciplinary understanding of what it means to “do research”, including embracing multiple ways of knowing and doing beyond those of your discipline and the academic environment.
  • Develop an appreciation for reproducibility, transparency, accessibility and inclusivity in data collection, analysis, and communication.
  • Build knowledge and skills in data science skills to tackle questions that are important to you.
  • Engage and reflect on contemporary issues in environmental and social justice related to your digital world, community and positionality.

Course Model

The central goal of the course is to work with community partners on a jointly-defined research project of benefit to the wider Maine community. While there are specific deliverables for each project — weekly oral presentations, a final written report, and anything else agreed upon with the partner — this course also requires deep listening, respectful as well as thoughtful feedback, and collaborative engagement within the classroom. You will be evaluated both on the final products you produce and the process by which you get there — your individuals contributions to the project, to group dynamics, and to the collaborative environment of mutual support within the classroom.

Grades

A growing body of research indicates that traditional approaches to grading fail to produce the sorts of meaningful learning desired by both teachers and students. Such approaches often reinforce inequitable power dynamics between teaches and students, promote faulty reward systems that disincentivize creativity and risk-taking, and devalue important aspects of learning (including revision and feedback). Given this context, instead of a traditional approach to grading in which you do work that is evaluated singularly by me, this course assumes that you out to take ownership or and responsibility over your performance and engagement with the class. To make this happen, this course uses a “contract grading” scheme, which gives you a voice in the grading process, provides you with the agency to specify your intended course performance, and then share in the responsibility for evaluating whether or not you fulfilled your intended obligations. Please see the contract grading document (on Lyceum) for a more-fleshed-out explanation of this approach and how it will operate in the course.

Course Requirements:

How you choose to engage in this course—both in and out of class—is central to the success of your project and our collective learning. As such, participation constitutes a substantial part of my expectations. A rubric (see Lyceum) explicitly defines what is meant by process components such as substantive feedback and emotional intelligence. Much of this will require individual reflection on your part, and be documented through a midterm and end-of-course peer- and self-evaluation. I will spend time talking about these expectations during class.

Group project:

Five groups of 4 students each will undertake projects with our community partners. Each group will determine how to structure their project; what forms of information are necessary and how to gather them; how to allocate and organize tasks; and how to produce the “deliverables” agreed upon with the community partner.

  • Project plan: In the first week of the course, each project group develops a written plan in consultation with their community partner, the instructors, and their peers. The plan should include a clear statement of particular questions you will address, how you will address them, and project milestones you will work towards. As part of your consultation with project partners, you will also develop a kanban board for the project that breaks the project down into specific tasks that will be added to throughout the project. Kanban boards are a common tool used by data science and software development teams to manage projects. You will use the kanban board to brainstorm, plan and allocate work - both temporally and among yourselves - to complete the project.

The reader of the project plan should understand why you are doing this project, what methods and approaches you will use to accomplish the project goals, and what materials, people, and organizations may be of assistance as you work on the project. The plan should indicate what the final product is for the partner, key milestones, and who the users for your final product will be as well as any additional potential efforts to disseminate the outcomes of your project. Ultimately, the reader should be convinced that you can professionally complete the work you outline in the time-frame indicated. I will give you lots of feedback on your plan and kanban board to help guide the project and give you aspects that will be reusable in your final report.

  • Partner Questions and Notes There will be an opportunity to meet and check in with your partners each week. Ahead of the meetings you will brainstorm a list of questions for the project partners and take notes on the answers to the questions as well as any actions generated for the meetings. The notetaker on the meeting will rotate throughout the team, and part of your participation grade will be on the questions and notes you take.

  • Mid-project presentation and feedback Each group will give a presentation (15 minutes) to update the project partner on your progress to date, including key findings and accomplishments, and any roadblocks or challenges you encounter along the way. This is an opportunity to meet with project partners and get feedback and guidance to inform the final project.

  • During the presentation, groups will take a note of what to update on the project and use this to inform/update the tasks on the kanban.
  • Final presentations and feedback: Each group must make a final presentation of their project to their community partner during the last week of class. Groups will negotiate with partners regarding appropriate venue and format, and the timing should be both appropriate for the audience and scheduled so that at least one instructor can attend. Each group will give a ‘dry run’ of their final presentation to the class (the week before), where we will give you substantial feedback ahead of your final presentation. Generous and honest feedback on your peers’ presentations is an essential and expected component of the course.

  • Final Written Report: Each group submits a final written report, including any final products and documented code, to both the community partner and the instructors. This is a primary outcome of the course and should represent top-level writing and professional standards of presentation. The format and structure of reports and the final product may vary, based on the nature of your particular project. All code must be well documented and organized in a way that the project partner can continue to build on the analysis/visualization. In addition to the final group report and final product, each individual student writes a final self- and peer-evaluation. All components – the group report, final products, and the evaluation – will be considered in grading the final written report. The group report will be sent electronically to the instructors. Each group should give their community partners electronic copies of the final report and all supporting documents, including code and references.

Note: Because the final reports from this class are put onto a website (this website!) that can be viewed by the public, you must pay particular attention to copyright issues in preparing your report. Any images you incorporate into the final report must be available in the public domain. Care must also be exercised in the usage and citing of textual material. If you have questions about whether inclusion of something in your report would infringe on copyright laws, you can refer to the following site: http://www.bates.edu/ils/policies/access-use/copyright/. You should also speak with Chris Schiff, Music and Arts Librarian (cschiff@bates.edu), if you have any concerns about intellectual property.

  • Personal Development Plan and Reflections: As part of this project you will be submitting a plan for personal development. I will meet with each of you 1 on 1 during the first week to discuss and form your project goals. Throughout the course there will weekly reflections to reflect on progress towards these goals.

Key Course Policies:

Communication

Working collaboratively requires cultivating mutual respect and engaging in clear, honest, and thoughtful communication. My aim in this course is to create a collective learning experience animated by these values. I expect the kinds of practices in the classroom that would be expected in any professional situation where a large team and a number of sub-teams are working together with a shared missing and smaller team assignments that serve that larger mission. If you are struggling and unable to meet your commitments to the group, your partner, or the whole class — or if you have feedback for us that might help to improve your experience in the class and in your group work — please communicate these concerns to me directly in a timely manner. It is my intention to ensure that all students have a successful short-term experience.

Course components

Sharing / reusing code

I am well aware that a huge volume of code is available on the web to solve any number of problems. Unless I explicitly tell you not to use something the course’s policy is that you may make use of any online resources (e.g. StackOverflow) but you must explicitly cite where you obtained any code you directly use (or use as inspiration). Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism. On individual assignments you may not directly share code with another student in this class, and on team assignments you may not directly share code with another team in this class. You are welcome to discuss the problems together and ask for advice, but you may not send or make use of code from another team.

Academic integrity

Bates College takes academic misconduct very seriously and is committed to ensuring that so far as possible it is detected and dealt with appropriately.

Cheating or plagiarizing on assignments are a breach of trust with classmates and faculty, violate the University policies, and will not be tolerated. Make sure to review the Student Guide to Academic Integrity Policies and Procedures on the Bates website. Such incidences will result in a 0 grade for all parties involved.

DCS Values, Goals, and Practices

The primary purpose of Digital and Computational Studies is to bridge the liberal arts education to computing and the digital world. In this, we are committed to actively creating digital and computational spaces that are radically inclusive. Our core commitment is to integrating equity and social justice throughout our curriculum, and engaging students in metacognition to support this work. Digital and Computational Studies Courses fall into a continuum of experiences that range from critical digital studies to programming, with an integrated core that embraces both. See the DCS website for details on DCS values, goals, and practices.

Universal Learning and Learning in Community

Many of us learn in different ways. For example, you may process information by speaking and listening, so while lectures are quite helpful for you, some of the written material may be difficult to absorb. You might have difficulty following lectures, but are able to quickly assimilate written information. You may need to fidget to focus in class. You might take notes best when you can draw a concept. For some of you, speaking in class can be a stressful or daunting experience. For some of you, certain topics or themes might be so traumatic as to be disruptive to learning. The principle of Universal Design for Learning calls for our classrooms, our virtual spaces, our practices and our interactions to be designed to include as many different modes of learning as possible, and is a principle I take seriously in this class.

It is also my goal to create an inclusive classroom, which depends on community building, and which requires everyone to come to class with mutual respect, civility, and a willingness to listen to and observe others. As such the syllabus serves as a contract of some expectations between all members of the class, including myself.

If you anticipate or experience any barriers to learning in this course, please reach out to me and your student support advisor. If you have a disability, or think you may have a disability, you may also want to meet with Abigail Nelson, Assistant Dean of Accessible Education and Student Support, to begin this conversation or request an official accommodation. You can find more information about the Office of Accessible Education and Student Support, including contact information, on the accessible education page of the Bates website. If you have already been approved for accommodations through the Office of Accessible Education please let me know! We can meet 1-1 to explore concerns and potential options. If you do not have a documented disability, remember that student support services are available to all students through the Math and Stats Workshop (R, Math, and Stats).

The college prohibits discrimination on the basis of race, color, national or ethnic origin, religion, sex, sexual orientation, gender identity or gender expression, age, disability, genetic information or veteran status and other legally protected statuses in the recruitment and admission of its students, in the administration of its education policies and programs, or in the recruitment of its faculty and staff. Bates College adheres to all applicable state and federal equal opportunity laws and regulations. Violations of this policy can be reported to Gwen Lexow, Director of Title IX and Civil Rights Compliance or through the Bates website: www.bates.edu/sexual-respect/non-discrimination-policy/

Learning during a pandemic

I want to make sure that you learn everything you were hoping to learn from this class. If this requires flexibility, please don’t hesitate to ask.

  • You never owe me personal information about your health (mental or physical) but you’re always welcome to talk to me. If I can’t help, I likely know someone who can.

  • I want you to learn lots of things from this class, but I primarily want you to stay healthy, balanced, and grounded during this crisis.

Help

Most of you will need help at some point and we want to make sure you can identify when that is without getting too frustrated and feel comfortable seeking help.

  • Lyceum Forum: The best way to get any questions on course content, technology, logistics, policies is to post your question on the lyceum forum. And you are encouraged to answer each others' questions here as well. When you post a question on lyceum, you can choose to do so anonymously to your classmates. Note that the course instructor and tutors can always see your name, and this is for a good reason! We want to be able to identify students who might be struggling so that we can extend help. Similarly, we want to know who you are if you’re providing great answers to others' questions!
  • Student hours: We will see each other most days during the course and there will be tutorial slots for additional instruction in some of the techniques you will find useful in your projects. If you wish to meet one on one outside of these hours, please send me an email to arrange a meeting.
  • Email: Please refrain from emailing any course content questions (those should go on Lyceum), and only use email for questions about personal matters that may not be appropriate for the public course forum (e.g. illness, missed assignments).
  • The Math and Statistics Workshop at Bates is open Mon-Thurs from 11-4 and 11-3 on Friday.
  • For more general support and advice, please make use of the following resources:

Make good use of this support system, it is there for you! And if you’re not sure where to go for help, just ask.

Personal Development Plan

Defining your learning goals and working towards them.

What is a Personal Development Plan?

As part of the course you will be putting together and designing your own personal development plan. Personal development plans (PDP) are commonly used in industry and are typically planned and agreed upon with your manager. Personal Developments Plans are about reflecting on your current skills and knowledge and thinking about the areas you want to grow in. For this course your PDP will consist of making at least three personal goals for areas where you would like to grow in the class and putting together a plan for reaching those goals.

Creating Smart Goals

SMART-goals

These goals should be SMART (Specific, Measurable, Achievable, Relevant, and Time-based). An example of a SMART goal might be:

Example Goal 1: Learn to use Github for version control Plan (during course):

  • Create a GitHub account,
  • Attend GitHub tutorial,
  • Practice version control on project (committing, pushing, pulling),
  • Practice writing good commit messages,
  • Learn to resolve a merge conflict,
  • Identify and work through and reflect on areas I still have questions on.
  • Relevance: Learning GitHub will allow me to collaborate on the project with my teammates.

Example Goal 2: Improve my skills in data visualization

Plan (during course):

  • Attend ggplot2 tutorial,
  • Work through independent gganimate/RShiny tutorials/leaflet tutorials,
  • Create a new visualization about the data each week.

Other goals could be:

  • Learn and implement a machine learning model,
  • Learn and grow confidence in R programming,
  • Learn and grow confidence in Python programming,
  • Explore data science career options,
  • Improve leadership skills (lead sprint planning, or sprint retrospective, meetings with partners),
  • Improve science communication writing skills,
  • Learn about different forms of community engagement,
  • Learn about the Lewiston community,
  • Learn and implement text analysis.

Support

There will be several morning tutorials, readings, designed to help you work toward your goals, there will also be open tutorial slots where you can come and work on an independent tutorial. The R for Data Science and Python for Data Science trello boards can help you get ideas for technical skills. I will help you find materials for the independent tutorials but you can also find your own. I encourage you to attend at least 2 morning tutorials during the course. You may also plan your own tutorial work time outside of the course (signed off by me).

Resources

Course Materials required:

Books:

All books for this course are freely available online as e-books and/or linked articles. There will be specific readings for each project.

For project management and reproducibility you might find the following useful:

For programming and statistics you might find the following useful:

Technology:

  • Bring a laptop to every class, we will be programming most days! You can choose whatever programming language you feel most comfortable using, however some project partners may request a particular language based on the tool that they are most comfortable using and where they will continue the analysis. You will each have access to an RStudioCloud account where you can analyze your data. Feel free to use a Integrated Development Environment (IDE) of choice. If you have any difficulties with accessing a computer, let me know and we can arrange a chromebook from the library.

Tools

Cheatsheets

Extra credit

A key goal of this course is to particpate in knowledge exchange, particularly in equipping community partners with the data science tools to tackle issues of local importance. I am waiting to confirm dates, but there will likely be an opportunity to deliver an R training workshop to partners centered around the data science tools and technology to analyze and visualize their data.

Volunteering during the workshop will be counted towards one half letter grade, e.g. moving an A to A+ or B- to B.

People

Course organisers

Acknowledgements

This course takes inspiration from Data Study Groups at the Alan Turing Institute, project management at the Data Science Campus, and a series of workshops on Data Science for Social Good programs.

This course draws on several resources including the Turing Way and includes featured artwork by @AllisonHorst and Scriberia.

The course structure and grading system draws heavily upon course organization and policies in ENVR 417, Community Engaged Research and PIC MATH, and the work of Bates faculty Ethan Miller, Misty Beck, Francis Eanes, Adriana Salerno as well as scholars beyond Bates.

The website structure is inspired by the 2020 Data Science course website taught by Mine Çetinkaya-Rundel at the University of Edinburgh.