DSC 223 - Introduction to Data Science Syllabus
Spring 2024 Block 7
Instructor
Instructor Dr. Tyler George
Cornell College, West 311
Class Meetings
March 18th - April 9th
9am-11am & 1pm-3pm
West 201
Office Hours
Moday - Friday
3:15pm-4:15pm
West 311
I am available far beyond these times listed. Please email me and we can set up a time to chat about class material or whatever you prefer! I will generally annouce changes to office hours in class but I still suggest checking the Course Calendar to verify availability.
You Are A Priority
My goal this block is to help you learn the material. I want to first and foremost recognize that you are an individual and thus are unique and may learn uniquely. Additionally, your health and wellbeing are priority one. Learning cannot happen effectively if you don’t meet your other personal needs. That all being said, I have structured the class in a way that I, from experience teaching and learning myself, think will be most beneficial for the majority of students. I promise you that I will do my best to create an inclusive and engaging learning environment. I ask that you keep an open line of communication between us for when you may need help and/or flexibility. You and your learning are why I am here.
Course Description
Managing and interpreting an overwhelming amount of raw data is part of the foundation of our information society and economy. People use computers and statistics to translate, process, and visualize raw data, enabling new understandings that in turn contribute new knowledge to the world. Data Science is a newly developing field that merges ideas from both statistics and computer science to address these issues. In this course statistics will inform the discussion about what appropriate goals are for learning from the data and how the data will answer the questions raised. The computer science perspective will help us figure out which goals are actually feasible computationally, and how to achieve them.
Learning Objectives
At the end of this course I would like you to be to use software’s including RStudio and GitHub to respect, explore, understand, and utilize data in a way that is replicable. This course supports the Educational Priorities and Outcomes of Cornell College with emphasis on knowledge, inquiry, reasoning, and communication, ethical behavior, citizenship, and vocation. Your emphasis on knowledge is in the skills you will learn and apply in various interdisciplinary fields. You will inquire by using the skills gained on data related to current issues. Your reasoning skills are built and tested when making decisions based on the data and your own programmed visualizations and numerical summaries. Your group work in class and group project presentations will help you practice your communication of statistical analysis. When you make decisions about what data to work with, how to treat the data, and how to talk about your results in an ethical way you practice good ethical behavior. Some of our analysis’ will be with data from institutions such as governments or organizations that have an influence on the public – these types of analysis’ can inform public policies and are our way, as data scientists, to practice citizenship. Lastly, you will learn about the field of data science and the types of knowledge and training that would be required to support your vocation as a data scientist.
Prerequisite
To be successful in this class, you should have completed Statistical Methods I (STA 201).
Open Access Books – Free!
All of materials for this class are free.
- Our primary book is Data Science in a Box by Mine Çetinkaya-Rundel. This book is a fabulous book for both R and version control (our major topics). Our secondary books are:
- R for Data Science, 2ed, by Wickham, Çetinkaya-Rundel, and Grolemund
- Introduction to Modern Statistics by Mine Çetinkaya-Rundel and Johanna Hardin.
Course Site and Moodle
Almost the entirety of the course will be on our own course website and handled with Git/Github. Moodle will have the syllabus and the link to the website, .
Software – No need to install
We will use a combination of technologies in this course including Git, GitHub, R, and RStudio (server). Luckily for you I have put lots of effort into setting all of this on a machine we have on campus that we will all access with a web browser! You don’t need to install any – in fact for a while I prefer you don’t. More on this in class. If you are an off campus student, please let me know right away, as you may need to checkout a laptop (free) from IT to work on homework from home.
If you have any technical problems you should contact IT as soon as possible. Submit a Work Order!
Group Work
In this class, I would like you to work in groups for a variety of reasons. A large part of this class is communicating analysis – not just completing analysis. At the beginning of the block, groups will be formed. You should expect to work with this group every day. When we work in groups in class we will decide on roles, specifically who is controlling the one screen will rotate). Group members will rotate roles between tasks to help make sure everybody is sharing work load, feels included, and learns equally. Groups will be randomized and change twice during the block. You won’t be working in a group for everything; your quizzes, and exams may all be individual. I may also have some activities where some, if not all of it will be completed individually. This is to encourage creative thinking.
Evaluations and grades
Grade Category Descriptions
Homework:
Homeworks will be graded for correctness. There will be a combination of homework that requires coding, book readings, and ethics reflections. Homework requiring coding will be distributed via your own private Git repositories and submitted by pushing your completed work to that same repository. Other homework formats will not be uniform but I will post them on the course website and give you separate submission instructions (likely via Moodle).
Labs and Application Exercises
Labs and Application Exercises (AE) will be graded for completeness and/or effort. We will typically complete these in class with leftovers being homework. Application exercise are very directed (to a single concept) and generally each person will work on their own but can help each other. Labs are longer and more involved but are completed in in class groups. These will be collected in the same manner as homework’s.
Group Project
This will entail multiple stages of submission with details accessed through “Project” on the left side of the course website (once available). Some class time will be given for discussing projects with me and your group but not enough to complete the project during class times. Part of this score is your attendance to all group meetings. Each missed meeting (without justification) will result in a 5% loss in your own final project score. The required number of meetings will be based on the project and group.
Exams
There will be a Midterm exam (3/29) and a final exam (morning of 4/10). They will be a mix of in class and take home components. These dates reflect when the in class portion is scheduled but the take home portion will likely be assigned around these dates.
Assignment | Points |
---|---|
Homework | 200 |
Labs | 50 |
Application Exercises (AE) | 50 |
Group project | 300 |
Exams, two 200pts exams | 400 |
Total | 1000 |
Grade | Range | Grade | Range |
---|---|---|---|
A | 93-100% | C | 73-76% |
A- | 90–92% | C- | 70-72% |
B+ | 87–89% | D+ | 67-69% |
B | 83-86% | D | 63-66% |
B- | 80-82% | D- | 60-62% |
C+ | 77-79% | F | <60% |
Use of AI
I expect you to generate your own work in this class. When you submit any kind of work (including projects, exams, homeworks), you are asserting that you have generated and written the text, and code, unless you indicate otherwise by the use of quotation marks and proper attribution for the source. Submitting content as your own that has been generated by someone other than you, or was created or assisted by a computer application or tool, including artificial intelligence (AI) tools such as ChatGPT is cheating and constitutes a violation of our Academic Honesty policy. You may use simple word processing tools to update spelling and grammar in your assignments, but unless given permission otherwise, you may not use AI tools to draft your work, even if you edit, revise, or paraphrase it. There may be opportunities for you to use AI tools in this class. Where they exist, I will clearly specify when and in what capacity it is permissible for you to use these tools.
DISABILITIES AND ACCOMODATIONS POLICY
Cornell College makes reasonable accommodations for persons with disabilities. Students should notify the Office of Academic Support and Advising and their course instructor of any disability related accommodations within the first three days of the term for which the accommodations are required, due to the fast pace of the block format. For more information on the documentation required to establish the need for accommodations and the process of requesting the accommodations.
ACADEMIC HONESTY POLICY
Cornell College expects all members of the Cornell community to act with academic integrity. An important aspect of academic integrity is respecting the work of others. A student is expected to explicitly acknowledge ideas, claims, observations, or data of others, unless generally known. When a piece of work is submitted for credit, a student is asserting that the submission is her or his work unless there is a citation of a specific source. If there is no appropriate acknowledgment of sources, whether intended or not, this may constitute a violation of the College’s requirement for honesty in academic work and may be treated as a case of academic dishonesty. The procedures regarding how the College deals with cases of academic dishonesty appear in The Catalog, under the heading “Academic Honesty.”
Illness Policy
If you are experiencing COVID-19 symptoms, do not attend class. Perform a home test or contact Director of Student Health Services Lynn O’Brien at student_health@cornellcollege.edu immediately to arrange a COVID-19 test at the Health Center. If you need to isolate due to COVID-19, or if you become unable to attend class for any other health reason, contact me as soon as possible to determine if you are able to continue in the class. A Withdrawal for Health Reasons may be required.
Mandatory Reporter Reminder
It is my goal that you feel supported and able to share information related to your life experiences during classroom discussions, in your written work, and in any one-on-one meetings with me. You should also know that all Cornell College faculty and staff are mandatory reporters. This means that I will keep information you share with me private to the greatest extent possible. However, I am required to share information regarding sexual assault, abuse, criminal behavior, or about a student who may be a danger to themselves or to others. If you wish to speak to someone confidentially who is not a mandatory reporter, you can schedule an appointment with one of the counselors in the Ebersole Health and Wellbeing Center or contact the College Chaplain, Rev. Melea White, at mwhite@cornelllcollege.edu.