Syllabus
Click here to download a PDF copy of the syllabus.
Office hours
Click here for the instructor and TA office hours locations and Zoom links. You are welcomed to attend the office hours for any STA 199 TA, regardless of section.
Textbooks
All books are freely available online.
R for Data Science, 2e | Grolemund, Wickham | O’Reilly, 2nd edition, 2022 | Hard copy only available of 1st edition |
Introduction to Modern Statistics | Çetinkaya-Rundel, Hardin | OpenIntro Inc., 1st Edition, 2021 | Hard copy available on Amazon |
Course learning objectives
By the end of the semester, you will…
- learn to explore, visualize, and analyze data in a reproducible and shareable manner
- gain experience in data wrangling and munging, exploratory data analysis, predictive modeling, and data visualization
- work on problems and case studies inspired by and based on real-world questions and data
- learn to effectively communicate results through written assignments and project presentation
Course community
Duke Community Standard
As a student in this course, you have agreed to uphold the Duke Community Standard as well as the practices specific to this course.
Inclusive community
It is my intent that students from all diverse backgrounds and perspectives be well-served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that the students bring to this class be viewed as a resource, strength, and benefit. It is my intent to present materials and activities that are respectful of diversity and in alignment with Duke’s Commitment to Diversity and Inclusion. Your suggestions are encouraged and appreciated. Please let me know ways to improve the effectiveness of the course for you personally, or for other students or student groups.
Furthermore, I would like to create a learning environment for my students that supports a diversity of thoughts, perspectives and experiences, and honors your identities. To help accomplish this:
- If you feel like your performance in the class is being impacted by your experiences outside of class, please don’t hesitate to come and talk with me. If you prefer to speak with someone outside of the course, your academic dean is an excellent resource.
- I (like many people) am still in the process of learning about diverse perspectives and identities. If something was said in class (by anyone) that made you feel uncomfortable, please let me or a member of the teaching team know.
Accessibility
If there is any portion of the course that is not accessible to you due to challenges with technology or the course format, please let me know so we can make appropriate accommodations.
The Student Disability Access Office (SDAO) is available to ensure that students are able to engage with their courses and related assignments. Students should be in touch with the Student Disability Access Office to request or update accommodations under these circumstances.
Communication
All lecture notes, assignment instructions, an up-to-date schedule, and other course materials may be found on the course website.
Announcements will be emailed through Sakai Announcements periodically. Please check your email regularly to ensure you have the latest announcements for the course.
Where to get help
- If you have a question during lecture or lab, feel free to ask it! There are likely other students with the same question, so by asking you will create a learning opportunity for everyone.
- The teaching team is here to help you be successful in the course. You are encouraged to attend office hours to ask questions about the course content and assignments. Many questions are most effectively answered as you discuss them with others, so office hours are a valuable resource. Please use them!
Check out the Help tab for more resources.
If there is a question that’s not appropriate for the public forum, you are welcome to email me directly. If you email me, please include “STA 199” in the subject line. Barring extenuating circumstances, I will respond to STA 199 emails within 48 hours Monday - Friday. Response time may be slower for emails sent Friday evening - Sunday.
Activities & Assessment
The activities and assessments in this course are designed to help you successfully achieve the course learning objectives. They are designed to follow the Build, Train, Create format.
Build: We have designed material to help build up an initial foundation of topics in data science. These materials include short videos, reading assignments, and lectures to introduce new concepts and ensure a basic comprehension of the material. The goal is to help you prepare for the in-class activities during lecture.
Train: During class, you will train your brain using in-class application exercises. These exercises are to help develop both the skills to accomplish data science tasks, as well as the ability to problem solve and extend your knowledge to new situations. These activities will graded for completion, as they are designed for you to gain experience with the statistical and computing techniques before working on graded assignments.
Create: We have designed assessments for you to create material that demonstrates your understanding of the covered content. This includes labs, homework, exams, and the project. These assignments extend upon the build and train aspects of this course, are the opportunity for you to demonstrate your understanding of the course material, and how it is applied to analyze real-world data.
Lectures (Train)
Part of the class time will be lectures that introduce new concepts or review topics from the preparation videos. Lectures will not repeat everything in the videos, they will instead highlight important and known to be complex concepts and will be supplemented with live coding activities. You are expected to attend every lecture. Lectures will be recorded and made available to students with an excused absence upon request.
Application exercises (Train)
A majority of the in-class lectures will be dedicated to working on Application Exercises (AEs). These exercises which give you an opportunity to practice apply the statistical concepts and code introduced in the prepare assignment. The AEs done in class (Tu, Th) are due at the end of the day on the following Friday by 11:59p ET. You are responsible for turning in material up to the point we covered in class.
Because these AEs are for practice, they will be graded based on completion, i.e., a good-faith effort has been made in attempting all parts. Successful on-time completion of at least 80% of AEs will result in full credit for AEs in the final course grade.
In addition to AEs will be periodic activities help build a learning community. These will be short, fun activities that will help everyone in the class connect throughout the semester.
Labs (Create)
In labs, you will apply the concepts discussed in lecture to various data analysis scenarios, with a focus on the computation. Most lab assignments will be completed in teams, and all team members are expected to contribute equally to the completion of each assignment. You are expected to use the team’s Git repository on the course’s GitHub page as the central platform for collaboration. Commits to this repository will be used as a metric of each team member’s relative contribution for each lab, and there will be periodic peer evaluation on the team collaboration. Lab assignments will be completed using Quarto, correspond to an appropriate GitHub repository, and submitted for grading in Gradescope.
The lowest lab grade will be dropped at the end of the semester.
Homework (Create)
In homework, you will apply what you’ve learned during lecture and lab to complete data analysis tasks. You may discuss homework assignments with other students; however, homework should be completed and submitted individually. Similar to lab assignments, homework must be typed up using Quarto and GitHub and submitted as a PDF in Gradescope.
The lowest homework grade will be dropped at the end of the semester.
Exams (Create)
There will be two, take-home, open-note exams. Through these exams you have the opportunity to demonstrate what you’ve learned in the course thus far. Each exam will include small analysis and computational tasks related to the content in the prepare, practice, and perform assignments. More details about the content and structure of the exams will be discussed during the semester.
Project (Create)
The purpose of the project is to apply what you’ve learned throughout the semester to analyze an interesting data-driven research question. The project will be completed with your lab teams, and each team will present their work in video and in writing during the final exam period. More information about the project will be provided during the semester.
Team work policy
The final project and several labs will be completed in teams. GitHub commits will be used to measure individual contribution to the assignment. All group members are expected to participate equally. Commit history may be used to give individual team members different grades. Your grade may differ from the rest of your group.
Grading
The final course grade will be calculated as follows:
Category | Percentage |
---|---|
Homework | 25% |
Labs | 15% |
Project | 20% |
Exam 01 | 18% |
Exam 02 | 18% |
Application Exercises | 4% |
The final letter grade will be determined based on the following thresholds:
Letter Grade | Final Course Grade |
---|---|
A | >= 93 |
A- | 90 - 92.99 |
B+ | 87 - 89.99 |
B | 83 - 86.99 |
B- | 80 - 82.99 |
C+ | 77 - 79.99 |
C | 73 - 76.99 |
C- | 70 - 72.99 |
D+ | 67 - 69.99 |
D | 63 - 66.99 |
D- | 60 - 62.99 |
F | < 60 |
Course policies
Academic honesty
TL;DR: Don’t cheat!
Please abide by the following as you work on assignments in this course:
- You may discuss individual homework and lab assignments with other students; however, you may not directly share (or copy) code or write up with other students. For team assignments, you may collaborate freely within your team. You may discuss the assignment with other teams; however, you may not directly share (or copy) code or write up with another team. Unauthorized sharing (or copying) of the code or write up will be considered a violation for all students involved.
- You may not discuss or otherwise work with others on the exams. Unauthorized collaboration or using unauthorized materials will be considered a violation for all students involved. More details will be given closer to the exam date.
- Reusing code: Unless explicitly stated otherwise, you may make use of online resources (e.g. StackOverflow) for coding examples on assignments. If you directly use code from an outside source (or use it as inspiration), you must explicitly cite where you obtained the code. Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism.
Any violations in academic honesty standards as outlined in the Duke Community Standard and those specific to this course will automatically result in a 0 for the assignment and will be reported to the Office of Student Conduct for further action.
AI & Assessments
The use of generative AI (such as ChatGPT) is extremely discouraged on homework, and will be forbidden on exams.
I acknowledge that ChatGPT can be a powerful learning tool, but the use of it can rob you from building a foundational understanding of coding and data science concepts. We build this foundation by critically thinking and working hands-on with the class material, not by asking AI to think for us.
If you were learning a new language, would you practice or would you type everything into Google translate?
Further, generative AI may be (and is often) not entirely accurate. Especially with R code, as it is a complex coding language. In many cases, generative AI suggests code outside the scope of this course, or suggests wrong code that may be hard to distinguish from something that is correct. Given that we are seeing and learning R for the first time, this type of workflow is not recommended and unproductive to you as a learner of data science.
Late work & extensions
The due dates for assignments are there to help you keep up with the course material and to ensure the teaching team can provide feedback within a timely manner. We understand that things come up periodically that could make it difficult to submit an assignment by the deadline. Note that the lowest homework and lab assignment will be dropped to accommodate such circumstances.
Late work
- Homework and labs may be submitted up to 1 day (24 hours) late. There will be a 10% deduction for a late assignment.
- There is no late work accepted for application exercises, since these are designed to help you prepare for labs and homework.
- Exams cannot be turned in late and can only be excused under exceptional circumstances.
- The late work policy for the project will be provided with the project instructions.
Waiver for extenuating circumstances
The Duke policy for illness requires a short-term illness report or a letter from the Dean; except in emergencies, all other absenteeism must be approved in advance (e.g., an athlete who must miss class may be excused by prior arrangement for specific days). Please email Ed if you fall into this situation. For emergencies, email notification is needed at the first reasonable time. Please note that accommodations are not retroactive.
A last-minute technical issue, being gone for vacation, or forgetting a deadline is not an extenuating circumstances.
If there are circumstances that are having a longer-term impact on your academic performance, please let your academic dean know, as they can be a resource. Please let Dr. Elijah Meyer know if you need help contacting your academic dean.
Regrade requests
Every effort will be made to mark your work accurately. We are on your side, and want you to receive every point you have worked to earn. However, sometimes grading mistakes happen. If you believe that an error has been made, please make the request on Gradescope. Do this within 4 days of when grades are released. The question will be again graded, in full, by the individual who graded it the first time.
The following claims will be considered for re-grading:
– points are not totaled correctly;
– the grader did not see a correct answer that is on your paper;
– your answer is the same as the correct answer, but in a different form (e.g., you wrote a correct answer as 1/3 and the grader was looking for .333);
– your answer to a free response question is essentially correct but stated slightly differently than the grader’s expectation.
The following claims will not be considered for re-grading:
– arguments about the number of points lost;
– arguments about question wording.
Considering re-grades consumes time and resources that TAs and the instructor would rather spend helping you understand material. Please bring only claims of type (i), (ii), (iii), or (iv) to our attention.
No grades will be changed after the project presentations.
Class recording requests
Lectures will be recorded on Panopto and will be made available to students with an excused absence upon request. Videos shared with such students will be available for a week. To request a particular lecture’s video, please email me (your professor). Please also make sure that any official documentation, such as STINFs, Dean’s excuses, NOVAPs, and quarantine/removal from class notices from student health are also uploaded and provided in the email.
Attendance policy
- COVID Symptoms, Exposure, or Infection: Student health, safety, and well-being are the university’s top priorities. To help ensure your well-being and the well-being of those around you, please do not come to class if you have tested positive for COVID-19 or have possible symptoms and have not yet been tested. If any of these situations apply to you, you must follow university guidance related to the ongoing COVID-19 pandemic and current health and safety protocols. If you are experiencing any COVID-19 symptoms, contact student health (dshcheckin@duke.edu, 919-681-9355). Learn more about current university policy related to COVID-19 at https://coronavirus.duke.edu. To keep the university community’s safe and healthy as possible, you will be expected to follow these guidelines. Please reach out to me and your academic dean as soon as possible if you need to quarantine or isolate so that we can discuss arrangements for your continued participation in class.
- Inclement weather: In the event of inclement weather or other connectivity-related events that prohibit class attendance, I will notify you how we will make up missed course content and work. Asynchronous catch-up methods may apply.
- Religious accommodations: Students are permitted by university policy to be absent from class to observe a religious holiday. Accordingly, Trinity College of Arts & Sciences and the Pratt School of Engineering have established procedures to be followed by students for notifying their instructors of an absence necessitated by the observance of a religious holiday. Please submit requests for religious accommodations at the beginning of the semester so that we can work to make suitable arrangements well ahead of time. You can find the policy and relevant notification form here: https://trinity.duke.edu/undergraduate/academic-policies/religious-holidays.
Important dates
- Aug 28th: Classes begin.
- Sep 4th: Labor Day. No classes are held
- Sep 8th: Drop/Add for Term 1 ends
- Oct 13: Fall break begins (7:00 PM)
- Oct 18th: Classes resume from Fall break (8:30 AM)
- Nov 10th: Last day to withdraw with “W”
- Nov 21th: Thanksgiving recess begins (10:30 PM)
- Nov 27th: Classes resume from Thanksgiving recess (8:30 AM)
- Dec 7th: Last Day of Class.
- We do not have a final exam in this course
For more important dates, see the full Duke Academic Calendar.