Link Search Menu Expand Document

MS&E 125: Applied Statistics [Spring 2024]

Concept check form

Lecture 14 abridged slides

Lecture 14 Worksheet

Announcements πŸ“£

HW6 will be released on Tuesday 5/21. Deadline extended Friday 5/24 β†’ Wednesday 5/29. Final project deadline also extended 3 days. Thursday 5/30 β†’ Sunday 6/2.

Table of contents πŸ₯—

  1. Course description πŸ“°
  2. Course staff 🍎
  3. Lecture schedule πŸ§‘β€πŸ«
  4. Office hours (OH) πŸ—“οΈ
  5. Grading policies πŸ’―
  6. Lecture attendance policy πŸ‘₯
  7. Lecture technology policy ❌ πŸ“±
  8. Lecture recordings πŸŽ₯
  9. Homework submission πŸ“
  10. Study groups
  11. Policy on Large Language Models (LLMs) πŸ’¬
  12. Extra credit policy βž•
  13. Expected time commitment βŒ›
  14. Course communication πŸ—£οΈ
  15. Computing environment πŸ–₯️
  16. Access and accommodations
  17. Diversity statement
  18. Acknowledgements

Course description πŸ“°

An increasing amount of data is now generated in a variety of disciplines, ranging from finance and economics, to the natural and social sciences. Making use of this information requires both statistical tools and an understanding of how the substantive scientific questions should drive the analysis. In this hands-on course, we learn to explore and analyze real-world datasets. We cover techniques for summarizing and describing data, methods for statistical inference, and principles for effectively communicating results.

Prerequisites: An introductory statistics or probability course taken at Stanford (e.g., MS&E 120 or CS 109), and CS 106A or equivalent.

Course staff 🍎

Josh Grossman (he/him) (Instructor)

Please call me β€˜β€˜Josh’’! No need for Prof., Mr., etc.

Feona Dong (they/them) (CA)

Salil Goyal (he/him) (CA)

Mark Khalil (he/him) (CA)

Mike Van Ness (he/him) (CA)

To contact the course staff, please create a private Ed post. If necessary, we will ask you to follow up via email.

Lecture schedule πŸ§‘β€πŸ«

Tuesdays & Thursdays @ 1:30pm - 2:50pm PT in Shriram 104.

Lecture cannot be attended remotely.

Office hours (OH) πŸ—“οΈ

  • Mondays @ 8-9pm (Mark, Zoom only)

  • Tuesdays @ 3:30-5pm (Josh, in person @ History Corner 105)

  • Tuesdays @ 7-9pm (Feona, Zoom only, 7-8pm is a guided review session)

  • Wednesdays @ 5-7pm (Mike, Zoom and in person @ Gates 100)

  • Thursdays @ 3:30-5pm (Josh, in person @ History Corner 105)

  • Thursdays @ 5:30-7:30pm (Salil, Zoom and in person @ Y2E2 101)

  • Fridays @ 11am - 1pm (Mark, Zoom only)

For security reasons, the Zoom links for OH are posted on Canvas.

To be successful in MS&E 125, we recommend attending at least one OH every week.

  • The Thursday and Friday OH tend to be the most crowded, as homework is typically due Fridays at 5pm midnight. If you need extensive individualized help, come to office hours earlier in the week.
  • We may add or reschedule OH if needed. Please create a private Ed post if you have conflicts with all of the current times.

OH is a great opportunity to discuss not only topics directly related to the course, but also anything else that’s on your mind.

  • For example, we welcome questions about career trajectories and research opportunities in MS&E and beyond.
  • Keep in mind that you do not need to come to office hours with an agenda. Listening in is welcomed and encouraged!
  • Finally, attending and participating in office hours is a great way to set yourself up for a terrific letter of recommendation (this is true for most courses).

Please note that there are no regular OH during the first week of class, but feel free to schedule an appointment if you would like to meet.

Grading policies πŸ’―

  • 7 6 homework assignments (50%)
  • Lecture attendance (10%)
  • Quiz 1 (10%)
  • Quiz 2 (15%)
  • Project proposal (5%)
  • Final project (10%)

Your lowest homework grade will be dropped.

In-person lecture attendance is mandatory, unless otherwise excused.

  • To take attendance, we will have an in-class concept check during every lecture.
  • You are allowed two unexcused absences.
  • See below for the full attendance policy.

Quiz 1 has a redemption policy.

  • If you score higher on Quiz 2 than Quiz 1, your Quiz 1 score will be replaced by your Quiz 2 score.

Every homework answer in MS&E 125 is eligible for extra credit. See below for the course extra credit policy.

Lecture attendance policy πŸ‘₯

In-person lecture attendance is mandatory. It is critically important to practice learning in a live setting.

  • Lecture attendance is a substantial component of your grade.
  • Lecture cannot be attended remotely.
  • You are allowed two unexcused lecture absences. Each additional absence will impact your lecture attendance grade.

If you cannot attend a lecture due to an extenuating circumstance, please complete this form before the lecture starts.

Acceptable extenuating circumstances include:

  • Illness. DO NOT come to class if you are sick! Even a sniffle!
  • Personal emergencies.
  • Important life events (e.g., weddings)
  • Pre-planned collegiate athletic events in which you are a participant.
  • This list is not exhaustive. If you think an absence should be excused, complete this form and explain your reasoning. We cannot guarantee that your absence will be excused, but we will be reasonable.

We will use in-class concept checks to track attendance.

  • Concept checks are not graded.
  • However, when deciding borderline final grades, we may consider demonstrated effort on concept checks.

Lecture technology policy ❌ πŸ“±

Most lectures will consist of an interactive problem-solving session, followed by a hands-on coding session.

  • Laptops and tablets with attached keyboards are not allowed during the problem-solving session, though you are permitted to use a tablet to take handwritten notes. This article explains why we have this policy. Long story short, laptop use can negatively impact the learning of nearby students.

  • If you need to use technology for accessibility reasons, the previous bullet does not apply to you.

  • Laptop use is permitted (and encouraged!) during the hands-on coding session.

Lecture recordings πŸŽ₯

Lectures will be recorded. We cannot guarantee audio or video quality.

Lecture recordings are posted on Canvas.

Office hours are not recorded.

The homework assignments may ask you to watch additional recordings to supplement the lecture material.

Homework submission πŸ“

Homework is generally due on Fridays at 5pm midnight.

  • Unless otherwise stated, assignments are to be done individually. You are welcome to work with others to master the principles and approaches used to solve the homework problems, but the work you turn in should be your own.

You are allotted five slip days for homework assignments and the project proposal. Each slip day adds 24 hours to the deadline.

  • You are allowed to use, at most, two slip days per assignment. In other words, assignments will not be accepted more than 48 hours after the original due date. This policy ensures that we can grade all assignments in a timely fashion.

  • Slip days are intended to account for unexpected delays, like minor illness or homework overload.

  • If you plan to use slip days, do not contact the course staff. We will automatically account for slip days when calculating grades.

  • Extensions will only be granted if required by an OAE accommodation letter, or in extraordinary circumstances (e.g., medical emergencies).

Poorly organized assignments will be docked points at the discretion of the grader. It is critical to have empathy for the person who will be reviewing your work, whether a member of the course staff, another student providing feedback, or your future manager.

Study groups

We encourage you to work together in groups to solidify your understanding of the course material. If you would like assistance forming a study group, please complete this form by Monday, April 8 at 5pm PT. Our goal is to form the study groups the following day, so students can begin discussing the first homework assignment.

Policy on Large Language Models (LLMs) πŸ’¬

LLMs (e.g., ChatGPT) are becoming increasingly essential in the workplace. To that end, the use of LLMs is not only permitted in this course, but encouraged. Use this course as an opportunity to learn where LLMs are most useful, and where they fall short.

Potential uses of LLMs in MS&E 125:

  • Generating practice quiz questions
  • Explaining course concepts
  • Helping you code

Many coding problems from past iterations of the course can now be fully solved by freely-available LLMs. With this advantange in mind, the difficulty and extent of coding required for this course may be increased substantially compared to previous years.

It can be easy to recognize default LLM text output. For example, if you copy and paste answers directly from ChatGPT, be warned that your grader may interpret your answer as lacking in effort. Take the time to understand and paraphrase the information LLMs provide you.

If you find an especially interesting use case of an LLM for any component of the course, please share it with the course staff! We are excited to hear what you find.

Extra credit policy βž•

Every question on every assignment in this course is eligible for extra credit. We want to provide space for you to get excited about particular topics, and be rewarded for it.

  • If your answer is particularly insightful, or goes markedly above and beyond the requirements of the question, you will be awarded substantial extra credit. This is a very high, but achievable, bar. Do not expect to reach this bar more than a handful of times throughout the entire quarter.

  • Note that writing more text does not always constitute a better answer. Brevity is valued in this course.

To be eligible for extra credit, you should self-nominate your answer in a way that is very obvious (e.g., a short note that is all caps, bolded, and/or highlighted). We want you to build confidence with strategic self-promotion, which is (for better or worse) a critical skill for climbing the ranks in the workplace.

  • Impressive answers that are not self-nominated may also be eligible for extra credit, but it’s less likely that the course staff will recognize your extra effort.

Expected time commitment βŒ›

MS&E 125 is a 4-unit course. Each course unit at Stanford is intended to require, on average, 3 hours of weekly work.

Here is an example breakdown of 4 units (i.e., 12 weekly hours) in MS&E 125:

  • 3 hours in lecture
  • 1 hour watching additional videos as part of the homework
  • 2 hours in office hours
  • 6 hours working individually on the homework

If MS&E 125 requires substantially more than 12 hours of your time each week, please reach out to the course staff. We can help you design a study plan.

Course communication πŸ—£οΈ

We use the Ed platform to manage course questions and discussion.

In general, do not email the course staff.

  • Exception: You are welcome to email individual members of the course staff if you have a private concern that you do not want shared with the entire course staff.

Please post publicly when possible.

  • Public posts help many more students than private posts.
  • We may ask you to change your private post to a public post if the answer could be of use to other students.
  • You are always allowed to remain anonymous!

If you include code in your Ed post, please use the code editing fonts:

Hard to read:

── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ── βœ“ ggplot2 3.3.2 βœ“ purrr 0.3.4 βœ“ tibble 3.0.3 βœ“ dplyr 1.0.2 βœ“ tidyr 1.1.2 βœ“ stringr 1.4.0 βœ“ readr 1.3.1 βœ“ forcats 0.5.0 ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()

# here’s my plot code

x <- ggplot(df) +Β geom_point(aes(x = year, y = count))

Easy to read:

── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
βœ“ ggplot2 3.3.2 Β  Β  βœ“ purrr Β  0.3.4
βœ“ tibble Β 3.0.3 Β  Β  βœ“ dplyr Β  1.0.2
βœ“ tidyr Β  1.1.2 Β  Β  βœ“ stringr 1.4.0
βœ“ readr Β  1.3.1 Β  Β  βœ“ forcats 0.5.0
── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag() Β  Β masks stats::lag()

# here's my plot code
x <- ggplot(df) + geom_point(aes(x = year, y = count))

Computing environment πŸ–₯️

The official course materials use the R programming language. Lecture notes and assignments are written with Google Colab.

Important note: The concepts taught in this course are language-agnostic. In other words, everything you learn in this class can be readily implemented using a combination of other tools (e.g., Python, SQL, and Microsoft Excel). LLMs are an excellent aid for translating your knowledge across different programming language and software.

Access and accommodations

Stanford is committed to providing equal educational opportunities for students with disabilities.

If you experience disability, please register with the Office of Accessible Education (OAE). Professional staff will evaluate your needs, support appropriate and reasonable accommodations, and prepare an Academic Accommodation Letter for faculty. To get started, or to re-initiate services, please visit oae.stanford.edu.

If you already have an Academic Accommodation Letter, we invite you to share your letter with us. Academic Accommodation Letters should be shared at the earliest possible opportunity so we may partner with you and OAE to identify any barriers to access and inclusion that might be encountered in your experience of this course.

Diversity statement

It is our intent that students from all backgrounds and perspectives be well served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that students bring to this class be viewed as a resource, strength, and benefit. We aim to present materials and conduct activities in ways that are respectful of this diversity. Your suggestions are encouraged and appreciated. Please let us know if you have ideas to improve the effectiveness of the course for you personally or for other students or student groups.

Acknowledgements

The MS&E 125 materials were adapted from course content originally developed by Sharad Goel. Thanks Sharad!