Cornell University was founded in 1865 in Ithaca, New York by Andrew D. White and Ezra Cornell, the latter famously stating “I would found an institution where any person can find instruction in any study.” The founders could not have envisioned the full extent of modern data science, of course, but scientific research of all types has been at the heart of Cornell’s mission since its beginning. Statistics itself – the precursor or original discipline underlying data science – first came to prominence at Cornell after World War II, with the presence of two seminal figures in the field, Jack Kiefer and Jacob Wolfowitz, as faculty members. Since then, Cornell’s Department of Statistics and Data Science (as it is now called) has hosted and continues to be the home of many prominent researchers in theoretical and applied statistical methods.
Data Science Programs at Cornell
Cornell University offers two undergraduate degrees in statistics and data science, as well as the M.S. and Ph.D., all of which enroll numerous students who find successful careers upon graduation. But its flagship Master of Professional Studies in Applied Statistics, or M.P.S., is unique and is the only program of its type offered by an Ivy League university. The M.P.S. is a two-semester Master’s degree program that provides training in a broad array of applied statistical methods. It has several components: (i) a theoretical core focusing on the underlying mathematical theory of probability and statistical inference (with a 2-year calculus prerequisite); (ii) a wide selection of applied courses including (but not limited to), data mining, time series analysis, survey sampling, and survival analysis; (iii) certification in the SAS® programming language (required); (iv) a professional development component including in-depth training in career planning and job searching, interviewing and resume writing, professional standards and etiquette, etc.; and (v) a year-long, hands-on, start-to-finish professional data analysis “capstone” project.
The Dynamic Leadership
Dr. John Bunge is the founding director of the M.P.S., in 1999-2000, and served in that role for 12 years. The position was then held by another Statistics professor, and at the end of his (6-year) term Dr. Bunge again became Director and will continue through 2021. Dr. Bunge has witnessed the program growth from an initial enrollment of 6 students to its current steady-state of 60, which is about the institute’s maximum capacity.
Interestingly, the number of M.P.S. applications seems to continue to increase so that the demand for the available spaces becomes ever more intense. “We are content with many of the decisions we made in designing the program (as long ago as the 1990’s), but we continue to monitor professional trends in data science and to adapt our program accordingly,” Dr. Bunge said. “In particular in the past decade we have added a second “concentration” to the M.P.S., so that students may now specialize more in classical (and modern) statistical data analysis; or (the second concentration) in more computationally oriented data science, including topics such as Python programming, database management and SAS, and big data management and analysis.”
Prominent Features of the Program
Dr. Bunge believes one of the chief advantages of the program and curriculum is that Cornell gives students a broad foundation in the fundamentals of statistical analysis and related computing. These skills are transferable: they can be used in finance, pharmaceutical and biological research, survey sampling and public opinion research, data security and privacy control, and many other fields. The institution has seen graduates move among seemingly unrelated applications areas, owing to the fact that their fundamentals are sound. Other advantages include: (i) Cornell’s extraordinary, world-class faculty in Statistics and Data Science, who teach most of the M.P.S. courses; (ii) its dedicated professional development advisers and support staff, who spend enormous amounts of time with the M.P.S. students; and (iii) Cornell’s focus on continuous improvement of the program, and its desire to anticipate (to the degree possible) future developments in the professional data science landscape.
Offering Extraordinary Industry Exposure
The main type of practical exposure offered to M.P.S. students is the M.P.S. project. During the fall semester, the faculty identifies a number of current applied research projects, some within Cornell or from Weill Cornell Medicine (the university’s medical school in New York City), some from external clients in the private or nonprofit sectors. The M.P.S. class is then divided randomly into teams of 3 or 4 students, and each team ranks the available projects by preference. The faculty then assigns projects to teams, attempting to accommodate preference as well as possible (this is known as the “fair item assignment” problem). Teams then have until the end of the spring semester to complete their projects. In the course of this, the team must communicate continuously with the client; formulate and re-formulate the problem in statistical terms; organize and manage relevant data (provided by the client); carry out statistical analyses using suitable computational methods and software; and finally provide both a written and an oral presentation of the results.
Upon completion, the projects are evaluated by the students themselves, the clients, and the faculty, and each year one or two “best project” awards are made. This is the closest experience to actual on-the-job statistical consulting that can be obtained within the academy, and it is very effective both as a learning process and as proof of competency for M.P.S. graduates.
In addition, Cornell allows M.P.S. students to elect to take an additional semester of study, which then introduces the opportunity for an internship in the intervening summer, another form of practical exposure for students.
Overcoming Academic and Industry Challenges
Dr. Bunge feels the most significant challenge is simple, and characteristic of any aspect of the technological or scientific enterprise: keeping abreast, or preferably ahead, of current developments. In practical terms, for example, what software will the students need to be familiar with? SAS® is still important but R is increasingly so, not to mention scripting languages such as Python, and big data resources or environments such as Hadoop. It is a major undertaking to stay current with developments in these areas much less to predict their future directions, and academics, while experts in their own fields, are less conversant with trends in industry, government, banking and so forth. From a broader perspective, what will be the industries of the future, and how will they apply data science? A forward-looking program cannot ignore, to take just three examples, quantum computing, genome editing (CRISPR), and for-profit space exploration (e.g., asteroid mining). These may seem like science fiction at present, but in no time at all, we will be sending our data science graduates to work in these fields, and we must prepare them accordingly, he said.
Remarkable Accomplishments of the University
The most important achievements of the university are the outcomes for its M.P.S. graduates. First, Cornell has a near-perfect placement rate: for the class of 2017, 96% of its 52 graduates were placed in statistics or data science field, with a median salary of $75K/year (range $50K-155K), and job titles such as Data Scientist, Data Analyst, Statistician, Engineer, and so on. Cornell also offers an aggressive program of exploring H-1B visa opportunities for international students. “In addition, we know that our M.P.S. program is highly regarded both within the U.S. and internationally, based on the information from our partners, clients, and employers, and also from our own graduates. Indeed, probably the best endorsement of our program is that our own M.P.S. graduates often hire later graduates into their own firms,” Dr. Bunge added.