How to create a free distributed data collection “app” with R and Google Sheets

A really neat concept that reminds me of my time tracking method:

Jenny Bryan, developer of the google sheets R package, gave a talk at Use2015 about the package.

One of the things that got me most excited about the package was an example she gave in her talk of using the Google Sheets package for data collection at ultimate frisbee tournaments. One reason is that I used to play a little ultimate back in the day.

Another is that her idea is an amazing one for producing cool public health applications. One of the major issues with public health is being able to do distributed data collection cheaply, easily, and reproducibly. So I decided to write a little tutorial on how one could use Google Sheets and R to create a free distributed data collecton “app” for public health (or anything else really).

There’s plenty of hype around “big data” in every field, including education, but I’m convinced there are many interesting questions that could be investigated at the school/district level with a “small data” approach supported by this workflow. There are obvious limitations - you wouldn’t want to put individual student/teacher data in a public Google Sheet - but it’s worth thinking about the kinds of programs/practices in schools that could benefit from research supported by distributed data collection.

Long-Term Orientation and Educational Performance

If there’s one common thread between Robin Lake relating The Boys in the Boat to schools, Neerav Kingsland on a recent charter study, and Robert Pondiscio on Hillbilly Elegy, it’s that culture plays a huge role in education. A new NBER Working Paper provides powerful evidence that it may be even more critical than was previously understood.

The authors examine the performance of first and second generation immigrant students in Florida to understand the relationship between the “long-term orientation” of the cultures they come from to their educational outcomes. They find that students from families from countries with a high degree of long-term orientation score higher and grow more over time than similar students from countries with lower long-term orientation. They also are more likely to have better attendance, fewer disciplinary incidents, and enroll in more advanced classes.

To put the size of the “long-term orientation” effect into context, the authors note that their data showed that a student of a mother holding a college degree scored 40% higher in math than the child of a high school dropout. The same data shows that holding everything else equal, moving from the lowest long-term orientation measure (Puerto Rico) to the highest (South Korea) translates to a 73% higher math score, almost twice the impact of a parent with a college education compared to a high school dropout.

Education reform is often focused on factors that can be addressed via policy change, like human capital management, standards/curriculum, and school governance/accountability. Yet as challenging as those issues are, we can’t ignore the power of culture, and more specifically, values that support delayed gratification. Fordham’s Education for Upward Mobility conference is a start, but given the power of culture to impact educational outcomes, this conversation deserves a more prominent place within the ed reform community.

ESSA and the Administrative State

Chad Aldeman on the delegation of policy making in ESSA:

Congress was able to reach broad bipartisan agreement on ESSA mainly because it punted on a number of of key policy questions. Any reading of ESSA leaves one wondering what exactly Congress meant when it asked states to “meaningfully differentiate” among schools, when it required that states give “substantial weight” to each indicator, or when it stipulated that academic indicators count for “much greater weight” than non-academic ones.

While I am pleased that ESSA will give states more latitude to develop school accountability models, I share Aldeman’s concern that we don’t really know how any of the candidates for president would approach ESSA rule making. This exemplifies a long-term trend in which Congress cedes power to the administrative agencies tasked with implementing legislation - a topic covered in the most recent episode of The Federalist Radio Hour.

If this habit is not broken, we will continue to find ourselves lamenting the uncertain implementation of major legislation.

Aggregation Theory and Education

Five years ago, Marc Andressen took to the pages of the Wall Street Journal to describe how software was “eating the world”:

“More and more major businesses and industries are being run on software and delivered as online services—from movies to agriculture to national defense. Many of the winners are Silicon Valley-style entrepreneurial technology companies that are invading and overturning established industry structures. Over the next 10 years, I expect many more industries to be disrupted by software, with new world-beating Silicon Valley companies doing the disruption in more cases than not.”

It’s not hard to come up with a list of examples of this prediction coming true, from taxis (Uber) to hotels (AirBnB), but software-driven disruption has yet to significantly change K-12 education. Existing online education offerings, such as Coursera and Khan Academy, show promise, but few would argue that their impact on K-12 education to date could be described as anything close to “disruptive.” Jay P. Greene explains why all-online education has failed to gain traction:

Online courses appear to be less effective in getting the average student to learn and I suspect the problem is that teaching online is less able to create social communities and authentic relationships that are necessary to motivate students. Having a human being in front of students who would be disappointed if students did not learn the material seems important and something that online instruction has not been able to simulate. Students appear to be better motivated to learn when they have an in-person, authentic relationship with a teacher and when they try to please that teacher by working hard to learn. Digital instruction or a human being on the other side of the internet may not be able to create that same relationship and motivation.

The question remains: if K-12 education is going to be disrupted by software, what would it look like? Is there a way to combine the scalability of software with the power of strong interpersonal relationships?

Delving deeper into the disruption of other industries might give us an answer. Ben Thompson’s Aggregation Theory provides a great framework to understand how the Internet is changing entire sectors of our economy:

“The value chain for any given consumer market is divided into three parts: suppliers, distributors, and consumers/users. The best way to make outsize profits in any of these markets is to either gain a horizontal monopoly in one of the three parts or to integrate two of the parts such that you have a competitive advantage in delivering a vertical solution. In the pre-Internet era the latter depended on controlling distribution.

For example, printed newspapers were the primary means of delivering content to consumers in a given geographic region, so newspapers integrated backwards into content creation (i.e. supplier) and earned outsized profits through the delivery of advertising. A similar dynamic existed in all kinds of industries, such as book publishers (distribution capabilities integrated with control of authors), video (broadcast availability integrated with purchasing content), taxis (dispatch capabilities integrated with medallions and car ownership), hotels (brand trust integrated with vacant rooms), and more. Note how the distributors in all of these industries integrated backwards into supply: there have always been far more users/consumers than suppliers, which means that in a world where transactions are costly owning the supplier relationship provides significantly more leverage.

The fundamental disruption of the Internet has been to turn this dynamic on its head. First, the Internet has made distribution (of digital goods) free, neutralizing the advantage that pre-Internet distributors leveraged to integrate with suppliers. Secondly, the Internet has made transaction costs zero, making it viable for a distributor to integrate forward with end users/consumers at scale.”

Applying Aggregation Theory, K-12 schools can be viewed as an industry that integrates the supply and distribution of educational content. Like newspapers, this itegrated product is limited to a certain geographic area. It also helps us to understand that most popular education reforms only seek to improve certain aspects of this relationship instead of fundamentally changing it. For example, charter schools may introduce more competition for students (customers), but they still (generally) operate with the same model of integrated supply and distribution of educational content to students.

Some tech companies are trying to flip the equation by integrating the delivery of educational content more directly with the student experience, making the actual content more of a modular component. Apple, Google, and Facebook (in collaboration with Summit Charter Schools) each offer products that allow teachers to provide a digital classroom experience for their students while also maintaining in-person relationships. This may help them avoid the pitfalls of the all-online education efforts identified by Green, but can any of them truly change the integration of content and delivery in K-12 education?

Of the three large tech companies, I think the Facebook/Summit approach has the most potential to deliver Aggregation Theory-style disruption. The online classroom products of Apple and Google are mostly abstractions of a traditional classroom in the cloud: teachers assign work to each student, then students submit it by a given due date. All of this happens online, but everyone in the class is still on the same timeline.

The Facebook/Summit collaboration, named Basecamp, takes a more Netflix-ian approach to delivering content. Here’s how it’s described on the Basecamp website:

“Students work through playlists of content at their own pace and take assessments on demand. They also work with teachers to set short-term and long-term goals and connect these back to their daily actions.”

Under the Facebook/Summit approach, time is no longer a constraint for students that have already mastered certain content, nor is it a limitation for students that may need more time to fully grasp a concept before moving on to the next lesson.

This represents a fundamental shift in the relationship between students,how they learn, and the role of educators in supporting students through that process. The Basecamp platform is still in its infancy and has much room to develop, particularly to address the needs of students with diabilities or limited English proficiency. However, we should still recognize the potential this approach could have in changing how K-12 schools operate.

Why American Schools Are Even More Unequal Than We Thought

Using data to inform our conversations about public school performance is a good idea, but too often, the measures we use are reduced to imprecise terms like “proficiency,” which can carry several different meanings when describing a local, state, or national assessment1.

As Susan Dynarski notes in The Upshot, this is also a common problem with the most-frequently used proxy for “poverty” in education, Free/Reduced Price Lunch (FRPL) eligibility:

“Nearly half of students nationwide are eligible for a subsidized meal in school. Children whose families earn less than 185 percent of the poverty threshold are eligible for a reduced-price lunch, while those below 130 percent get a free lunch. For a family of four, the cutoffs are $32,000 for a free lunch and $45,000 for a reduced-price one. By way of comparison, median household income in the United States was about $54,000 in 2014.

Eligibility for subsidized school meals is clearly a blunt indicator of economic status. But that is the measure that policy makers, educators and researchers rely on when they gauge gaps in academic achievement in schools, districts and states.”

In practice, this means that when we refer to FRPL students as “economically disadvantaged,” we’re really painting with a broad brush. Thankfully, Dynarski and her co-author, Katherine Michelmore, devised a way to use current FRPL data to produce a more precise picture of student economic disadvantage: instead of looking at FRPL-eligibility in the current school year, we can use longitudinal datasets to look at how many years a student has been FRPL-eligible.

The concept is simple. If you were comparing two fifth grade students, student A and student B. If student A has been FRPL-eligible for a year and student B has been FRPL-eligible for five years, it’s clear that student B has a greater economic disadvantage than student A.

Dynarski continues:

No one ever actively decided that eligibility for subsidized meals was the best way to measure students’ economic disadvantage. The metric was widely available and became by default the standard way to distinguish between poorer and richer children. But it was always an imprecise measure, and we can do better at little cost.

We’ve already seen researchers stand up to advocate for better ways to quantify “proficiency” - I hope we see a similar movement by researchers to advocate for better measures of poverty. Supporting Dynarski’s approach would be a good (and cost-efficient!) step in that direction.

  1. This is a particular problem when NAEP results are released. Just remember, friends don’t let friends engage in misNAEPery