# A New Science of Cities?

**Published by:** [Pioneering Spirit](https://pioneeringspirit.xyz/)
**Published on:** 2026-06-19
**Categories:** cities
**URL:** https://pioneeringspirit.xyz/a-new-science-of-cities

## Content

Late 2014, when I was a graduate student at NYU's Center for Urban Science and Progress, I wrote a long essay trying to make sense of something that felt genuinely exciting and slightly overhyped at the same time: the idea that cities had become legible to science in some new way. Geoffrey West and Luis Bettencourt had published their scaling law work. Michael Batty had written "The New Science of Cities." Santa Fe Institute researchers were finding power law relationships in urban data that felt analogous to metabolic laws in biology. The pitch was heady -- cities as organisms, quantifiable, governed by deep mathematical regularities that transcended their particular histories and politics. I was skeptical but genuinely curious, and I tried to work through what I actually thought. Reading it again a decade later, a few things stand out. The basic epistemic cautions still hold. Cities remain stubbornly resistant to unified theory, and the scaling law claims have aged into something more nuanced and contested than their early proponents suggested. What surprised me more was how much the informatics speculation in the middle section pointed toward things that actually happened: real-time urban data, longitudinal learning analytics, the attempt to use web-scale data to ask better research questions. We got a lot of that. We also got machine learning systems optimizing for prediction while remaining largely mute on causation, which is its own kind of epistemic problem. The illegible messy tangled crooked timber of human life in cities still remains, in many ways with an only larger disconnect as increasingly obsolete institutions largely persist through a zombie-like logic in the midst of dramatic information technology change. The rest of this post is the original paper. A New Science of Cities? “There is no single paradigm that has come to dominate our understanding of cities, and as we will elaborate here, there is unlikely to be one.” – Michael Batty, “The New Science of Cities” In the past few decades, a diverse collection of physicists, urban planners, computer scientists, engineers and other technical disciplines have renewed interest in the quantification of urban phenomena. Proponents argue the discovery of urban scaling laws, increasingly sophisticate agent based models and the increased availability of data brought on by the digital revolution point towards the development of an entire new science. Whole research programs have sprung up to study these issues more deeply. What should we make of this conjectured new science of cities? Two points stand as unequivocal. First, it’s early days. Urban scaling laws hold for particular countries – primarily developed ones – across the past few decades. Yet will the same sublinear infrastructure scaling relationship hold for Mumbai or a newly developed city in China or a colony on Mars? Will those patterns hold across massive technological change? A robust dataset that spans centuries for every city within human civilization would enable answers closer to the level of generality to which this new science aspires. Second, the tools this science uses may be new but the questions it raises are decidedly not. The aspiration towards a deeply quantitative understanding of human affairs is at least as old as Adolphe Quetelet’s 1835 “Essays on Social Physics” and his findings of the (then) surprisingly invariant annual murder count for certain cities across time. Considering murder an act of free will, Adolphe expected a high variance that he did not find. That mathematical pattern encouraged Adolphe’s aspirations towards a social physics that has not come to pass. More broadly, techniques developed in the natural sciences have long history of application to societal questions, not least the development of statistics itself! Ordinary least squares regression – a technique essentially ubiquitous today across the across the social sciences – began as a method for combining astronomical observations. This quantification of human affairs, particularly in economics, has endured criticism portraying the effort as simplistic and reductionist at best and useless at worst. Further quantitative models played a pivotal role in the fires that decimated the South Bronx in the late sixties and the 2008 financial crisis. That is not to say data and quantitative methods have played no part in recent human progress. Over the past two centuries, increasingly robust census demographic data and quantitative modeling have improved everything from infrastructure development to school choice. It is hard if not impossible to imagine the tremendous human progress over the past two centuries without improved data collection and manipulation tools. How would one even begin to conduct the Federal Reserve’s business today with pen and paper? What then are we to make of the potential for understanding cities with new data collection and manipulation tools pioneered in the natural sciences? Looking at the same challenge from another angle, what is the current state of social science inquiry? Where are the opportunities for improvement? When is ceteris really paribus? “In contrast to the situation of controlled experimentation that motivated much of the development of modern statistical methods, where variables of interest can be varied orthogonally, the political analyst typically confronts a situation where an assortment of equally plausible theories suggest several closely related (and therefore highly correlated) variables as possible causal factors.” – Phillip Schrodt, “Seven Deadly Sins of Contemporary Quantitative Political Analysis” Cities embody dense causal webs. A huge variety of natural environment, artificial infrastructure and human factors coalesce in a single place. Within that complexity, where does one look to find what caused a given phenomenon? Consider a basic education question: how does class size impact student achievement? A researcher might study how test scores vary by class size across cities. Yet any number of other relationships could be related to class size and really be the driving cause. Class size could relate to the level of wealth in the surrounding community which affect the number of opportunities students have outside of school. A researcher might “control” for those factors with other variables such as neighbor home prices in a linear regression, but that model assumes independence across the explanatory variables. And as Phillip makes clear, that is not generally a robust assumption: “As a consequence, linear regression results are notoriously unstable—even minor changes in model specification can lead to coefficient estimates that bounce around like a box full of gerbils on methamphetamines. This is great for generating large numbers of statistical studies… but not so great at ever coming to a conclusion.” Collinearity can result in coefficients oscillating dramatically or even changing sign across specifications. Further, in a casually dense system like a city or an urban public school, we likely will not account for causally relevant variables within our model. This omitted variables will bias the coefficient results of the included variables. Perhaps then, we should not be surprised at this 1979 statistical meta-analysis summary of the previous few decades of research into the impact of class size on student achievement. “There is no point in recording the obvious about class size: that teachers worry about it more than nearly everything, that administrators want to increase it, that it is economically important, and the like. The problem with class is the research. It is unclear. It has variously been read as supporting larger classes, supporting smaller classes, and supporting nothing but the need for better research.” – Gene Glass and Mary Smith, “Meta-Analysis of Research on Class Size and Achievement” The standard social science research answer to this challenge is simple: random controlled trial. Randomization minimizes identification effects – the impact of what else relates students to being in a given class size for instance – and allows for statistical inference into the impact of a particular policy treatment. And a well-designed experiment indeed happened to test the efficacy of class size reduction. The Tennessee Star Study randomly assigned teachers and students to different sized classes, ranging from 13 to 25 students. This blue ribbon effort was used to justify smaller class sizes in Tennessee but also pivotally in California. Here was a model of social science research still used as a canonical example in econometric textbooks. There’s a simple problem. It didn’t really work in California. The class size reduction passed just six weeks before the start of the school year and implementation was a nightmare. High poverty school districts struggled to hire qualified teachers to staff the new class rooms and inequities increased between rich and poor schools. Across the state, the biggest legacy of the policy was the now common school yard bungalow classroom. What lessons can we learn from this failure about the opportunities for improvement within modern social science? First, note that many social scientists will likely argue that the fault lay not in the research but rather in the improper application by policymakers. Yet the common practice among social science research is to state all the statistical reasons for uncertainty – the problem of generalizing from a specific random control trial to other situations for instance – and then moving on to make policy recommendations in no uncertain terms. So what are policymakers supposed to do after decades of yes, no, maybe so education reform answers from the research community? Ignore blue ribbon research with strong conclusions? California had money to spend on education and chose a policy aligned with social science research. What else do social scientists want from policymakers? The world demands decisions, and public school administrators and policymakers must size classes. Second, class size illustrates the faults of focusing on average treatment effects. Why do we assume class size can be optimized like tuning the batch size of an assembly line process? What would even be the ideal experiment we would run to test the class size optimization hypothesis? Would a statistically robust sample of random field trials of randomly chosen students assigned to randomly assigned teachers in random school sites in a wide variety of student to teacher ratios really determine the relationship of class size to student achievement? These questions illustrate the methodological incoherence of the current social science research paradigm. Yet even if such a research agenda of random control trials were possible, how do we know that the relationship is constant across time? Might students of different cultural time periods and technological contexts learn differently in the same sized class? Why should we be confident that test scores really reflect student achievement over the long run? Why base our research on average outcomes anyway? What logic do we have to support the assumption that optimizing for average students is even possible let alone desirable? The dime store metaphysics embedded in the class size research agenda fails a basic task of intellectual inquiry: questioning assumptions. Why should we have one class size to fit all students? More broadly, this type of one-size-fits-all thinking seems far too endemic to current social science thinking. Why should understanding average treatment effects be the default goal of social science? After all: “Human nature is not a machine to be built after a model, and set to do exactly the work prescribed for it, but a tree, which requires to grow and develop itself on all sides, according to the tendency of the inward forces which make it a living thing.” “Such are the differences among human beings in their sources of pleasure, their susceptibilities of pain, and the operation on them of different physical and moral agencies, that unless there is a corresponding diversity in their modes of life, they neither obtain their fair share of happiness, nor grow up to the mental, moral, and aesthetic stature of which their nature is capable.” —John Stuart Mill, On Liberty (1859) Yet social scientists, founded in wake the industrial triumphs of the late 19th century, seem to have ignored John’s exhortation. So how might we do better? Where are the opportunities for a science of cities to inform better social science research? Why does informatics matter? What strategies do new digital tools enable? The new informatics-enabled frontier “The idea that no one really knew how to run a government led to the idea that we should arrange a system by which new ideas could be developed, tried out, and tossed out if necessary, with more new ideas brought in — a trial and error system. This method was a result of the fact that science was already showing itself to be a successful venture at the end of the eighteenth century. Even then it was clear to socially minded people that the openness of possibilities was an opportunity, and that doubt and discussion were essential to progress into the unknown. If we want to solve a problem that we have never solved before, we must leave the door to the unknown ajar.” – Richard Feynman, “The Value of Science” How might scaling laws and other ideas developed in the nascent science of cities inform the class size debate specifically and education reform discussion more broadly? These megatrends abstract too far from the day to day activity of a classroom to be useful directly but offer insight into potential avenues of productive inquiry. So what hypotheses might we generate from the superlinear scaling of intellectual productivity as measured by patent production for instance? I conducted a naïve regression of patents per capita on drop-out rates to illustrate the potential for further inquiry. Higher patents per capita correlate with lower dropout rates. In many ways, this result is not surprising. Education does not occur a vacuum after all. Children learn outside of formal public schooling from the world around them, and we would expect the intellectual vibrancy of an area – as measured by patents produced among other potential metrics – to positively correlate with student achievement. What if, taking inspiration from Michael Batty’s work on network connections through urban environments, we explored how educational attainment, skill development and other measures of intellectual vibrancy change throughout a city spatially across time? What if we used new data streams enabled by the web – peer certified skills verified on LinkedIn or intellectual discourse on social media newsfeeds – to understand how conditions outside of school impact student achievement? And why not use more robust data sources than simply end of the year scantrons to understand student achievement? Why not track how students learn throughout the year as evidenced by blended learning tools like Khan Academy quizzes? This data certainly will have its imperfections but could definitely add insight into students’ learning trajectories within the school year. Going further, why not longitudinally track how students actually apply their skills after graduation through interfaces like LinkedIn among other tools? Together would this not enable us to understand student learning far closer to the simple intuitive perspective that students learn from the entire world around them – not just what standardized tests capture after sitting in a formal classroom? At the very least, this research program would enable us to get a picture of how well students are learning the skills the world demands. This research would inform how to best build bridges between public schools and employers in the broader city, an urgent question in the education community: “Everywhere I go, I talk to employers saying we can’t find the skilled workforce we need, and I talk to young people in schools saying we want opportunities and we can’t find them. Somehow the matchmaking, the connection (between business and education) doesn’t happen in too many places.” – U.S. Secretary of Education Arne Duncan That challenge has nontrivial economic and future scientific research implications – not to mention the moral fiber of the nation. This informatics enabled approach to education research would offer additional improvements to the status quo epitomized by the class size question. For starters, by measuring student achievement in more granular time scales both within public school and beyond graduation, we can examine what makes sense for more specific types of students – such as an athletically inclined math geek – and even individual students. That research rests on a sounder philosophical framework than simply assuming all students are roughly similar and optimizing class sizes for average student achievement. Further why not improve the connection between research and practice? The feedback loop between experimental design, random field trial, study and policy change spanned decades in the class size research question, creating large disconnects between the maps researchers build and the territory of actual practice. Imagine an alternative approach. What if we randomly assigned different nonprofit programs aiming to connect students to business mentors as a matter of standard practice similar to how web companies regularly conduct A/B tests of new features? Experiments at the timeline would enable policy changes in response to what we learn about local conditions rather than generalizing based upon the few scant studies. Those experiments assume an enlightened and agile school administration but there is a lot of opportunity nonetheless – particularly at the scale the web offers. In conducting those nonexperimental analyses, we might use web scraping, access to organic social media data, real time blended learning assessment data and other web data such as surveys mashed up with standard student achievement metrics. The web also enables creative aggregation of qualitative and anthropological data. In addition to standard classroom visits, imagine crowdsourcing first hand accounts of how local neighborhoods interact with specific schools. With that data in hand, we might ask questions like: what is the optimal algebra class size for students that successfully understand geometric visualizations but struggle with open ended word problems? We can further contextualize those curricular questions by neighborhood characteristics and the broader learning environment students live in (whether they have internet access at home or not for instance). Rather than ask what the ideal class size is for all students at any time anywhere, we might work to improve the classroom environment for this unique group of students at this unique stage in their learning trajectory. These questions show how social science can use lessons from the nascent science of cities to generate contextual hypotheses and flexibly use new informatics tools to gather the appropriate data radically easier than the research world that existed before the web. Too often social science researches questions not for any theoretically cogent reason but rather simply because those are the questions that accessible data allows. Case in point class size and standardized test scores. Yet as we have seen, new informatics tools enable us to inform whole new categories of research questions with new data sources such as how classroom characteristics correlate to lifelong learning and career development. In short, we can ask questions based more on what boundaries we understand to be actually relevant rather than the categories we impose on the world. What are the broader takeaways for the nascent science of cities? What does this new research approach mean for how we understand cities specifically and human civilization more broadly? What have new informatics tools not changes? What challenges endure? Computation, causality, and the enduring challenges of human inquiry “…a single problem: how to combine the perspective of a particular person inside the world with an objective view of that same world, the person and his viewpoint included.” – Thomas Nagel, “The View from Nowhere” These above debates about class size may have acquired a particular focus in the context of twentieth century public education, but the underlying questions about how we understand human affairs have been debated for millennia. How do we know what we know? What knowledge is unbiased by our particular opinions and feelings? How can we make sense of the world in light of Thomas’s eloquently posed problem? These questions acquire special urgency in the context of human affairs. Cities are not merely physical systems that we study. They are collections of citizens with opinions that may conflict with our very research agenda! More deeply, there are several reasons why we will not achieve the same precision modeling cities as we do purely physical systems. First, any model cannot include every relevant variable and will always include specification error. Achieving the right level of abstraction and choosing which variables to leave out is especially paramount when mapping human behavior, a point Jorge Luis Borges eloquently makes in his parable of a map the size of empire it attempted to describe. Second, we will always have measurement error. In addition, to data collection errors – typos in data entry or a biased instrument – we can also have disagreement about what constitutes the right metric for a given variable. See the debate about measuring student achievement for instance. Reconciling those disagreements requires the researcher to make choices. Third, cities embody incredibly complex and chaotic systems. The nature of chaotic systems is that they can achieve widely divergent results based on small changes in inputs. How we choose to account and appropriately model that variance represents another choice on the part of the researcher. Together these choices influence how we measure, model and thus scientifically understand the systems we study. And unlike the natural sciences, we do not study a world that we can plausibly isolate from the influences of human agency. What makes cities such fascinating areas of study results from the incredible interactions and emergent phenomena that bubble up when so many humans – each with their own plans, projects and objectives – live, work and play in such close proximity to each other. Further, unlike the natural sciences, these research choices cannot appeal to simply understanding the world as it objectively is. Researchers have their own set of political, economic, social, cultural and other biases that affect how any human understands any other. Scientists studying cities cannot come at these questions from a view purporting to originate from nowhere. In fact, the very rationale for studying cities scientifically is to improve how cities operate! Such normative appeals by definition necessitate a view built upon moral judgments decidedly coming from somewhere. These issues stemming from human agency come to a head when working to understand causality in the context of cities. We do not have merely Hume’s billiard balls to contend with. Rather than imagining simply several possible outcomes when one isolated billard ball strikes another isolated ball, in cities we study infinitely more complex causal interactions. Class size might plausibly impact student achievement contingent on a nurturing home environment or a safe neighborhood. Perhaps further those relationships only hold at certain thresholds. The legal system expends tremendous energy to uncover whether an event would have occurred “but for” the cause in question. Yet researchers looking to not just understand what happened in the past but to build theories with predictive power for the future must reach a higher bar. We cannot simply understand isolated causes but need to work to understand how higher order interaction effects result in different outcomes. Class size may help or hurt student achievement depending on teacher quality for instance. Further we must remember that omitted variable bias acquires special urgency when we seek to a causally dense environment like a city. A variable unspecified in our model might flip the sign or dramatically alter the magnitude of another variable’s relationship. And in inferring from our imperfect models we must remember that understanding the direction causality flows, if it even can be said to have a single direction, can be a matter of subjective interpretation. What causes a school to be good? What is the causal connection between school quality and poverty? In a deep sense, we should not be surprised that causal questions are so tricky. Causes represent a special kind of knowledge, a tool we humans create to make sense of the world around us. As such, causes are irrevocably embedded with the biases stemming from our particular position in time or place, a challenge that acquires particular urgency when studying cities. Our understanding of cities and civilization remains contingent on the time, place and political economy arrangements that produced such ideas. Embracing contingency with both arms “All models are wrong. Some are useful.” – George Box, Statistician What then are we to do? What should we make of a science of cities’ aspiration to a unifying predictive understanding of “the structure, dynamics, growth and organization of cities?” We can appreciate that our study of cities or really any aspect of human civilization is not immune to tradeoffs. We might choose to ignore the challenges of contingency and assert that new science of cities articulates universal objective truths irrespective of the particular conditions that gave rise to their formulation. We might aspire to “know everything” about a city and not bother with appreciating how understanding shifts with perspective. There lies the road to metaphysics and dogmatism. Alternatively, we might choose to obsess over the complications created by the deep links between contingency and our all too human condition. We could follow in the path of postmodernism and assert that meta-narratives are dead, that an understanding that extends beyond our finite position in time and space is only a naïve delusion. There lies the road to stultified skepticism. Or we might choose a third option. Rather than choosing a single path, we might recognize that many valid perspectives exist for understanding cities. We can appreciate that the new scale of digital data allows us to see cities in a new light. By developing an understanding of the complex interactions between place, flows and networks, this nascent science of cities allows greater appreciation for the landscape megatrends that bound urban life. Grand narratives like patent superlinear scaling laws and theories about agglomeration effects do not determine the universal future of our cities. A science of cities cannot tie all of humanity up in a neat mathematical bow. Instead it proceeds from the deep intuition that we as humans do share common conditions regardless of our contingent circumstance. And such a science aims to articulate the deep patterns that tend to emerge from the intersection of so many human experiences in a single place. By highlighting commonalities and how communities differ, such insights can inform the choices that we make at a local level in areas such as education. Other areas of inquiry such as established areas of social science and the ground level anecdotes of practitioners equally can offer insights. Those decisions do not have right answers and no model can replace human judgment. Rather such questions engage debates about justice that have been ongoing for thousands of years. The English word City itself stems from the Latin word civis or citizenship. Perhaps then there is no better function for a science of cities than to serve as a forum for the great debates about the future of our cities. By articulating megatrends that affect cities and pioneering computational methods that cut across disciplines, this inquiry can bring together diverse disciplines in a common undertaking – an academic corollary to the nature of cities themselves. So in sum, what should we make of a new science of cities? A tremendous opportunity to engage the questions that have challenged humanity since the dawn of civilization. References Michael Batty, “The New Science of Cities” Phillip Schrodt, “Seven Deadly Sins of Contemporary Quantitative Political Analysis” Gene Glass and Mary Smith, “Meta-Analysis of Research on Class Size and Achievement” John Stuart Mill, On Liberty (1859) Richard Feynman, “The Value of Science” Arne Duncan, U.S. Secretary of Education Thomas Nagel, “The View from Nowhere” Stephen Stigler’s The History of Statistics: The Measurement of Uncertainty before 1900. Further Reading Dewey’s Great Community in a Digital Age Pioneering Spirit Sep 4, 2025 In the tech world right now, the hottest idea in education is AI tutoring. Personalized, mastery-based, infinitely patient digital guides — the dream is that every student could have the kind of one-o... 0 supporters Support

## Publication Information

- [Pioneering Spirit](https://pioneeringspirit.xyz/): Publication homepage
- [All Posts](https://pioneeringspirit.xyz/): More posts from this publication
- [RSS Feed](https://api.paragraph.com/blogs/rss/@pioneering-spirit): Subscribe to updates
- [Twitter](https://twitter.com/patwater): Follow on Twitter