I have written over 30 articles on cities, and from this a pattern has emerged: It seems to me that the emergence of cities and their optimization can be treated a bit like physics, like a deterministic science.1 I have seen bits and pieces of evidence for this across the Internet, but nothing holistic. That’s what I want to do in this series, with an overview of three big points:
How did cities emerge? (this article)
How did cities grow?
What makes a city better?
1. How Did Villages Appear?
Historically, the biggest driver of the number of people that could live in a specific area—the density of people—was food per acre:2 If one acre can produce a lot of food, it can feed lots of mouths, and lots of people will live there. Less productive land will not feed as many people, so population density will be lower.
Hunting and Gathering
This has been true since forever. As hunter-gatherers, we could only survive if there was enough food around. This usually meant game (hunter) and fruit (gatherers). In other words, calories.3
Places with few calories available, like mountains or deserts, couldn’t host many people. Instead, they concentrated where these resources were most naturally abundant: tropical forests, coasts, riversides, and estuaries. This is why the bountiful US Pacific Northwest could sustain more than 50 ppl/km2 thanks to its whales, salmon, and other resources, while the more hostile Amazon Rainforest can only sustain 0.2 humans per km2, and Arctic regions as low as 0.002 ppl/km2.
The same rule was valid once humans developed agriculture over the last 10k years or so. But how many people could fit in one place with agriculture?
The Malthusian Trap
More people can produce more food from a farm, up to a certain point.

More people don’t mean more food production, but they do mean more food consumption.
The balance between production and consumption will determine the size of a village.
Food Production
Let’s take the baseline of 2000 calories per person per day. This means a person needs to consume approximately that amount every day of a year to survive, so 730k calories per year from harvests.4 Then, you have to keep some of the harvest to replant the following year, plus there’s some storage and spoilage. All in all, let’s say ancient societies needed 1M calories per year to sustain one person.5
How many calories could an acre produce? It varies a lot, but ChatGPT, Claude, and Grok tell me from ~0.6M per acre in Medieval England, to 1.5M in Rome, and 3M of rice in China. This means that England could feed 0.6 people per acre at most (about 150 people per km2), while China could feed 3 people per acre (750 ppl/km2), and Rome 1.5 (375 ppl/km2).6
Of course, that’s a maximum. Not all land was cultivated, some calories came from other sources like meat, fish, vegetables, or dairy, etc. But this gives us a ballpark of how things worked.
It also tells us something about the mechanics at work in societies at the time. For example, a household had ~5 people,7 which means the bare minimum farm size for a household ranged from less than 2 acres in China (6M calories produced per year) to 4 acres in Ancient Rome and 8 acres in Medieval England (4.8M calories produced per year).
What this tells us is that the population density of one area was the result of two forces:
Soil quality
Technology8
Soil quality was normally highest around volcanoes and close to rivers, thanks to the sediments each one brought and the natural irrigation of rivers and mountains.
Humans could have an influence using technology: clearing forests, irrigating, using manure, improving plows, selecting the best grains to replant, harvest after harvest…
This means the carrying capacity of a piece of land was determined basically by the luck of local soil quality, and by the evolution of human ingenuity, which could produce more from that soil.
Now we know how many people can live on a piece of land, but not how they distribute themselves.
Agglomeration Effects of Villages
Anybody who has lived in a village knows why they exist:
When harvest time comes, households can help each other, and that’s crucial because farm work is highly seasonal: Some weeks have no work, other weeks it’s all hands on deck.
More households hedge risks: If somebody gets sick, the neighbor can tend to them. If one has a bad harvest, the other can share their crop.
Infrastructure is expensive: A single person can hardly build a granary, a mill, a road, a bridge, an irrigation system… When groups get together, they can share these costs.
If you’re alone and a foreigner attacks you, you are cooked. But if you stand together, you are more likely to fend off invaders.
So most people didn’t live alone, and bundled together instead. This has been true of all humans since we were hunter-gatherers.
If people benefit from living together, what’s the maximum size of a village?
Marchetti’s Constant for Villages
Marchetti noticed that city sizes in antiquity were limited to ~30 minutes of walking distance from end to end. The intuition behind that is that people would not want to walk more than 30 min each way to go to work. This then carried through to more modern transportation systems: Trains and cars allowed these 30 min to take you farther, thus growing cities.
If this is true, it probably applied in the past, too, so people wouldn’t want their farms to be much farther than 30 min walking distance from the village. Otherwise, they would just create or join another hamlet.

Since human walking speed is about 4 km/h, in 30 min a person could walk 2 km. This limited how far a village reached: A 2 km distance means about 3,000 acres,9 which would fit ~375 farms / households maximum (~1800 people) in Medieval England, and 1500 households (~7,500 people) in China.
This is why the largest ancient towns hosted at most a few thousand people.10
Note that this is the maximum size of a village, one that would consume everything it produces and would die off in bad harvest years. One where people would be crammed in their households and had only small plots to farm. One where most fields were so far away from the house that the farmer could only see one of them each day, and would waste a ton of time walking there.
Logically, this was not what happened in most cases.
Actual Village Sizes
Most villages in history actually had 150-300 people.
Instead of having big villages every 4 km,11 it would make sense to be closer to the fields, so there would be more small villages, closer together. I took a random sample from India and found the distance between villages is about half that today.
In China, it’s about 500m!
But this sample from Ancient Greece shows the distance between villages was 5km!

Of course, the soil of China’s heartland is much more productive than Greece’s. The more food, the more population density. But why did this translate into more small villages rather than bigger villages?
Dispersion Effects
Imagine you’re an Ancient Roman with an 8 acre farm and five children who survived to adulthood, two daughters and three sons. Normally, the sons inherited the land, but if you divide your land in three, each son will get only 2.7 acres—not enough to feed them and their families! Even dividing it between two sons will still give them only 4 acres each, which is not far from subsistence level. Dangerous: Mortality was very high at the time, and a bad harvest could mean much less to eat for everybody. Young, healthy people could survive, but very small children and old people would be weakened, and any infection could kill them. People wanted to avoid getting close to that point.

So the farm owner might want to keep his farm in one piece and give it to the eldest son. But then the other two sons will remain farmless. What will they do?
Whenever a village would become too crowded, people would just leave and settle some more land. In the Neolithic, or in the Dark Ages, that could have meant clearing some adjoining forest. But in Roman times, it meant enlisting in the army to go kill some neighboring peoples, take their land, and farm it yourself. Indeed, citizenship and land was the pension, the final prize of having served the army for years or decades.

Another factor is that not everybody was a farmer who worked on their fields. The lords and their entourage had large estates, but they were not the ones to work on them—the other farmers were. The surplus from their work was dedicated to feeding nobles, along with any other person who didn’t farm directly—mostly warriors, clergy, merchants, artisans, and other townspeople.12
Takeaways
So now we have the main drivers of village emergence:
Food productivity determined the population density of the land.
Farming increased the food productivity of the land, so it increased the population density.
Agglomeration effects pushed people to live together.
The size of a village was capped to a few hundred to a few thousand people because of the transportation costs of walking.
People tended to leave their villages before the cap was reached and they became overcrowded. This usually meant either clearing new fields and settling nearby, or joining armies to conquer new farmland.
How do we go from village to cities? And how do you actually organize cities to make them great?
Christopher Alexander’s A Pattern Language is an amazing book that tries to do something similar, but is much more ambitious (it goes from the ideal size of states to the ideal types of chairs around your dinner table), and more intuitive. I intuit that urbanism should be a much more scientifically precise science than it currently seems to be, given how terrible most new urban landscapes are.
More precisely, calories per unit of land, but that’s harder to understand, so to introduce the concept, let’s keep it simple.
Technically, kilocalories, but everybody calls them calories. I’m confused about this.
This is an average. Let’s assume most calories come from grain, which I believe is reasonable for our math. Also, apparently, humans need a bit more than 2000 calories per day.
It looks like about 10% of harvests were dedicated to reseeding and 10% to spoilage, which means a total of 910k calories per person would be needed. Also, not all the calories produced were consumed (eg the plant stems are not edible). And there was also usually a 10% or so tax, bringing the total to 1M calories harvested per farmer. But farmers are producers, not consumers. The tax was dedicated to feeding non farmers. So the number is probably around 900k per person, but for rounding goals I’ll use 1M. The 10% spoilage for grain comes from Medieval England. It’s 20% in extreme situations in Africa today. The reseeding seems to have reached 25% in the Middle Ages. In Rome, apparently it was around 10%.
I’m simplifying here and making assumptions, but I’m trying to get to the simplest model to expose how things have probably worked.
A typical size was 6-7 people, but there were also a bunch that had only a few or just one, so the average turned out to be ~5.
In the broadest sense, this includes better crops, crop rotation, better institutions, better ideas on how to sow, till, harvest…
The surface is π*r2=π*22=12.5 km2. Since one km2 fits about 250 acres, that’s 3,125 acres
According to Claude, Grok, and ChatGPT. I spot-checked them and they seemed correct.
2 km to go from one village to the Marchetti limit, and another 2 km to go from that limit to the next village.
This doesn’t change the fact that the amount of food per acre was the main determinant of total population density. Put another way, if say 70% of workers were farmers, and 30% did other things, the 70% worked the farms that fed the 100%. Those farms, and their production, was the limiting factor for the entire population, since farmers could work on more than just their own farm.
Very exciting!
I think Smil has some of the numbers you may be looking for in Energy and Civilization.
I am curious how this fits in to arguments you’ve listed in Why Some Cities Thrive particularly about being situated of Fall Lines.
Also, Nick Szabo has a cool post on the trade-off between security and productivity of farms.
https://unenumerated.blogspot.com/2005/12/security-and-productivity-of-farms.html?m=1
He argues that unified Japan and England got productivity dividends since their security costs were lower than non-island settings.
I thought villages and later-on cities grow in crossroads or waterways where they did commerce. Farmers don't need to live in villages or cities.