90% of All Human Data Was Created in Just Two Years — What That Number Actually Means

An Exponential Curve, Not a Straight Line

Human beings have been recording information for thousands of years. Cave paintings, clay tablets, papyrus scrolls, printed books, photographs, audio recordings, and video tapes each represented a leap in our capacity to capture and store information. Yet all of those millennia of human documentation combined represent a fraction of the data now generated in a single year — and the gap widens with every passing month.

The reason is exponential growth. When a technology doubles its data output every year or two, the most recent period always dwarfs everything that came before. After ten doublings, the latest period represents more than half of all time accumulated. After twenty doublings, the history before the recent surge becomes almost statistically invisible. The 90 percent figure is not a coincidence or an exaggeration — it is the mathematical consequence of sustained exponential growth in data generation.

Three Engines Driving the Explosion

IBM and other research organizations that have studied global data creation identify three primary drivers of this extraordinary growth.

Social media platforms generate enormous quantities of data through posts, images, videos, comments, likes, and the behavioral signals left by billions of users. A single major platform can collect petabytes — millions of gigabytes — of new data daily. Every image uploaded, every story viewed, every advertisement clicked generates data points that are stored, analyzed, and used to improve algorithmic recommendations.

The Internet of Things has connected billions of physical devices — thermostats, fitness trackers, industrial sensors, traffic cameras, agricultural monitors — to networks that continuously report their readings. A modern smart factory might have thousands of sensors each generating data points every second, and that information must be stored and processed to enable the automation and efficiency that makes the installation worthwhile.

Streaming services for video and music account for a staggering share of global internet traffic, and all of that content must be stored on servers, replicated across data centers for reliability, and served to users on demand. Netflix, YouTube, and their competitors together constitute one of the largest repositories of data on earth, and the content library grows continuously.

What Gets Lost in the Numbers

The 90 percent statistic is a striking way to communicate data growth, but it requires careful interpretation. Not all data is equivalent. A high-resolution 4K video file occupies billions of times more storage than a medieval manuscript that has been digitized, but that does not mean the video is more historically or culturally significant. Much of the data generated in the modern era is ephemeral — temporary files, sensor readings that have served their purpose, cached web content — and a significant portion is duplicate, stored in multiple locations for redundancy.

Data is also not the same as information or knowledge. The challenge for businesses, governments, and researchers is not merely generating data but making sense of it. The field of data science and the rise of machine learning have both emerged largely in response to this challenge — the need for tools and techniques capable of finding meaningful patterns in quantities of information no human analyst could review manually.

The Infrastructure Challenge

Storing 90 percent of all human data in just two years requires infrastructure of extraordinary scale. Global data center capacity has grown enormously, consuming land, water for cooling, and electricity in quantities that have drawn significant environmental scrutiny. The energy footprint of cloud computing and data storage has become a meaningful factor in corporate sustainability reporting, and engineering innovations in chip design, storage density, and cooling efficiency are driven substantially by the need to handle ever-larger data volumes.

The curve shows no signs of flattening. Artificial intelligence systems themselves generate new data as they are trained and deployed, adding another layer to the growth. The data generated today will likely be dwarfed by what the next two years produce — and so on indefinitely, until either the technology changes or the infrastructure constraints become insurmountable.

90% of All Human Data Was Created in Just Two Years — What That Number Actually Means

An Exponential Curve, Not a Straight Line

Three Engines Driving the Explosion

What Gets Lost in the Numbers

The Infrastructure Challenge

Related Articles