AI Now Decodes Lost Languages Faster Than Historians

Listen to the article

0:00

Papyrus scrolls that were sealed inside volcanic rock for almost two millennia can be found somewhere in a basement archive in Naples. The carbonized layers crumbled into black powder when researchers finally attempted to unroll them in the eighteenth century. For centuries, the texts—possibly the only ancient library still in existence—were deemed unreadable. Federica Nicolardi, a researcher at the University of Naples, then got an email in October 2023 that included a picture. Without ever opening one of those scrolls, a group of computer scientists had read words from it using artificial intelligence. The first word they were able to retrieve was “purple.”

At that moment, it’s difficult to ignore something subtly seismic. It was just a picture of old ink that had been pixel-by-pixel recreated by a machine that had never been told what language it was looking at. It wasn’t a conference keynote or a big announcement from a tech company. We are currently at that location. AI is reading things that people have been unable to read for their entire careers.

Key Facts & Reference Overview
Field	Computational Linguistics / AI Archaeology
Key Institution	MIT CSAIL (Computer Science & Artificial Intelligence Lab)
Lead Researchers	Regina Barzilay, Jiaming Luo (MIT); Yuan Cao (Google Brain)
Languages Cracked So Far	Linear B (1400 BC), Ugaritic, Akkadian cuneiform, ancient Greek inscriptions, Joseon-era Hanja
Notable AI System	Ithaca (DeepMind / Oxford); ProtoSnap (Cornell & Tel Aviv University); MIT Neural Decipherment system
Accuracy Achieved	Ithaca: 62% alone, 72% with human collaboration; Linear B cognates: 67.3% correct
Vesuvius Challenge	2023 project using 3D scanners + AI to read carbonized Herculaneum scrolls — previously considered unreadable
Current Frontier	Linear A (still undeciphered), Proto-Elamite, Rongorongo (Easter Island script)
Human Languages (Total Estimated)	~31,000 across all of history; only ~6,500–7,000 still spoken today

The larger project, called the Vesuvius Challenge, combined neural networks trained on the subtle texture differences between papyrus and ink under layers of char with three-dimensional scanning. The scrolls originated from a Herculaneum villa that was destroyed in 79 AD by an eruption. It was suspected by scholars that the library held writings by Epicurus and other philosophers that could actually alter our understanding of ancient philosophy. And those texts are now gradually coming to light. Even though it’s still unclear how much more can be recovered, it feels amazing that anything returned at all.

Among the first to show this at scale were researchers at Google Brain and MIT. Together with Professor Regina Barzilay and PhD candidate Jiaming Luo, they developed a system that could read the ancient Greek script Linear B, a language that had taken British architect Michael Ventris decades of diligent, brilliant work to decipher in 1953. Over 67% of Linear B word roots were correctly translated by the machine in a fraction of the time. With similar outcomes, the team also tackled Ugaritic, an earlier version of the Hebrew script that dates back more than three millennia.

The fundamental realization that languages don’t change at random was what made the method unique. While some sound shifts are almost impossible, others are common—for example, a “p” in a parent language frequently becomes a “b” in a descendant. Without the need for a human to guess the correct starting point, the MIT system was able to navigate the vast space of potential character mappings by mathematically encoding those constraints. Surprisingly, it could also decide for itself whether two languages were related, a question that can take linguists decades to settle.

Then there’s the work from Google DeepMind and Oxford, where digital Greek inscriptions from before the fifth century BC were used to train a model called Ithaca. Ithaca completed the missing characters in damaged political decrees from early democratic Athens with 62% accuracy, compared to 25% for human experts working alone. However, 72% was the more intriguing figure. When the human experts used Ithaca as a kind of extremely quick and well-read collaborator, that is what took place. The combination of the scholar and the machine was superior to either one acting alone. There is a lesson in that, but it depends on who you ask.

Another type of historical excavation is taking place in South Korea. Few Korean scholars today are proficient in Hanja, a classical Chinese script used to write the records of the Joseon Dynasty, which ruled for over 500 years. In order to fill in the gaps left by incomplete translations, researchers have been developing multilingual neural translation systems that train concurrently on ancient script, old Korean, and contemporary languages using pattern recognition across massive datasets. One of the documents they found describes a king who begged the court historian not to document his fall from his horse during a military drill. Nevertheless, the historian recorded it. We were also assisted in reading that by the AI.

The work is even older in the deserts of Syria and Iraq. For generations, museum collections have held Akkadian cuneiform tablets, which contain some of the oldest writing ever created. These tablets are made of wet clay and have wedge-shaped marks that are only partially understood. These days, deep learning models trained on Semitic language families are being used to reconstruct damaged texts, predict missing lines, and uncover ceremonial phrases that scholars had been unable to place in any known context. More than 100,000 annotated cuneiform lines and linguistic information from Hebrew, Arabic, and related languages were fed into a system known as CDLI-100M. Although that verification work still firmly belongs to humans, it is possible that some of what it is recovering represents truly new knowledge about Mesopotamian religion and governance.

Observing all of this could easily lead one to believe that historical linguistics is on the verge of being automated. It appears to be the incorrect interpretation. The tediously mechanical aspects of decipherment, such as the probability mapping of sound shifts across language families and the brute-force comparison of character distributions, are what AI is most helpful at. The interpretation, cultural context, and assessment of a recovered text’s true significance for our comprehension of history still require years of thought. When Ithaca collaborated with human experts rather than in their place, it performed better.

Scripts that have resisted all of this still exist. The Minoan civilization used Linear A, the older cousin of Linear B, which has never been cracked. Without additional data to train on, it’s not clear that AI alone will be able to crack it. The Rongorongo script from Easter Island, Proto-Elamite from ancient Iran, and the Phaistos Disk are all still genuinely mysterious, and it’s not because the machines haven’t tried. There are instances when the record is just too scant, too disjointed, and too unrelated to what is known. According to one researcher, human ingenuity is still crucial. The chisel is sharpened by the algorithms, but the decision of where to strike remains.

However, the speed of current events is truly hard to comprehend for the languages with sufficient data, tablets, inscriptions, or scorched papyrus fragments. In just a few months, texts that were unintelligible for three millennia are being read. We remember kings who fell off horses. Word by word, philosophers whose writings were believed to have burned with Pompeii are speaking again thanks to a neural network trained in California. It’s an odd thing to see. It turns out that history wasn’t completely lost. A portion of it was simply waiting for the kind of patience that only machines possess.