Research4h ago

Harvard Trains AI Model on Pre-1931 Public Domain Content

Harvard MagazineMay 20, 20261 min brief

In brief

Researchers at Harvard have trained a large language model called Talkie on public domain content from Harvard libraries published before 1931.
- This model can respond fluently to prompts about early aviation or 1920s social customs but falters on modern topics.
The model is significant because it shows how artificial intelligence can learn from historical data.
Since its release, users have tested Talkie to see if it can forecast future events or generalize concepts it was not taught.
Talkie has demonstrated the ability to produce new code when given small snippets of Python.
Talkie's development may change how we think about artificial intelligence and its connection to libraries and archives.
- It may rely on these institutions as much as technology companies.
Now researchers will see how Talkie and similar models perform in the future.

Terms in this brief

Talkie: A large language model trained by Harvard researchers using public domain content from before 1931. Talkie can answer questions about early aviation or 1920s social customs but struggles with modern topics. It's notable for its ability to generate new code from small Python snippets, showing how AI can learn from historical data.

More briefs