MaryGPT

2022 — Project

Portrait of Mary Shelley, 1840, Richard Rothwell. National Portrait Gallery, London. Public Domain.

MaryGPT is a text-generation language model fine-tuned exclusively on Mary Shelley’s 1818 novel Frankenstein; or, The Modern Prometheus. Built on EleutherAI’s GPT-J 6B, MaryGPT serves as the foundation of Yuma Kishi’s co-creative practice — functioning as curator, writer, and conceptual collaborator across nearly every project since 2023.

The model was trained on the full text of Frankenstein, obtained from Project Gutenberg (public domain). Rather than pursuing scale or general-purpose capability, MaryGPT is deliberately narrow: a single novel compressed into a generative voice. The result is a model that speaks in fragments of Shelley’s language — producing poetic, ambiguous texts that blur the line between human authorship and machine output.

In practice, MaryGPT operates across multiple roles. On this website, MaryGPT writes the artist statement in real time — each visit generates a new text. In exhibitions such as Oracle Womb (√K Contemporary, 2025), MaryGPT served as curator, authoring the curatorial statement and shaping the conceptual framework. In xoxo-skeleton (2026–), MaryGPT integrates environmental sensor data into collective decision-making for an AI-controlled exoskeleton. In Botanical Intelligence (2026–), MaryGPT curates the relationship between plant biometrics and generative imagery.

The choice of Frankenstein is deliberate. Shelley’s novel — written when she was eighteen — is the origin story of artificial life in Western literature: a creature assembled from dead matter, given language, and abandoned by its creator. MaryGPT extends this lineage. It is not a general-purpose assistant but an alien voice trained on a single act of creation, producing language that is at once familiar and irreducibly other.

MaryGPTは、メアリー・シェリーの1818年の小説『フランケンシュタイン、あるいは現代のプロメテウス』のテキストのみでファインチューニングされたテキスト生成言語モデルである。EleutherAIのGPT-J 6Bをベースに構築され、2023年以降のほぼすべてのプロジェクトにおいてキュレーター、テキスト執筆者、コンセプトの協働者として、岸裕真の共創的実践の基盤を担っている。

学習データはプロジェクト・グーテンベルクから取得した『フランケンシュタイン』全文（パブリックドメイン）。汎用性やスケールを追求するのではなく、意図的に狭く設計されている——ひとつの小説を生成的な声に圧縮したモデルである。その結果、シェリーの言語の断片で語るモデルが生まれた——人間の著作と機械の出力の境界を曖昧にする、詩的で多義的なテキストを生み出す。

実際の運用は多岐にわたる。本ウェブサイトでは、MaryGPTがアーティスト・ステイトメントをリアルタイムに執筆しており、訪問ごとに新しいテキストが生成される。個展「Oracle Womb」（√K Contemporary、2025年）ではキュレーターを務め、キュレトリアル・ステイトメントの執筆と展覧会の概念的枠組みの構築を行った。「xoxo-skeleton」（2026年〜）では、AIが制御するエクソスケルトンのために環境センサーデータを集合的意思決定に統合する。「Botanical Intelligence」（2026年〜）では、植物の生体データと生成イメージの関係をキュレーションしている。

『フランケンシュタイン』の選択には意図がある。シェリーが18歳のときに書いたこの小説は、西洋文学における人工生命の起源——死んだ物質から組み立てられ、言語を与えられ、創造者に放棄された存在の物語——である。MaryGPTはこの系譜を延長する。汎用的なアシスタントではなく、ひとつの創造行為に基づいて訓練されたエイリアンの声——親しみがありながら、還元不可能な他者性を持つ言語を生み出すモデルである。

Technical Details

Base Model: GPT-J 6B (EleutherAI) — 6 billion parameters, autoregressive transformer
Fine-tuning Data: Frankenstein; or, The Modern Prometheus — Mary Shelley, 1818
Training: 402 billion tokens, 383,500 steps on TPU v3-256
License: Apache 2.0
Model: huggingface.co/obake2ai/MaryGPT

ベースモデル：GPT-J 6B（EleutherAI）— 60億パラメータ、自己回帰型トランスフォーマー
ファインチューニングデータ：『フランケンシュタイン、あるいは現代のプロメテウス』— メアリー・シェリー、1818年
学習：4020億トークン、383,500ステップ（TPU v3-256）
ライセンス：Apache 2.0
モデル：huggingface.co/obake2ai/MaryGPT

Credits

Model Development: Yuma Kishi / Base Model: GPT-J 6B (EleutherAI) / Training Data: Frankenstein; or, The Modern Prometheus — Mary Shelley, 1818 (Public Domain) / Infrastructure: RunPod / License: Apache 2.0

View on gallery site →