Documentation

Everything you need to know about auto-seed — from quick setup to advanced usage.

Getting Started

auto-seed requires Node.js 18+ and runs directly with npx or bunx — no installation needed.

1. Configure your LLM provider

Run the interactive setup to choose your provider and enter your API key. Configuration is stored at ~/.auto-seed/config.json with restricted file permissions.

$ npx auto-seed init

2. Generate seed data

Point auto-seed at your project. It auto-detects your schema file, calls the LLM once to plan data generation, then produces your seed file locally.

$ npx auto-seed generate

That's it. You now have a ready-to-run seed file. Customize with flags like --rows, --format, and --seed for more control.

Commands

auto-seed init

Interactive first-run setup. Prompts you to select a provider (Anthropic, OpenAI, or Google Gemini), enter your API key, choose a default model, and set the default output format.

auto-seed generate

The primary command. Parses your schema, calls the LLM to create a generation strategy, produces rows with a local deterministic engine, and writes the output file. See CLI Reference for all available flags.

auto-seed config

Manage configuration without editing the JSON file directly.

Subcommand	Description
config set <key> <value>	Set a configuration value
config get <key>	Read a configuration value (API keys are masked)
config list	Show all configuration
config path	Print the config file path

CLI Reference

All flags for auto-seed generate.

Flag	Default	Description
--schema <path>	auto-detect	Path to schema file
--format <ts\|sql>	sql	Output format
--rows <spec>	10	Global count or per-table spec (e.g. `users:25,posts:200`)
--mode <plan\|direct>	plan	Generation mode
--out <path>	./seed.<ext>	Output file path
--seed <number>	random	Deterministic RNG seed for reproducible output
--tables <a,b,c>	all	Generate only specified tables
--locale <code>	en	Faker locale hint (e.g. de, fr, ja)
--provider <name>	from config	Override LLM provider for this run
--model <id>	from config	Override LLM model for this run
--dialect <name>	postgresql	SQL dialect (postgresql, mysql, sqlite)
--hint <text>	—	Domain context for realistic data (e.g. "fintech app")
--dry-run	false	Print Seed Plan and summary without writing
--plan-only	false	Write only the Seed Plan JSON (no row generation)
--plan <path>	—	Reuse a saved Seed Plan (skips LLM call entirely)
--max-direct-rows <n>	200	Row cap for direct mode
-y, --yes	false	Skip confirmation prompts (CI-friendly)
--verbose	false	Debug logging and stack traces

Schema Support

auto-seed parses your schema into a common intermediate representation, extracting models, fields, relations, and constraints. Three formats are supported.

Prisma (.prisma)

Parsed via @prisma/internals. Supports @id, @unique, @@id, @@unique, @relation, @default, optionality (?), and enums.

Auto-detected at: prisma/schema.prisma, schema.prisma

SQL DDL (.sql)

Parsed via node-sql-parser. Supports CREATE TABLE statements with inline and table-level PRIMARY KEY, UNIQUE, FOREIGN KEY, SERIAL/AUTO_INCREMENT, NOT NULL, and DEFAULT.

Dialects: postgresql (default), mysql, sqlite. Auto-detected at: schema.sql, db/schema.sql, or any *.sql at the project root.

TypeORM Entities (.entity.ts)

Parsed via ts-morph. Supports @Entity, @Column, @PrimaryGeneratedColumn, @ManyToOne, @OneToMany, @OneToOne, @JoinColumn, @Unique, @CreateDateColumn, and @UpdateDateColumn.

Auto-detected at: src/**/*.entity.ts, entities/**/*.entity.ts

Auto-detection order

When --schema is not specified, auto-seed searches in this order:

prisma/schema.prisma or schema.prisma
schema.sql, db/schema.sql, or any *.sql at root
TypeORM entities in src/**/*.entity.ts or entities/**/*.entity.ts

Generation Modes

Plan Mode (default)

The LLM makes one call to design a structured generation strategy — the Seed Plan. A local deterministic engine then produces all rows using Faker.js and a seeded RNG. This scales to millions of rows at the cost of a single cheap API call.

$ npx auto-seed generate --mode plan

Direct Mode

The LLM directly produces literal row data as JSON. Useful for small datasets where narrative coherence matters (e.g. a blog with realistic post titles and content). Capped at 200 rows by default.

$ npx auto-seed generate --mode direct --rows "posts:20" --hint "blog about cooking"

Seed Plan Format

The Seed Plan is a JSON structure generated by the LLM that describes how to generate data for each column. The local engine reads this plan and produces rows deterministically.

{
  "version": 1,
  "generationOrder": ["users", "posts", "comments"],
  "tables": [
    {
      "table": "users",
      "rowCount": 50,
      "columns": [
        { "column": "id", "strategy": { "type": "sequence", "start": 1 } },
        { "column": "email", "strategy": { "type": "faker", "method": "internet.email" } },
        { "column": "name", "strategy": { "type": "faker", "method": "person.fullName" } },
        { "column": "role", "strategy": { "type": "enum", "values": ["admin", "user", "editor"], "weights": [0.1, 0.7, 0.2] } }
      ]
    }
  ]
}

Column Strategy Types

Type	Description	Example
sequence	Auto-incrementing integers	`{"type": "sequence", "start": 1}`
uuid	UUID v4 generation	`{"type": "uuid"}`
faker	Calls @faker-js/faker v10 methods	`{"type": "faker", "method": "internet.email"}`
enum	Weighted random selection from values	`{"type": "enum", "values": ["a","b"]}`
pattern	String template with `{{index}}` or `{{faker:method}}` tokens	`{"type": "pattern", "template": "user_{{index}}"}`
reference	Foreign key reference to a parent table column	`{"type": "reference", "table": "users", "column": "id"}`
static	Fixed value for all rows	`{"type": "static", "value": true}`
null	Always null (nullable columns)	`{"type": "null"}`

Columns can also specify a nullRatio (0–1) for nullable columns, controlling the probability of generating a NULL value.

Output Formats

SQL Output

Generates INSERT INTO statements wrapped in a BEGIN/COMMIT transaction. Tables are ordered topologically by foreign key dependencies. Includes a commented-out DELETE FROM preamble for cleanup.

-- Generated by auto-seed
-- model: anthropic/claude-haiku-4-5-20251001

BEGIN;

-- users: 50 rows
INSERT INTO "users" ("id", "email", "name") VALUES
  (1, 'jordan@example.com', 'Jordan Chen'),
  (2, 'priya@example.com', 'Priya Sharma');

-- posts: 200 rows
INSERT INTO "posts" ("id", "title", "author_id") VALUES
  (1, 'Getting Started with GraphQL', 1),
  (2, 'Why We Moved to Postgres', 2);

COMMIT;

TypeScript Output

Output varies by schema source. Prisma schemas produce PrismaClient.createMany() calls. TypeORM schemas produce DataSource/repository inserts. SQL schemas produce typed arrays.

// Generated by auto-seed
import { PrismaClient } from "@prisma/client";

const prisma = new PrismaClient();

async function main() {
  await prisma.user.createMany({
    data: [
      { id: 1, email: "jordan@example.com", name: "Jordan Chen" },
      { id: 2, email: "priya@example.com", name: "Priya Sharma" },
    ],
    skipDuplicates: true,
  });
}

main()
  .catch(e => { console.error(e); process.exit(1); })
  .finally(async () => { await prisma.$disconnect(); });

Configuration

Configuration is stored at ~/.auto-seed/config.json with file permissions 0600. You can edit it directly or use auto-seed config commands.

{
  "provider": "anthropic",
  "models": {
    "anthropic": "claude-haiku-4-5-20251001",
    "openai": "gpt-4o-mini",
    "gemini": "gemini-2.0-flash"
  },
  "apiKeys": {
    "anthropic": "sk-ant-...",
    "openai": "sk-...",
    "gemini": "AIza..."
  },
  "defaults": {
    "format": "sql",
    "rows": "10",
    "locale": "en"
  }
}

Environment Variables

Environment variables take precedence over the config file.

Variable	Description
ANTHROPIC_API_KEY	Anthropic API key
OPENAI_API_KEY	OpenAI API key
GEMINI_API_KEY	Google Gemini API key
AUTO_SEED_PROVIDER	Override provider (anthropic, openai, gemini)
AUTO_SEED_MODEL	Override model ID

Default Models

Provider	Default Model
Anthropic	claude-haiku-4-5-20251001
OpenAI	gpt-4o-mini
Gemini	gemini-2.0-flash

Examples

Basic usage

# Auto-detect schema, 10 rows/table, SQL output
$ npx auto-seed generate

Custom counts and format

# Prisma schema, TypeScript output, custom row counts, deterministic
$ npx auto-seed generate \
    --schema prisma/schema.prisma \
    --format ts \
    --rows "users:25,posts:200,comments:600" \
    --seed 42 \
    --out prisma/seed.ts

Save and reuse a plan

Generate the plan once (one API call), then reuse it offline with different row counts — zero cost after the first run.

# Save the plan
$ npx auto-seed generate --plan-only --out plan.json

# Reuse offline with different row counts
$ npx auto-seed generate --plan plan.json --rows 5000 --format sql

Domain hints

# Provide context for more realistic data
$ npx auto-seed generate --hint "fintech app: accounts, ledgers, transactions"

Direct mode for narrative data

# Small dataset with coherent content
$ npx auto-seed generate \
    --mode direct \
    --rows "posts:20" \
    --hint "blog about retirement planning"

CI / non-interactive

# Deterministic, no prompts
$ npx auto-seed generate --seed 12345 --format ts --rows 100 --yes

Subset of tables

$ npx auto-seed generate --tables users,posts,comments

Override provider for one run

$ npx auto-seed generate --provider openai --model gpt-4o

Exit Codes

Code	Meaning
0	Success
1	User or config error (missing key, invalid flag)
2	Schema parse error (no schema found, parser failure)
3	LLM/API error (auth, rate limit, invalid response)
4	Generation error (cyclic FK without nullable side, FK exhaustion)