Documentation

Everything you need to know about auto-seed — from quick setup to advanced usage.

Getting Started

auto-seed requires Node.js 18+ and runs directly with npx or bunx — no installation needed.

1. Configure your LLM provider

Run the interactive setup to choose your provider and enter your API key. Configuration is stored at ~/.auto-seed/config.json with restricted file permissions.

$ npx auto-seed init

2. Generate seed data

Point auto-seed at your project. It auto-detects your schema file, calls the LLM once to plan data generation, then produces your seed file locally.

$ npx auto-seed generate

That's it. You now have a ready-to-run seed file. Customize with flags like --rows, --format, and --seed for more control.

Commands

auto-seed init

Interactive first-run setup. Prompts you to select a provider (Anthropic, OpenAI, or Google Gemini), enter your API key, choose a default model, and set the default output format.

auto-seed generate

The primary command. Parses your schema, calls the LLM to create a generation strategy, produces rows with a local deterministic engine, and writes the output file. See CLI Reference for all available flags.

auto-seed config

Manage configuration without editing the JSON file directly.

SubcommandDescription
config set <key> <value>Set a configuration value
config get <key>Read a configuration value (API keys are masked)
config listShow all configuration
config pathPrint the config file path

CLI Reference

All flags for auto-seed generate.

FlagDefaultDescription
--schema <path>auto-detectPath to schema file
--format <ts|sql>sqlOutput format
--rows <spec>10Global count or per-table spec (e.g. users:25,posts:200)
--mode <plan|direct>planGeneration mode
--out <path>./seed.<ext>Output file path
--seed <number>randomDeterministic RNG seed for reproducible output
--tables <a,b,c>allGenerate only specified tables
--locale <code>enFaker locale hint (e.g. de, fr, ja)
--provider <name>from configOverride LLM provider for this run
--model <id>from configOverride LLM model for this run
--dialect <name>postgresqlSQL dialect (postgresql, mysql, sqlite)
--hint <text>Domain context for realistic data (e.g. "fintech app")
--dry-runfalsePrint Seed Plan and summary without writing
--plan-onlyfalseWrite only the Seed Plan JSON (no row generation)
--plan <path>Reuse a saved Seed Plan (skips LLM call entirely)
--max-direct-rows <n>200Row cap for direct mode
-y, --yesfalseSkip confirmation prompts (CI-friendly)
--verbosefalseDebug logging and stack traces

Schema Support

auto-seed parses your schema into a common intermediate representation, extracting models, fields, relations, and constraints. Three formats are supported.

Prisma (.prisma)

Parsed via @prisma/internals. Supports @id, @unique, @@id, @@unique, @relation, @default, optionality (?), and enums.

Auto-detected at: prisma/schema.prisma, schema.prisma

SQL DDL (.sql)

Parsed via node-sql-parser. Supports CREATE TABLE statements with inline and table-level PRIMARY KEY, UNIQUE, FOREIGN KEY, SERIAL/AUTO_INCREMENT, NOT NULL, and DEFAULT.

Dialects: postgresql (default), mysql, sqlite. Auto-detected at: schema.sql, db/schema.sql, or any *.sql at the project root.

TypeORM Entities (.entity.ts)

Parsed via ts-morph. Supports @Entity, @Column, @PrimaryGeneratedColumn, @ManyToOne, @OneToMany, @OneToOne, @JoinColumn, @Unique, @CreateDateColumn, and @UpdateDateColumn.

Auto-detected at: src/**/*.entity.ts, entities/**/*.entity.ts

Auto-detection order

When --schema is not specified, auto-seed searches in this order:

  1. prisma/schema.prisma or schema.prisma
  2. schema.sql, db/schema.sql, or any *.sql at root
  3. TypeORM entities in src/**/*.entity.ts or entities/**/*.entity.ts

Generation Modes

Plan Mode (default)

The LLM makes one call to design a structured generation strategy — the Seed Plan. A local deterministic engine then produces all rows using Faker.js and a seeded RNG. This scales to millions of rows at the cost of a single cheap API call.

$ npx auto-seed generate --mode plan

Direct Mode

The LLM directly produces literal row data as JSON. Useful for small datasets where narrative coherence matters (e.g. a blog with realistic post titles and content). Capped at 200 rows by default.

$ npx auto-seed generate --mode direct --rows "posts:20" --hint "blog about cooking"

Seed Plan Format

The Seed Plan is a JSON structure generated by the LLM that describes how to generate data for each column. The local engine reads this plan and produces rows deterministically.

{
  "version": 1,
  "generationOrder": ["users", "posts", "comments"],
  "tables": [
    {
      "table": "users",
      "rowCount": 50,
      "columns": [
        { "column": "id", "strategy": { "type": "sequence", "start": 1 } },
        { "column": "email", "strategy": { "type": "faker", "method": "internet.email" } },
        { "column": "name", "strategy": { "type": "faker", "method": "person.fullName" } },
        { "column": "role", "strategy": { "type": "enum", "values": ["admin", "user", "editor"], "weights": [0.1, 0.7, 0.2] } }
      ]
    }
  ]
}

Column Strategy Types

TypeDescriptionExample
sequenceAuto-incrementing integers{"type": "sequence", "start": 1}
uuidUUID v4 generation{"type": "uuid"}
fakerCalls @faker-js/faker v10 methods{"type": "faker", "method": "internet.email"}
enumWeighted random selection from values{"type": "enum", "values": ["a","b"]}
patternString template with {{index}} or {{faker:method}} tokens{"type": "pattern", "template": "user_{{index}}"}
referenceForeign key reference to a parent table column{"type": "reference", "table": "users", "column": "id"}
staticFixed value for all rows{"type": "static", "value": true}
nullAlways null (nullable columns){"type": "null"}

Columns can also specify a nullRatio (0–1) for nullable columns, controlling the probability of generating a NULL value.

Output Formats

SQL Output

Generates INSERT INTO statements wrapped in a BEGIN/COMMIT transaction. Tables are ordered topologically by foreign key dependencies. Includes a commented-out DELETE FROM preamble for cleanup.

-- Generated by auto-seed
-- model: anthropic/claude-haiku-4-5-20251001

BEGIN;

-- users: 50 rows
INSERT INTO "users" ("id", "email", "name") VALUES
  (1, 'jordan@example.com', 'Jordan Chen'),
  (2, 'priya@example.com', 'Priya Sharma');

-- posts: 200 rows
INSERT INTO "posts" ("id", "title", "author_id") VALUES
  (1, 'Getting Started with GraphQL', 1),
  (2, 'Why We Moved to Postgres', 2);

COMMIT;

TypeScript Output

Output varies by schema source. Prisma schemas produce PrismaClient.createMany() calls. TypeORM schemas produce DataSource/repository inserts. SQL schemas produce typed arrays.

// Generated by auto-seed
import { PrismaClient } from "@prisma/client";

const prisma = new PrismaClient();

async function main() {
  await prisma.user.createMany({
    data: [
      { id: 1, email: "jordan@example.com", name: "Jordan Chen" },
      { id: 2, email: "priya@example.com", name: "Priya Sharma" },
    ],
    skipDuplicates: true,
  });
}

main()
  .catch(e => { console.error(e); process.exit(1); })
  .finally(async () => { await prisma.$disconnect(); });

Configuration

Configuration is stored at ~/.auto-seed/config.json with file permissions 0600. You can edit it directly or use auto-seed config commands.

{
  "provider": "anthropic",
  "models": {
    "anthropic": "claude-haiku-4-5-20251001",
    "openai": "gpt-4o-mini",
    "gemini": "gemini-2.0-flash"
  },
  "apiKeys": {
    "anthropic": "sk-ant-...",
    "openai": "sk-...",
    "gemini": "AIza..."
  },
  "defaults": {
    "format": "sql",
    "rows": "10",
    "locale": "en"
  }
}

Environment Variables

Environment variables take precedence over the config file.

VariableDescription
ANTHROPIC_API_KEYAnthropic API key
OPENAI_API_KEYOpenAI API key
GEMINI_API_KEYGoogle Gemini API key
AUTO_SEED_PROVIDEROverride provider (anthropic, openai, gemini)
AUTO_SEED_MODELOverride model ID

Default Models

ProviderDefault Model
Anthropicclaude-haiku-4-5-20251001
OpenAIgpt-4o-mini
Geminigemini-2.0-flash

Examples

Basic usage

# Auto-detect schema, 10 rows/table, SQL output
$ npx auto-seed generate

Custom counts and format

# Prisma schema, TypeScript output, custom row counts, deterministic
$ npx auto-seed generate \
    --schema prisma/schema.prisma \
    --format ts \
    --rows "users:25,posts:200,comments:600" \
    --seed 42 \
    --out prisma/seed.ts

Save and reuse a plan

Generate the plan once (one API call), then reuse it offline with different row counts — zero cost after the first run.

# Save the plan
$ npx auto-seed generate --plan-only --out plan.json

# Reuse offline with different row counts
$ npx auto-seed generate --plan plan.json --rows 5000 --format sql

Domain hints

# Provide context for more realistic data
$ npx auto-seed generate --hint "fintech app: accounts, ledgers, transactions"

Direct mode for narrative data

# Small dataset with coherent content
$ npx auto-seed generate \
    --mode direct \
    --rows "posts:20" \
    --hint "blog about retirement planning"

CI / non-interactive

# Deterministic, no prompts
$ npx auto-seed generate --seed 12345 --format ts --rows 100 --yes

Subset of tables

$ npx auto-seed generate --tables users,posts,comments

Override provider for one run

$ npx auto-seed generate --provider openai --model gpt-4o

Exit Codes

CodeMeaning
0Success
1User or config error (missing key, invalid flag)
2Schema parse error (no schema found, parser failure)
3LLM/API error (auth, rate limit, invalid response)
4Generation error (cyclic FK without nullable side, FK exhaustion)