Getting Started¶
Let's go from nothing on your disk to a working, reproducible analysis. You can read this top to bottom without running anything, or follow along — every command is copy-paste ready.
What you'll build: a small two-output analysis that fits a linear model on
a public dataset and sweeps one methodological decision (whether to standardize
features). The result is two universes, baseline and raw, each with its
own r2 metric and fit_plot figure — a clean comparison ready for a paper
figure.
Make sure you've finished the install first.
1. Create a project¶
lc init is a one-shot setup. It creates a small, opinionated directory
layout and stops; it doesn't ask any questions.
r2-decision-demo/
├── astra.yaml # the spec — this is where everything lives
├── CLAUDE.md # short note for the agent (resumes context across sessions)
├── .gitignore
├── .git # initialized git repository (skip with --no-git)
├── .venv/ # Python virtual env (skip with --no-venv)
├── .claude/ # Claude Code plugin — skills, agents, hooks
├── .lightcone/ # internal scratchpad — don't edit by hand
├── Containerfile # build instructions for a local testing container
├── requirements.txt # software dependencies
├── universes/
├── src/
└── results/
The two files you'll actually look at:
astra.yaml — the single source of truth for your analysis. Inputs,
outputs, methodological decisions, recipes. Everything else lightcone-cli does
is downstream of this file. The boilerplate from lc init has one example
output and an empty decisions block — enough to run lc run and see something
materialize, but not yet a real analysis.
CLAUDE.md — a short note that tells Claude Code about the project. The
skills will update this as you go (filling in working notes, design context).
You can edit it by hand whenever you want.
2. Open Claude Code¶
This opens an interactive session inside the project directory. Claude Code
reads astra.yaml and CLAUDE.md so it has context from the start.
3. The slash commands¶
Inside Claude Code, the /lc-from-* family is organized by what you're
starting from. We'll use /lc-new in this guide; the others work the same
way.
| Command | Use it when… |
|---|---|
/lc-new |
You're starting from a research question and an empty astra.yaml. |
/lc-from-code |
You have an existing codebase you want wrapped in ASTRA. |
/lc-from-paper |
You have a published paper (DOI / arXiv ID) you want to reproduce. |
/lc-feedback |
Something broke and you want to file a GitHub issue without leaving the session. |
These are structured entry points for common starting situations. Once inside a
project you can also just describe what you're trying to do to Claude —
astra.yaml, lc run, and lc verify keep things tracked regardless of how
you got there.
4. Scope the analysis with /lc-new¶
Type:
The agent banner switches to RESEARCH QUESTION and asks something like "What are you trying to learn?" Reply in plain prose:
I want to know how much R² changes on the diabetes dataset depending
on whether I standardize features before fitting a linear regression.
A few follow-ups will sharpen this. After Phase 1 your astra.yaml already
has a name, description, and version — open it in another window if
you're curious; it's <30 lines.
In Phase 2 (ANALYSIS STRUCTURE) the agent asks about inputs, outputs, and whether this should be one analysis or split into stages. For our case, one analysis is right:
- Input:
diabetes(sklearn's bundled toy dataset). - Output 1:
r2, typemetric. - Output 2:
fit_plot, typefigure.
In Phase 3 (DEEP DIVE), say "skip the literature pass" to keep this a quick demo. The agent will still walk you through identifying the decision: does it preprocess? what options? what's the default?
You'll end up with something like this in astra.yaml:
version: "1.0"
name: "R² with and without feature standardization"
description: "Linear regression on the diabetes dataset, sweeping the standardization choice."
inputs: []
decisions:
standardize:
label: "Feature standardization"
rationale: "Standardizing changes coefficient scales and can shift R² for ridge-like models."
default: standardized
options:
standardized: { label: "StandardScaler before fit" }
raw: { label: "No preprocessing" }
outputs:
- id: r2
type: metric
description: "Coefficient of determination on the test split."
recipe:
command: python scripts/fit.py --standardize {standardize} --output {output[0]}
- id: fit_plot
type: figure
description: "Predicted vs true scatter."
recipe:
command: python scripts/plot.py --r2_dir {input.r2} --output {output[0]}
inputs: [r2]
container: Containerfile
Phase 4 (FINALIZE) runs astra validate astra.yaml, writes
universes/baseline.yaml, and fills in the narrative: block. You're handed
back a short summary table — two outputs, one decision, zero prior insights.
The agent may suggest /clear to free up context. Take its advice.
5. Implement the spec¶
/clear
Implement this analysis from astra.yaml. Write the scripts, run the baseline universe, and verify the result.
The agent reads the spec, the universe file, and the empty scripts/ dir,
then makes an implementation checklist:
1. Add Python deps (scikit-learn, matplotlib) to requirements.txt
2. Write Containerfile if missing
3. scripts/fit.py — accepts --standardize {standardized,raw}, writes r2.json
4. scripts/plot.py — reads r2_dir, writes fit_plot.png
5. lc run --universe baseline
6. lc status
7. astra validate astra.yaml
8. lc verify
It works through the checklist one item at a time. You'll see commands like:
Expected lc status output:
lc verify and astra validate should exit cleanly — no tampering, no broken
chains. If anything fails, ask the agent to fix the concrete error and rerun.
The agent commits after each successful output, so your git log is a clean
record of the build.
6. Verify integrity¶
This recomputes data hashes for every output and walks the input chain back to
declare whether anything has been tampered with since materialization. Useful
pre-publication, when archiving a project, or any time you want a stronger
guarantee than lc status.
What just happened¶
astra.yamlwas the only file you "wrote" — and the agent did most of the typing.- The agent wrote
scripts/fit.pyandscripts/plot.pywith argparse-driven decision injection. lc rungenerated.lightcone/Snakefilefrom your spec, dispatched each rule through Snakemake, and wrote a per-output sidecar manifest recording the recipe, container image, decisions, input hashes, and output hash.lc statusandlc verifyrely on those manifests — they don't re-execute anything; they just check.
If your laptop dies tomorrow and you git clone the repo on a fresh machine
and run lc run, you'll get bit-identical results.
Where to next¶
- The Agentic Workflow — what each slash command does in detail.
- Running on a Cluster — take the same project to SLURM.
- Troubleshooting — when something goes sideways.
- Glossary — terms like universe, decision, and manifest in plain language.