createOperator: turning a layout into a frontend operator
createOperator is the factory that wraps a low-level layout node-builder (Spread, Scatter, Table, Frame) and produces the frontend operator (spread, scatter, table, group, plus stack as a thin wrapper over spread) used inside chart(...).flow(...) and as a combinator inside .mark(...).
It lives at src/ast/marks/createOperator.ts.
The design is inspired by Krist Wongsuphasawat's Encodable ("Encodable: Configurable Grammar for Visualization Components", IEEE VIS 2020 — arxiv:2009.00722). createOperator extends Encodable's per-component channel-grammar pattern to layout operators: the channel system carries over verbatim, with a split step added in front and a combine (low-level layout) step added behind. See "Prior art" at the bottom of this doc for the mapping.
This doc explains what the factory does, why it has two call shapes, and how to add a new operator. It assumes you've read The Mark Factory — this is the same idea applied to layout containers instead of leaf shapes.
1. The two call shapes every operator has
Every layout operator is a single function that you can call in two ways:
// (A) Combinator form — pass marks directly:
spread({ dir: "x" }, [m1, m2, m3]);
// (B) Operator form — used inside .flow():
chart(data)
.flow(spread({ by: "category", dir: "x" }))
.mark(rect({ h: "value" }));| form | what varies | what's shared | meaning |
|---|---|---|---|
| combinator | n marks | one datum | "arrange these n marks horizontally" |
| operator | n data slices | one mark, one outer datum | "for each group of data, build a mark; arrange those" |
In combinator form, the user provides the array (of marks). In operator form, split produces the array (of data slices) from by. Either way, the factory ends up with N children to hand to the same low-level layout. The two forms aren't strictly category-theoretic duals — they're two ways of getting to the same N-children-then-layout shape, with different sources of the multiplicity.
createOperator produces both forms from one config. Disambiguation is by arg shape: a second positional argument means combinator form; no second arg means operator form.
Both forms also get the standard structural .translate({ x?, y? }) modifier. It wraps the operator's produced node instead of merging x/y into the operator's own options. That distinction matters for operators like scatter: scatter({ by: "lake", x: "lake" }).translate({ y: 50 }) keeps x: "lake" as scatter's discrete placement encoding, while y: 50 belongs to the outer translation wrapper.
2. The split → fmap → combine shape
Pick any layout operator and you'll find the same three steps — a fan-out into N pieces, followed by a fan-in back to a single node:
- Split. Partition the data into pieces. For
spread, this is "groupByby-field"; fortable, it's the cartesian product of two fields; forgroup, it's groupBy. Forscatterwith noby, it's "one piece per item". - fmap. Apply the user's mark to each piece, producing one
GoFishNodeper piece. - Combine. Hand the array of nodes to the low-level layout function (
Spread,Table,Scatter,Frame), which positions them.
The combinator form skips split entirely — the user already supplied the array of marks. The factory loops over them, applies each to the shared datum, and hands the resulting nodes to the same combine step.
3. Anatomy of a createOperator call
From src/ast/graphicalOperators/spread.tsx:430:
export const spread = createOperator<any, SpreadOptions>(Spread, {
split: ({ by }, d) =>
by ? Map.groupBy(d, (r: any) => r[by]) : new Map(d.map((r, i) => [i, r])),
channels: { w: "size", h: "size" },
});Three pieces:
- The low-level layout function —
Spread, the existingcreateNodeOperator-built node builder that already knows how to position children along an axis. This is the combine step. split(opts, d)— partitiondinto an orderedMap<key, subdata>. Insertion order matters (it determines layout order). Whenbyis omitted, each item becomes its own one-element group.channels(optional) — per-opt data-aware encodings. Same idea ascreateMark's channels:w: "size"means the user can pass a field name, and the factory will applyinferSizebefore handing opts toSpread.
That's it. Both call shapes (operator and combinator) fall out of the factory.
4. What happens at render time
Operator form (spread({ by, dir }) inside .flow(...))
Walking createOperator.ts:391-415:
- Split —
cfg.split(opts, d)partitions the input into aMap<key, subdata>. (Some operators, liketable, also returnkeys— row/column labels that get merged into the layout opts.) - fmap — for each
(key, subdata)entry, call the user's mark with that subdata and a parent-prefixed key (${key}-${i}). The result is resolved to aGoFishNode.node.setKey(...)makes downstream coordinators able to look it back up. Whenbyis a string, each produced leaf is also stamped with__splitByrecording that field — the innermost grouping wins (a??=-style guard means an already-stamped node keeps its value). This is what lets a laterresolve(cols, { from })infer its match key for free: it reads__splitByoff the resolved node to learn which field that node was grouped by (scatter({ by: "id" })⇒ join onid), so the user need not restate the key. A functionbyhas no field name to record, soresolveerrors there unless given an explicitkey. - Apply channels —
applyChannelsrunsinferSize/inferPos/inferColoron annotated opts. For an entry-flagged channel ({type, entry: true}), the inference runs once per split entry, producing an array of values (one per child); otherwise it aggregates over all ofdand produces one value. - Strip factory keys —
byanddebugnever reach the low-level layout; remove them from opts. - Inject the grouping measure —
byis stripped, but a grouping operator needs its field to name the ORDINAL axis it builds. So the resolved per-axis grouping field (cfg.axisFields?.(opts), e.g.{ x: "lake" }) is passed through to the low-level layout in opts (as__axisFields), where the node builder stamps it onto the ORDINAL space'smeasure— the discrete analogue of a continuous channel's field becoming its space's measure. (axisFieldsis also the source the chart-builder uses as a fallback hint for axis titles when a space carries no measure — see layout passes.) - Combine — call the low-level
layoutwith the encoded opts and the array of child nodes.
Combinator form (spread({ dir }, [m1, m2, m3]))
Same machinery, simpler:
- Apply each mark in
marksto the samed. Marks may be any of:Mark<T>functions, already-resolvedGoFishNodes (e.g.ref(...)), or aPromise<Mark<T>[]>(e.g. when produced by SolidJSFor(...)). - Apply channels (no per-entry inference — there's no split).
- Strip factory keys.
- Combine.
5. Channels in operator opts
The factory's channel system mirrors createMark's, with one extra spec shape — entry-flagged channels:
channels: {
w: "size", // aggregate over all data, one value
x: { type: "pos", entry: true }, // per-entry, produces array of values
}| spec | what it does |
|---|---|
"size" / "pos" / "color" | aggregate over all of d, produce one value (single number/string) |
{ type: "size", entry: true } | run once per split entry, collect into array (one value per child) |
{ type: "pos", entry: true, discrete: true } | for nonnumeric categorical fields, emit evenly spaced discrete placement coordinates |
| user passed an array | already final form — pass through unchanged |
scatter uses entry: true for x/y/xMin/xMax/yMin/yMax so a field name like x: "miles" becomes a per-group mean position (src/ast/graphicalOperators/scatter.tsx:336). Its point channels also set discrete: true, so a grouped nonnumeric field such as x: "lake" becomes a slot coordinate instead of an invalid numeric mean.
6. Adding a new operator: a worked example
Suppose you want a wrap operator that lays children out left-to-right with line wrapping at a max width. (This isn't a real GoFish operator today — it's an example.)
You already have the low-level node builder, Wrap, written with createNodeOperator. Then:
export type WrapOptions = {
by?: string;
maxWidth: number;
spacing?: number;
};
export const wrap = createOperator<any, WrapOptions>(Wrap, {
split: ({ by }, d) =>
by ? Map.groupBy(d, (r) => r[by]) : new Map(d.map((r, i) => [i, r])),
});Both forms now work without further code:
// Operator form:
chart(items)
.flow(wrap({ by: "category", maxWidth: 400 }))
.mark(rect({ w: "size" }));
// Combinator form:
wrap({ maxWidth: 400 }, [m1, m2, m3, m4]);If Wrap accepts a width-per-child, you'd add channels: { width: "size" } so consumers can pass a field name there.
If your operator needs to feed extra data (like colKeys/rowKeys) into the layout opts, return the wrapped {entries, keys} form from split instead of a bare Map — see table.tsx:228 for an example.
Operators created with createOperator automatically support .translate({ x?, y? }). You do not implement this per operator; the factory composes the ordinary split/channel/combine pipeline with a structural translation wrapper around the produced node.
7. The relationship with createMark
The two factories are siblings:
| wraps | output | |
|---|---|---|
createMark | a leaf shape (Rect, Ellipse, …) | a Mark<T> (one node from one datum) |
createOperator | a layout (Spread, Scatter, …) | a dual-mode operator (one node from many) |
Both use channel annotations to encode opts; both produce mark types supporting .name(...) and .label(...) chaining. That chaining is wired by the modifier factory that also lives in this file — createModifier + attachModifiers — a single config-driven system shared by nameableMark (combinator marks), createMark (leaf marks), and makeConstrainableMark (layer / Porter-Duff marks, which add .constrain()). .name(...) also stashes the passed name on the returned mark function via stashLayerName (defined in chartBuilder.ts, called by the name modifier's tag hook), so ChartBuilder.connect() can detect a user-chained name without parsing the __serialize tag.
A second flavor, attachTransformModifiers, handles methods that map a mark to a different mark rather than mutating its nodes — e.g. image(...).cut(opts) maps the image to an expand-kind cut mark (which slices the source into N nodes 1:1 with data, built on the pure cut(source, opts) array primitive). Because the transform replaces the mark before any node exists, it wraps the existing .name()/.label() methods to re-apply itself, keeping .cut available across a naming/labeling chain.
Expand marks consume a whole group at once, so the operator (traversal) form hands them a single leaf containing all rows regardless of its own split config. An expand mark therefore turns each group's rows into an array of nodes, whereas a by-grouped operator needs exactly one child node per group — so an expand mark can't hang directly under a by-operator, and that case throws. The fix is to interpose a layout operator between the grouping and the expand mark (.flow(spread({ by }), stack({ dir }))): the inner operator consumes the expand mark and collapses each group's slices into one node, which the outer by-operator then arranges.
Naming-wise: createOperator is the frontend factory; the low-level helper that produces Spread, Scatter, etc. is createNodeOperator (withGoFish.ts:297). The "node" prefix reflects that it returns a function whose output is a single GoFishNode, not the dual-mode shape that createOperator returns.
8. Prior art
createOperator extends the per-component channel-grammar pattern from Encodable (Wongsuphasawat, IEEE VIS 2020 — paper, code) to layout operators. The channel system maps onto Encodable's directly — see The Mark Factory's "Prior art" section for the mark-level table. createOperator adds two pieces Encodable doesn't have:
| step | what it does | Encodable analogue |
|---|---|---|
split | partition the input data into an ordered Map | none — Encodable encodes a single component, not a layout |
channels | parse user opts into rendering parameters | Encoder / ChannelEncoder |
layout | combine partitioned children into one node | none — Encodable's encoders feed a renderer outside the grammar layer |
entry: true channel flag | per-partition aggregation (e.g. mean x of each group) | extension; closest analogue is Encodable's per-channel scale resolution against grouped data |
The two-call-shapes design (combinator and operator/traversal) — where the multiplicity comes from the marks array vs. the data partitions — is novel to createOperator; Encodable doesn't address layout multiplicity.
9. Pointers
- The factory:
src/ast/marks/createOperator.ts. - Existing operators (each colocated with their low-level layout):
spreadandstack—graphicalOperators/spread.tsx.scatter—graphicalOperators/scatter.tsx.table—graphicalOperators/table.tsx.group—graphicalOperators/group.ts(sibling offrame.tsx, extracted to keep the chartBuilder ↔ createOperator import graph acyclic).
- The companion mark factory: The Mark Factory.
- The
serializeconfig field tags the produced operator with__serializemetadata the frontend-IR emitter reads — see Frontend IR (Serialization). - Encodable: paper arxiv:2009.00722, source github.com/kristw/encodable.
