Flex Watt: AI Data Centers That Flex With the Grid
Live demo: Flex Watt
Tools used: Codex, Python, Next.js, React, TypeScript, Tailwind, Recharts, Deck.gl, Vercel
TL;DR: SCSP AI+ Expo Hackathon was part of SCSP’s 2026 National Security Technology Hackathon programming. The prototype weekend ran April 25-26, 2026, across San Francisco, Washington, DC, and Boston, with final demos tied to the AI+ Expo in Washington, DC.
Project
I built Flex Watt for the Electric Grid Optimization track. The challenge was about the pressure AI data centers are putting on the electrical grid: build agents or systems that forecast demand, coordinate supply, or keep the grid stable under rising compute load.
The idea I chose was narrow on purpose:
AI data centers do not have to behave like rigid 24/7 peak loads.
If large GPU clusters can reduce, defer, or move non-critical work during grid stress, they can behave more like conditional load. Real-time inference stays protected, but training, batch inference, and low-priority jobs can flex.
The project is an interactive grid dashboard that compares rigid AI data centers against flexible AI data centers during a replay of a grid stress event.
Workflow and Design
The core story is:
- The grid has off-peak headroom most of the time.
- Rigid AI data centers make peak stress worse.
- Flexible AI data centers can reduce demand during scarcity.
- The reduction happens by shifting non-critical workloads, not by turning off everything.
The live app lets you choose added AI load scenarios of 1 GW, 3 GW, or 5 GW, then run a 72-hour stress replay. In rigid mode, the data centers continue drawing power through the peak. In flexible mode, the fleet curtails training and batch work, transfers some deferrable load, and keeps priority inference protected.
I wanted the demo to feel like a grid operations dashboard, not a slideshow. It shows grid load, headroom, stress level, an ERCOT-style fleet map, event signals, workload states inside each data center, and a result card at the end of the replay.
Backend
The backend and data layer are intentionally simple. The flow is:
ERCOT data -> Python pipeline -> replay data -> simulation -> dashboard
The project uses ERCOT 2023 hourly native load data. I built a small Python data pipeline that parses the source workbook from a zip file, validates the full 8,760-hour year, preserves ERCOT local time labels, and emits frontend-ready JSON.
The processed dataset includes:
- 8,760 hourly records for 2023
- a 72-hour September 2023 stress replay
- annual peak and average load metrics
- zone-level load values
- event metadata for the dashboard
One important caveat: the stress labels in the demo are modeled labels for replay. They are not real PRC reserve data. I kept that caveat visible because grid demos can become misleading fast if the assumptions are hidden.
The frontend simulation is deterministic. That was intentional. I did not want an LLM deciding grid operations in the core loop.
The app models a fleet of AI data centers across ERCOT regions. Each data center has workload categories:
- real-time inference
- priority inference
- batch inference
- training / fine-tuning
- low-priority jobs
When grid stress rises, Flex Watt computes a curtailment intensity. Low-priority work drops first, training and batch jobs reduce next, priority inference is mostly protected, and real-time inference is never curtailed in the demo model.
What Worked
The strongest part of the prototype is the single-lever comparison. A viewer can run the replay and immediately see the difference between rigid load and flexible load.
The important part is not just reducing megawatts. It is showing graceful degradation. A training run might take longer, but the user-facing AI service stays online and the grid gets some headroom back.
Codex helped me move quickly across the stack: data processing, UI composition, simulation logic, and deployment wiring. But the project still needed a human design decision: keep one clear story and avoid pretending the prototype solved the entire grid.
Limitations
- The stress labels are modeled demo labels, not real PRC reserve data.
- The simulation is a prototype, not production grid-control software.
- A real deployment would need interconnection constraints, live market and reserve data, facility-level workload telemetry, and operator policy controls.
The hardest part was deciding what not to build. The problem statement was broad enough to invite forecasting, optimization, grid modeling, agent planning, dispatch, renewable integration, and demand response. In a hackathon, trying to build all of that would have produced a weak demo.
So I pulled one lever: flexible AI load.
Why It Matters
The real-world question is not just “can we add more load?” It is “can the load behave better when the system is stressed?”
Even as a prototype, the point is useful: AI infrastructure should not only consume power from the grid. It should become responsive to grid conditions.
That is the technical direction I wanted to demonstrate: AI data centers as flexible grid participants, not just bigger demand spikes.
Thanks
Thanks to the Special Competitive Studies Project for organizing the hackathon as part of AI+ Expo organizers for creating the track and problem space.