Part 6 of 10: Finding the Rhythm That Actually Works
Several weeks into building my Pomodoro timer, I’d accumulated a lot of practices:
Session retrospectives. Code review. Verification checklists. Quick reference guides. Gap tracking.
But I was about to discover that the most important practice was also the simplest:
Stop when the timer goes off.
The Meeting That Never Ends
Before I instituted two-hour sessions, my work with Claude looked like this:
9:00 PM: “Let’s implement the settings panel.”
9:47 PM: “This is going great!”
11:23 PM: “Wait, why is the timer state getting reset?”
1:08 AM: “I’m just going to fix this one last thing.”
2:34 AM: “How did we get here?”
The next morning, I’d look at what we’d built. Some of it was good. Some of it was nonsense. Most of it required rework.
The problem wasn’t the code Claude wrote. The problem was that I’d let us wander for five and a half hours without stopping to assess direction.
It was like staying in a meeting that should have ended hours ago, but everyone keeps saying “just one more thing.”
The Forcing Function
Two-hour sessions seemed arbitrary when I first tried them. Why two hours? Why not “until we finish the feature”?
But that’s exactly why it worked: it was arbitrary.
The timer going off wasn’t a suggestion. It was a forcing function.
At the two-hour mark, we stopped. Didn’t matter if we were mid-feature. Didn’t matter if we were “almost done.” In fact, I would watch the clock and look for good stopping points when we got close. I don’t like having code in my repository that has not been verified and tested.
Stop. Document current state. Plan next session.
The first few times, this felt wasteful. We’d have momentum! Why stop?
But here’s what I learned: momentum isn’t always good.
Sometimes momentum is just speed in a random direction.
The Todo List That Tells the Truth
At the start of each two-hour session, Claude builds a todo list.
Not a generic list. A specific one:
Items assigned to Claude or me explicitly. “Claude: Implement timer state machine.” “Human: Test timer in incognito mode with cleared storage.”
Expected outcomes. Not “work on the settings panel” but “settings panel allows theme selection, persists to localStorage, applies theme immediately.”
Validation steps (done by me). How will I know if Claude’s work succeeded?
Expected todos for next session. What do we think comes after this?
That last part is the genius piece.
Forcing Claude to project the next session makes problems visible immediately. If Claude lists five sessions worth of work for a feature I wanted in two sessions, we have a scoping problem. Better to discover that now. In fact, stopping when there is more to do, or when you have a “good idea” that you can just sneak in with the current work is a huge red flag. We keep a backlog for a reason, and if I want to “sneak something in,” it is probably because I wouldn’t prioritize it as the next thing to do. What should that tell me?
And forcing me to review that list means I can’t claim ignorance later. I signed off on the plan.
The Parallel Work Pattern
With clear role separation, I could work in parallel with Claude.
Claude: “I’m implementing the settings panel storage logic.”
Me: “I’m testing the timer reset behavior manually in three different browsers.”
Both happening simultaneously.
This sounds obvious, but it wasn’t how I’d been working. I’d been watching Claude work, like watching someone else type on a shared screen.
Total waste of human time.
With the todo list showing my tasks alongside Claude’s, I had work to do. Real work. Validation that mattered.
The sessions became genuinely collaborative instead of me being a spectator.
Reasons that made me want “Just a little more time”
The first was a bug I found. When it takes me more than 2 iterations with Claude on a bug, that is a red flag. Something is going on either in the context, the agents, the tech stack, the architecture – something more than code. These eat up my days, or at least they could.
The second is an insight into the process. I think “I know how to save some tokens”, or “I know how to reduce bugs”, or “I know how to capture metrics”. The sirens song is sweet and it seems so easy to make a single change.
What I do now is pay attention to the time. If It looks like I am going to go over, wrap up and put the leftover work in a new GitHub issue. If I have a “great idea”, I can have Claude put the kernal of that idea in GitHub to prioritize and handle in the future.
Compare this to my previous approach: multiple days of Claude claiming fixes that didn’t work, no systematic reproduction, no verification checklist.
The difference wasn’t Claude’s ability. The difference was the system constraining both of us to work verifiably.
The Cadence That Creates Confidence
After a few weeks of two-hour sessions, I noticed something:
I trusted the process more than I trusted the output.
That sounds bad, but it’s actually good.
With human developers, you trust specific people. “Jamie writes solid code, but Pat’s work needs careful review.”
With AI, that kind of trust is dangerous. Claude doesn’t have good days and bad days. It has good prompts and bad prompts, good context and bad context.
But I could trust the process. If we’d followed the two-hour session structure, done the verification steps, tracked our progress in retrospectives—the output was reliable.
Not because Claude was reliable. Because the system was reliable.
The Retrospective Pattern That Emerged
I’d been doing session retrospectives after each completed story. Quick conversations:
- What went well?
- What was frustrating?
- What would we do differently next time?
- Any patterns we’re noticing?
Logging these in monthly files. Brief notes. Honest observations.
Then, every few sessions, I’d ask Claude to review those retrospective notes and pull out something actionable for the current work. How often do I take these actions? When I feel the need for a win, I will look for something from our retros to test out and try. Also, if something just feels broke, I will look through retro history and see if we can come up with something new to try.
The changes we tried weren’t insights Claude gave me. These were insights the retrospectives revealed when I stopped to look.
Lessons for Leaders (From Someone Who Finally Found Rhythm)
Lesson 1: Time-boxed work is even more critical with AI.
Human developers get tired. They naturally take breaks, go to meetings, switch contexts.
AI doesn’t. Claude will happily work with you until 3 AM generating code you’ll regret.
Your teams need mandatory work boundaries. Not just “end of day” but “end of session.” Stop, assess, plan next steps.
This feels like overhead. It’s actually discipline.
Lesson 2: Parallel work requires explicit task assignment.
“Pairing” with AI isn’t like pairing with a human. You can’t both look at the same screen and collaborate smoothly.
But you can work in parallel if roles are clear. AI does implementation, human does verification.
Your teams need explicit task assignment at session start. Otherwise they’ll default to watching AI work, which is waste.
Lesson 3: Retrospectives create accountability for both human and AI.
Session retrospectives made me face facts: some of the churn was Claude’s bugs, but a lot of it was my unclear requirements.
The retrospectives didn’t let me blame AI for problems I was creating.
Your teams need similar accountability. If you’re only measuring AI performance (tokens used, features shipped), you’re missing the human contribution to success or failure.
The Practices That Actually Matter
By this point, my workflow had stabilized:
- Roadmap tracks work ahead
- Two-hour sessions with clear scope
- Claude builds session plan from roadmap
- Todo list with explicit Claude/human tasks
- Parallel work during session
- End-of-day code review against standards
- Session retrospectives after completed stories
- Periodic retrospective review for actionable insights
It sounds like a lot. But it’s actually less overhead than my previous approach of “ask Claude to build things and hope for the best.”
That approach required constant rework, debugging sessions that went nowhere, and features that looked done but had invisible problems.
This approach front-loads clarity and back-loads verification. Total time is lower, even counting overhead.
What I Still Didn’t Know
The practices worked for me, solo, building a small app.
But I still had questions:
Would this scale to larger apps? To teams? To domains where I had less expertise?
I’d deliberately chosen a Pomodoro timer because I understood the problem space. What happens when you’re building something you’re learning as you go?
I’d been working in areas where I could verify AI’s output. What happens in domains where you can’t easily verify—security, performance, accessibility?
And the biggest question: was I actually building something valuable, or just practicing a process?
My Pomodoro timer was almost done. Time to find out if anyone besides me would actually use it.
That’s where theory meets market reality.
This is part 6 of a 10-part series. Part 1, Part 2, Part 3, Part 4, Part 5 covered the journey from chaos to system. Part 7 explores what happened when the system faced real-world pressures.
About the Author: I’m an Agile coach and professor who’s been documenting the messy reality of learning AI-assisted development. This series shares the practices that actually worked—and the ones that failed expensively.
Leave a comment