Hey everyone,
I wanted to open a discussion on handling one of the most notorious bottlenecks in Apps Script development: the 6-minute execution wall (and the dreaded “Resource Exhausted” errors).
The Context:
My team recently built and launched a sheet-based Workspace management add-on called AdminSheet Pro. Our biggest technical hurdle wasn’t the Directory API logic itself, but dealing with massive enterprise and Higher-Ed domains. When you are trying to bulk update 10,000+ users or migrate massive nested groups, hitting that 6-minute wall is almost guaranteed.
We noticed that a lot of other sheet-based tools in the ecosystem just slap a warning label on their UI saying something like:
“Organizations with 10k+ users may experience timeouts due to Apps Script runtime limits.” We felt that defeated the purpose of building a “bulk” tool, so we decided to try and engineer a native workaround.
Our Approach: “Intelligent Pacing”
Instead of trying to brute-force the API or migrating the entire execution engine off Apps Script to GCP, we engineered a state-management architecture we call Intelligent Pacing.
Essentially:
-
We track row-level execution state continuously.
-
We anticipate the execution wall around the 5.5-minute mark.
-
We gracefully pause the script, save the state, and use triggers to spin up the next batch seamlessly in the background.
We made a conscious design choice to prioritise guaranteed completion over raw speed. It might take a little longer to run safely, but it doesn’t crash halfway through. We also had to build real-time, row-by-row visual feedback in the Sheet to prevent “terminal anxiety”—so admins wouldn’t panic and kill the script while it was processing in the background.
My Questions for the Community:
Since we are always looking to optimise, I’m curious how other devs here are tackling this:
-
Architecture: Are you using a similar trigger-based batching system to stay within Apps Script, or have you offloaded the heavy lifting entirely to Cloud Run/Cloud Functions?
-
UX: How do you handle user UX when background tasks take 15+ minutes? How do you keep the user informed without exceeding quota limits on UI updates?
-
Batching: Have you found a “sweet spot” for API batching sizes to maximise throughput before you hit that 6-minute mark?
Would love to hear your thoughts and strategies!
Good morning!
Over time, here is what I learned. If you talk to Googler’s or Google Support, they are going to say break your script up as much as possible, create triggers, and keep processes as minimum as possible. If you talk to 3rd party vendors like Folgo, they will tell you that when developing their toolset, you have to know the right people at Google. The 6 minute timer is not written in stone and and that specific quota can be increases…if you know the right folks. What I personally do is a combination of Apps Script and Cloud Functions/Tasks. I do this for a couple of reasons. Tools that require a quick response (Google Chat for a Chatbot) work very well in this scenario because Google Chat requires a reply in 30 seconds or it times out. With this combination of tools, you can leverage code for the quick reply while offloading your work to a Cloud Function/Task to do the dirty work. In your scenario, that might be a good fit for you. To Travis’s point, this is where I would lean on Gemini for some help. Honestly, you will probably get a bunch of different answers here because everyone has a favorite toolset. Personally, I would suggest starting with Gemini to get a list of options and then do what you feel most comfortable with while also keeping security in mind.
I hope my thoughts help! Take care!
Travis, you hit the nail on the head regarding the ‘silent trigger’ issue. That was exactly the flaw we ran into early on during prototyping! If the extra trigger runs silently in the background, the user assumes the add-on froze or broke. They inevitably try to run it again, and suddenly you have overlapping executions creating a massive mess in the directory.
That’s exactly why we had to build the real-time, row-by-row visual feedback directly into the Sheet. Getting the initial 5.5-minute pause scaffolded with AI is a great starting point, but getting the recursion to loop perfectly while providing a UI update without hitting the Apps Script PropertiesService read/write quotas is definitely where the real work begins. Glad to hear we aren’t the only ones wrestling with that sweet spot!
Casper, that is fascinating insight regarding the ‘undocumented’ quota increases for larger vendors if you know the right people at Google! That definitely makes sense.
Your architecture of combining Apps Script with Cloud Functions/Tasks is absolutely the gold standard for pure performance and elegantly bypassing those timeouts. We actually heavily debated going that exact route!
The main reason we chose the painful path of keeping the entire pacing engine strictly native inside Apps Script was to satisfy ‘Zero Data Egress’ requirements from K-12 and Enterprise InfoSec teams. They flat-out refuse to let their Workspace directory data (PII, org units, etc.) leave their Google perimeter to be processed on third-party external Cloud Functions.
We essentially had to trade raw processing speed for an automatic ‘pass’ on security audits. But man, for non-restricted internal tools, your Cloud Tasks approach is 100% the way to go. Really appreciate you sharing your stack and insights here!