The UK government ran structured Copilot pilots across multiple departments and published what they found. The productivity gains were real in some areas and absent in others. The lessons are directly applicable to Irish organisations — if they are willing to read the findings honestly.
In 2024, the UK government's Central Digital and Data Office published findings from structured Microsoft 365 Copilot pilots conducted across multiple government departments. This was not a vendor case study. It was an attempt by a large, complex organisation to measure what Copilot actually does to productivity in a real working environment — with controls, with measurement, and with an honest account of where it worked and where it did not.
The findings are worth reading carefully, both because they are more nuanced than vendor marketing suggests and because the organisational conditions in UK government departments are not entirely unlike those in large Irish public sector bodies, professional services firms, and regulated private sector organisations.
What the UK trials found
The headline result was positive: civil servants using Copilot reported time savings averaging around 1.2 hours per week. Drafting assistance, email summarisation, and meeting note generation were the most frequently cited use cases. Satisfaction among users was generally high, with a majority reporting that Copilot helped them do their jobs more effectively.
The more interesting findings were in the detail.
Time savings were not evenly distributed. Users with high volumes of routine written communication — those drafting a lot of correspondence, processing a large number of similar documents, or managing substantial email load — saw materially greater time savings than those whose work was primarily analytical, relational, or judgement-intensive. For some roles, Copilot use had no measurable impact on the time required to complete work.
Adoption was uneven even where licences were provided. A significant portion of licensed users were not regular Copilot users several months into the pilot. The barrier was not primarily technical literacy — it was finding use cases that fit established working patterns. People adopt tools when the tools solve problems they are already experiencing. Copilot adoption stalled in areas where the friction it reduced was not the friction users cared about most.
Quality concerns were real. A minority of users reported that Copilot-generated drafts required significant editing before use, and that in some cases the editing time exceeded the time that would have been required to write the document from scratch. This was not a failure of the tool — it was a deployment lesson about which tasks are suited to AI-assisted drafting and which are not.
What the governance lessons were
The UK government findings highlighted several governance issues that organisations planning Copilot deployments need to consider.
Data classification matters. Copilot's access to content is determined by the user's access permissions within Microsoft 365. In organisations where document permissions are poorly governed — where sensitive documents are widely accessible because nobody ever restricted them — Copilot will surface content that should not be easily discoverable. The trials surfaced this as a risk that required pre-deployment remediation.
Output quality is not self-evident. Users without domain expertise in an area tended to overestimate the accuracy of Copilot outputs. This is a known risk with LLM-generated content and is particularly significant in contexts where errors have material consequences — legal, regulatory, financial.
Measurement requires a baseline. Departments that had not established baseline productivity measures before deployment could not credibly measure the impact of Copilot after deployment. The 1.2 hours per week figure came from departments that had done the pre-work to establish what pre-Copilot productivity looked like. Many had not.
What Irish organisations can apply
The UK findings are directly relevant to Irish organisations — in the public sector, in professional services, in financial services — that have deployed or are planning to deploy Copilot.
The deployment gap in Irish organisations — the gap between licences purchased and measurable productivity return — is well documented. The causes are consistent with what the UK trials surfaced: deployment without diagnostic, measurement without baseline, and use case identification based on vendor suggestion rather than organisational analysis.
The specific lessons are practical. Before deployment: assess the data governance environment and remediate obvious access control gaps. Define which roles and workflows are genuinely suited to Copilot assistance and which are not. Establish baseline productivity measures that will allow you to assess impact.
After deployment: measure actual usage patterns alongside productivity outcomes, not usage as a proxy for productivity. Monitor for quality failures in AI-assisted outputs. Track adoption differentials across roles and investigate why some user groups are not finding value.
The organisations getting measurable return from Copilot are doing these things. They are also, consistently, those that ran a diagnostic of how work actually flows before deploying AI on top of it. That is not a coincidence.
If your organisation has Copilot licences but cannot demonstrate a productivity return, a diagnostic of your working patterns and deployment approach is the most efficient starting point. Contact Acuity AI Advisory to discuss.