How to estimate R&D hours without a timesheet nightmare

The labour cost is the largest part of most software R&D claims, and it comes down to a number almost no engineering team has: how many hours each person spent on eligible R&D. The gold standard is contemporaneous timesheets. Almost nobody keeps them, at least not split by experiment. So the practical question is how to produce a defensible hours figure without inventing a year of time records after the fact.

The answer the ATO accepts is a reasonable, documented estimation methodology, as long as it is defensible and linked to the specific registered activity. Here is a method that meets that bar, anchored to evidence you already have.

This is general information, not tax advice. Your registered tax adviser determines and lodges your actual claim.

Start from the artefacts, not from memory

Asking an engineer "what percentage of your year was R&D?" produces a number with nothing behind it. A reviewer cannot test it, and it usually drifts high.

The better anchor is your pull requests. Each one is a dated record of real work by a known person. The trick is not to ask someone to recall a whole year, but to estimate effort on a small, representative sample of PRs, then extrapolate in a way you can explain and defend. The recorded number is still a human-attested time allocation; the PRs are the evidence and the scaffold that make it quick and traceable.

The method, step by step

1. Take a stratified sample

Pull requests vary wildly in effort, so do not sample randomly. Group each contributor's PRs by size (small, medium, large, by code churn) and sample across the classes, with at least one from each size class and at least one per experiment. Stratifying this way measures how effort changes with size instead of assuming one flat rate.

2. Average within each size bucket

From the sample, compute the average hours for each size class. For example:

small PRs:  avg 6h
medium PRs: avg 16h
large PRs:  avg 40h

Bucket averages beat a single fitted rate because they capture the fixed effort per PR (review, context-switching, setup) that a strictly proportional model misses, and because "we sampled each size class and applied its measured average" is easy to explain to a reviewer.

3. Extrapolate across all PRs, split by assignment

Apply each size class's average to every PR that contributor authored in the year. Then split the total: hours on PRs assigned to a confirmed core experiment are experiment hours; everything else is routine and excluded.

4. Deduct non-PR overhead

Take the contributor's actual working hours for the year (from their FTE or salaried hours). Whatever is not represented by any PR, meetings, planning, operations, design that left no artefact, is overhead and is excluded:

non_pr_overhead = working_hours - total_pr_hours

This residual doubles as a sanity check. If your extrapolated PR hours exceed the person's actual working hours, the estimate has over-counted and should be flagged and re-sampled, not quietly accepted.

5. Eligible hours and expenditure

Only experiment hours on confirmed core activities are eligible. Multiply by the fully-loaded rate (salary plus on-costs, divided by working hours):

eligible_fraction    = eligible_hours / working_hours
eligible_expenditure = eligible_hours x fully_loaded_rate

Because eligible hours can never exceed PR hours, which can never exceed working hours, the eligible fraction is capped at 100% by construction. That removes the over-apportionment failure mode entirely.

A worked example

Alice, FY2026, worked 1,720 hours (about 0.9 FTE) at a fully-loaded rate of $95/h. She authored 48 PRs; a 9-PR stratified sample gives the bucket averages above. Extrapolated across all 48:

experiment PRs (core, confirmed)   520 h    eligible
routine PRs                        680 h    excluded
non-PR overhead (residual)         520 h    excluded
-------------------------------------------
working hours                    1,720 h

eligible fraction    = 520 / 1,720 = 30%
eligible expenditure = 520 x $95   = $49,400

Every one of those 520 hours traces back to specific pull requests assigned to a specific experiment.

Why this holds up

Three things make this defensible rather than a guess:

It is anchored to contemporaneous artefacts. Every eligible hour links to dated pull requests and the experiment they belong to, satisfying the requirement that a reviewer can tie cost to a specific registered activity.
It is conservative by construction. Routine PRs and non-PR overhead are shown but excluded, and the eligible fraction cannot exceed 100%. R&D effort that left no PR, such as whiteboard design or hand-run experiments, is not counted, so the method tends to undercount rather than over-reach.
It is attested. A named reviewer (a founder or engineering manager) checks each contributor's sample, hours, rate and breakdown and signs off, and the basis of each figure (estimate, manager-attested, or timesheet) is recorded.

A note on best practice

This is a defensible reconstruction, not a replacement for real-time records. If you do keep contemporaneous timesheets split by activity, use them; they are stronger. But for the large majority of teams that do not, a documented, artefact-anchored estimate is the reasonable methodology the rules contemplate, and it is far more defensible than a percentage someone recalled at year end.

This is exactly the model our tool implements: stratified sampling against your PRs, extrapolation, overhead deduction, and per-contributor attestation, with every eligible hour tied back to the evidence for your adviser to review.

This article is general information, not tax advice, and not a determination of eligibility. Speak to a registered tax adviser before lodging an R&D Tax Incentive claim.