— Data privacy & AI

Does AI Takeoff Software
Train on My Uploaded Plans?

The single biggest data fear estimators have about AI takeoff is that their confidential plans and pricing quietly become training fuel for a model competitors also use. Here is how to tell, what the answers should look like, and how PILARS handles it.

Why this question matters

Your uploaded plans contain far more than geometry. Embedded in a set of construction documents are your unit costs, labor rates, scope notes, and the specific way you read a job — information that took years to calibrate and that you would never share with a competitor. If an AI takeoff platform trains its models on customer-uploaded data, fragments of that pricing intelligence can theoretically surface in responses to other users asking similar questions about similar project types.

The risk is not purely hypothetical. Public and federal project work frequently includes contractual provisions that prohibit design data from being reused, retained beyond the transaction, or processed for any secondary purpose. If you are bidding publicly funded work, a vendor whose terms permit training on customer data could put you in breach of your contract with the owner.

This is not a reason to avoid AI takeoff tools — it is a reason to read the policy before you upload your first set, and to ask the right questions in writing before you sign.

The three policy stances vendors take

When you read through privacy policies and data processing agreements in this space, vendors generally fall into one of three categories. Understanding which bucket a tool sits in before you upload is the fastest way to protect yourself.

StanceWhat it meansVendor example
No training on customer data, everThe vendor never uses uploaded plans, pricing, or outputs to train, fine-tune, or evaluate any model. This is the strongest protection and the stance PILARS takes.PILARS
Training only on opt-in, de-identified dataThe vendor may train on customer data but only with explicit consent and after de-identification. Check what “de-identified” means in their DPA — scope notes and BOQ line items can re-identify a project type even without a project name.Varies
Training by default unless you opt outThe weakest stance. Unless you find and toggle an opt-out setting, your data is in the training pool. Read the fine print in the Terms of Service carefully — this is often buried under “service improvement” language.Varies

Third-party LLM pass-through risk

Many AI takeoff tools do not run their own models. They route your uploaded plans — or extracted text and images from them — to an external LLM API such as OpenAI, Anthropic, or Google. The vendor's own privacy policy may be perfectly clean, but whether your data is used for training depends equally on which API tier that vendor has purchased.

Enterprise API tiers from the major providers typically offer zero-data-retention and explicit no-training contractual terms. Consumer-facing chat products (ChatGPT, Gemini chat) generally do not offer these guarantees by default, and using them directly for construction estimating carries meaningful data risk. The gap between tiers matters: the same underlying model can carry very different data handling obligations depending on how the vendor accesses it.

The practical question to ask any vendor is: which model provider do you use, are you on an enterprise API agreement, and are no-training and zero-retention terms contractually enabled for my data? A vendor who cannot answer that question clearly is unlikely to have enterprise-grade terms in place.

  • Enterprise API tiers: typically zero-data-retention + no-training contractually available
  • Consumer chat products: generally no such guarantees by default
  • Ask for written confirmation of which tier and which provider handles your files

Where to find the answer

Most estimators are not lawyers, and vendor privacy documents are not written to be readable. Here is where to look and what to search for to get a fast answer without reading every paragraph.

Start with the Privacy Policy and Terms of Service. Use your browser's find function to search for the words training, model improvement, machine learning, and derivative works. Any of those phrases appearing in a section about how the vendor uses customer data is a signal worth reading carefully. If none appear, the policy may simply be silent — which is not the same as a prohibition.

The Data Processing Agreement (DPA) is often a separate document and is more legally precise. It will state the lawful basis for processing, the permitted purposes, and whether secondary use (such as training) is prohibited. If the vendor does not publish a DPA or will not provide one on request, treat that as a significant red flag for enterprise or public-sector work.

Finally, check the vendor's security page or trust center. Vendors who have achieved SOC 2 Type II certification have been audited on data handling controls. The trust center will also list sub-processors — the third-party services, including LLM providers, that touch your data.

Questions to send the vendor in writing

Email is better than a demo call for this purpose. A written answer creates a record you can attach to a vendor agreement, and it requires the vendor to commit. If they are reluctant to answer in writing, that tells you something too. Here are the four questions that cover the full surface area of the risk.

  • Do you train, fine-tune, or evaluate models on my uploaded plans or pricing? This should be a simple yes or no. If the answer is no, ask them to confirm it in the DPA.
  • Do you send my files to a third-party LLM, and is no-training/zero-retention enabled? Ask them to name the provider and the tier. Enterprise API agreements with zero-data-retention are contractually enforceable; verbal assurances are not.
  • Can I opt out of any data use, and is opt-out the default? The safest answer is that there is nothing to opt out of because no secondary use occurs. If there is an opt-out process, ask whether it applies retroactively to already-uploaded data.
  • Will you sign a DPA confirming no secondary use of my data? Most reputable enterprise vendors will. If a vendor refuses or cannot turn one around, that is a practical signal about their data maturity.

PILARS will answer all four of these in writing and will sign a DPA for any customer who requests one. The short answer to question one is no — we do not train or fine-tune on uploaded plans or pricing.

Questions estimators actually ask

How do I know if my takeoff data is used for training?

Read the privacy policy and DPA for terms like model improvement or training, and ask the vendor directly in writing. A clear vendor states whether training occurs and whether you can opt out.

Does PILARS train on my uploaded plans?

No. PILARS does not train or fine-tune models on your uploaded plans or pricing data.

Is it safe if the tool uses ChatGPT or Claude under the hood?

It can be, if the vendor uses the enterprise API tier with zero-data-retention and no-training terms enabled. Ask the vendor to confirm which provider and tier they use.

What is zero data retention?

It is an LLM API setting where the provider does not store your prompts or outputs after processing. Enterprise API tiers from major providers can enable it; consumer chat products generally do not.

Can I require a no-training clause in my contract?

Yes. Many vendors will sign a DPA or addendum confirming they do not use your data for training or any secondary purpose.

See Pilars run a takeoff on your own plans. Book a call →