Vendor RFP scoring without referral fees: independent technical evaluation

Most vendor RFP scoring is theater

The standard RFP scoring rubric weighs categories — features, price, vendor stability, references — and produces a weighted score. The category structure looks rigorous. The scoring within each category is mostly judgment by reviewers who are responding to vendor sales materials, not to architecture. The vendor with the better salesperson and the slicker deck typically scores highest, regardless of whether the underlying architecture is what the buyer needs.

We see this consistently across audit engagements: a buyer signed with the vendor that scored 4 points higher in the RFP, only to discover at month 9 that the integration cost is double the proposal, the model is white-labeled from a frontier vendor with surprise rate-limit constraints, and the data residency claim doesn't hold up in the customer's regulatory geography. None of this surfaced in the original RFP scoring because the scoring didn't ask for the right things.

Architecture comparison surfaces what marketing materials hide

An independent technical scoring evaluates the underlying architecture: where is inference actually run, what model is actually used (and is it the vendor's or a re-licensed frontier model), what is the retrieval architecture, how is access control enforced, where is data stored at rest and in transit, what is the observability stack, what is the vendor's roadmap for model upgrades, and how does versioning work across the customer's tenant.

We evaluate each of these explicitly and require vendor responses with specifics, not slogans. 'We use a state-of-the-art model' is rejected; 'we use Claude 3.5 Sonnet via Anthropic API in us-east-2 with a fallback to GPT-4o via Azure OpenAI in eastus' is acceptable. Vendor responses that can't get specific are flagging architectural ambiguity that will become an integration cost later.

Categories scored: 12–18 architecture-led, not feature-led
TCO over 5 years: Modeled with sensitivity ranges
Hidden cost surfacing: ~6–8 items per vendor, on average
Referral fees taken: 0 from any vendor evaluated

Total cost of ownership is more than the listed price

Vendor list pricing covers the explicit subscription. Total cost of ownership includes integration engineering, ongoing operations (the customer's SRE time supporting the vendor's stack), data egress fees, model upgrade migrations, eval harness construction, change-management for prompt or behavior shifts, exit costs (data extraction, re-training, contract termination), and the reputational risk of vendor lock-in. We model TCO over five years with sensitivity ranges, because the headline price is rarely the largest cost component.

Common surprises: a vendor's data egress fees on a high-volume conversation platform exceeding the subscription cost; a vendor's model-upgrade cycle requiring quarterly customer-side prompt re-engineering; a vendor's eval-harness gap leaving the customer to build their own from scratch. Each of these is foreseeable in the RFP if the right questions are asked. Independent scoring asks them.

References are the most over-weighted RFP signal

Vendor-supplied references are by definition self-selected. The references are happy customers who agreed to take the call. They tell the buyer the system works for their use case. They do not tell the buyer about the customers who churned, the deployments that stalled, or the integration projects that ran 200% over budget. Reference checks have value but should be weighted accordingly: as confirmation of best-case outcomes, not as a representative sample.

We supplement vendor-provided references with independent diligence — public outage records, GitHub issue history if the vendor has any open-source presence, regulatory complaints, and back-channel conversations with engineers we know at peer organizations. The picture from triangulation is meaningfully different from the vendor's curated reference set.

Hidden integration costs are where most projects exceed budget

Every vendor proposal underestimates customer-side integration cost. The integration always involves identity provider configuration, network and security review, data pipeline construction, eval harness adaptation, monitoring instrumentation, change management with affected teams, and the migration of any pre-existing system. We estimate customer-side integration at 80–150% of the vendor's stated implementation cost — the vendor's number covers the vendor's work, not the customer's.

We surface this in the scoring: each vendor's proposal includes our independent estimate of customer-side integration cost, with the line items the vendor omitted. This is often the most actionable artifact for the buyer's procurement team because it sets a realistic budget envelope rather than the vendor's optimistic one.

Independence from vendor referral fees is the credibility test

We do not take referral fees, kickbacks, or commissions from any vendor we evaluate. The buyer pays for the engagement; the vendors do not. This is the structural property that makes the scoring independent. Most consulting firms doing RFP work today have undisclosed referral relationships with multiple vendors; the scoring inevitably tilts toward the relationships that pay better.

We disclose this on every engagement, in writing, in the engagement letter. The cost is that we make less money per engagement than firms that take both the buyer's fee and the vendor's referral. The benefit is that buyers get evaluations they can trust — which is what they engaged us for in the first place.

The scoring artifact is the deliverable, not the recommendation alone

We deliver RFP scoring as a structured artifact: the rubric used, the per-vendor responses with our notes, the scored categories with the rationale per score, the TCO model with sensitivity ranges, the integration cost estimates with line items, and the recommendation with explicit risk factors. The buyer can present this artifact internally to procurement, finance, security, and the board, and the rationale survives scrutiny.

When the buyer disagrees with our recommendation — which happens, and is fine — they can at least disagree with documented analysis. The artifact itself becomes the basis for negotiation with the chosen vendor, surfacing the gaps that need contractual treatment. The artifact has continuing value beyond the immediate procurement decision.

We had three vendors who all looked similar in their pitches. The independent scoring surfaced that one had a 40% higher five-year TCO because of data egress fees nobody had asked about, another had churned out of two peer deployments we hadn't heard about, and the third was the right answer despite having a less polished demo. We saved $2.4M over the contract life by paying for an evaluation that didn't take vendor money.
— CIO, regional bank

Frequently asked

What's wrong with standard RFP scoring?

It rewards salesmanship over architecture. Standard rubrics weight categories like features, price, and references, but the scoring within each category responds to vendor sales materials rather than to underlying architecture. The vendor with the better salesperson and the slicker deck typically scores highest, regardless of whether the architecture fits. The buyer signs and discovers the gaps in month 9, after the budget is committed.

How is independent technical scoring different?

It evaluates architecture, total cost of ownership, hidden integration costs, and failure modes the salesperson didn't mention. Vendor responses that can't get specific (state-of-the-art model, robust scaling, world-class security) are rejected and re-asked with specificity required. The scoring asks the questions the marketing materials are designed to skip.

What is total cost of ownership in vendor evaluation?

Vendor subscription plus integration engineering plus ongoing operations (customer SRE time) plus data egress fees plus model upgrade migrations plus eval harness construction plus change management plus exit costs plus vendor-lock-in risk. Modeled over five years with sensitivity ranges. The headline price is rarely the largest component. Many surprises — egress fees exceeding subscription, quarterly prompt re-engineering on model upgrades — are foreseeable when the right TCO questions are asked.

How are references handled?

Vendor-supplied references are self-selected — happy customers who agreed to take the call. They tell the best-case story, not the representative one. We supplement with independent diligence: public outage records, regulatory complaints, back-channel conversations with engineers at peer organizations. The triangulated picture is meaningfully different from the vendor's curated reference set.

Why does it matter that you don't take vendor referral fees?

Because referral fees create undisclosed financial incentives that tilt scoring toward the relationships that pay better. Most consulting firms doing RFP evaluation today have such relationships, often undisclosed. We do not take referral fees, kickbacks, or commissions from any vendor we evaluate, disclose this in writing on every engagement, and earn less per engagement as a result. The structural independence is what makes the scoring credible.

What does the deliverable look like?

A structured artifact: rubric, per-vendor responses with our notes, scored categories with rationale, TCO model with sensitivity ranges, integration cost estimates with line items, and recommendation with explicit risk factors. The buyer can present internally to procurement, finance, security, and the board, and the rationale survives scrutiny. The artifact also drives negotiation with the chosen vendor, surfacing the gaps that need contractual treatment.

Vendor RFP scoring without referral fees: independent technical evaluation

Most vendor RFP scoring is theater

Architecture comparison surfaces what marketing materials hide

Total cost of ownership is more than the listed price

References are the most over-weighted RFP signal

Hidden integration costs are where most projects exceed budget

Independence from vendor referral fees is the credibility test

The scoring artifact is the deliverable, not the recommendation alone

Frequently asked

More from Field Notes

The second-opinion engagement: when a fresh set of eyes saves the program

Build vs buy: the AI math behind the decision most teams get wrong

Operating model design: org chart, hiring plan, RACI, and tooling spine