Job Evaluation AI Agents: Why Hay and Mercer Need an Overhaul

Estimated Reading Time: 12 minutes

Your marketing director manages a team of twelve people. Every traditional job rating system on the market says that leadership span drives big job value. Now picture a different marketing director across town who manages zero people but runs fifteen AI agents. Those agents handle prospecting, content creation, market analysis, and campaign tuning. Under Hay, the second director scores lower on Know-How and Accountability (Hay Guide Chart overview). Mercer IPE’s Knowledge factor penalizes the leaner team (Mercer IPE). WTW’s Global Grading System weights Job Functional Knowledge and Leadership against the agent-led model (WTW Global Grading). Yet the second director may produce twice the revenue impact while spending more time on cross-functional influence and strategic alliances.

Job evaluation AI agents are exposing a core flaw that cuts across the entire pay structure, not just one or two methods. The systems that have guided pay choices for decades all share one critical need: they assume knowledge must be held by a person to create job value. That assumption is falling apart.

The Knowledge Gap That Spans Every Major Pay Rating Method

This problem does not stop at one system. It runs through every major pay rating method that compensation pros depend on today. The Hay Guide Chart-Profile Method, built in the 1950s, scores roles on Know-How, Problem Solving, and Accountability. Know-how forms the base that caps the other two. Mercer’s IPE system rates five factors — Impact, Innovation, Knowledge, Communication, and Risk — with Knowledge as a direct scoring element. In addition, WTW’s Global Grading System uses seven factors. Two of them are clearly knowledge-based: Job Functional Knowledge and Business Expertise. In addition, Radford McLagan’s pay database levels roles across Professional and Management tracks. “Knowledge and Skills” serves as a primary lens for setting autonomy and impact. Even the BLS National Compensation Survey assigns Factor 1 as “Knowledge.” Points range from 50 to 1,850 based on depth and use.

The pattern is clear. Every system treats built-up knowledge as a core line between job levels and pay grades. WorldatWork reports that 16 to 21 percent of organizations use point-factor methods. Nearly all include knowledge or skills as rated factors (WorldatWork). Additionally, the ILO’s Scheme of Geneva from 1950 set “Mental Requirements” as an original pillar. That pillar has lasted through seventy-five years of method changes across every major consulting firm (ILO framework).

MorganHR is far from alone in raising this alarm. In 2025, HBR writers Catalini, Wu, and Zhang argued that AI automates any task you can measure. This directly threatens systems built on scored knowledge (HBR 2025). Similarly, MIT’s Erik Brynjolfsson described how AI exposes “fossilized” experience in traditional ratings (NBER). Even WTW’s own Russ Wakelin has flagged the risks of over-relying on knowledge factors (WTW insights).

When Vendors Flag Their Own Methods

When the vendors themselves flag the problem, HR Directors must pay attention. Job evaluation AI agents do not challenge one method alone. They challenge the shared belief beneath all of them: that knowledge depth creates lasting gaps in job worth.

Algorithm-Based Platforms Carry the Same Dependency

Switching from Hay or Mercer to a data-driven platform does not escape this problem. Salary.com levels roles by scope, complexity, and competencies. Knowledge depth ties directly to level progression from Level I through Level III. Payscale (formerly PayFactors) blends market data with leveling guides that use “knowledge and skills” as core criteria. Pave sets job levels through scope, seniority, and required competencies. Higher levels demand deeper expertise for broader impact. ERI scores knowledge as a primary factor, with sub-criteria for formal education and work experience. Every one of these platforms treats personally held knowledge as the backbone of job leveling. AI agents compress knowledge gaps across levels. These tools hit the same wall as the legacy frameworks they were built to replace.

Why Job Evaluation AI Agents Break the Know-How Chain

Consider the logic chain that Hay built and that most other methods copy in some form. You cannot solve problems beyond what you know. You cannot hold accountability beyond what you can solve. For seventy years, this logic held because deep knowledge required years of schooling, practice, and hands-on work. That scarcity created real pay gaps between a senior analyst and a junior one. Job evaluation AI agents break that chain outright. Today, any team member can prompt an AI agent to run market analysis or model financial outcomes. That same agent can draft compliance documents or pull research across fields that once required an entire team.

Stanford’s Digital Economy Lab studied 1,500 workers across 104 occupations. Their findings confirmed what many pay professionals have sensed but held back from saying. Valued workplace skills are shifting from data tasks toward people-focused skills (Stanford Digital Economy Lab). High-wage skills like data analysis are losing their premium. Meanwhile, skills tied to planning, teaching, and clear communication are gaining ground. Furthermore, 80 percent of U.S. workers may see AI tools affect at least 10 percent of their tasks. About 19 percent face changes to more than half their duties.

PwC’s 2025 Global AI Jobs Barometer adds a key data point: workers with AI skills now command a 56 percent wage premium over peers in the same roles, up from 25 percent one year prior (PwC 2025). Employer skill demands are changing 66 percent faster in AI-exposed roles. At the same time, demand for formal degrees fell seven points in AI-augmented roles between 2019 and 2024.

The Market Is Already Repricing Roles

The market is devaluing the credentials that knowledge-based rating systems reward. These trends confirm that job evaluation AI agents are not a future concern. The market is already repricing roles based on AI skill rather than knowledge depth.

How Job Evaluation AI Agents Disrupt Mercer IPE and WTW Differently

Mercer IPE starts from a stronger position than Hay. MorganHR has always noted that Mercer begins with the type of company and its value chain. This grounding in business reality adds genuine depth that pure factor-scoring methods lack. However, job evaluation AI agents still challenge Mercer IPE on the Knowledge factor. When an AI agent can pull research and deliver strategic advice in minutes, the premium that held knowledge once earned shrinks fast. The gap between what a person knows and what an agent can access has narrowed to almost nothing.

WTW’s Global Grading System faces a similar but broader challenge. Its seven factors include Job Functional Knowledge, Business Expertise, Leadership, Problem Solving, Nature of Impact, Area of Impact, and Interpersonal Skills. Two of the seven factors are clearly knowledge-based. Problem Solving and Leadership both lean on assumed knowledge buildup. In effect, four of seven WTW factors face direct compression from AI agents. Radford McLagan’s leveling framework creates another weak spot. Its Professional track (P1 through P4 and above) ties advancement to expertise depth. P1 represents base knowledge. P4 represents industry-leading mastery built over ten or more years. When AI agents can deliver P4-level analytical output to a P2-level professional, the entire leveling structure becomes suspect.

HR Director Decision Framework: Ask one question of every role evaluation completed this year — Would this score change if we removed the assumption that knowledge must be held by a person rather than be agent-accessible? If the answer is yes for more than a quarter of rated roles, the framework needs immediate recalibration.

Influence Replaces Knowledge in a World Where Job Evaluation AI Agents Level the Field

Think about what happens when every employee has access to the same AI tools. An experienced senior expert who once spent weeks preparing data analysis now competes with a lower-level employee. That lower-level employee can produce the same analysis overnight using AI agents and present to the C-suite with equal backing. Consider this: the gap is no longer what each person knows. Instead, the gap is how well each person influences decisions and builds ties across org lines. Gokul Rajaram, board member at Marathon MP, has called AI a “discontinuity in talent evaluation” that makes old credentials irrelevant. He observes that young builders now outperform seasoned executives in AI-enabled settings. Amee Parekh, CEO of Stello AI, advocates tying pay to outcomes rather than knowledge-based tiers. According to a Gartner survey of more than 3,300 employees, 57 percent believe humans are more biased than AI when making pay decisions (HR Executive).

AI Elevates Outside Sales Rather Than Replacing It

This shift transforms sales roles just as much as analytical ones. The traditional outside sales rep earned higher pay partly through years of product, market, and client knowledge. Yet over time, many reps drifted toward email sequences and phone calls as their primary engagement tools. Face-to-face meetings became a later step rather than the core skill. AI agents now handle that entire digital front end — lead scoring, targeted messaging, and content delivery that pulls prospects deeper into the funnel. Rather than shrinking the outside sales role, this change actually elevates it. The reps who thrive will be those who show up in person, build trust across a table, and close through genuine relationships that no AI agent can replicate.

AI is forcing outside sales back to what “outside” was always supposed to mean. The FLSA exemption question alone demands attention as role duties shift. Yet the broader job evaluation AI agents challenge runs deeper. When digital prospecting and process tasks migrate to agents, the human contribution concentrates on persuasion, judgment, and in-person relationship building. Current methods from Hay through WTW assign too little weight to these skills because they were seen as “soft” rather than primary value drivers.

Team Leadership Gets Complex When AI-Driven Role Evaluation Reshapes Org Charts

Traditional methods weigh people’s leadership heavily. Indeed, managing fifteen direct reports scores far higher than managing zero under any system. Market data backs this because survey benchmarks have always captured leadership span as a key pay factor. However, job evaluation AI agents introduce a critical twist that current benchmarks cannot handle. A marketing director who leads twelve people now competes with one who leads three people and twelve AI agents. As a result, the second director may have more time for cross-functional strategy.

This creates a cascading problem for market data integrity. Pay surveys collect data based on traditional job matching that weights headcount as a primary leveling factor. When two very different role profiles carry the same title but vastly different AI levels, the benchmark becomes unreliable. AI-first organizations structure roles around agent management. Traditional organizations maintain people-centric structures. Blending their pay data produces market rates that accurately represent neither model. Gartner projects that by 2026, 20 percent of organizations will use AI to flatten their structures. More than half of current middle management roles could be cut. This flattening amplifies pay challenges, as remaining roles shift to strategic oversight (Gartner IT Predictions). Consequently, pay surveys will mix data from very different org designs. Pay professionals who rely on published data without knowing whether participants are AI-first or traditional face growing risks.

Furthermore, the idea of “leader of people” itself must expand. In an AI-augmented firm, every employee becomes a talent agent of sorts. Each person manages AI tools, directs agent workflows across old lines, and decides when human input adds value. The gap between “individual contributor” and “people leader” blurs when the contributor runs agents that match a team of five.

Growth Through Agent Adoption Is the New Leadership Metric

People who grow their impact through agent adoption are more valuable than those who maintain old human-only processes. As a result, communication, influence, agent management, and growth drive — not knowledge buildup or headcount — should become primary rated factors.

What Job Evaluation AI Agents Mean for Pay Frameworks Going Forward

The path forward requires pay professionals to confront three hard realities at once. First, the knowledge-based pillar of every major job evaluation method is eroding. AI agent capabilities will keep advancing. This goes beyond Hay and Mercer. WTW, Radford, BLS point-factor systems, and even the ILO framework share this weakness. Organizations that delay updating will build what MorganHR calls “compensation design debt.” This is where deferred method updates create compounding gaps between pay and actual job value.

Second, market data from pay surveys will grow less reliable as AI-first and traditional firms diverge within the same sectors. Third, the factors that drive real pay gaps now are communication, influence, agent management skills, and capacity for growth. These factors exist in current frameworks as secondary concerns. They need to become primary.

Tools like SimplyMerit can streamline compensation administration by collecting manager input on restructured criteria and modeling impacts on pay programs — while your compensation consultants provide the core assessment expertise. HR Directors can test new factor weighting before committing to firm-wide changes. Gartner predicts that by 2026, skills atrophy from GenAI use will push 50 percent of global organizations to require “AI-free” skills assessments (Gartner 2026 Predictions). This tension between AI skill premiums and critical-thinking preservation makes evaluation framework updates even more urgent. Rather than dropping old methods entirely, forward-thinking teams should overlay AI-augmentation modifiers as a bridge step.

Scaling by Company Size

By company size, the approach differs in meaningful ways. Specifically, small organizations under 250 employees can move fastest. They have fewer roles to re-evaluate and can pilot new criteria with less red tape. Mid-size companies should focus on their highest-impact job families first. These are typically revenue and technology roles. They can then expand over twelve to eighteen months. Large enterprises face the greatest challenge because existing structures represent a major sunk cost. However, large organizations also gain the most because their scale amplifies the cost of misaligned pay.


Key Takeaways

  • Job evaluation AI agents challenge every major pay method — Hay, Mercer IPE, WTW, Radford, and BLS point-factor systems — because all weigh knowledge as a primary job value line.
  • HBR, MIT’s Brynjolfsson, and WTW’s own product leadership acknowledge that AI commoditizes scored expertise, exposing “fossilized” experience in traditional ratings.
  • Stanford research across 1,500 workers confirms that valued human skills are shifting from data processing toward interpersonal, communication, and organizational abilities.
  • PwC’s analysis of nearly a billion job ads shows workers with AI skills command a 56 percent wage premium while degree requirements drop seven points in AI-augmented roles.
  • Market pay data is becoming unreliable as AI-first and traditional organizations report benchmarks for very different role profiles under the same job titles.
  • Communication, influence, agent management, and growth capacity should replace knowledge depth as the primary rated factors in updated frameworks.
  • Gartner predicts 50 percent of global organizations will require “AI-free” skills assessments by 2026, creating a dual tension between AI proficiency premiums and critical-thinking preservation that evaluation frameworks must address.

Quick Implementation Checklist

  1. Audit your current job evaluation method and identify which factors depend on held knowledge versus agent-accessible knowledge.
  2. Map your evaluated roles against a four-quadrant grid: high knowledge dependency with high AI agent potential (urgent), high knowledge with low AI potential (stable), low knowledge with high AI potential (watch), low knowledge with low AI potential (no action).
  3. Flag roles where AI agent adoption has already shifted the primary value from analytical output to influence, communication, or agent management.
  4. Assess your pay survey participation list for AI-first versus traditional organizations and adjust benchmark weighting.
  5. Pilot an AI-augmentation modifier on your highest-impact job family using your existing framework before expanding organization-wide.
  6. Brief your executive team on the gap between current evaluation scores and actual role value in AI-augmented positions.
  7. Establish a quarterly review cycle for evaluation criteria as AI agent capabilities continue to advance.

Contact MorganHR

The organizations that win the compensation strategy race are those that stop measuring what people know and start measuring what people do with tools everyone now shares. Knowledge was a differentiator when it was scarce. Influence is the differentiator now that knowledge is abundant. Contact MorganHR for a compensation framework assessment that maps your current evaluation method against the realities of an AI-augmented workforce. Your pay structures should reflect where your organization is heading, not where the job evaluation industry was seventy years ago.

For a deeper look at how AI is reshaping job families and organizational design, read our guide on Job Architecture in the AI Era.


Frequently Asked Questions

Are Hay and Mercer’s IPE evaluation methods completely obsolete because of AI agents?

Not completely, but they require significant recalibration. The structural frameworks remain useful for organizing role comparisons. However, the factor weightings — particularly Knowledge and Know-How — need updating to reflect how job evaluation AI agents have compressed knowledge-based differentiation across roles.

How many major job evaluation methodologies are affected by AI agents?

Every major system is affected. Hay, Mercer IPE, WTW Global Grading System, Radford McLagan, BLS National Compensation Survey, and the foundational ILO Scheme of Geneva all weigh knowledge as a primary differentiator. The disruption is systemic, not isolated to one or two methods.

How do I adjust compensation benchmarks when AI-first and traditional companies report different role profiles?

Segment your peer group analysis by AI maturity level. Ask survey vendors which participants have adopted AI agents at scale and weigh the data accordingly. If your vendor cannot provide this segmentation, your benchmark data carries an increasing risk of misalignment with your actual talent market.

Should we pay more for employees who adopt AI agents effectively versus those who maintain traditional workflows?

Yes, when the adoption drives measurable business impact. PwC’s 2025 Global AI Jobs Barometer found that workers with AI skills command a 56 percent wage premium. Growth-oriented employees who expand capabilities through agent management deliver compounding value that maintenance-oriented employees do not.

What evaluation factors should replace knowledge depth as AI levels the playing field?

Prioritize communication effectiveness, cross-functional influence, agent management capability, and demonstrated capacity for continuous skill growth. These factors predict sustained value creation in environments where analytical and knowledge tasks migrate to AI agents.

How does job evaluation AI agents’ disruption impact FLSA exemption classifications?

It may actually strengthen outside sales exemptions rather than weaken them. Many reps had drifted toward email, phone prospecting, and digital content delivery — tasks that look more like inside sales activity and could put the outside sales exemption at risk. When AI agents absorb that digital front end, the human rep’s duties concentrate on face-to-face client engagement, relationship building, and in-person closing. That profile aligns more clearly with the Department of Labor’s outside sales duties test than the blended digital-and-field role many organizations have today. Organizations should still review classifications with employment counsel, but AI adoption may resolve existing compliance gaps rather than create new ones (DOL FLSA guidelines).

About the Author: Michelle Henderson

Michelle Henderson’s lifelong love of puzzles and problem solving has been an incredible asset in her role as Compensation Consultant for MorganHR, Inc. Michelle advises clients on market pricing, employee engagement, job analysis and evaluation, and much more.