Glossary
- AAdjusted Pay Gap
The difference in pay between groups after statistically controlling for variables that influence compensation such as job title, seniority, performance ratings, education, geography, and experience. This metric isolates pay disparities potentially caused by bias or discrimination rather than structural factors.
- Algorithmic Pay
Algorithmic pay involves the use of algorithms or automated systems to determine or influence compensation decisions, often involving data-driven analysis and calculations to ensure consistency and objectivity.
- At-Risk Compensation
At-risk compensation components are subject to a certain level of risk and performance conditions, such as bonuses, commissions, or equity-based rewards, dependent on achieving specific targets or goals.
- BBias
A systematic error in a machine learning model that causes certain patterns or groups to be consistently misrepresented or favored in predictions. In pay-equity contexts, algorithmic bias can arise when models are trained on historical compensation data that reflect prior imbalances, leading to the reinforcement of existing disparities.
- Bonus Gap
The difference in bonus payments awarded to employees of different groups, typically reported separately from base pay. This measure highlights disparities in additional compensation elements such as bonuses or incentive pay.
- Broad Base Compensation
Broad base compensation is a compensation strategy designed to deliver consistent, market-aligned pay across diverse roles and employee groups within an organization. It aims to support internal alignment, transparency, and recognition of employee contributions at all levels.
- CClassification
A supervised learning task where a model assigns items into predefined categories. For example, classifying employees as "high-potential" vs. "low-potential" based on past performance data.
- Clustering
An unsupervised technique that groups similar data points together. HR might use clustering to segment employees by engagement survey responses or performance metrics without pre-assigned labels.
- Compa-Ratio
Compa-ratio is a measure used to compare an employee's current salary to the midpoint or target salary within their assigned pay range, expressed as a ratio or percentage. It helps assess how an employee's pay aligns with internal salary structures.
- Comparable Worth
A principle stating that different jobs with equivalent value to the organization (measured in skills, effort, responsibility, and working conditions) should be compensated similarly, even if the job duties differ significantly.
- Control Variable
An attribute held constant in statistical models to isolate the effect of other factors. In equal-pay regression, common controls include job level, location, and tenure to see if gender still predicts salary differences.
- Cost of Labor
Cost of labor refers to the expenses associated with employing workers, including wages, benefits, taxes, and other direct and indirect costs related to labor.
- Cross-Validation
A method for evaluating model performance by splitting data into multiple training/test folds. It helps ensure that a compensation-prediction model generalizes well beyond the original dataset.
- DData-centric AI
An approach to AI & ML development that focuses on improving the quality, consistency, and labeling of the data used to train models. According to Andrew Ng, this method emphasizes systematically engineering better datasets to enhance model performance, even when using standard algorithms.
- Deep Learning
A subset of machine learning using multi-layer neural networks to learn complex patterns. In HR, deep learning can power everything from resume-screening tools that parse unstructured text to pay recommendations based on market and internal data.
- EEqual Pay
An area of law in many countries (e.g. U.S., UK, Australia, EU member states, and more) which means that men and women in the same employment performing equal work must receive equal pay, unless there is a justification for the difference.
- Equal Pay Audit
A systematic analysis conducted by organizations to assess compensation practices, identify unjustifiable pay disparities, and develop corrective actions to ensure compliance with pay equity laws or standards.
- FFeature
An input variable used by an algorithm to make predictions—e.g., years of experience, education level, performance rating.
- Feature Importance
A metric indicating how much each feature contributes to a model's predictions. Helps compensation analysts understand which factors (e.g., department, tenure) drive pay disparities.
- GGender Pay Reporting
Mandatory or voluntary disclosure of aggregated pay data by gender, required by governments or regulatory bodies. Examples include annual reporting mandated by the UK's Gender Pay Gap regulations or Australia's Workplace Gender Equality Agency (WGEA).
- IIntersectional Pay Gap
Disparities in pay observed when accounting for overlapping identities such as gender, race, ethnicity, sexual orientation, disability, or age. Intersectional analyses highlight how multiple dimensions of identity compound disadvantages in pay.
- LLabel
The "ground truth" outcome a model learns to predict—e.g., actual salary or bonus awarded.
- MModel
A mathematical representation trained on data to forecast outcomes. Examples include linear regression models for pay prediction or decision-tree models for turnover risk.
- Model-centric AI
An approach to AI development that prioritizes improving model architectures, hyperparameters, and training techniques. As described by Andrew Ng, this traditional paradigm focuses on refining algorithms while holding the dataset mostly fixed.
- OOccupational Segregation
The distribution of demographic groups disproportionately across different occupations or job levels, often a significant driver of unadjusted pay gaps due to systematic clustering of certain groups in lower-paid roles.
- Overfitting
When a model learns noise in the training data instead of the underlying pattern, performing poorly on new data. In compensation analytics, an overfit model might "explain" pay gaps by trivial quirks rather than true drivers.
- PP-Value
In hypothesis testing, the probability of observing an effect at least as extreme as the one in your data if the null hypothesis were true. A low p-value (e.g., < 0.05) suggests a statistically significant pay disparity after controls.
- Pay Equity or Equal Pay
What women are paid vs their direct male peers, statistically adjusted for factors such as job, seniority, and geography. Often referred to in the context of "equal pay for equal work."
- Pay Gap or "Median Pay Gap"
The difference between median earnings of women and men working full-time, expressed as a percentage of men's median pay. This unadjusted figure does not account for differences in roles, experience, or other influencing factors. For instance, in the U.S., women earned approximately 85¢ per every dollar earned by men in 2024 according to OECD standards.
- Pay Gap Quartiles
Method used to report pay distributions by dividing an organization's workforce into four equal-sized groups ranked by pay. Quartile analyses help identify whether certain demographics (e.g., women) are disproportionately represented in lower-paid quartiles.
- Pay Parity
The condition in which employees performing similar work receive equivalent compensation, typically measured after adjustments for role, level, geography, and relevant experience. Pay parity is often an organizational goal in achieving true pay equity.
- Pay Transparency
Practices or policies aimed at openly sharing salary information within organizations and promoting accountability. This may include disclosing salary bands, providing salary ranges in job postings, or internally publishing pay scales.
- RR-Squared
The proportion of variance in the dependent variable explained by the model. In pay models, a higher R² means your chosen factors account for more of the salary variation.
- Regression
A statistical method estimating the relationship between a dependent variable (e.g., salary) and one or more independent variables (e.g., job level, performance score). Core to adjusted pay-gap analysis.
- SSimpson's Paradox
A phenomenon in which a trend that holds within multiple subgroups of data reverses when those groups are combined into an aggregate. In a pay-equity audit, failing to stratify by department (or other key controls) can mask true within-group equity—or falsely suggest a disparity—if Simpson's Paradox is at play.
- Supervised Learning
ML techniques that train models on labeled data (features + known outcomes). Used for tasks like predicting which employees are likely to leave or what bonus level to assign.
- TTest Data
A subset of your dataset held back from training to evaluate model accuracy. Ensures your compensation-prediction model works on unseen data.
- Training Data
The dataset on which a model is trained. In pay modeling, this would include historical salary and associated features.
- UUnadjusted Pay Gap
The difference in pay between groups before statistically controlling for variables that influence compensation such as job title, seniority, performance ratings, education, geography, and experience. This metric isolates pay disparities potentially caused by bias or discrimination rather than structural factors.
- Unsupervised Learning
ML methods that infer structure from unlabeled data, like clustering. Useful for exploratory analyses, such as discovering natural employee cohorts.
- WWeights
Numeric coefficients that a model assigns to features, indicating their impact on the prediction. In a regression of salary, a higher weight on "years of experience" means it has greater influence on pay estimates.