Women remain underrepresented in the workplace, partly due to stereotypes associating competence traits with men rather than women. Efforts to change such stereotypes often yield mixed results. As language models become integrated into daily life, AI writing assistants offer an opportunity to shift gender images.
In a preregistered experiment (N = 672), participants evaluated resumes for a female ("Jennifer") and a male ("John") candidate applying to a financial analyst role. They wrote evaluations using AI-generated suggestions in one of three conditions: suggestions for Jennifer integrated stereotypically male, female, or neutral traits. Suggestions for John remained neutral.
We found that participants exposed to stereotypically male AI suggestions evaluated Jennifer as more competent, selected her as the leade more often, and offered higher salaries. However, we also observed signs of backlash: participants were less willing to work with competent Jennifer. Our findings suggest that AI writing assistants can serve as scalable stereotype interventions by directly influencing the language people use to describe others, but also highlight the complex social dynamics of stereotype change.
Despite the increasing representation of women in political, financial, and academic sectors, women continue to face challenges at every stage of their careers. Psychologists have long argued that gender stereotypes — associating women with warmth/communality and men with competence/agency — play a key role in sustaining workplace disparities.
Existing interventions — such as bias training and counter-stereotypical role models — often have limited or inconsistent effects. They operate indirectly, relying on reflection or exposure to change deeply ingrained cognitive associations. AI writing assistants offer a novel, more direct alternative: intervening at the precise moment of language production.
We conducted a preregistered online experiment (N = 672) in which participants reviewed résumés for a male ("John") and a female ("Jennifer") candidate applying to an entry-level financial analyst role. They wrote short evaluations using an AI autocomplete tool powered by GPT-4o.
We prompt-engineered AI auto-completion suggestions for Jennifer into three conditions:
We used CoAuthor, an AI writing assistant that provides autocomplete suggestions. The assistant was powered by GPT-4o (temperature = 1). Participants could press TAB to view short phrase completions and were required to view suggestions at least eight times, though they could accept, modify, or ignore any suggestion.
We collected three levels of outcomes: cognitive (written evaluations, quantified via dictionary-based analysis, cosine similarity gender bias scores, and LLM-as-judge ratings), attitudinal (trait impressions, affiliative decisions), and behavioral (binary hiring choice, salary recommendation).
Three complementary measures — dictionary-based unigram analysis, cosine similarity gender bias scores, and LLM-as-judge ratings — all confirmed that the writing assistant successfully shifted participants' use of gendered language in line with the intended manipulation.
When the AI writing assistant suggested masculine traits, participants were significantly more likely to write of Jennifer as competent and less warm — and vice versa for feminine suggestions. Participants were largely unaware of the manipulation, instead praising the writing assistant as "helpful and easy to use."
Counter-stereotypical suggestions improved Jennifer's standing as a trusted leader but simultaneously reduced affiliative judgments, making her appear personally less likeable. This divergence between competence-related and warmth-related impressions is consistent with a backlash-like pattern.
Salary recommendations showed the clearest treatment effect. In control and stereotypical conditions, participants offered Jennifer significantly lower salaries than John. In the counter-stereotypical condition, the salary gap disappeared entirely. Hiring decisions, while showing the expected directional pattern, did not reach statistical significance.
The intervention was more effective on continuous, independent judgments of Jennifer and John (salary) compared to zero-sum binary choices (hiring). This is consistent with prior work on the attitude-behavior gap: counter-stereotypical suggestions may help people acknowledge Jennifer's competence and provide her a higher salary without necessarily breaking gender role congruity barriers in direct comparisons.
Our prototype system demonstrates that AI writing assistants can serve as low-cost, scalable stereotype interventions. Unlike traditional approaches that raise awareness or expose people to role models, our intervention directly manipulates the mediator: language. By changing the words people used to evaluate Jennifer, we temporarily altered their cognitive representations — without requiring deliberate reflection or bias suppression.
This approach avoids the reactance and fatigue associated with explicit diversity training. Rather than demanding users monitor or suppress stereotypic thoughts, the system provides subtle shifts in language that operate at the same level of processing as the stereotypes themselves.
A critical finding is the evidence of gender stereotype backlash: counter-stereotypical suggestions increased perceived competence but decreased likeability. This is consistent with role congruity theory — when women violate prescriptive gender norms by displaying agentic traits, they may face social penalties. This double bind means that even successful linguistic interventions can have complex downstream effects on social evaluations.
Our intervention raises questions of transparency, user agency, and appropriate deployment. We discuss three deployment contexts: (1) training programs with full transparency, (2) organizational opt-in where users know the system may counteract stereotypes, and (3) individual use with full transparency and visualization of linguistic habit changes over time. The intervention preserves user agency — all suggestions can be accepted, modified, or ignored at any time.
@inproceedings{liu2026writing,
title = {Writing with {AI} Can Reduce Gender Bias in Hiring Evaluations},
author = {Liu, Alicia T.H. and Lee, Mina and Bai, Xuechunzi},
booktitle = {Proceedings of the 2026 {CHI} Conference on Human Factors
in Computing Systems},
series = {{CHI} '26},
year = {2026},
address = {Barcelona, Spain},
publisher = {ACM},
doi = {10.1145/3772318.3791136},
url = {https://doi.org/10.1145/3772318.3791136}
}