- | 8:00 am
Can ChatGPT handle HR? Here’s what happened when we put it to the test
A team of HR experts wanted to know if AI would enhance their work—or take their jobs. They tested several iterations of ChatGPT and then built their own version.
In the rapidly evolving landscape of artificial intelligence, businesses are continually faced with the question: Will AI enhance our capabilities or threaten our existence (or both!)? When I first encountered ChatGPT, my reaction was a blend of skepticism and curiosity. The stakes were high—could AI truly match or even surpass the nuanced understanding and problem-solving abilities of a seasoned HR professional?
Driven by a blend of fear and fascination, my team—composed of HR experts with an average of 18+ years of experience—decided to put ChatGPT to the test. Our objective was clear: evaluate whether AI could handle the complex and often sensitive issues inherent in human resources, from legal compliance to employee relations. The implications of this exploration were significant—if AI could indeed match the prowess of an HR expert, it could transform how HR operates moving forward.
PUTTING CHATGPT TO THE TEST
We started this work by conducting a six-week experiment in 2023. Using ChatGPT versions 3.0, 3.5, and 4.0, we tested them on four evergreen HR compliance areas: the Fair Labor Standards Act (FLSA), the Family and Medical Leave Act (FMLA), the Americans with Disabilities Act (ADA), and various immigration topics. We designed test scenarios ranging from salary transparency to complex employee termination and leave policies.
Each AI version was evaluated by our HR experts on accuracy, relevance, consistency, brevity, bias, and applicability. The initial findings were both enlightening and mixed: GPT-3, the earliest version in our test, struggled substantially; it lacked the detail and nuance essential in HR, particularly in legally sensitive scenarios. This highlighted a key limitation of AI: the lack of deep understanding and contextual awareness that human HR professionals bring to complex situations.
However, as we tested the subsequent versions, improvements were evident. GPT-4, in particular, demonstrated a significant leap in performance. It handled all 10 questions across our evaluation categories with a marked increase in accuracy and relevance. This progression from GPT-3 to GPT-4 underscored the rapid advancements in AI capabilities, suggesting a promising future where AI could effectively augment human expertise.
BUILDING CUSTOM AI
Encouraged by these advancements, we proceeded to develop a custom version of ChatGPT specifically tailored for our needs earlier this year. This bespoke AI was designed to support our HR experts in managing the over 3,000 HR compliance inquiries we receive each week. Queries range from state-specific labor laws to federal regulations on independent contractors and layoff procedures.
During a three-week, proof-of-concept (POC) phase, our advisors utilized the custom ChatGPT in real-time scenarios. They evaluated the AI’s responses using a detailed rubric and provided feedback through weekly surveys. Additionally, we gathered both quantitative and qualitative feedback through weekly sentiment surveys among our testers.
The results from this phase were surprisingly positive. While there was a slight decrease in efficiency, the quality of responses from our custom ChatGPT improved significantly over the test period. More importantly, employee satisfaction with the AI integration was high, and client feedback on cases handled during the POC was overwhelmingly positive.
KEEPING HUMANS IN HR
These experiments underscored an essential insight: the importance of keeping humans in the loop. The conversation around AI often centers on the fear of job displacement, especially for those in customer service focused roles, but our experience showed that AI is best used as an enhancer, not a replacement. Our HR experts were able to leverage AI to handle routine inquiries, freeing them up to focus on more complex and high-value interactions that require human empathy and judgment.
Our journey with ChatGPT also taught us valuable lessons in change management as well as prompt engineering (the art of designing AI queries to yield the most useful information). In designing the way we interacted with the tool, we learned that how we structure questions matters as much as the questions we ask; for example we use detailed inputs to steer the interaction and provide specific instructions in how we want to see the generated outputs.
Additionally, our initial defensive stance toward AI evolved into a proactive strategy focusing on organizational readiness rather than just technological adoption. When ChatGPT first came on the scene, many of my team members (myself included!) were nervous it would put us out of work. But what we learned through this process is that AI can be a powerful tool to enhance our work. It provided inspiration, direction, and practical support, freeing us from routine tasks and allowing us to focus on more strategic and creative activities, like researching more complex questions and freeing up more time to spend one on one with customers.
Our experience has solidified our belief in the transformative potential of AI in enhancing business operations. The future of HR is not about choosing between AI and humans but about integrating both to achieve the best outcomes for our organizations and teams.