Insurance

Why Underwriting is the Perfect Application for Reinforcement Learning from Human Feedback (RLHF)

Will Ross

February 15, 2024

No items found.

In this article, Will Ross, CEO and Co-Founder of Federato argues that AI should be a tool to enhance human capabilities, not replace them, and explains how underwriting organizations stand to benefit from recent advances in Reinforcement Learning from Human Feedback (RLHF) by harnessing the uniquely human insights of their underwriters and teams.

There’s no question that insurance is having its “AI moment.” AI dominated the discussion at Insuretech Connect ITC Vegas 2023, the world’s largest gathering of insurance and insurtech professionals. As Kristin Applegate, CIO of Berkley Construction Services told a standing-room-only crowd at ITC:

“We’re at a tipping point where technology can start to take over parts of the workflow. It’s critical to get experts in the industry on board with technology – people are more likely to support tech if there is something in it for them. We have to make technology more approachable and add legacy knowledge from underwriters who will be retiring to build in that gut feel.”

But why now? What makes this present generation of AI so interesting when we’ve had technologies that we’ve called AI in the past that didn’t seem to get as much interest or make much of an impact? The answer lies in a type of AI known as reinforcement learning from human feedback, or RLHF, which powers household-name tools like ChatGPT.

RLHF is a machine learning technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. In contrast to other ML categories like supervised and unsupervised learning, RL can generate new data to learn from by interacting with the environment.

In RLHF, an AI algorithm offers multiple responses to a prompt, and a human picks which response is better, thereby reinforcing patterns that the algorithm learns to follow, improving over time. The novelty of RLHF is that it doesn’t try to produce the perfect response to a prompt; it just makes its best guess – just as a human underwriter does. This type of computing is a fantastic complement to the type of pattern recognition and best-guess prediction-making that underwriters do on a daily basis.

For example, say an underwriter has the goal of winning $10M in new business in Texas with a low hurricane risk, so we show them accounts in Texas with low hurricane scores. Some are winnable and others are not; the underwriter instinctively knows which is which and pursues the submissions that they believe are winnable. If they click on a recommended submission, the algorithm gets positive feedback and learns to show more accounts with similar criteria. If the underwriter clicks it, wins it, and it makes a positive impact on the actual portfolio goals, the algorithm gets even more positive feedback. Over time, the recommendation algorithm will learn to present the best of the best, which the underwriter might have previously needed hours to pore over hundreds of inbound emails to find.

Federato’s roots go deep in the realm of reinforcement learning and RLHF has been critical to our strategy since the company’s inception. My co-founder William Steenbergen and I formed Federato on the idea that AI working together with human underwriters could make an incredible positive impact on an insurance company’s decision-making and top and bottom line growth. William’s graduate research at Stanford’s Human Computer Interaction Group centered around the same algorithms that are at the core of ChatGPT. In Federato’s case, the focus is on using RLHF to empower a group of underwriters to work towards the complex and coordinated end goal of a balanced, growing risk portfolio.

The AI-powered Federato RiskOps Platform allows insurance executives and portfolio analysts to define what a good portfolio looks like versus a bad portfolio based on their unique criteria, and then the platform automatically recommends to underwriters that they should look at certain opportunities to achieve the desired portfolio. And it does all this as part of the underwriter’s core workflow – no need to hop back and forth between tools or spend time in a policy admin system that isn’t core to the underwriter’s workflow.

Frankly, I can’t think of a more perfect use case for RLHF than insurance underwriting. Insurance is a “people business,” and as my colleague Megan Bock, COO always says, “one that is built on trust and reliant on human interaction and judgment. Regardless of line of business or size of business, there still needs to be a human touch.”

Underwriting at its core is a process by which a person processes a large volume of disparate types of information and makes a highly educated guess about the value of a risk. Over the course of decades-long careers, great underwriters develop the ability to see patterns and learn to look for data points, correlations, and patterns that are often far from obvious.

But what happens when that highly-trained and experienced underwriter leaves the company or retires? How can insurers keep that institutional knowledge and hard-earned pattern recognition from being lost? Can it be harnessed to more rapidly onboard and upskill new underwriting talent? These questions are top of mind for insurance leaders today. Over the next fifteen years, 50% of the current insurance workforce will retire, and the industry faces significant challenges around attracting talent and retooling for a new era of work.

This is where RLHF-based underwriting solutions can shine. We effectively embed the institutional knowledge and expertise of your underwriters as rules, goals, and guidelines within the core underwriting system and workflow, leading to faster, more consistent, and profitable decision-making across the entire organization.

Take for instance the process by which an experienced underwriter digs into a submission with follow up questions. We can present within the submission environment a set of suggested questions generated based on the organization’s underwriting guidelines. The underwriter can either decline to answer, or, if it is actually a good question, they can use that space to populate that information once they get it from the broker. The algorithm receives positive feedback when a suggested question gets answered, and learns to suggest that question for other submissions with similar characteristics. In doing so, the algorithm can help junior underwriters more quickly pick up the experienced underwriter’s understanding of what makes a good follow-up question that can help illuminate a complex submission.

Rather than losing all that legacy knowledge when an experienced underwriter leaves or retires, it is retained as a learned element of the core underwriting system your team interacts with and uses every day to transact business. Instead of averaging underwriting performance, as basic risk scores risk scores typically do, we can codify the underwriting prowess of your best underwriters on their best day into prompts and guidelines that the rest of your underwriters can learn from.

MGA Codefies Its Underwriting ‘Secret Sauce’ with RiskOps

The strategic, forward-looking goal to preserve the underwriting knowledge that lives in people’s heads was behind MGA Propeller’s decision to invest in a RLHF-driven underwriting platform:

"We didn’t have a single place for all of our documentation – it was all in our heads or buried in emails, and it wasn’t integrated as part of our workflow. We were stepping on each other’s toes, and as we continued to grow, we realized that if we didn’t fix the problem now, it would snowball into something major later.”

– Konae C. Mignott, CPCU, AFSB, Chief Operating Officer, Propeller

Read the Case Study

Much of the interest in AI today focuses on its potential to automate away busywork and manual tasks. In insurance, automating low hanging fruit like the identification of relevant underwriting guidelines for a policy will absolutely yield operational benefits. But as an insurer, attempting to oversimplify the underwriting process or automate away the human creativity and knowledge of your underwriters will ultimately result in negative outcomes. We see an opportunity for AI to serve as a trusted guide that can elevate critical thinking and tap into the innately human qualities and skills that help underwriters perform – deal analysis, underwriting and portfolio strategy, pricing and negotiation, distribution partnerships and team-building. By focusing on the human dimension of AI, insurance companies can embed the unique institutional knowledge and ‘secret sauce’ that makes them stand out, and make it more consistent and repeatable.

Frequently asked questions

No items found.

Key Results

No items found.

Featured resources

Video

Insurance

How full policy lifecycle operations modernize traditional insurance operations

February 6, 2026

Video

Insurance

3 ways industry leaders are transforming insurance for an AI-native future

January 28, 2026

Video

Insurance

What is an AI-native insurance platform?

January 16, 2026

Insurance

Why Underwriting is the Perfect Application for Reinforcement Learning from Human Feedback (RLHF)

‍

Further Reading

Frequently asked questions

Key Results

Featured resources

Ready to get started?