news-details
Government

Salesforce Introduces World's First LLM Benchmark for CRM

Salesforce has unveiled the world’s first large language model (LLM) benchmark for customer relationship management (CRM) systems, offering businesses a comprehensive tool to evaluate the rapidly expanding array of generative AI models. This new benchmark provides an evaluation framework that measures LLM performance based on accuracy, cost, speed, and trust and safety, specifically tailored to common sales and service use cases such as prospecting, lead nurturing, and service case summaries.

Salesforce's benchmark includes a public leaderboard to assist professionals in selecting the best LLM for their CRM needs. The company plans to continually update the benchmark with new use case scenarios and enhance its evaluation criteria, soon incorporating fine-tuned LLMs.

“As AI continues to evolve, enterprise leaders are saying it’s important to find the right mix of performance, accuracy, responsibility, and cost to unlock the full potential of generative AI to drive business growth,” said Silvio Savarese, EVP & Chief Scientist at Salesforce AI Research. “Salesforce’s new LLM Benchmark for CRM is a significant step forward in the way businesses assess their AI strategy within the industry. It not only provides clarity on next-generation AI deployment but also can accelerate time to value for CRM-specific use cases. Our commitment is to continuously evolve this benchmark to keep pace with technological advancements, ensuring it remains relevant and valuable.”

Importance of the Benchmark

Existing LLM benchmarks have been primarily academic and consumer-focused, lacking relevance to business applications. They often fail to address crucial aspects like accuracy, speed, cost, and trust, leaving CRM customers without a reliable way to evaluate generative AI-powered CRM solutions. Salesforce's benchmark addresses these gaps by using real-world CRM data and expert human evaluations, offering businesses strategic insights for incorporating generative AI into their CRM systems.

Key Evaluation Metrics

  1. Accuracy: Assessed through factuality, completeness, conciseness, and instruction-following. Accurate models provide valuable results, improving customer experience.
  2. Cost: Categorized as high, medium, or low based on percentiles, allowing businesses to evaluate the cost-effectiveness of different LLMs.
  3. Speed: Measures the responsiveness and efficiency of LLMs in processing and delivering information, enhancing user experience and reducing customer wait times.
  4. Trust and Safety: Evaluates the LLM’s ability to protect sensitive customer data, comply with privacy regulations, and avoid bias and toxicity.

Businesses can use this benchmark to compare LLMs, identify the best solutions, and make informed decisions to enhance customer success and drive business growth. With Salesforce’s Einstein 1 Platform, customers can choose from existing LLMs or bring their own models to meet their unique needs, deploying more effective generative AI solutions.

Clara Shih, CEO of Salesforce AI, emphasized the practical focus of this benchmark: “Business organizations are looking to utilize AI to drive growth, cut costs, and deliver personalized customer experiences, not to plan a kid’s birthday party or summarize Othello. Our customers have been asking for a purpose-built way to evaluate and select from among the proliferation of new AI models, and we are thrilled to introduce the world’s first LLM benchmark for CRM to help them navigate the complex landscape of models. This benchmark is not just a measure; it’s a comprehensive, dynamically evolving framework that empowers companies to make informed decisions, balancing accuracy, cost, speed, and trust.”

For more information, visit Salesforce's website and view the LLM Leaderboard for CRM on Huggingface.

Disclaimer: The information provided in this article does not constitute an endorsement of any particular LLM; it is for general informational purposes only. Readers should make their own determinations based on their needs. Opinions of the referenced presenters and/or authors are their own and do not necessarily reflect the official position of Salesforce.

 

Related News

Betser Life Partners with Xver ...

Betser Life Pvt Ltd., a pioneering Indian startup providing healthcare services through the aggregator platform betsercare.com, has partnered with Xverse, a distinguished CX M...

Wego Recommends Nearby Destina ...

Wego, the leading travel app and largest online travel marketplace in the Middle East and North Africa (MENA), has curated a list of nearby holiday destinations for Kuwaiti travele...

Watani Al Emarat Foundation La ...

Watani Al Emarat Foundation has announced the launch of the fifth session of its ‘Ambassador of Emirati National Identity Programme.’ This initiative aims to strengthen...

History Meets Luxury at the Si ...

Nestled in the heart of Al Habtoor Palace, the Sir Winston Churchill Suite offers an unparalleled blend of history and luxury. Unveiled in 2016 with Randolph Churchill, Sir Winston...

Team Abu Dhabi’s Al Qemzi Finishes ...

Team Abu Dhabi’s Rashed Al Qemzi commenced his UIM F2 World Championship title defense with a sixth-place finish at the Grand Prix of Italy in Brindisi. The race was marked b...

Team Abu Dhabi’s Al Qemzi Finishes ...

Team Abu Dhabi's Rashed Al Qemzi commenced his UIM F2 World Championship title defense with a seventh-place finish at the Grand Prix of Italy in Brindisi. The season's opening race...

UAE Reaffirms Commitment to Collabo ...

Her Excellency Dr. Amna bint Abdullah Al Dahak, Minister of Climate Change and Environment (MOCCAE), has emphasized the UAE's dedication to working with BRICS countries on joint fo...

Team Abu Dhabi Duo Face Battle Back ...

Team Abu Dhabi's Rashed Al Qemzi and Mansoor Al Mansoori are set for a tough fight in Sunday's Grand Prix of Italy after qualifying in ninth and tenth positions respectively. The c...

B2Trader v1.1 Update Unveiled: Intr ...

B2Trader Brokerage Platform, a leading crypto spot brokerage solution from B2Broker, has launched its latest update, BBP v1.1. This update introduces the innovative BBP Prime funct...

UAE's Parliamentary Work Drives Nat ...

 His Excellency Abdulrahman Al Owais, Minister of Health and Prevention and Minister of State for Federal National Council (FNC) Affairs, emphasized that the UAE’s parli...