news-details
Government

Salesforce Introduces World's First LLM Benchmark for CRM

Salesforce has unveiled the world’s first large language model (LLM) benchmark for customer relationship management (CRM) systems, offering businesses a comprehensive tool to evaluate the rapidly expanding array of generative AI models. This new benchmark provides an evaluation framework that measures LLM performance based on accuracy, cost, speed, and trust and safety, specifically tailored to common sales and service use cases such as prospecting, lead nurturing, and service case summaries.

Salesforce's benchmark includes a public leaderboard to assist professionals in selecting the best LLM for their CRM needs. The company plans to continually update the benchmark with new use case scenarios and enhance its evaluation criteria, soon incorporating fine-tuned LLMs.

“As AI continues to evolve, enterprise leaders are saying it’s important to find the right mix of performance, accuracy, responsibility, and cost to unlock the full potential of generative AI to drive business growth,” said Silvio Savarese, EVP & Chief Scientist at Salesforce AI Research. “Salesforce’s new LLM Benchmark for CRM is a significant step forward in the way businesses assess their AI strategy within the industry. It not only provides clarity on next-generation AI deployment but also can accelerate time to value for CRM-specific use cases. Our commitment is to continuously evolve this benchmark to keep pace with technological advancements, ensuring it remains relevant and valuable.”

Importance of the Benchmark

Existing LLM benchmarks have been primarily academic and consumer-focused, lacking relevance to business applications. They often fail to address crucial aspects like accuracy, speed, cost, and trust, leaving CRM customers without a reliable way to evaluate generative AI-powered CRM solutions. Salesforce's benchmark addresses these gaps by using real-world CRM data and expert human evaluations, offering businesses strategic insights for incorporating generative AI into their CRM systems.

Key Evaluation Metrics

  1. Accuracy: Assessed through factuality, completeness, conciseness, and instruction-following. Accurate models provide valuable results, improving customer experience.
  2. Cost: Categorized as high, medium, or low based on percentiles, allowing businesses to evaluate the cost-effectiveness of different LLMs.
  3. Speed: Measures the responsiveness and efficiency of LLMs in processing and delivering information, enhancing user experience and reducing customer wait times.
  4. Trust and Safety: Evaluates the LLM’s ability to protect sensitive customer data, comply with privacy regulations, and avoid bias and toxicity.

Businesses can use this benchmark to compare LLMs, identify the best solutions, and make informed decisions to enhance customer success and drive business growth. With Salesforce’s Einstein 1 Platform, customers can choose from existing LLMs or bring their own models to meet their unique needs, deploying more effective generative AI solutions.

Clara Shih, CEO of Salesforce AI, emphasized the practical focus of this benchmark: “Business organizations are looking to utilize AI to drive growth, cut costs, and deliver personalized customer experiences, not to plan a kid’s birthday party or summarize Othello. Our customers have been asking for a purpose-built way to evaluate and select from among the proliferation of new AI models, and we are thrilled to introduce the world’s first LLM benchmark for CRM to help them navigate the complex landscape of models. This benchmark is not just a measure; it’s a comprehensive, dynamically evolving framework that empowers companies to make informed decisions, balancing accuracy, cost, speed, and trust.”

For more information, visit Salesforce's website and view the LLM Leaderboard for CRM on Huggingface.

Disclaimer: The information provided in this article does not constitute an endorsement of any particular LLM; it is for general informational purposes only. Readers should make their own determinations based on their needs. Opinions of the referenced presenters and/or authors are their own and do not necessarily reflect the official position of Salesforce.

 

Related News

Samsung Pushes Boundaries of M ...

Samsung Gulf Electronics has unveiled a range of exciting pre-order benefits for its latest flagship devices following the global launch at Galaxy Unpacked in Paris. Customers in t...

Samsung Showcases New Galaxy A ...

 Samsung Gulf Electronics has launched an immersive pop-up space at the Mall of the Emirates, following the Galaxy Unpacked event on July 10th. The innovative Galaxy Experienc...

Saeed Saleh Al Ghamdi Appointe ...

 Gulf Islamic Investments (GII), a leading Shari’ah-compliant asset management group, has announced the appointment of Saeed Saleh Al Ghamdi as a new Board Member at GII...

LG Introduces Concealed AC Uni ...

LG Electronics (LG) has unveiled its latest climate control solution, the LG Ceiling Concealed Duct, an advanced system designed to provide unparalleled temperature control while s...

BLS International Successfully Acqu ...

BLS International Services Limited, a leading global tech-enabled service partner for governments and citizens, has announced the successful acquisition of a 100% stake in iDATA, a...

New Ford Raptor T1+ Ready for Rally ...

Ford Performance, in collaboration with M-Sport Ltd., has unveiled the new Ford Raptor T1+, a vehicle purpose-built to tackle the grueling Dakar Rally and other challenging off-roa...

Ericsson Named Leader in Gartner Ma ...

 Ericsson (NASDAQ: ERIC) has once again been recognized as a Leader in the 2024 Gartner® Magic Quadrant™ for CSP 5G RAN Infrastructure Solutions. This marks the four...

Ministry of Economy Reviews New Com ...

The Ministry of Economy held a media briefing to review Federal Decree-Law No. 36 of 2023 on competition regulation. This new legislation aims to promote and protect competition in...

MENA Region's Early Adoption of Gen ...

A new report from the Project Management Institute (PMI) highlights how early adopters of Generative AI (GenAI) are leading the way in leveraging this transformative technology to ...

Samsung Unveils Pre-Launch Benefits ...

Samsung Gulf Electronics has announced exclusive pre-order benefits for its newest flagship devices, following the global launch at Galaxy Unpacked in Paris. Customers who reserve ...