Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (2024)

Hello and welcome! Shouldn't we discuss AI?

Of course, we should. The race for technology trends continues to gain momentum, and no one wants to fall behind (except Google; what's wrong with you guys?). So today, we will talk about the future: the biggest AI plans, deals, and releases. Yes, as always.

And speaking of Google. Have you seen how AI Overview is ruining Google Search? If not, we have a particular post for you:

Google's AI Search is a DisasterAnvar Abrarov·May 28Read full story

Now, let's get to the news.

This Creators’ AI Edition:

  1. Featured Materials 🎟️

  2. News of the week 🌍

  3. Useful tools ⚒️

  4. Weekly Guides 📕

  5. AI Meme of the Week 🤡

  6. AI Tweet of the Week 🐦

  7. (Bonus) Materials 🎁

Mistral Releases Its First AI Model for Code

Remember Devin? He's the first AI software engineer introduced by Cognition a couple of months ago. We wrote about it in this post. Well, Devin has a competitor. Its name is Codestral, developed by Mistral, a startup recently valued at $6 billion.

Codestral promises to help developers write and interact with code.

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (2)

It has 22B parameters and has been trained in over 80 programming languages, including Python, Java, C++, JavaScript, and many others. Codestral performs coding functions, writes tests, “completes” incomplete code, and answers questions about the codebase. The company says that interacting with AI will help improve developers and reduce the risk of bugs.

Codestral’s Code Generation Performance

Mistral has already compared its new model with other AI for code generation. And (who would have thought so) Codestral was the best solution in its class. The company used four benchmarks: HumanEval pass@1, MBPP sanitized pass@1 to evaluate Codestral’s Python code generation ability, CruxEval to evaluate Python output prediction, and RepoBench EM to evaluate Codestral’s Long-Range Repository-Level Code Completion.

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (3)

Codestral is a 22B open-weight model licensed under the newMistral AI Non-Production License, which means that you can use it for research and testing purposes. You can try Codestral out right now by downloading it onHuggingFace.

Sure, it's great that Codestral turned out to be the most productive AI coder. However, these tests don't mean much right now. If Mistral wants its chatbot to be able to replace humans or greatly simplify their work, we have to compare its results with those of real engineers.

In any case, it won't be long before we see independent comparisons.

Quick Blitz on OpenAI News

OpenAI continues to inundate us with updates of various sizes, so let's start with a quick blitz (and come back to the startup again a little later):

  • Custom GPTs, data analytics, vision, and memory features are now open to free ChatGPT users. One of the best chatbots just got even more convenient.

  • OpenAI forms a safety and security committee. This new committee recommends critical safety and security decisions for all projects.

  • The company introduced «OpenAI for Nonprofits». This new initiative will make AI tools more affordable for nonprofit organizations, including ChatGPT Team and Enterprise discounts.

  • ChatGPT Edu Announcement. The new offering for universities and colleges to help responsibly implement AI on campuses.

  • OpenAI Is Rebooting Its Robotics Team. According to Forbes, with investment into AI-powered roboticsheating up, OpenAI is formally relaunching its previously abandoned robotics team.

Hm, maybe we should compile all of OpenAI's updates over the past months in one post, so we can track the company's progress? What do you think?

Sharing is caring! Refer someone who recently started a learning journey in AI. Make them more productive and earn rewards!

Refer a friend

Scale AI Introduces Its New Leaderboard

Scale AI published its first leaderboards, ranking AI model performance in specific domains. It’s a new ranking system for advanced LLMs based on private, curated, and unused datasets that attempt to evaluate their capabilities in common use cases. Specifically, the leaderboard includes using AI for coding, instruction following, math, and multilingualism.

It's always interesting to look at numbers, results and compare the performance of models (especially if your favorite one is among them, then it's even a bit like sport). But I'll remind you that dry tests don't always reflect real-world benefits. And, for now, I'd treat the new leaderboard from Scale AI the same way.

Palantir Wins $480M AI Computer Vision Deal With US Army

The Pentagon announced that the Army awarded Palantir a $480 million deal for its Maven Smart System prototype. The U.S. Army wants to use AI tools such as Maven to implement its Combined Joint All-Domain Command and Control (CJADC2) warfighting construct, which aims to better connect the platforms, sensors, and data streams of the U.S. military and key international partners in a more collaborative single network.

Defense Department officials also intend to use AI to help commanders make faster and more effective decisions and improve operational efficiency and effectiveness.

Well, that sounds pretty serious. The military always gets the most advanced technology, so integration of AI into its operations was only a matter of time.

PwC Will Become OpenAI’s Largest ChatGPT Enterprise Customer

Yes, let's talk a little more about OpenAI. This time, the company's partner is PricewaterhouseCoopers, a Big Four accounting firm. PwC said it will deploy ChatGPT Enterprise, a version of ChatGPT designed for large companies, to its 75,000 employees in the U.S. and 26,000 employees in the U.K., totaling more than 100,000 licenses for the AI product.

The partners declined to disclose the financial terms of the deal.

I've been working a lot with analytics lately, so I've been actively following PwC, Deloitte, EY, and KPMG. And you know what's funny? Every second (and sometimes the first) report is about generative AI in work, business, economy or something else. And now AI is going to help analysts tell us about AI. Okay.

Perplexity AI’s New Feature Turns Your Searches Into Shareable Pages

Not a very big but very nice update for my favorite chatbot. The new Perplexity Pages feature will allow us to create reports, articles, or guides in a visually appealing format. You can find all the necessary tools in the library section. By the way, before generating the text, you can also choose the type of audience for relevant content. When the page is ready, it can be shared via links or even found via Google search.

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (5)

I could go on endlessly about how much I like Perplexity, but I won't take up your time and will only recommend trying it. The new feature seems useful for both work and personal matters.

Vox Media and The Atlantic Sign Content Deals with OpenAI

And once again. First, OpenAI gained access to the Reddit and Financial Times database. Next, the startup partnered with News Corp. And now OpenAI has signed a new deal with Vox Media and The Atlantic. The company will soon start using licensed content from these media companies to train GPT and share it with its users.

As a writer and editor, the first thing I want to do when I see another news story about OpenAI's partnership with a media holding is to send Sam Altman the “It's time to stop” meme. Because... Colleagues, we still need work, don't we? But seriously, ChatGPT has a growing opportunity to become the most comprehensive and powerful information resource in the world.

And that's a very strong competitive advantage.

Share this post with friends, especially those interested in AI stories!

Share

ResumeDive - A resume-boosting service using AI

OH, a potato! – AI-powered zero-waste meal planner

AI Notebook App – AI-powered second brain

Tonic Textual – Get your unstructured data AI-ready in minutes

VIVA – AI-powered creative visual design platform. Viva is a good option if you want to try generating images and videos with AI. It's completely free and lets you create stuff through a website or their Discord server. You can describe what you want in words, and they'll turn it into an image or video. Plus, you can even edit the images afterward to fine-tune them. Remember, it's a new tool, so the results might be a bit wonky sometimes, but it's a new, fun way to play around with AI art.

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (6)

Weekly Guides 📕

Make Videos 3X Faster With AIAnvar Abrarov·May 30Read full story

How to Start Affiliate Marketing Using AI - This Makes Me $30,000/Month

Create Consistent Characters Using Leonardo AI | Mastering Leonardo AI

How To Create AI News Channel & Make Money Online 2024

How to Use Elicit AI, Literature Reviews + More: Beginner Tutorial and Research Tips!

10 Powerful Shortcuts with ChatGPT-4o

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (8)

AI Tweet Of The Week

Let's add some seriousness to our Tweet of the Week column. Yesterday, YC co-founder Paul Graham criticized rumors that Sam Altman was forced to step down as president of Y Combinator due to a potential conflict of interest. He also responded to several replies on the topic below the post. You can read them here.

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (9)

Chatbots Are Entering Their Stone Age

2024 Is the Year of the Generative AI Election

Satya Nadella on Microsoft’s New CoPilot+ and the Future of AI PCs

Gen AI at work has surged 66% in the U.K., but bosses aren’t behind it

What We Learned from a Year of Building with LLMs (Part I)

How was this week in AI? Share your content and ideas in the comments to this post so we can discuss or include them in the next edition!

Leave a comment

Mistral's AI Coder, OpenAI & Vox Media, Perplexity Pages, and Scale Intoduces Leaderboard (2024)

References

Top Articles
Latest Posts
Article information

Author: The Hon. Margery Christiansen

Last Updated:

Views: 5835

Rating: 5 / 5 (70 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: The Hon. Margery Christiansen

Birthday: 2000-07-07

Address: 5050 Breitenberg Knoll, New Robert, MI 45409

Phone: +2556892639372

Job: Investor Mining Engineer

Hobby: Sketching, Cosplaying, Glassblowing, Genealogy, Crocheting, Archery, Skateboarding

Introduction: My name is The Hon. Margery Christiansen, I am a bright, adorable, precious, inexpensive, gorgeous, comfortable, happy person who loves writing and wants to share my knowledge and understanding with you.