Season 2 Ep 21 Clement Delangue of Hugging Face on building the GitHub of machine learning
Summary

In season 2 episode 21 of the Dr. Pawd podcast, Clement Delangue, the CEO of Hugging Face, discusses the growth of his company, their machine learning platform, and their vision for the future of machine learning. Hugging Face is a natural language processing (NLP) startup with over 1,000 companies using their platform, including big names like Bing, Apple, and Banjo. The CEO sees machine learning as the new way of building technology and believes their platform is becoming like the Github of machine learning.

The company's revenue comes from users who are willing to pay for additional enterprise features or need specific needs because they're so dependent on the platform. Hugging Face is used for various use cases, with some of the most significant being search, autocomplete, newsfeed ranking, information extraction, and computer vision. The popularity of transformer models, which started to beat state-of-the-art models on every NLP task, led to companies using them in production for various use cases, which created a positive cycle, turning NLP into one of the biggest machine learning domains today.

Hugging Face initially started as a chatbot company but shifted their focus to building a machine learning platform. The company has attracted a lot of interest from open-source contributors and companies like Siri and Alexa. Researchers are seeing transformers as a more general purpose model than current models in artificial neural networks. Transfer learning is at the basis of transformer models and is the most exciting development in machine learning. Multi-modal models are becoming more prevalent, like text with image or audio with text. Companies can also use the same abstractions for different features and workflows without creating entirely different systems.

Hugging Face has seen a great adoption rate of speech and vision models on the platform, with over 300,000 monthly downloads of speech models and over 200,000 monthly downloads of vision models. The plan is to invest heavily in computer vision, speech, reinforcement learning, biology, and chemistry. Hiring at Hugging Face is not based on specific job positions, but instead, on finding smart people who share the company's values and culture. The team is careful about ethics and building a very open and value-driven organization.

The company aims to scale up the training of large models through the "Big Science" project and has already completed a few small runs with the help of Ijivia and the Jean-Zay supercomputer. Hugging Face's approach of open-sourcing and open-science is aimed at increasing its ability to build technology faster than others and stay ahead in the fast-moving domain of machine learning. The company logo is an emoji, and its name is Hugging Face, which reflects its approach of being serious about the work but not taking itself too seriously.

The interviewee and his team at Hugging Face believe that AI and machine learning have a positive impact in the world by improving search and translation, moderation for social platforms, and other use cases that are yet to be discovered. However, the presence of biases in these models, the use of private information (PII), and the energy consumption of these models are some of the ethical challenges posed by AI that the Hugging Face team recognizes. Dr. Margaret Mitchell, one of the most recognized researchers on ML ethics, now works at Hugging Face to work on measuring and analyzing bias in datasets, creating model cards to communicate limitations and biases in models, and steering machine learning towards positive impacts for humanity and the world.

In summary, Hugging Face's open-source community is a hub for deep learning and has expanded beyond NLP to include machine learning more generally. Hugging Face's popularity stems from its collaboration with the open-source community and its unprecedented adoption rate. The company's revenue comes from users who are willing to pay for additional enterprise features or have specific needs because they're so dependent on the platform. The interviewee and his team at Hugging Face believe in the positive impact of AI and machine learning on the world, but the presence of biases in these models, the use of private information (PII), and the energy consumption of these models are some of the ethical challenges posed by AI that the Hugging Face team recognizes.