• Turing Post
  • Posts
  • FOD#17: The Week of Impressive Numbers and the GPU-Rich vs. GPU-Poor Narrative

FOD#17: The Week of Impressive Numbers and the GPU-Rich vs. GPU-Poor Narrative

AI companies, unfazed, continue to deliver increasingly sophisticated versions and to work on research that propels the industry forward.

Today’s edition is largely about numbers. After the Q2 earnings results and some new, impressive funding rounds, the question that percolates is: where are we in the AI cycle? AI companies, unfazed, continue to deliver increasingly sophisticated versions and work on research that propels the industry forward.

In terms of numbers, the standout performers of the past week are Hugging Face and Nvidia.

Let’s dive in and see what the week has brought us.

Now to the last week's news.

Some of the articles might be behind the paywall. If you are our paid subscriber, let us know, we will send you a pdf.

The Week of Impressive Numbers and the GPU-Rich vs. GPU-Poor Narrative

Nvidia's Week of Triumph: More Than Just Silicon

$13.5 B in revenue with a projected $16 B in the next quartet: Last week Nvidia wasn't merely riding high on a wave; it was shaping the tide itself. Its Q2 earnings revealed a stratospheric $13.5 billion in revenue, mainly driven by AI chip sales. What's staggering is the seismic shift this implies: over 50% of the company's revenue now stems from data center/AI sectors, dwarfing its erstwhile gaming focus.

Heavyweight AI investor: Nvidia's savvy isn't confined to its balance sheets. It's also emerged as a venture capitalist, and participated in some of the largest AI rounds of this year to companies like Inflection AI ($1.3B Series B), Adept ($350M Series B), Cohere ($270M Series C), CoreWeave ($221M Series B), and, recently, Hugging Face's Salesforce Ventures-led Series D round ($235M). These investments mark Nvidia not just as a hardware juggernaut but as an axis in the AI cosmos.

Strong moat: The company’s technological moats further its stranglehold. Proprietary software ecosystems and specialized GPU orchestration set it apart from competitors. It's worth noting that Nvidia has a monopolistic manufacturing arrangement with TSMC, further securing its position.

Strategic Partnerships: Amidst this, a notable development is Nvidia's partnership with VMware. This alliance, aimed at extending Nvidia's portfolio to cloud-based AI services, adds another dimension to its competitive edge. It's a proactive measure against Big Tech’s upcoming in-house AI chips. Also noteworthy is Nvidia's collaboration with Stability AI to improve the speed and efficiency of Stable Diffusion XL by incorporating NVIDIA TensorRT.

In an exhaustive but helpful list of achievements, they name all Q2' partnerships and the objectives behind it.

Nevertheless, questions loom. Will the demand for GPUs justify the hyped-up stock valuations? Nvidia's CEO hints at GPUs' potential beyond AI/ML, but will companies shift from existing CPU ecosystems? Meanwhile, the competitor IBM is working on an analog chip that mimics the human brain and is up to 14 times more efficient than leading GPUs.

Hugging Face on the Spree of Good News but…

Fresh off a $235 million Series D round that skyrocketed its valuation to $4.5 billion (we just published a comprehensive profile of Hugging Face), the company released a series of compelling updates.

AutoGPTQ, a new library that allows quantization of large language models, reducing computational demands without sacrificing much in terms of accuracy. The GPTQ algorithm, part of the AutoGPTQ library, enables models to operate at 8, 4, 3, or 2-bit precision with minimal accuracy degradation. The feature is supported on both Nvidia and AMD GPUs. Quantization aids in democratizing access to advanced NLP models by making them more resource-efficient →read more 

IDEFICS. an 80 billion-parameter open-access visual language model. Distilled from DeepMind's unreleased Flamingo, this model is trained on everything from Wikipedia to a gargantuan 115B token dataset known as OBELICS. Not just a showcase of raw power, IDEFICS has undergone ethical evaluation through red teaming pre-release →read more

Last but far from least, meet SafeCoder. Built for the enterprise, this code assistant aims to alleviate costly coding errors and enhance productivity. It's more than just a fancy autocomplete; it’s a tangible step towards mitigating risks in code deployment →read more and also very important Carbon Footprint →

But! The internet would not be 'internet' if there were no one to throw a bone and introduce a new term for feverish discussions. SemiAnalysis called companies like Hugging Face and Databricks 'GPU-poor,' in comparison with Google and OpenAI, which are 'GPU-rich.'

The article argues that Google's historical missteps in the language model race, epitomized by the transient glory of its MEENA model, illustrate its underutilization of existing assets and talent like Noam Shazeer (a co-author of the famous paper ‘Attention Is All You Need). In their opinion, despite past fumbles, Google is now ramping up, with the potential to outpace even OpenAI's GPT-4 in terms of FLOPS. They also think that ‘GPU-Poor’ companies are struggling to innovate meaningfully. The article poses the question of whether Google, with its new focus and massive resources, can break Nvidia's stronghold and democratize access to high-end AI computation.

The AI Twitter resented the statement, while Hugging Face’s CTO noticed: “I for one am unapologetically GPU-middle-class.”

News from The Usual Suspects

Cool Meta keeps being open-sourced

Meta has unveiled Code Llama, a suite of large language models specialized for coding tasks. A fine-tuned variant of the previously released Llama 2, Code Llama is available in three sizes—7B, 13B, and 34B parameters—and two specialized versions for Python and instruction-based coding. The models display robust performance in coding benchmarks, notably closing the efficiency gap with larger models like GPT-4. Unlike Llama 2, Code Llama offers no 70B parameter version, possibly due to scaling laws and computational constraints. The release is in line with Meta's open-source approach, augmenting the democratization of AI and serving as a potential game-changer in the developer ecosystem. With AI stalwarts like Yann LeCun, Meta is positioned as a key player in the evolving landscape of generative AI →read more / request access to download the model / GitHub / research paper

Additionally, for those interested in multimodal AI, Meta has also recently released SeamlessM4T, a comprehensive model for translating and transcribing speech and text across nearly 100 languages →read more

Meta also sets a flow of rumors:

OpenAI is also keeping up

ChatGPT Enterprise

Scale AI: Aligning with its business-centric strategy, OpenAI partners with Scale AI to allow fine-tuning of its GPT-3.5 model using custom data, providing added expertise and services →read more

Fine-Tuning GPT-3.5 Turbo: The introduction of fine-tuning capabilities in GPT-3.5 Turbo enables developers to create models matching GPT-4's narrow use-case capabilities. The cost is $0.008 per 1000 tokens for training →read more

Additional info: MLOps newsletter gives a holistic explanation of fine-tuning and how OpenAI builds this capability to their base models →read more

The Freshest Research News, categorized for your convenience:

Synthetic Data Generation

Amazon's Hands-Off: Amazon revealed a technique for reducing manual annotation in synthetic image data generation. Named "HandsOff," this technique relies on a small dataset of pre-labeled images and a Generative Adversarial Network (GAN). It employs GAN inversion to map pre-labeled images to the GAN's latent space and then uses a third model for labeling newly generated synthetic images. It's a breakthrough for tasks like semantic segmentation and depth estimation →read more

Multimodal Language Models

As an update to the Turing Post's China AI report:

Speech and Audio Processing

ElevenLabs Multilingual v2: ElevenLabs announced the launch of Eleven Multilingual v2, an AI speech model that supports 28 languages. Designed to produce emotionally rich AI audio, this model offers voice cloning and is versatile enough to be used in gaming, education, and assistive technologies →read more

Reinforcement Learning

DeepMind's Reinforced Self-Training (ReST): DeepMind has introduced ReST, a method that speeds up the AI development cycle. It involves two main steps: a 'grow' phase for dataset collection and an 'improve' phase for fine-tuning the model. Despite its promise, it may suffer from overfitting if multiple iterations are conducted →read more

Model Efficiency and Performance

Prompt2Model: This technique trains models that can outperform get-3.5-turbo by an average of 20%, despite being up to 700 times smaller. This could signify a significant shift in balancing model performance and size →read more

3D Scene Reconstruction

Generalizable NeRF Transformer: A research paper on a technique that generalizes Neural Radiance Fields (NeRF) using mixture-of-view experts. This approach seeks to broaden the applicability and performance of NeRF in generating 3D scenes →read more

In other newsletters:

We are reading:

Getting back to the AI Hype Cycle:

That’s it for today! Thank you for reading, please feel free to share with your friends and colleagues 🤍 You can also leave a comment!

Another week with fascinating innovations! We call this overview “Froth on the Daydream" - or simply, FOD. It’s a reference to the surrealistic and experimental novel by Boris Vian – after all, AI is experimental and feels quite surrealistic, and a lot of writing on this topic is just a froth on the daydream.

How was today's FOD?

Please give us some constructive feedback

Login or Subscribe to participate in polls.

Join the conversation

or to participate.