Turing Post
Posts
FOD#25: What's Happening in Self-Driving Universe?

FOD#25: What's Happening in Self-Driving Universe?

plus the curated list of the most relevant developments in the AI world

Ksenia Se
October 23, 2023

Since the Tesla I was driving, equipped with Autopilot and Full Self-Driving, nearly drove me under a truck, my interest in self-driving cars has intensified. As if serving my curiosity, the last week was filled with news about self-driving, driverless, autonomous vehicles, and robotaxis – so, here's my Monday take.

Let’s be clear, I don’t blame Tesla. In the same way, I can’t fault ChatGPT for making things up. It’s not truly intelligent. ChatGPT doesn’t chat with full understanding and responsibility for its word choices, just as self-driving cars are still largely experimental. If you're a good driver, you can somewhat rely on it – but with heightened awareness. Similarly, if you're a sharp lawyer, you can somewhat rely on ChatGPT, ensuring you verify the info it generates.

But the trend is clear: there will be more driverless cars on city streets soon. The dream of self-driving cars, once confined to sci-fi, is on the brink of reality. However, like all emerging tech, there are challenges and debates to address.

The Giants in the Race

Nvidia and Foxconn: At the forefront, Nvidia and Foxconn are setting up "AI factories," supercomputing hubs meant to speed up self-driving tech. Their collaboration aims to craft cars with AI brains, capable of interacting with drivers and passengers, and even learning from real-world experiences. This partnership leans on Nvidia's GPU expertise and Foxconn's EV manufacturing skill. Tesla and its supercomputer, Dojo, spring to mind as another contender.

Waymo and Geely: Alphabet's Waymo, which started as Google's self-driving project, is partnering with Chinese auto giant Geely's brand, Zeekr. Their ambition? A jointly-developed robotaxi for the U.S. market. A noteworthy move, given the current frosty relations between the U.S. and China! But Waymo is clearly aiming for gold in this race.

General Motors and Cruise: GM's self-driving unit, Cruise, backed by Honda, is expanding its geography, with plans to launch a robotaxi service in Japan by 2026, following its deployment in Dubai. But there is a lot of challenges when changing location, for example:

Navigating the Urban Jungle

Waymo and Cruise expanding to new cities: After decent results in San Francisco and Phoenix, they're setting sights on Dallas, L.A., Houston, and beyond. The hurdles? Beyond the tech's nascent stage, each city offers distinct traffic patterns, driving cultures, and rules. Speaking from an ML perspective, this is prime territory for few-shot and zero-shot learning.

To ease the navigation and communication, Waymo is pioneering "tertiary communication," using LED displays on their vehicles. Unlike humans, who can gesture or establish eye contact, autonomous cars need other methods. Yet, with various companies potentially adopting unique symbols, there’s a risk of confusion.

Regulatory Oversight

As these autonomous vehicles hit the roads, they're also navigating intricate regulatory landscapes. The U.S. National Highway Traffic Safety Administration (NHTSA) is closely watching Cruise, especially after reports of pedestrian incidents. This attention underscores the balance these companies must strike between tech innovation and public safety.

China, meanwhile, is charting its own course. Guangzhou's government is teaming up with Chinese ride-hailing giant, Didi, pouring in up to $149 million. Their focus on transforming from a service-based model to a platform solutions entity, especially in the domains of smart EVs, cities, and manufacturing, is worth watching.

That Tesla I mentioned in the beginning? Courtesy of my partner, who's behind TezLab, an app tailored for electric vehicles (EVs). They are data-driven, aiding drivers in trip planning, understanding range, and adapting to the EV lifestyle in general. A stellar app by die-hard software developers. But I keep wondering: when will they add LLMs/foundation models and other ML ‘magic’? It feels like most of my readers are in the same boat, still figuring out how to really judge these models.

There was an insightful paper in June about Autonomous Driving's challenges and future. I believe we should delve more into car-human interaction and ML's role. Share your thoughts on this topic.

Wrapping up this piece, I came across a tweet. Oh, how I wish I could experience it!

This is mesmerising! 😎 We're seeing some incredible robust behaviour emerge from our AI Driver.
This video shows @wayve_ai's autonomous vehicle driving through London. With no lidar, no HD-maps, no hand-coded rules. Just pure on-board intelligence from our end-to-end neural… twitter.com/i/web/status/1…
— Alex Kendall (@alexgkendall)
5:19 PM • Oct 23, 2023

You are currently on the free list. Join Premium members from top companies like Datadog, FICO, UbiOps, etc., AI labs such as MIT, Berkeley, and .gov, as well as many VCs, to learn and start understanding AI →

News from The Usual Suspects ©

Brain-Inspired IBM Chip

IBM unveils NorthPole, an AI chip that redefines the boundaries of computing. Drawing inspiration from the human brain, NorthPole marries compute and memory, boasting a remarkable 25-fold increase in energy efficiency over existing GPUs. This revolutionary silicon chip, packing 22 billion transistors in 800 square millimeters, holds the potential to transform industries, from autonomous vehicles to robotics. Beyond its prowess, NorthPole's design suggests we're only scratching the surface of brain-inspired computing possibilities.

Stanford’s Transparency Index

Stanford's Center for Research on Foundation Models (CRFM) introduces the Foundation Model Transparency Index (FMTI), evaluating the transparency of ten major foundation model companies. Developed by a multidisciplinary team, the FMTI scores based on 100 transparency indicators. Findings show most companies score poorly, with the top score being an unremarkable 54 out of 100. The index aims to guide effective AI regulation and address transparency issues, essential for AI policy, consumer protection, and industry advancements. The goal is broader transparency in the AI sector.

Image Source: Stanford’s CRFM

Trustworthiness in GPT Models

Researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research released "DecodingTrust," a study evaluating the trustworthiness of GPT models, specifically GPT-4 and GPT-3.5. The study assessed perspectives like toxicity, bias, adversarial robustness, and privacy. Key findings revealed vulnerabilities in GPT models, including susceptibility to generate toxic, biased outputs and leaking private data. While GPT-4 typically outperformed GPT-3.5, it was more vulnerable to certain adversarial tactics. The research aims to improve LLM trustworthiness and has shared findings with OpenAI.

Twitter Library

Unlocking Data Mastery: 3 Free Books for Every Learner

From Basics to Advanced: Navigate the World of Data Visualization with Our Top Picks

www.turingpost.com/p/3-free-books

Professor Yann LeCun recently gave a talk titled "From Machine Learning to Autonomous Intelligence"
The presentation covers an important topic:
AI systems that can learn, remember, reason, plan, have common sense, yet are steerable and safe.
Here is a short summary:
— TuringPost (@TheTuringPost)
12:54 PM • Oct 3, 2023

Tech news, categorized for your convenience:

Foundation Models (FMs)

Fuyu-8B: Adept open-sourced Fuyu-8B, a compact version of their multimodal AI model. Fuyu-8B stands out due to its simplified architecture, rapid response time, and design tailored for digital agents. The model excels in tasks like understanding images, graphs, UI queries, and image localization. It is available on HuggingFace →read more
ERNIE 4.0 by Baidu: Claimed to rival GPT-4, ERNIE 4.0 enhances AI capabilities, allowing improved performance in understanding, generation, reasoning, and memory. Baidu has integrated this technology into various applications, revolutionizing products like Baidu Search, Baidu Maps, and Baidu Drive →read more
SALMONN: By integrating a pre-trained text-based large language model with speech and audio encoders, SALMONN allows direct processing and comprehension of diverse audio inputs. It excels in tasks such as automatic speech recognition, emotion recognition, and music captioning. It also demonstrates emergent abilities in untrained areas like speech translation to unfamiliar languages →read more
SILC: Sets a new state-of-the-art in Vision-Language Foundational Models. It enhances CLIP's contrastive pre-training by incorporating local-to-global correspondence learning. Achieves new benchmarks in zero-shot and open vocabulary segmentation →read more

Enhancements to FMs

Self-Reflective Retrieval-Augmented Generation (SELF-RAG): Augments LLMs with on-demand retrieval and self-reflection features. Outperforms other LLMs and traditional RAG models in various tasks →read more
AutoMix: A system that optimizes the use of LLMs by routing queries based on their complexity. Demonstrates up to an 89% increase in incremental benefit per cost →read more

Reinforcement Learning

EUREKA Algorithm: Utilizes Large Language Models (LLMs) like GPT-4 for evolutionary optimization of reward functions in reinforcement learning environments. Achieves better performance than human experts in 83% of 29 tested RL environments →read more
Vision-Language Models as Zero-Shot Reward Models (VLM-RMs): Proposes using pre-trained vision-language models, specifically CLIP-based ones, as reward models for reinforcement learning. Demonstrates its effectiveness in teaching complex tasks like kneeling to a MuJoCo humanoid →read more

Platforms

OpenAgents: An open platform designed for hosting and using language agents in real-world scenarios. Developed by xlang-ai, the platform includes Data Agent for data analysis, Plugins Agent that integrates over 200 daily tools, and Web Agent for autonomous web browsing. OpenAgents emphasizes user interaction through a web UI and offers developers easy deployment options. The project aims to enhance the trustworthiness and applicability of language agents, bridging gaps in current frameworks →read more

In other newsletters

Strange Ways AI Disrupts Business Models, What’s Next For Creativity & Marketing, Some Provocative Data by Implications
AI and Open Source in 2023: The Highs and Lows: A Year in Review by Ahead of AI

Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍

Another week with fascinating innovations! We call this overview “Froth on the Daydream" - or simply, FOD. It’s a reference to the surrealistic and experimental novel by Boris Vian – after all, AI is experimental and feels quite surrealistic, and a lot of writing on this topic is just a froth on the daydream.

How was today's FOD?

Please send us a real message about what you like/dislike about it

Reply

or to participate.