October 28, 2024 | Read Online

FOD#73: Apple Intelligence is undercooked

We discuss Apple’s unfortunate underdelivery and provide a collection of immensely interesting articles, interviews, news, and research papers. Dive in!

Hi there! Was this email forwarded to you? Sign up here

This Week in Turing Post:

Wednesday, AI 101: New portion of your favorite ML flashcards!
Friday, Agentic Workflows series: The History of Agents (Who’s that JARVIS?!)

If you like Turing Post, consider becoming a paid subscriber, checking out GenAI Productionize 2.0 event (outstanding lineup!), or sharing this digest with a friend. It helps us keep Monday digests free →

Become a Supporter

The main topic

Not that I intended the wordplay in the title of this newsletter. But when Tim Cook announced on Twitter that Apple Intelligence was here, I got excited – finally, buying the new iPhone was justified. I immediately downloaded the necessary update (iOS 18.1), joined the waitlist (which was a bit annoying), and got approved in about an hour.

I can try Apple Intelligence now! You can see, I’m a fan.

I wasn’t prepared for much it sucks. Believe me, I still think Apple Intelligence could become a powerful AI promoter among the masses. But it’s embarrassing how undercooked it was on launch day.

Here’s what I tried to do with my iPhone today:

Starting with the disappointments, here’s where Apple Intelligence falls short (so far!):

It doesn’t understand half of the commands and certainly doesn’t converse like ChatGPT.
It barely answers questions, usually providing links from the internet instead—often not even related to the question.
It struggles to understand whom I ask it to call.
It couldn’t write an email to the correct person.
I couldn’t figure out how to make the camera recognize what’s in front of it (supposedly, it should be able to do that).
I couldn’t find how to create a fancy new emoji (most likely it’s in the next update, but why announce it on the website then?!).

On the bright side, here are some features that worked well:

Finally, call recording is here. Apple Intelligence saves a call into Notes, provides a transcription, and can summarize, rewrite, and offer other text options—definitely convenient!
It provides “smart tools” in the Apple email app. But I don’t ever use that app since the Gmail app is much more convenient. Yes, now Apple’s app offers these new options for email summarization and reply rewriting—I might give it another try.
When you call Siri, the screen all glowing – that’s very beautiful (not sure, how useful).
It can summarize all notifications you choose, but so far, this feature isn’t that helpful. Not exactly “intelligent,” I would say.

AI needs an optimistic approach, but it also requires honesty. We’ve been through too many AI winters to allow ourselves to overpromise and underdeliver. And if ChatGPT was a true moment of magic, Apple Intelligence reminds me of that Friends episode where Phoebe tries to cover for Joey at a French-speaking audition, insisting he’s fluent, while he actually babbles nonsense. As she tells the director, “C'est mon petit frère. Il est un peu retardé.”

Alas.

But I still like all my Mac devices and hope with all final updates, the AI (apple intelligence) will bring at least a tiny bit of excitement.

💎 We recommend – Learn GenAI tips from NVIDIA, Databricks, Twilio, and more

Tomorrow 10,000+ AI professionals will learn how NVIDIA, Databricks, Twilio, and more get their GenAI apps into production!

Don't miss GenAI Productionize 2.0 for top best practices, including:

How to design an enterprise GenAI stack
Techniques for AI governance, evaluation, and observability
Proven strategies for getting GenAI apps into production

Last Chance to Register

Twitter library

6+ Free Sources to Study Diffusion Models

The ongoing growth of multimodal models highlights the importance of understanding the basics of these models for AI developers and users

www.turingpost.com/p/6-sources-to-study-diffusion-models

Speaking about Diffusion models: Ideogram introduced creative board Canvas; Midjourney announced external image editor, image retexturing, and next-gen AI moderation systems; Stability AI open sourced Stable Diffusion 3.5.

Weekly recommendation from AI practitioner👍🏼

We don't usually spotlight big players here, but Anthropic’s new “computer use” feature for Claude 3.5 stands out. With this feature, Claude can view your screen, click, type, and complete tasks autonomously. While promising, limitations like API speed and costs remain hurdles. But you should try it anyway! These are surely the first steps toward highly capable AIs that make us all a four-armed Ganesha (the famous Hindu god widely worshipped as the remover of obstacles and the god of new beginnings).

We are reading

There were a huge amount of blogs dedicated to Anthropic’s “computer use”, we liked this post a lot: it explores how this feature can be exploited via prompt injection to run commands and potentially turn AI-powered systems into “ZombAIs,” posing serious cybersecurity risks (by Embrace The Red).
This is a super interesting article – Prompting Considered Harmful – it questions our heavy reliance on prompts (which don’t work consistently – so please don’t pay for the “prompt collections”. Experiment yourself!) and pushes for more intuitive AI interfaces (by Meredith Ringel Morris).
Fun interview with Marc Benioff from Salesforce (by Stratechery).
And this article is so very promising! Carbon dioxide capture from open air using covalent organic frameworks (published in Nature)
For all my readers from software development: State of the software engineering job market in 2024 (by Pragmatic Engineer).

News from The Usual Suspects ©

Meta drops NotebookLlama
- NotebookLM from Google got a lot of attention, now Meta wants to steal it. Meta released “NotebookLlama,” an open-source workflow on GitHub, offering a complete guide for transforming PDFs into podcasts using Llama-3 models. Covering PDF processing, transcript writing, and dramatic TTS, this recipe allows the podcast-curious to dive deep with customizable settings and experiment with Llama-3 models, Parler TTS, and more. And yes, community contributions are encouraged to take it further.
They also step on OpenAI’s foot – Meta Turns to Reuters for AI News Savvy
- In a new multi-year agreement, Meta has tapped Reuters as its go-to news source for real-time updates via the Meta AI chatbot. This move, Meta’s first AI-era news deal, lets U.S. users access Reuters’ real-time reporting across Meta’s platforms. The deal provides Reuters compensation, though it's unclear if their journalism will also be used to train Meta’s language models.
OpenAI and Microsoft Bet $10M on Local News Innovation
- They have teamed up with the Lenfest Institute, contributing $10 million to a pioneering AI initiative supporting local journalism. Starting with grants for five U.S. metro outlets, the partnership enables these newsrooms to experiment with AI tools like conversational archives and ad analytics, aiming to boost local news sustainability and open-source innovation across communities.
Hugging Face pushes boundaries with AutoTrain Advanced
With a few clicks, users can craft state-of-the-art models on Hugging Face Spaces or locally – no coding or heavy-lifting required. Plus, you only pay for what you use. Simple setup, sophisticated outcomes, HF likes to deliver.

But there was a bunch of interesting research papers last week (categorized for your convenience)

Models

Aya by Cohere advances multilingual AI by supporting 101 languages, especially those underrepresented in AI, through open-access datasets and models optimized for natural language tasks →read the paper
Ferret-UI by Apple enables precise, cross-platform user interface understanding, enhancing multimodal interactions across various Apple devices through adaptive scaling and spatial reasoning →read the paper
Granite 3.0 by IBM focuses on enterprise optimization with built-in safety and efficiency, offering models like the 8B Instruct for performance across languages and domains, supported by open-source principles →read the paper
Quantized Llama Models by Meta AI speeds up performance by 4x and reduces memory footprint by 56%, using quantization techniques like QLoRA, optimizing Llama models for on-device mobile efficiency →read the paper
PANGEA by Carnegie Melon University bridges linguistic and cultural gaps with multimodal reasoning for 39 languages, evaluated via PANGEABENCH to address underrepresentation in cross-lingual and cultural understanding →read the paper
WAFFLE by Purdue University improves UI-to-HTML generation by integrating visual and HTML representations, achieving higher accuracy in code conversion tasks →read the paper

Our TOP is all Canadian today

In-Context Learning and Occam’s Razor. Researchers at Mila connect in-context learning (ICL) with Occam's Razor, showing ICL’s prediction loss aligns with prequential coding—a data compression method. This approach improves generalization by balancing training error and model complexity. Tests indicate ICL outperforms traditional optimizers in data-efficient settings but faces limits in task generalization, suggesting areas for future innovation →read the paper
Hallucination Detox: Sensitive Neuron Dropout (SEND). Mila and McGill University researchers propose Sensitive Neuron Dropout (SeND), a training method that drops high-variability neurons to reduce hallucinations, boosting factual reliability by up to 40%. An efficient hallucination metric, Efficient EigenScore (EES), approximates traditional EigenScore 2x faster, enhancing LLM accuracy across domains like Wikipedia and medical texts without post-training adjustments →read the paper
Asynchronous RLHF: Efficient Off-Policy Training for LLMs: The Quebec AI Institute introduces an asynchronous approach to reinforcement learning with human feedback (RLHF) for LLMs, cutting training time by 40% for the 8-billion-parameter LLaMA 3.1 model. Off-policy training allows for efficient data use, with Direct Preference Optimization (DPO) showing high resilience. This method improves scalability and compute efficiency →read the paper

Language Model Optimization, Alignment & Distillation

Aligning LLMs Via Self-Steering Optimization develops a self-steering optimization framework to align language models effectively without human annotation, achieving improved accuracy on evaluation tasks.
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs uses a high-quality, curated dataset to improve reward modeling for alignment, emphasizing data quality over quantity.
Pre-Training Distillation for LLMs: A Design Space Exploration extends knowledge distillation techniques to pre-training, yielding performance gains while optimizing computational efficiency.
LOGO: Long Context Alignment Via Efficient Preference Optimization introduces a preference optimization framework for aligning long-context models, enhancing performance on extensive input tasks.

Efficient Model Scaling & Training Techniques

Breaking the Memory Barrier in Contrastive Learning: Near Infinite Batch Scaling with Inf-CL introduces a tiling strategy to enable massive batch sizes in contrastive learning, significantly reducing memory costs.
Stable Consistency Tuning: Understanding and Improving Consistency Models proposes a framework that stabilizes consistency models, improving generative quality with reduced training variance.
Multi-Draft Speculative Sampling: Canonical Architectures and Theoretical Limits enhances sampling efficiency in language models through a two-step speculative sampling method, boosting performance on token selection.
Scaling Diffusion Language Models Via Adaptation From Autoregressive Models adapts autoregressive models to diffusion models for faster, diverse generation, maintaining performance without rigid sequencing.

Multimodal & Vision-Language Processing

Mitigating Object Hallucination Via Concentric Causal Attention addresses hallucination in vision-language models by adjusting token alignment, improving accuracy on visual tasks.
MIA-DPO: Multi-Image Augmented Direct Preference Optimization for Large Vision-Language Models introduces an efficient multi-image framework for preference-based tasks, achieving superior accuracy without additional annotation.

Theorem Proving & Mathematical Reasoning

Pantograph: A Machine-To-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4 introduces an API for Lean 4, enhancing automated theorem proving through machine-to-machine interaction.
ALCHEMY: Amplifying Theorem-Proving Capability Through Symbolic Mutation expands formal theorem datasets for improved theorem proving, using symbolic mutation to increase dataset diversity.
Unleashing Reasoning Capability of LLMs Via Scalable Question Synthesis from Scratch proposes a scalable data generation method for reasoning tasks, significantly enhancing mathematical reasoning abilities in LLMs.

Attention Mechanisms & Memory Optimization

Taipan: Efficient and Expressive State Space Language Models with Selective Attention integrates selective attention layers to optimize long-context tasks, enabling efficient processing of extensive sequences.
Value Residual Learning for Alleviating Attention Concentration in Transformers introduces ResFormer, reducing attention concentration by adding residual connections, thus improving efficiency without cross-layer attention.

Leave a review!

Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. You will get a 1-month subscription!