This Week in Turing Post: | Wednesday, AI 101: a new batch of cards with ML concepts! Friday, AI Unicorns: Eleven Labs
| If you like Turing Post, consider sharing it with a friend. It helps us keep Monday digests free. |
|
| The main topic – lucky 70 | On May 7, 2023 – we published our first article on Turing Post. Since then, it’s been quite the adventure. Today marks our 70th edition of Froth on the Daydream – a nod to Boris Vian’s surreal novel L'Écume des jours, where dreams and reality blur, much like the current state of AI. And fittingly, not so long ago, we’ve also reached a milestone: 70,000 readers. | We’ve covered major ML and AI events, showcasing the practical side of these technologies while cutting through both the hype and the gloom. We spoke with OpenAI’s General Counsel, met with Microsoft’s Kevin Scott, got predictions from Yoshua Bengio, talked to Allen AI’s Oren Etzioni, uncovered TimeGPT, and haven’t heard back from DeepMind (yet!). We launched a fascinating series on the History of LLMs (soon to be a book!), and explored Foundation Models Operations and the emerging trend of task-centric ML. We also turned our focus globally, with dedicated coverage of Global AI affairs. | I’ve been immersed in AI, starting in business development and then as a publisher, for over five years now. During this time, I’ve often been asked if AI will "take over" (cue Terminator references), and my answer is always: no. But to understand why, you have to learn how machine learning really works. It’s not rocket science – and even when it is, I break it down so it’s understandable for both of us (I’m a constant learner myself). | To each of you, thank you for your support and feedback. Just as Hugging Face champions open-source ML, we at Turing Post focus on understanding the history of machine learning to better grasp where we are today and where we might be headed. It’s just a technology, after all, and it’s up to us to decide how to make the most of it. | On a personal note: when I decided to start Turing Post, I also realized I was pregnant. So my 8-month-old daughter, Reason, has been with me every step of the way – along with her four brothers! Thankfully, they’re already more independent. To all the women reading Turing Post – don’t let anyone tell you what you can or cannot do. We’re capable of amazing things, and I’m right here with you. | I’m enormously flattered to see the top people from such companies as Arm, Nvidia, Hugging Face, Microsoft, Google, Zilliz, CoreWeave, Ernst&Young and many many more to be Turing Post’s premium subscribers. Thanks to all of you for your trust. | If you want to join them, I’m offering a 70% discount on the annual subscription ($21 per YEAR). Why such a big discount? Firstly, to honor all 70,000 of you. And secondly, money is good and all, but what’s really important to me is that people truly understand machine learning and AI – the most important technology of the 21st century. | Currently, we’re in the middle of two amazingly interesting series: AI Agents and Agentic Workflows and AI 101 – the best guide you can find out there. | | The offer ends in 7 days, on Oct 14. |
|
| |
News from The Usual Suspects © | Cerebras Takes on Nvidia with IPO Gambit Cerebras Systems, the AI chipmaker known for its massive wafer-scale chips, is going public with a $7-8B valuation. Powered by 87% revenue from a single client, G42, Cerebras is gunning for Nvidia’s dominance in AI hardware. Despite a $66.6M net loss in H1 2024, CEO Andrew Feldman is bullish, claiming Cerebras could take "all" of Nvidia's market share. Check our profile on Cerebras! Meanwhile, two reports say AI investments are surging and not going to slow down: CB-Insights: Mega-rounds for Safe Superintelligence ($1B), Baichuan AI ($688M), and Helsing ($488M) highlight investor confidence in AI's transformative potential. Focus areas included LLMs, GenAI, and enterprise solutions. AI integration across finance, healthcare, and defense attracted significant funding. Despite economic uncertainties, AI remains a hotbed for venture capital, reflecting its perceived long-term value and cross-industry impact. This trend underscores AI's central role in shaping future technologies. Bain: The AI market is set to hit $780-990B by 2027, with early adopters enjoying 20% earnings boosts. But scaling up AI is straining data centers, electricity, and labor. Meanwhile, 75% of software companies struggle with declining net revenue retention despite increased spending on customer success. Tech firms investing heavily in automation and AI outperform their peers, with leaders planning to invest over 3x more in generative AI than laggards. Bain Dives Deep into AI Waters Bain Capital Ventures leads the $500 million Series B round for poolside, the AI-first startup helmed by former GitHub CTO Jason Warner. With its proprietary tech, Reinforcement Learning from Code Execution Feedback (RLCEF), poolside aims to push AI beyond human-level coding.
Black Forest Labs Unveils FLUX1.1 [pro] and New API The model tops the Artificial Analysis leaderboard for text-to-image models, with ultra-high resolution features coming soon. Paired with the launch of the BFL API, developers can now integrate FLUX’s capabilities into their apps, offering customization, scalability, and competitive pricing at 4 cents per image.
Liquid AI pioneers liquid neural networks and foundation models Their research spans state-space models, neural operators, and DNA foundation models. Liquid AI has contributed to advancements in generative modeling, graph neural networks, and open-source LLM finetunes, driving innovation in AI scalability and performance.
Google Search Goes Visual and Vocal with AI Lens now supports video and voice input, letting users search by recording video or speaking while snapping photos. With 20 billion monthly visual searches, AI Overviews and shopping tools are getting smarter too. Plus, AI-organized search results are debuting in the U.S., offering more diverse content and perspectives.
Last week dominance of OpenAI: OpenAI has secured $6.6B in funding – the largest VC round ever, pushing its valuation to $157B. And secured a new $4 billion revolving credit line. OpenAI introduced "Canvas," a fresh tool designed to elevate collaboration with ChatGPT for writing and coding projects. Moving beyond mere chat, this beta feature lets users work on documents in real-time with their AI sidekick – highlighting, editing, and getting feedback on the go. People say they like the UI better than Claude. OpenAI introduced Realtime API to enable low-latency speech-to-speech applications, perfect for natural conversations in language learning and customer service. Audio tokens are priced at $0.06/min input and $0.24/min output, available in public beta for paid developers. OpenAI introduced Prompt Caching with 50% discounts on reused tokens. Ideal for long or repeated conversations, it reduces costs and latency, priced at $1.25 per million cached tokens. OpenAI introduced vision fine-tuning, boosting applications in visual search and medical analysis. OpenAI introduced Model Distillation. Developers can fine-tune smaller models using outputs from GPT-4o, cutting costs while maintaining performance. OpenAI introduced an interesting use-case – Altera, led by Dr. Robert Yang, is pioneering “digital humans” that go beyond assisting. These AI agents, powered by OpenAI’s GPT-4o, collaborate with users and even exhibit emotional responses. From Minecraft pals to digital coworkers, Altera’s agents tackle long-term autonomy by solving data degradation, aiming to mimic human cognitive functions with striking realism. Mustafa Suleyman is biting his nails. Sam Altman expects AI agents to be a game-changer by 2025, potentially completing month-long human tasks in an hour.
| AI Ethics Loses a Leading Voice: Abhishek Gupta Abhishek Gupta, Founder of the Montreal AI Ethics Institute and Director for Responsible AI at BCG, passed away on September 30, 2024. His work played a key role in shaping discussions around responsible AI practices. Our thoughts are with his family.
| |
|
| | |
| 11 Types of LoRA | Explore different ways for enhancing fine-tuning | www.turingpost.com/p/11-types-of-lora |
| |
|
| Weekly recommendation from AI practitioner👍🏼: | Serpapi – scraping the scrapers has never been easier. Meaning you can search Google maps, Youtube, Google search itself as a model tool. | | We are reading | The Exponential View argues that the future of AI lies in an ecosystem of domain-specific foundation models, which will excel in specialized tasks. We certainly think so as well, as we’ve extensively written about it in our FMOps - task-centric ML series. Terence Tao, a renowned mathematician, explores the potential of AI in transforming mathematical reasoning. Semianalysis published a fascinating article about AI Neoclouds – cloud providers focused on GPU compute rental.
| The freshest research papers, categorized for your convenience | Our TOP | | TuringPost @TheTuringPost | |
| |
There are several surprising and notable insights from this paper on Movie Gen by @AIatMeta: Scale and simplicity are key: The authors found that scaling up a simple Transformer-based model with Flow Matching yielded high quality results across multiple media generation tasks.… x.com/i/web/status/1… | AI at Meta @AIatMeta 🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in… x.com/i/web/status/1… |
| | 6:21 PM • Oct 4, 2024 | | | | 8 Likes 1 Retweet | 0 Replies |
|
| → read the paper | RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning (RLEF): Researchers from Meta AI developed RLEF to improve LLMs' iterative code synthesis using execution feedback. They demonstrated that RLEF enables LLMs to use real-time feedback to refine code over multiple steps. Tested on CodeContests, RLEF significantly reduced sample requirements by tenfold while surpassing prior state-of-the-art models. MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Apple researchers introduce MM1.5, an improved family of multimodal large language models (MLLMs) with 1B to 30B parameters, designed to excel in tasks like text-rich image comprehension, visual referring, and multi-image reasoning. MM1.5 utilizes a data-centric training approach, with high-quality OCR data and synthetic captions, emphasizing continual pre-training and supervised fine-tuning. Specialized variants include MM1.5-Video for video understanding and MM1.5-UI for mobile UI comprehension, achieving strong performance across diverse benchmarks.
| Enhancing Language Models and General AI Reasoning | RATIONALYST: Pre-training Process-Supervision for Improving Reasoning improves reasoning tasks through process supervision, specifically boosting LLM performance on mathematical and commonsense tasks. Read the paper Quantifying Generalization Complexity for Large Language Models develops SCYLLA, a framework to measure how well LLMs generalize across various task complexities, highlighting a "generalization valley" in model performance. Read the paper VinePPO: Unlocking RL Potential for LLM Reasoning Through Refined Credit Assignment enhances reinforcement learning for language models by improving credit assignment for more accurate step-wise reasoning. Read the paper Not All LLM Reasoners Are Created Equal explores how LLMs perform differently on compositional reasoning tasks, especially in math. Read the paper LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations investigates how LLMs encode truthfulness and proposes methods for detecting hallucinations in outputs. Read the paper
| Task-Specific and Compositional Learning in AI | Can Models Learn Skill Composition from Examples? examines whether smaller models can generalize complex skills from examples, showing that fine-tuning improves unseen task performance. Read the paper Training Language Models on Synthetic Edit Sequences Improves Code Synthesis demonstrates how training on synthetic edit sequences can enhance code generation, improving diversity and quality of output. Read the paper Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning shows how tree search and reflection techniques improve the performance of autonomous agents in complex, real-world tasks. Read the paper
| Model Efficiency, Deployment, and Performance Scaling | TPI-LLM: Serving 70B-Scale LLMs Efficiently on Low-Resource Edge Devices introduces a system for deploying massive LLMs on edge devices with optimized memory and latency performance. Read the paper InfiniPot: Infinite Context Processing on Memory-Constrained LLMs offers a method to process long contexts efficiently, using limited memory resources while maintaining performance. Read the paper SageAttention: Accurate 8-Bit Attention for Plug-and-Play Inference Acceleration proposes an attention mechanism to accelerate transformer models using quantization techniques while retaining accuracy. Read the paper
| Federated and Distributed Learning | | Multimodal Models and Vision-Language Integration | COMFYGEN: PROMPT-ADAPTIVE WORKFLOWS FOR TEXT-TO-IMAGE GENERATION presents a system that adapts workflows based on user prompts, improving the quality of generated images. Read the paper Contrastive Localized Language-Image Pre-training (CLOC) enhances vision-language models by introducing fine-grained region-based understanding, improving tasks that require precise localization. Read the paper
| Optimization Techniques for Neural Networks | Old Optimizer, New Norm: An Anthology reimagines classic optimizers like Adam and Shampoo under different norms, offering new ways to improve model training efficiency. Read the paper Cottention: Linear Transformers with Cosine Attention introduces a novel attention mechanism using cosine similarity, improving efficiency in long-sequence tasks. Read the paper Hyper-Connections proposes an alternative to residual connections, dynamically adjusting the strength of connections between network layers to improve training speed and accuracy. Read the paper
| Cross-Lingual and Multilingual Learning | | Specialized AI Environments and Evaluation | COFFEE-GYM: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code introduces a reinforcement learning environment designed to evaluate and improve feedback on code-editing models. Read the paper HELPSTEER2-PREFERENCE: COMPLEMENTING RATINGS WITH PREFERENCES proposes a hybrid model to improve instruction-following in LLMs by combining preference annotations with rating systems. Read the paper
| Foundations of Intelligence and Learning Systems | Intelligence at the Edge of Chaos explores how models trained on moderately complex systems can outperform those trained on simpler or more chaotic data, suggesting intelligence emerges at the balance between order and chaos. Read the paper Were RNNs All We Needed? revisits RNNs and proposes simplified versions that perform as well as modern architectures, challenging the dominance of transformer-based models. Read the paper
| Leave a review! | | Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. You will get a 1-month subscription! |
|
| | |
|