|
|
The main topic |
A few FODs ago, we asked you what topics would be most interesting for you to read about. One of the common interests was open-endedness in AI which refers to the ability to explore and generate novel outcomes without predefined constraints or fixed goals. It coincided that from all directions, we see amazing developments in this field. Today, we will look at four papers from the last week that work on moving this field even further. |
The AI Scientist: Open-Ended Scientific Discovery |
The concept of open-endedness is central to "The AI Scientist," which seeks to autonomously generate research ideas, execute experiments, and write papers. Although the current implementation is limited and the quality of outputs is medium, the framework embodies the idea of open-ended discovery by allowing the system to explore novel research directions without a predetermined path. This aligns with the broader goal of achieving open-endedness in AI, where the system's capacity for innovation and discovery is not restricted by predefined rules or boundaries. |
| Image Credit: The original paper |
|
Cosine Genie: Automated Design in Software Engineering |
Cosine's Genie model, while not explicitly framed as an open-ended system, demonstrates characteristics that contribute to open-ended discovery in software engineering: |
Autonomous task completion: Genie's ability to perform a wide range of programming tasks autonomously suggests it can explore solution spaces without constant human guidance. Human-like reasoning: By training on datasets that capture the decision-making processes of real software engineers, Genie may be able to approach problems with a more open-ended, creative mindset. Collaborative potential: The model's ability to work alongside human developers opens up possibilities for human-AI collaborative open-ended discovery in software development.
|
This autonomy in problem-solving and design is a key aspect of open-endedness, where the model’s output isn’t just a repetition of learned patterns but rather a product of creative exploration within the constraints of software engineering. |
Automated Design of Agentic Systems (ADAS): Evolving Agentic Systems |
ADAS introduces a new dimension to open-endedness by focusing on the automated design and evolution of agentic systems. The use of a meta agent to iteratively design, test, and refine agents within a code-defined space exemplifies open-endedness in a dynamic, evolving context. The meta agent’s ability to discover novel building blocks and combine them in innovative ways aligns with the broader goal of open-ended AI research – creating systems that can autonomously evolve and adapt to new challenges and environments. |
LONGWRITER: Ultra-Long Text Generation |
While seemingly less related, LongWriter addresses the open-endedness of language generation itself. By enabling the creation of coherent, ultra-long texts, it expands the potential for AI to assist in creative writing, technical documentation, and other applications where generating large amounts of text is necessary. The ability to generate 10,000+ words from long contexts pushes the boundaries of what language models can achieve, allowing them to create detailed and nuanced narratives or documents without strict adherence to a predefined structure. |
Potential and Challenges |
While these systems showcase significant advancements in open-endedness, they also highlight the challenges, such as the risk of generating low-quality or unjustified conclusions (as seen in The AI Scientist) or the lack of transparency in the development process (as with Genie). |
Conclusion |
The open-ended capabilities of these systems offer the potential to advance scientific discovery, improve software development, and extend AI-generated content. However, they also bring up critical questions about the necessity of human oversight, the extent to which AI can make independent discoveries, and the broader implications for various fields as these technologies progress. Despite the challenges, we are living in times when AI is increasingly contributing to the expansion of human knowledge and creativity. We are becoming AI-augmented. And that’s thrilling. |
If you like Turing Post, consider becoming a paid subscriber. You’ll immediately get full access to all our articles, investigations, and tech series → | |
|
|
Related to Superintelligence/AGI (new rubric!) |
One of the main roadblocks on the path to human-level or superintelligence is the ability of machines to reason. Last week, researchers from Microsoft Research Asia and Harvard University introduced rStar, a self-play mutual reasoning method that significantly enhances the problem-solving abilities of small language models (SLMs) without fine-tuning. By using Monte Carlo Tree Search (MCTS) to generate reasoning trajectories and a second SLM to verify these paths, rStar increases accuracy in benchmarks like GSM8K from 12.51% to 63.91% for LLaMA2-7B and from 36.46% to 81.88% for Mistral-7B. |
|
| 11 Options for Image Generation | Image generation models are on the rise. Check out our list to find both – free and paid – options to play with image creation | www.turingpost.com/p/11-options-for-image-generation |
| |
|
|
|
News from The Usual Suspects © |
StackOverflow developer survey 2024: AI |
|
Google Bets on Reality with Gemini |
At its "Made by Google" event, Google shifted gears from AI hype to practical applications, showcasing its Gemini model integrated deeply into Android. Senior VP Rick Osterloh made it clear: no more empty promises—it's time for AI to deliver. Yet, despite this, Pixel's market impact remains minimal. Meanwhile, the DOJ's antitrust scrutiny looms, potentially threatening Google's integrated AI strategy. Read more
|
Snowflake vs. Databricks: The AI Showdown |
Snowflake and Databricks are locked in a fierce battle for AI supremacy, with Databricks outbidding Snowflake for Tabular. This rivalry has heated up with aggressive moves like Databricks' "SnowMelt" campaign. But with tech giants like Microsoft entering the fray, both companies might face an even tougher fight ahead. Read more
|
Anthropic's Claude Gets Clever with Caching |
Anthropic's latest innovation, prompt caching for its Claude models, slashes costs by up to 90% and cuts latency by 85%. Available in public beta, this feature is a game-changer for extended AI conversations and complex tasks, with Notion already onboard to optimize its AI assistant. Read more
|
MIT's AI Risk Repository: Navigating the Unknown |
MIT has launched an AI Risk Repository, a detailed catalog of over 700 risks associated with AI. With categories ranging from causal to domain-specific risks, this tool is invaluable for developers, researchers, and policymakers navigating the increasingly complex AI landscape. Read more
|
Midjourney's All-in-One Image Editor |
Midjourney has rolled out a unified AI image editor, bringing together inpainting, outpainting, and more under one roof. Despite facing a class-action lawsuit, Midjourney pushes forward with innovations like a virtual "brush" tool and seamless message mirroring between web and Discord platforms.
|
Hugging Face |
| Thomas Wolf @Thom_Wolf | |
| Replying to@Thom_Wolf | Sharing one demo which is blowing my mind, the new instant SmolLM 360M running live in your browser: huggingface.co/spaces/Hugging… And find more info here: huggingface.co/collections/Hu… And here: | | huggingface.co/blog/smollm SmolLM - blazingly fast and remarkably powerful We’re on a journey to advance and democratize artificial intelligence through open source and open science. |
|
| | 8:58 AM • Aug 18, 2024 | | | | 47 Likes 8 Retweets | 2 Replies |
|
|
In other newsletters: |
|
The freshest research papers, categorized for your convenience |
Models and Their Enhancements |
Falcon Mamba 7B – an open-source State Space Language Model (SSLM) that outperforms traditional transformer models, offering efficient processing for long text generation →read TII blog Hermes 3 – a versatile open-source model that excels in multi-turn conversations and roleplaying, available in multiple sizes, and sets a new benchmark in its class →read the paper Grok-2, which excels in code, math, and reasoning tasks, outperforming major competitors and improving instruction following and factuality →read the paper Imagen 3 – a text-to-image model that surpasses competitors in quality and accuracy, with robust safety measures to prevent misuse →read the paper xGen-MM (BLIP-3) – an advanced multimodal model framework that excels in visual-language tasks and supports both single and multi-image inputs →read the paper JPEG-LM – an LLM that generates images as compressed JPEG files, simplifying visual generation and improving image quality, especially for complex elements →read the paper AQUILA2 TECHNICAL REPORT introduces the Aquila2 series, bilingual models that outperform competitors with efficient training and strong performance, even after quantization →read the paper
|
Our top of other research papers |
Towards Flexible Perception with Visual Memory Researchers from Google DeepMind propose a new visual memory model combining deep neural networks with a flexible database to enhance image classification. This model allows for easy addition and removal of data, enabling scalability from individual samples to billion-scale datasets without retraining. It introduces RankVoting, which outperforms previous aggregation methods, achieving 88.5% top-1 accuracy on ImageNet. The system demonstrates capabilities in lifelong learning, machine unlearning, and interpretable decision-making, showcasing the benefits of an explicit visual memory in deep learning →read the paper Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability Researchers from Google DeepMind studied hallucinations in LLMs by training models on knowledge graphs, where factual content is fully controlled. They found that larger and longer-trained models hallucinate less on seen data but still struggle with unseen data, requiring significantly more compute than previously thought optimal. Despite this, detecting hallucinations becomes harder as models scale up, showing a trade-off between model size, training duration, and hallucination detectability →read the paper Automated Design of Agentic Systems Researchers from the University of British Columbia and the Vector Institute propose the Automated Design of Agentic Systems (ADAS) to autonomously create and improve agentic systems using Foundation Models (FMs). Their method, Meta Agent Search, allows a "meta" agent to program new agents iteratively in code. Experiments show these automatically discovered agents outperform state-of-the-art, manually designed systems in diverse domains, including math and reading comprehension, and demonstrate strong cross-domain generalization and robustness →read the paper
|
Innovative Techniques in Model Design and Application |
Layerwise Recurrent Router for Mixture-of-Experts introduces a new approach to enhance routing in large models by sharing routing information across layers, improving both efficiency and performance →read the paper Solving a Rubik’s Cube Using Its Local Graph Structure proposes a novel method for solving the Rubik’s Cube by modeling it as a graph, enhancing search efficiency while reducing the solution length →read the paper
|
Innovations in Model Training and Efficiency |
How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model demonstrates a method to effectively reduce LLM sizes through pruning and distillation, improving performance while cutting down compute costs →read the paper I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm introduces a novel iterative self-enhancement approach for LLMs, enabling continuous self-alignment and significant performance improvements using minimal external signals →read the paper
|
Understanding Model Training and Tuning |
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models investigates the interaction between pre-training and fine-tuning in LLMs, revealing insights into how these processes impact performance and task retention →read the paper Can Large Language Models Understand Symbolic Graphics Programs? evaluates LLMs' ability to understand symbolic graphics programs, introducing a new benchmark and technique to improve comprehension of these programs →read the paper Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents proposes a framework that integrates the diverse strengths of software engineering agents, significantly improving problem-solving capabilities →read the paper
|
Leave a review! |
|
Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. You will get a 1-month subscription! |
|
|
|