In partnership with | |
| | | Next Week in Turing Post: | Wednesday, AI 101: What is LongRAG? Friday, Interview with Innovators: we discuss the impact of AI on search engines with ML experts from Yandex Search
| If you like Turing Post, consider becoming a paid subscriber. You’ll immediately get full access to all our articles, investigations, and tech series → | |
|
| The last week was marked by two very interesting research papers related to the use of synthetic data in AI, offering thought-provoking insights into the future of this technology. The first, "LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives," by Cohere researchers, explores how synthetic data can be used to fine-tune AI models. The second, "Scaling Synthetic Data Creation with 1,000,000,000 Personas," by Tencent AI Lab, unveils a colossal persona-driven framework to generate diverse and realistic synthetic data. | What if we could combine these approaches? Active inheritance allows us to guide AI models toward desirable attributes, like reducing toxicity and increasing lexical diversity. Imagine layering this with the vast, varied personas from Persona Hub. One billion personas is no joke! Could we then create a new generation of AI – a new AI Nation – trained on data that's diverse, ethically sound, and highly functional? | The potential here is immense. These papers collectively suggest a future where AI models are not just trained but finely sculpted through sophisticated data generation techniques. | There are, of course, a few questions to consider: as we move AI behavior through targeted data, how do we ensure we’re not embedding unintended biases? Data from one billion personas is massive – how do we manage it ethically and effectively? How can we make sure this AI Nation is different from us, biased humans? | Synthetic data is on the rise, and we still don’t know all the answers or even the right questions to ask. The conversation around synthetic data in AI is just beginning; the promise is truly fascinating, and it's one we must approach with both enthusiasm and caution. | | Click the link below so we can make some money on this ad 🙂 You might also like what they offer → | Ship AI projects faster, cleaner and better with AE Studio | | Don’t let limited talent resources or skill gaps be the reason your projects don't make it to the finish line. AE Studio's expert team of developers, data scientists, and designers can help accelerate your projects — without compromising clean code and infrastructure. | We've helped companies like Berkshire Hathaway, EVGo, and Ritual build and ship big ideas that changed the way they do business. | Have a project in the works or want to learn more about our work? Let's talk | | |
| 10+ Research papers to learn more about Vision Language Models (VLMs) | a list of research papers for better understanding of how VLMs work | www.turingpost.com/p/vlms-rp |
| |
|
| | News from The Usual Suspects © | AI’s Financial situation Anthropic's Safety Dance Character.AI’s Love Triangle Character.AI, the chatbot trendsetter, is flirting with Google and Meta as competition heats up. Once the darling of quirky AI interactions, it's now navigating partnerships and content controversies to stay in the game.
Apple's AI Adventure Apple is joining forces with OpenAI, gaining an observer seat on its board. Phil Schiller will oversee this AI alliance, aiming to integrate ChatGPT into Apple devices and boost Siri’s smarts—all without spending a dime.
Stability AI’s Generous Diffusion World Artificial Intelligence Conference (WAIC) in Shanghai Despite U.S. restrictions, China’s AI firms continue to rival market leaders. As often happens, sanctions fuel innovations, and Chinese companies successfully develop workarounds to remain competitive. At WAIC, SenseTime unveiled SenseNova 5.5, claiming it outperforms GPT-4 in key metrics. Alibaba highlighted user growth for its Tongyi Qianwen models, which have over 20 million downloads. Both companies emphasize their commitment to open-source development amidst intense domestic competition in the AI sector. Elon Musk is an often visitor to China. Tesla's Optimus humanoid robot made a splash at the WAIC, though safely behind glass. Alongside it, 18 Chinese robotics firms showcased their bots, tackling high costs and US tech restrictions with creative solutions. Discussions centered on how Chinese companies can innovate despite US technology restrictions, focusing on areas like cloud computing and AI application development.
Kyutai’s Voice Revolution Kyutai introduced Moshi, the first openly accessible voice-enabled AI, created by an 8-member team in just six months. Demonstrated in Paris, Moshi's code and model weights are free to all, pushing for open collaboration in AI. I liked the reaction of Hugging Face’s CTO Julien Chaumond the most:
| | In other newsletters/posts (a lot of thought-provoking pieces!): | | The freshest research papers, categorized for your convenience | Optimization and Performance Enhancements | MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Utilizes dynamic sparse attention patterns to speed up the pre-filling stage of long-context LLMs, significantly reducing inference latency while maintaining accuracy. Read the paper AGENTLESS: Demystifying LLM-based Software Engineering Agents Simplifies LLM-based software development using a two-step process of localization and repair without autonomous tool usage, achieving high performance and low cost. Read the paper RouteLLM: Learning to Route LLMs with Preference Data Optimizes cost and performance by dynamically selecting between strong and weak LLMs, reducing costs while maintaining response quality through data augmentation and human preference data. Read the paper LiteSearch: Efficacious Tree Search for LLM Develops a novel tree search algorithm to improve LLMs' performance on mathematical reasoning tasks, reducing computational costs while maintaining competitive performance. Read the paper Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Proposes Expert-Specialized Fine-Tuning (ESFT) for sparse Mixture-of-Experts (MoE) architectures, tuning only the most relevant experts for a task, improving tuning efficiency and performance. Read the paper
| Benchmarks and Evaluation | TabReD: A Benchmark of Tabular Machine Learning in-the-Wild Presents a benchmark collection of industry-grade tabular datasets with temporal splits, highlighting the performance of different architectures and the impact of time-based splits. Read the paper Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Proposes the SummHay task to evaluate LLMs and RAG systems on long-context summarization, highlighting models' challenges in precise citation and comprehensive coverage. Read the paper MIRAI: Evaluating LLM Agents for Event Forecasting Develops a benchmark for assessing LLM agents' capabilities in predicting international events using the GDELT event database, highlighting the need for advanced temporal reasoning. Read the paper WE-MATH: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Introduces a benchmark for evaluating visual mathematical reasoning in LMMs, revealing significant struggles with insufficient knowledge despite advancements in generalization. Read the paper
| Content Regulation, Alignment, and Safety | UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI Highlights that unlearning fails to prevent reintroduction of removed knowledge through in-context learning, emphasizing the need for robust content filtering mechanisms. Read the paper ProgressGym: Alignment with a Millennium of Moral Progress Introduces a framework to align LLMs with human moral progress using historical texts and LLMs, offering benchmarks to track evolving values and address value lock-in risks in AI. Read the paper Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks Proposes a method to defend against jailbreak attacks by unlearning harmful knowledge, significantly reducing attack success rates and demonstrating remarkable generalizability. Read the paper A False Sense of Safety: Unsafe Information Leakage in ‘Safe’ AI Responses Explores limitations of current AI safety measures, introducing "inferential adversaries" to exploit seemingly safe outputs, emphasizing the need for new defense mechanisms. Read the paper Self-Evaluation as a Defense Against Adversarial Attacks on LLMs Develops a defense mechanism using self-evaluation to reduce attack success rates, outperforming existing defenses and remaining robust even under adaptive attacks. Read the paper
| Multimodal Models and Applications | 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Trains a vision model on over twenty diverse modalities, enabling it to perform a wide range of tasks without performance loss, enhancing multimodal generation and retrieval. Read the paper Understanding alignment in multimodal LLMs: a comprehensive study Explores alignment of responses in multimodal LLMs with image content, proposing Bias-Driven Hallucination Sampling (BDHS) and highlighting the benefits of combined offline and online methods. Read the paper ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Integrates LLMs with the Robot Operating System (ROS) to facilitate intuitive robot programming, incorporating feedback to refine tasks, demonstrating robustness and scalability. Read the paper STARK: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge Introduces a large-scale multi-modal conversation dataset featuring diverse social personas and images, enabling the creation of advanced conversation models with superior visual imagination abilities. Read the paper
| Advanced Techniques and New Models | Chain-of-knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs Enhances LLMs with knowledge reasoning abilities using knowledge graphs and a trial-and-error mechanism, improving general reasoning capabilities and addressing rule overfitting. Read the paper Learning to (Learn at Test Time): RNNs with Expressive Hidden States Proposes Test-Time Training (TTT) layers, which update hidden states even during test sequences, demonstrating superior performance to Transformer and modern RNN baselines in long context scenarios. Read the paper E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS Introduces a non-autoregressive zero-shot text-to-speech system with a simple architecture, achieving human-level naturalness and state-of-the-art speaker similarity and intelligibility. Read the paper
| Long-Context and Retrieval Capabilities | Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP Argues that defining long-context NLP tasks by input length is insufficient, proposing a taxonomy to better evaluate and develop LLM capabilities in genuinely difficult long-context scenarios. Read the paper Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER Employs instruction-tuning with enriched prompts containing definitions and guidelines, significantly improving the model's ability to generalize to unseen entity types in NER tasks. Read the paper
| Novel Architectures and Techniques | Consistency Flow Matching: Defining Straight Flows with Velocity Consistency Enhances flow matching in generative models by enforcing self-consistency in the velocity field, improving training efficiency and sample quality. Read the paper DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Improves LLM performance on complex math tasks by decomposing problems into logical subtasks and incorporating self-correction, demonstrating robust generalization capabilities. Read the paper MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Utilizes dynamic sparse attention patterns to speed up the pre-filling stage of long-context LLMs, significantly reducing inference latency while maintaining accuracy. Read the paper
| Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. You will get a 1-month subscription! |
|
| | Leave a review! | | |
|