|
Last week was eventful, with several developments that are likely to significantly change how we communicate with computers and what we build with their help. But before we dive into all the news, let's watch this charming ‘gentleman’ sporting a small mustache and wearing a hat. I believe it exemplifies an excellent way to integrate robots and AI (in this case Boston Dynamics and ChatGPT) – serving as a thoughtful and extremely knowledgeable guide: |
| Making Chat (ro)Bots |
|
|
Turing Post is a reader-supported publication. To receive new posts, have access to the archive and support our work, consider becoming a paid subscriber → | |
|
|
To the main news: Paris on AI map |
Some time ago, one following the Twitter feeds of Yann Le Cun (Chief of AI at Meta), Clément Delangue (CEO of Hugging Face), Mistral AI, and other French-born AI leaders might have noticed an increased posting and cross-posting, highlighting the triumphs of French tech companies. Something was cooking there! |
Indeed, the partnership between Meta, Hugging Face, and Scaleway at Paris's Station F, announced last week, symbolizes a significant shift in the tech landscape, challenging the Silicon Valley narrative. France is geared up to become the open-source AI capital. |
Yann LeCun's advocacy for open AI models at Meta echoes a broader commitment to ethical and transparent AI development. Hugging Face, with its billion-dollar presence, forges a unique path in the AI domain, championing open-source alternatives in a field dominated by tech behemoths. |
France's academic prowess, with institutions like Polytechnique and ENS, has nurtured a generation of AI talent, attracting global tech giants and enriching the French AI ecosystem. |
The French government's strategic role in fostering this sector is a deliberate move to propel France to the forefront of AI development. The "Make it iconic. Choose France" campaign captures this shift, marking France's transition from a historical and artistic powerhouse to a modern tech beacon. Yann Le Cun is one of the icons, of course. |
France’s approach to AI, with its emphasis on open-source development, presents a unique chapter in the global tech narrative. Yann Le Cun and all three founders of Hugging Face – with their accomplishments and authority in the industry – are such a perfect combination that they might send France to the AI stratosphere. Bringing a European attitude and awareness, its leaning towards ethics and collaboration provides a solid foundation for robust open-source development. Keep an eye on the French startup scene! |
Additional read: Since we are talking geopolitics here, this blogpost might be interesting ‘Why China may stay permanently behind the US in generative AI’ by Interconnect (spoiler: Hugging Face is involved in this one too!) |
News from The Usual Suspects © |
OpenAI’s GPTs |
Surprisingly, OpenAI also made a shift towards open-source and released 'large-v3,' a versatile automatic speech recognition model named Whisper, on GitHub. Whisper excels in transcribing various contents and supports subtitles through timestamps. Targeted at researchers, it processes audio in 30-second chunks for accurate text prediction. OpenAI also |
open-sourced the Consistency Decoder for more stable speech processing. They also announced OpenAI Data Partnership with the goal to create open-source and private datasets for AI training. |
But these two releases, that happened during the historical DevDay, were outshone by what is now called ‘the app store moment for AI’, when OpenAI introduced GPTs – New Assistants API that makes it easier for developers to build their own assistive AI apps that have goals and can call models and tools. |
The excitement became palpable even for those who weren't at DevDay and didn’t see the keynote (some people highly praise the keynote itself: you can watch it here) and all AI newsletters and AI media rushed to publish. Mostly there were two types of related content: will OpenAI's API kill startups (and why it won’t) and all sorts of guides on how to build your own personal or public GPT that eventually – when OpenAI introduces a special marketplace – can potentially make you a millionaire, or at least a star. |
Investors’s side: |
Madrona's take on perpetuating question: how many AI startups has OpenAI killed? The opportunities they see: Micro-Entrepreneurship: OpenAI's new features enable individuals to start businesses with minimal funding. Open Source Advantage: Startups may benefit from building with open-source models, given GPT's closed-source limitations. Security Opportunities: The rise of custom AI models creates a need for security and orchestration services. Neutral Role for AI Startups: Startups can mix open and closed-source models, playing a critical middleware role. OpenAI's Influence: OpenAI's advancements suggest its potential to be a generational company, enabling rather than hindering new startups.
|
2. Investors at a Reuters NEXT conference remain optimistic about AI startups despite OpenAI's expansion. They believe there's still significant room for innovation, particularly in developing consumer apps and addressing deep tech issues like brain-computer interfaces. These investors see the current phase as part of a long-term AI revolution, suggesting ample opportunities for novel products and formats in the AI space. |
Articles around building GPT apps: |
|
Runway's "Motion Brush" Upgrade |
Runway introduced the ability to animate specific areas in images with a digital brush with enhanced video resolution to 2,816 x 1,536 pixels, surpassing Full-HD standards.
|
xAI's Latest Development: PromptIDE |
|
Samsung’s Gauss |
Samsung Research introduced new model: Gauss. “The model consists of Samsung Gauss Language, Samsung Gauss Code and Samsung Gauss Image, and is named after Carl Friedrich Gauss, the legendary mathematician who established normal distribution theory, the backbone of machine learning and AI.
|
Twitter Library |
| Insights from Sasha Luccioni's TED Talk on Impact of AI | Ethical AI, Environmental Impact, and Bias in Machine Learning | www.turingpost.com/p/ai-impact-ted-talk |
| |
|
|
|
Other news, categorized for your convenience |
Here is a categorized list of the research papers, each with a brief explanation: |
AGI Development and Evaluation: |
Levels of AGI: Proposes a framework by Google DeepMind for classifying AGI progress, focusing on capabilities, autonomy, and providing a common language for AGI assessment →the paper Additional read: “If we’re going somewhere strange, we should guess at the map and the local flora and fauna” by Jack Clarke (Anthropic)
|
Hallucination in LLMs: |
|
Efficient Model Serving and Acceleration: |
S-LoRA: Introduces a system for efficiently serving thousands of LoRA adapters, optimizing GPU memory and enhancing throughput for scalable model serving →the paper LCM-LoRA: Combines Latent Consistency Models with LoRA for efficient text-to-image generation, significantly reducing memory requirements and inference steps →the paper
|
Multimodal AI Systems, Models and Assistants: |
LLaVA-Plus: An end-to-end trained multimodal assistant with a skill repository, outperforming its predecessor in various tasks by effectively utilizing multimodal inputs →the paper u-LLaVA: A unified multimodal framework to minimize hallucinations and interference in LLMs, integrating multiple expert models for task-specific performance →the paper
|
Large Multi-Modal Models: |
mPLUG-Owl2: Introduces a modularized network for multi-modal LLMs, focusing on modality collaboration and adaptability for diverse tasks →the paper OtterHD: Evolves from Fuyu-8B, this model interprets high-resolution visuals and outperforms leading models in detecting small details and spatial relationships →the paper GLaMM: A unique approach for generating natural language responses with object segmentation masks, suitable for grounded conversation generation →the paper
|
Innovations in Large Language Models: |
LUMOS: A novel framework for training language agents using open-source LLMs, excelling in complex question answering and interactive tasks →the paper JARVIS-1: An open-world agent for dynamic environments, using a memory-augmented MLM for multimodal input processing and sophisticated task execution →the paper Everything of Thoughts (XOT): Enhances LLMs' thought generation for problem-solving by integrating Monte Carlo Tree Search with pretrained reinforcement learning →the paper Ziya2: A 13-billion-parameter LLM that highlights the importance of data-centric approaches in improving LLM performance without increasing model size →the paper
|
3D Model Generation and Reconstruction: |
|
Image-to-Video Synthesis: |
|
|
In other newsletters |
|
|
|
Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍 |
|
Another week with fascinating innovations! We call this overview “Froth on the Daydream" - or simply, FOD. It’s a reference to the surrealistic and experimental novel by Boris Vian – after all, AI is experimental and feels quite surrealistic, and a lot of writing on this topic is just a froth on the daydream. |
How was today's FOD?Please give us some constructive feedback |
|
|
|