| | | Good morning. Nvidia will close out Big Tech’s earnings season on Wednesday, a report that, as always, promises to be highly consequential to the AI trade. | Wall Street hasn’t treated Nvidia’s AI-related peers too kindly following their own earnings reports; we’ll see if Nvidia will get the same treatment. | This is where the fun begins. | — Ian Krietzberg, Editor-in-Chief, The Deep View | In today’s newsletter: | ⚕️ AI for Good: Generating DNA 💻 DeepSeek doubles down on open source 🏛️ Unredacted documents show Meta’s thought process around training AI on copyrighted materials 🤖 Humanoid robots for the home
|
| |
| AI for Good: Generating DNA |  | Source: Unsplash |
| The news: A team of scientists last week released Evo 2, an open-source generative AI model that they say is the largest-ever AI model for biology. | The details: The model was trained on data that, according to Stanford, “includes all known living species,” ranging from humans to plants, animals, bacteria and extinct species. The dataset includes around nine trillion nucleotides (nucleotides are the building blocks of DNA and RNA). | The first version of Evo was released last year, though its training set only included simpler life forms (no animals or plants) and a total of around 300 billion nucleotides. Stanford’s Brian Hie, one of the co-leaders of the effort, said that Evo 2 essentially acts like ChatGPT but for DNA: “if you want to design a new gene, you prompt the model with the beginning of a gene sequence of base pairs, and Evo 2 will autocomplete the gene.”
| Why it matters: The potential significance of Evo 2 is that it can be prompted to write new genetic code “that has never existed before.” Scientists can use it to study and extrapolate out the impacts of specific genetic mutations. | “Essentially, Evo 2 is speeding up evolution, providing promising new genetic paths for us to explore,” Hie said, adding that he hopes the model will someday have clinical significance. “Evo 2 could help predict which mutations lead to pathogenicity and disease. Everyone has random mutations in their DNA and, mostly, they’re harmless. But on rare occasions, they’ll cause cancer or other disease. The model is actually very good at distinguishing which mutations are just random, harmless variations and which cause disease.” |
| |
| | Outperform every benchmark | | Turing specializes in post-training solutions for AI labs and leading companies, including model evaluation, SFT, RLHF, and DPO. Their expertise improves LLM performance in reasoning, coding, and multimodal tasks. | Use their free, 5-minute assessment to: | Pinpoint your goals and priorities in GenAI development Receive tailored insights to refine your model strategy Learn how Turing supports benchmarking and optimization
| Start your assessment now |
| |
| DeepSeek doubles down on open source |  | Source: Unsplash |
| The news: Just a few weeks after massively disrupting the AI industry with a high-performing ‘reasoning’ model that was freely accessible under an MIT license, DeepSeek has promised to crack their models open a little wider. | The details: The Chinese firm wrote on social media last week that it plans to release five open-source repos, “sharing our small but sincere progress with full transparency.” | “These humble building blocks in our online service have been documented, deployed and battle-tested in production,” DeepSeek wrote. “As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey.” | Though DeepSeek’s R1 — like a litany of major models, including Meta’s Llama and Google’s Gemma — was released with open weights (which means developers can easily fine-tune the model), the model’s training data and source code were not released. In order for it to be truly open source, DeepSeek would have to release source code and training data. It’s not clear how “open” the company plans to get, writing on a GitHub page only that it plans to release the “code that moved our tiny moonshot forward.”
| “Daily unlocks are coming soon,” DeepSeek wrote. “No ivory towers — just pure garage energy and community-driven innovation.” | The landscape: The pending release is seemingly being presented as a very purposeful means of distinguishing DeepSeek from the entrenched competition, mainly OpenAI, which has made a bit of a name for itself by being the very opposite of “open” — weights, source code, training data, etc. have never been revealed by the company for any of the models it has released in the past two years. | Sam Altman said on Reddit earlier this month that OpenAI might have been “on the wrong side of history here,” saying that the company needs “to figure out a different open source strategy.” | Fully open-source models enable better scientific research and broader scientific access, a boon to the advancement of the underlying technology. |
| |
| | | Apple has refused to comply with the UK government’s order to build a backdoor to access user data. Instead, Apple is removing its end-to-end encryption option for British users. Artificial intelligence, due to its main capability of recognizing patterns from large training sets, functions much like a mirror, something that makes the design of ‘fair’ AI an intensely difficult challenge.
| | Hackers steal $1.5 billion from exchange Bybit in biggest-ever crypto heist (CNBC). Tesla recalling more than 375,000 vehicles due to power steering issue (AP). Corporations dig deeper: using bunkers to secure data — and their CEOs (Semafor). Musk threatens FBI agents and air traffic controllers with forced resignation if they don’t respond to an email (Wired). The man trying to capture the internet before it disappears (Vox).
|
| |
| Unredacted documents show Meta’s thought process around training AI on copyrighted materials |  | Source: Meta |
| The news: Newly unredacted court documents in an ongoing copyright lawsuit against Meta revealed a series of internal conversations that displayed Meta’s knowing use of pirated and copyrighted books in training Llama. | A prior release of documents in January showed that CEO Mark Zuckerberg personally gave approval to Meta’s GenAI team to train on LibGen, a self-described ‘shadow library’ that acts as a means of accessing pirated material. LibGen has been sued, fined and ordered to shut down in numerous cases. | The new set of documents was submitted as the latest entry in the case Kadrey V. Meta, a copyright lawsuit that includes author Sarah Silverman as a plaintiff. Meta, like its other developer peers, has argued that its use of copyrighted material to train generative AI is protected by the ‘fair use’ defense, a position that seems to be weakening by the day. One of the new exhibits showcases a conversation in which Meta engineers discussed their frustrations with pursuing licensing deals for training material, with one researcher — Xavier Martinet — writing in a chat from February of 2023: “my opinion would be (in the line of ‘ask forgiveness, not for permission’): we try to acquire the books and escalate it to execs so they make the call.”
| In a separate exhibit, one engineer wrote that it would not be “trivial to download LibGen if everything is in torrents,” which refers to a means of distributing copyrighted material among people who didn’t pay for it. That engineer linked to a Quora article titled: “what is the probability of getting arrested for using torrents in the USA.” | In an email making the case for why LibGen was needed to produce state-of-the-art results in Llama, Sony Theakanath, director of product management at Meta, offered up some “mitigations” to reduce Meta’s legal exposure. These included scrapping data “clearly marked as printed/stolen” and tuning Llama to avoid “IP risky prompts … refuse to answer queries like: reproduce the first three pages of Harry Potter and the Sorcerer’s Stone.” | “It's time to call it,” Jason Kint, the CEO of Digital Content Next, wrote. “AI is built on a house of cards of intellectual property violations starting with Facebook which is starting to look a lot like a crime scene as held back discovery documents begin to be compelled and unsealed in court.” |
| |
| Humanoid robots for the home |  | Source: 1x |
| The news: Robotics startup 1x last week unveiled its latest humanoid robot — the NEO Gamma — which is designed specifically as a sort of robot butler for the home. | The details: According to the company, the Gamma model comes loaded with a visual manipulation model enabling it to pick up a wide variety of objects, in addition to a language model that enables natural language communication. It can also walk around and “sit in chairs,” though I’m not sure why a robot butler would need to take a load off. | Little is known about the models that power the robot, including training data, architectural approaches, safety mitigations, power requirements, energy sources and data protection mitigations. None of 1x’s claims have been independently verified. An accompanying video shows the robot walking around a home, bringing a pot of coffee to its human masters, adjusting a painting on a wall, cleaning the windows and vacuuming (instead of a robot vacuum, like the Roomba, this is a robot using a vacuum. Mind-blowing).
| The startup is currently continuing a private, early-access program that is bringing NEO robots to people’s homes. It’s not clear when 1x will begin mass production and distribution, or what the robot will cost when it does so. | “There is a not-so-distant future where we all have our own robot helper at home, like Rosey the Robot or Baymax. But for humanoid robots to truly integrate into everyday life, they must be developed alongside humans, not in isolation,” CEO Bernt Børnich said, adding that deployments of the robot in and among the complexities of homes will enable the systems to “grow in intelligence and autonomy.” | But Eric Jang, a researcher at 1x, noted on Twitter that the video doesn’t actually showcase the robot’s autonomy. Jang said that the “shots in the video were done with upper body (teleoperation and) a lower body RL controller (with some variations in control authority for things like sitting, bending).” | This is not explicitly mentioned by the company in any of its official materials. | He added that, for the past several weeks, NEO robots have been “doing chores” around employee’s houses. Acknowledging the enormous privacy implications of having remote operators watching people through the cameras on the robot, Jang said that “we will share more on security & privacy as the NEO product becomes available to non-employee users.”
| 1x first began deliveries of an earlier, wheeled robot model to factory customers in 2022, but in 2023, shifted its focus to developing a robot for the home. Between March of 2023 and January of 2024, 1x raised a total of $125 million across two funding rounds from investors ranging from venture capitalists to OpenAI. | | Teleoperated robots are not useless, especially if the hardware is highly functional. I can think of a number of dangerous scenarios where sending in a teleoperated robot makes plenty of sense — bomb disarming, fires, mines, repairs in the midst of natural disasters, etc. | What is frustrating is a regularly occurring pitch of full autonomy, coinciding with futuristic videos of robot butler companions, where the details of these highly-produced videos (and the systems that they feature) remain opaque at best. | The hardware seems somewhat impressive, but a clear disclosure that the robot functions primarily through teleoperation at the moment would be a welcome step away from hype and toward reality. | It is really, really hard to build autonomous robots; since they interact with the physical world, they need a series of powerful sensors that enable them to operate in unstructured environments. Probabilistic models have been some help on this front, but we’re nowhere near a point where it’s good enough to be reliable or safe. | I get that it’s become attractive to sell science fiction, especially when there’s so much money riding on its success. But selling a highly marketed illusion doesn’t represent real progress. | | | Which image is real? | | | | | 🤔 Your thought process: | Selected Image 2 (Left): | |
| |
| 💭 A poll before you go | Thanks for reading today’s edition of The Deep View! | We’ll see you in the next one. | Would you buy a NEO? | | If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here. |
|
|
|