| | Good morning. I saw The Killers for the first time this weekend. | Wow. Just wow. | If we’ve got any Killers fans in the room, let me know your favorite songs/albums. I’ll go first: A Dustland Fairytale. | Anyway, Gavin Newsom vetoed SB 1047. | — Ian Krietzberg, Editor-in-Chief, The Deep View | In today’s newsletter: | |
| |
| MBZUAI Research: Detecting questionable (funny) content online | | Source: Created with AI by The Deep View |
| The world of online content moderation is one that seems ripe for AI-based assistance. But the nature of some types of questionable content generally makes it difficult for AI to be of much help. | The content in question here is “comic mischief content,” or potentially objectionable content that is combined with humor, something that throws off an algorithm’s ability to accurately differentiate what might be objectionable and what might not be. | Researchers at the Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) recently designed an approach that works to mitigate the problem. | The details: The researchers presented a system — HICCAP — that serves as a multimodal approach to the detection and classification of “comic mischief” content. | The researchers also introduced a dataset designed to train models to handle the task. The dataset was collected from a mix of publicly available videos, including from YouTube. The team found that their multimodal approach outperformed existing methods of comic mischief labeling.
| Why it matters: “Detecting comic mischief in videos is a feasible task for multimodal learning,” the researchers said, adding that there is still plenty of room for improvement in this specific task. | To learn more about MBZUAI’s research visit their website. |
| |
| | | Peak is approaching, but is your customer service team ready? Boost your software development skills with generative AI. Learn to write code faster, improve quality, and join 77% of learners who have reported career benefits including new skills, increased pay, and new job opportunities. Perfect for developers at all levels. Enroll now.*
| | The U.S. wants to triple nuclear power by 2050. America’s coal communities could provide a pathway (CNBC). A stalled Waymo brought Kamala Harris’ motorcade to a halt in SF (SF Standard). If your AI seems smarter, it's thanks to smarter human trainers (Reuters). OpenAI reportedly plans to increase ChatGPT's price to $44 within five years (Engadget). What I found on the secretive tropical island they don't want you to see (BBC).
| If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here. | | | | | | |
| |
| University gets $6 million for AI research center | | Source: NIST |
| The United States Department of Commerce’s National Institute for Standards and Technology (NIST) announced last week that it has awarded Carnegie Mellon University (CMU) a $6 million grant to establish an AI research center. | The details: The center, housed on CMU’s campus in Pittsburgh, is specifically designed for the research and evaluation of AI capabilities and tools. | The center’s goal is to advance the science of risk management within AI models and tools; it will focus on developing metrics, evaluation procedures and best practices to help developers build safe systems, The research center will seek to advance NIST’s focus on AI, which includes an increase in explainability, safety and accountability.
| “This new cooperative research center will expand NIST’s knowledge base and fundamental research capacity in AI,” Under Secretary of Commerce for Standards and Technology and NIST Director Laurie E. Locascio said in a statement. “Through this partnership, we will strengthen our understanding of foundation models and support new research — and new researchers — in this rapidly evolving field.” |
| |
| California Gov. signs AI transparency bill | | Source: Gavin Newsom |
| California Gov. Gavin Newsom signed a roundup of around a hundred bills this weekend. Buried in the midst of that list was a bill, called AB 2013, that would ensure a rather significant level of algorithmic transparency for generative artificial intelligence systems. | The details: The bill, which was introduced to the Senate at the beginning of this year, requires that AI developers — on or before Jan. 1, 2026 — make available documentation regarding the data used to train a given generative AI system. | It would impact all models released on or after January 1, 2022, regardless of the terms of use of any specific system. If it’s available for use by Californians, it will be impacted by this law. The documentation now required — which must be posted to the developer’s website — must include a high-level summary of training data.
| This includes the sources or owners of the data used, a description of how each dataset furthers the model, the number of data points included, the types of data included, whether the datasets include copyrighted or otherwise protected content, whether the datasets include personal or consumer information, the time period the data was collected, how the dataset was cleaned and processed and whether the system uses synthetic data as well. | You can read the full text of the bill here. | The requirements listed above won’t apply to generative AI models whose sole purpose involves security; models designed specifically for aircraft operation or military use will not be required to share their training data. | This, cognitive scientist and AI expert Gary Marcus said, means that AI developers can “no longer hide behind the BS phrase ‘publicly available.’” | At the same time, Newsom officially vetoed the highly contentious SB 1047, despite the public’s broad approval of the bill. | Daniel Colson, the director of the Artificial Intelligence Policy Institute, called the move “misguided, reckless and out of step with the people he’s tasked with governing.” | | Marcus is right, AB 2013 would certainly bring the obvious reality of copyright infringement into the open. But it kind of already is. | In trickles and floods, reports have surfaced over the past few months — and we’ve reported on many of them — that training data includes all sorts of copyrighted intellectual property, including YouTube videos, art, articles, books, music, etc. | Though the firms themselves have declined to answer specific questions about the details and scope of their training data, it’s the worst-kept secret of the Valley that these models are trained on everything they can get their hands on. | | As the issue is battled out in court across a number of contentious cases, the reality is that human learning and AI model training are two very, very different things. Generative AI models are statistical probability generators that cannot function without their training data; humans are complex creatures who blend inspiration — at a far, far smaller scale — with personal spirit and strife out of an ancient need for connection. GenAI is no more than its training data; the unpaid creators of that training data are far more than their inspirations. | This law will not change the debate, and it will likely not change the course of the legal battles. But it will make it way easier for more creators to bring additional lawsuits, as it will become blatantly obvious which works have been scraped by which developers, something that could very well open the proverbial floodgates, especially if the cases currently in litigation start to go well for the artists. | In terms of SB 1047, the bill was flawed, though many of the talking points espoused by the bill’s opponents were simply untrue. Largely, I would say targeted approaches — like AB 2013 — are better; but 1047 sought to challenge the twisted incentives that have become enshrined in the Valley. And at least at this stage, those incentives will continue unchecked and unabated. | | | Which image is real? | | |
| |
| 💭 A poll before you go | Thanks for reading today’s edition of The Deep View! | We’ll see you in the next one. | Here’s your view on AI text detectors: | 42% of you don’t mess around with AI text detectors, and for good reason. 26% use detectors and haven’t noticed an issue yet; 15% said its hard to figure out when their students/employees are using generative AI and 10% of you have been wrongly accused of using generative AI due to a faulty text detector. | No issues (yet …): | | That’s the idea. Let us know if you’d be interested in a text version of our Real or AI. Practice, after all, makes perfect here. | Do you think you would use a system like HICCAP? | |
|
|
|