🎙 Doug Downey/Semantic Scholar: Applying Cutting Edge NLP at scale
Was this email forwarded to you? Sign up here It’s so inspiring to learn from practitioners. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you find it enriching. No subscription is needed. 👤 Quick bio / Doug Downey
Doug Downey (DD): I received my Ph.D. in Computer Science and Engineering from the University of Washington in 2008, focusing on AI – specifically, information extraction from large-scale text. Since then, I have been a professor at Northwestern University but have been away from the university since 2019, working full-time at the Allen Institute for AI (AI2). I manage the research unit of AI2's Semantic Scholar group focused on machine learning, natural language processing, and human-computer interaction in service of Semantic Scholar's mission of accelerating scientific breakthroughs with AI. 🛠 ML Work
DD: Semantic Scholar aims to radically improve people's ability to identify and understand relevant research. In the past couple of years, we've rolled out new capabilities like automatically-generated "TLDRs" for papers, adaptive recommendation feeds for staying up to date with recent research, and improvements to core capabilities like search and author disambiguation. The tool we're most excited about today is the Semantic Reader, which aims to revolutionize reading by making it more accessible and contextual. The Reader already provides a seamless way to lookup references while reading without losing your place. And it is available for thousands of papers. We'll soon make it available for hundreds of thousands of papers, and we are exploring new features like assisted skimming, on-demand symbol and term definitions, and more.
DD: For TLDRs, interestingly, the long-form input doesn't change the problem too much. Our production model only uses the abstract, intro, and conclusion as input, which is not too long and tends to be sufficient for generating good TLDRs. But TLDRs only scratch the surface of what we might want to summarize from scientific papers. For example, say you're reading a paper, and it says, "we use the same experimental setup as reference 17." Wouldn't it be great if you could click on that statement and immediately get a concise summary of the relevant part of reference 17's experimental design? This is something we're working on. For that, we may need to model whole documents using a tool like Longformer, and even model multiple documents at a time, as in the cross-document language models that we introduced this past year.
DD: The Semantic Scholar website and the Reader face a difficult and expensive extraction challenge as a first step. Given a PDF, we have to pull out all of the basic paper elements – title, authors, citations, equations, figures, and so on. We recently introduced a new technique for this task called VILA (for VIsual LAyout), which uses a simple intuition that in typical scientific paper layouts, semantically-related elements tend to appear in the same visual block on the page. Fairly simple models that encode this intuition are more efficient than previous work and still get high accuracy. We've released a version of those and plan to improve them further in 2022.
DD: Few-shot learning techniques are very relevant to Semantic Scholar in settings like feeds, where users might give just a handful of example papers and want to get high-quality recommendations right away, or in domain transfer, where we might have a model built for one scientific domain like computer science and want to quickly adapt it to bioinformatics. To help support additional few-shot research by the community, we recently established a new benchmark, FLEX, that provides a standard and realistic challenge workload for few-shot NLP. One limitation of current few-shot learning work, including FLEX, is that it tends to focus on classification, and other settings like few-shot text generation and summarization are far less studied. We've recently demonstrated a few-shot summarization technique (PRIMER), but more work is needed in this direction.
DD: Great question. We'd like to get to a point where our systems understand science well enough to verify a claim in one paper by reading other papers, or suggest the best tools for a given task, or recommend new hypotheses for human scientists to investigate. Reaching that level of reasoning requires not just scientific knowledge but also vast commonsense knowledge. Semantic Scholar has collaborated with the MOSAIC team here at AI2 on a variety of commonsense challenge tasks, including ones that are critical for science like abductive reasoning (reasoning to the best explanation). Recent years have shown that large-scale language modeling approaches, especially when trained on large and varied commonsense datasets, can perform fairly well on the commonsense question-answering benchmarks that we devise. But it's not obvious how to convert performance on those constructed QA tasks into a successful real-life application, like a scientific assistant that suggests hypotheses. I hope we can work more with MOSAIC in this direction. 💥 Miscellaneous – a set of rapid-fire questions
The "surprise test" paradox.
Weapons of Math Destruction by Cathy O'Neil, although my own views on predictive modeling are more optimistic.
It has flaws, but yes it's still relevant. In particular, the value of an interactive test is underappreciated by today's benchmark-focused AI research. To really evaluate an AI system, you have to interrogate it – not just test it on a fixed data set.
Seems like not, and it would be so great to understand exactly why. You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
🍮 Edge#147: MLOPs – Model Serving
Tuesday, December 7, 2021
plus overview of the TensorFlow serving paper and TorchServe
✴️ Amazon’s Big ML Week
Sunday, December 5, 2021
Weekly news digest curated by the industry insiders
📕 If only someone wrote the book on ML Observability*
Friday, December 3, 2021
Getting a model from research to production is hard.
🔺 Edge#146: A Deep Dive Into Arize AI ML Observability Platform
Thursday, December 2, 2021
While the number of platforms that incorporate observability as a native capability is still small, the relevance of this feature in ML applications has been progressively increasing
📝 Guest post: A Guide to Leveraging Active Learning for Data Labeling
Wednesday, December 1, 2021
by Labelbox
You Might Also Like
BetterDev #259 - How LLMs Work, Explained Without Math and Turning AirPods into a Fitness Tracker to Fight Cancer
Monday, May 13, 2024
Better Dev #259 May 13, 2024 Hi all, We come back with a new issue this week. If you like BetterDev, please help spead word out by refer to your friends. Buy me a coffee would be great too. Many link
Meet OpenAI’s newest GPT
Monday, May 13, 2024
Plus: White House to fund semiconductors and Cruise tests in Phoenix View this email online in your browser By Christine Hall Monday, May 13, 2024 Good afternoon, and welcome back to TechCrunch PM. We
The Story of Project Management & SEO ruined the internet
Monday, May 13, 2024
My name is Philipp and you are reading Creativerly, the weekly digest about creativity and productivity-boosting tools and resources, combined with useful insights, articles, and findings from the
📱 Don't Travel Without This Cheap iPhone Accessory — Run Your Smart Home With a Raspberry Pi
Monday, May 13, 2024
Also: How to Generate AI Art for Free, and More! How-To Geek Logo May 13, 2024 Did You Know Thanks to serious conservation efforts and sustainable harvesting programs starting in the 1950s, the United
JSK Daily for May 13, 2024
Monday, May 13, 2024
JSK Daily for May 13, 2024 View this email in your browser A community curated daily e-mail of JavaScript news Level Up Your JavaScript: Mastering Array Manipulation Techniques Arrays are a fundamental
You rock(et) my world, moms
Monday, May 13, 2024
If you're looking for a Starliner mission recap, you'll have to wait a little longer -- the mission has officially been delayed. View this email online in your browser By Aria Alamalhodaei
Mapped | U.S. States By Number of Cities Over 250,000 Residents 🌎
Monday, May 13, 2024
Eighteen US States don't have a single incorporated area with more than 250000 people. View Online | Subscribe Presented by: Is your portfolio ready for the internet's next evolution? >>
Daily Coding Problem: Problem #1440 [Easy]
Monday, May 13, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Zillow. A ternary search tree is a trie-like data structure where each node may have up
Deepdive – prioritizing for product managers
Monday, May 13, 2024
As a Product Manager, you're constantly juggling everything – ideas, feature requests, strategic initiatives… the works. You want to do it all, but with limited time and resources, you know you
GCP Newsletter #398
Monday, May 13, 2024
News Official Blog Security Threat Intelligence Introducing Google Threat Intelligence: Actionable threat intelligence at Google scale Official Blog Security Introducing Google Security Operations: