In this issue: - Meta's Open-Source AI Economics, Re-Revisited—It's easy to articulate why Meta can build a powerful open-weight model. It's harder to explain why that's what they've chosen to do, though the company has taken a stab at it.
- The Quantum Gambit—If the core business isn't doing well, spin off something more exciting.
- Monetizing Volatility—The return of convertible bond issuance.
- Banking and Credit—Buy Now, Pay Later—but don't Pay Later with a Chase credit card.
- How Long is the Long Game?—Why Twitch is a tough business.
- Sticky Expectations—Japan got inflation, at last. It's not working as expected.
Today's issue of The Diff brought to you by our sponsor, Warp.
We talked a few weeks ago about Meta's overdetermined AI economics ($): they bought more GPUs than they turned out to need for targeting Reels, and put those GPUs to work training models instead. Now, Meta, too, is talking about the economics of mostly open-source AI.
A lot of what Meta emphasizes in their paper on the Llama 3.1 release is that 1) training gets more operationally challenging at scale, and 2) this is not because there's some single specific problem that scales with total FLOPs, just a whole range of narrow special cases. At one point in the paper they provide a statistical breakdown of all of the issues that caused training to be interrupted. There are eighteen separate issues, with about 60% of the problems caused by GPU-specific hardware issues, ~20% from miscellaneous hardware problems, then software and host maintenance making up the rest.
Meta has some interesting advantages that help explain why they're able to ship an open-weights model whose performance is in the same ballpark as the biggest closed models. They and Google have both made plenty of preexisting investments in evaluating the quality of content for ranking purposes, and both companies have spent a long time identifying the kinds of content that they don't want showing up at all regardless of how high its production values are. But Meta probably has more experience than any other entity in the world in the specific task of taking a chunk of user-generated content and instantly determining whether it should be at the top of a feed or should never be visible to anyone who isn't actively looking for it, and that means that Meta's relative advantage is strongest in scooping up huge volumes of long-tail material, filtering out everything irrelevant, and keeping the valuable training tokens.
They also have some interesting remarks about synthetic data. There's been some academic research circulating recently arguing that training models on the outputs of other models eventually leads to nonsense. But demonstrating that this can happen is hardly enough to demonstrate that synthetic data can never work, in the same way that pushing the Wright Flyer off a cliff doesn't tell you much about the possibility of heavier-than-air flight. The Meta approach was to use it in particular contexts where there's both a narrow scope and room for quality control, like transpiling code from a popular language with a larger corpus to a newer language, or writing new code in general. They also wrote tests for the code, which created a feedback loop: start with a programming question, generate an answer in code, run the code through a parser and linter, generate tests, and then run those tests and see if they fail. If so, GOTO 10 is a reasonable next guess—the randomness of LLM outputs means that you can converge on some good answers by brute force, especially when your tests are giving detailed feedback, and human intervention is expensive.
And on the topic of expense, Meta's latest entrant is another datapoint in favor of continued high spending on model training. As models scale, there are more questions about how long AI scaling laws will hold, how dependent they are on an ultimately finite pool of human-generated tokens, whether or not it's really plausible that throwing more compute at the same data will keep leading to better results, etc. Llama 3.1 is not quite on the line of best fit for performance versus FLOP input—it's a little bit better.
As AI gets more mature, inference grows faster than training. This doesn't necessarily mean that the costs scale the same way, since that partly depends on the pace of hardware and software efficiencies, but in terms of total useful computing operations, it's what we should expect. There are a few different forces driving this:
- An obvious one is that the absolute cost of training a new model is rising (Meta trained this model on 16,000 H100s, or about $480m worth of GPU hardware alone). The higher that cost, the more carefully companies have to underwrite the investment in new models instead of tweaks to older ones.
- Meanwhile, each model has broader use cases, so there are more reasons to be satisfied with them. For some purposes—answering customer support questions that are contained in documentation, and detecting when to switch back to a human—legacy models do just fine.
- A subtle consequence of higher model skill is that the organizational cost of replacing models goes up as the wages of the workers being replaced rise. It's pretty straightforward to replace a job that could be done by an intern with a job that can be done by Claude; you just change which window you type your natural language instructions into, and expect to get results a few orders of magnitude faster. But it's harder to say which parts of the job of a lawyer, investment banker, marketer, etc., get changed, and to figure out what their new responsibilities are. A corollary to that is that, even holding the pace of new model launches constant, the dollar retention rate of models rises over time, and that effect gets stronger if the pace of new model releases slows down.
One way to conceptualize the open source bet is that there's more leverage for reducing costs on the software side than the hardware side. Specialized inference hardware is an investment that makes sense for either a) someone who already has a model they're charging for and wants to use this to increase margins and offer more competitive pricing, or b) for a very large-scale cloud host who can capture that upside themselves. But if the constraint is software, and if there are many execution tricks that work best in different domains, then an open source model will tend to be cheaper. (It also doesn't have to amortize the cost of that hardware R&D. As in so much of tech, your incentive is to spend as fast as humanly possible if you're likely to be the winner, and to pivot to cutting costs if you expect to lose, or think that winning will take longer than the rest of the industry recognizes.) That seems to be the meta-bet that Meta is making: if the big cost savings show up in software, then open source models win.
So there are good reasons to think Meta has a cost advantage in building a model, and debatable reasons to think that openness will extend those advantages. Which leaves the question: why bother? However much Meta can save by using GPUs it already paid for and moderation and translation data it's already collected, an even cheaper option is to not train a model at all.
They present this as a strategic decision, rather than one with a directly measurable financial payoff:
One of my formative experiences has been building our services constrained by what Apple will let us build on their platforms. Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build. On a philosophical level, this is a major reason why I believe so strongly in building open ecosystems in AI and AR/VR for the next generation of computing.
One strategic angle for Meta, worth considering but not dwelling on, is spite. Or, more charitably: there might be a fairly finite amount of room for strategic maneuvering among big tech companies, and if they're doing something defensive in AI, they're not doing something offensive somewhere else. Meta has been on the receiving end of this before; some of Apple's privacy moves make a lot more sense as a way to impair the economics of one of the companies that earns the most on iOS.
There's also a political angle, somewhat related to competition. Lina Khan has nice things to say about open models, since they're harder to monopolize. Of course, a classic feature of tech economics is the race to commoditize one's complement, i.e. destroy the monopoly adjacent to your business in order to give your own monopoly that much more pricing power. It might be a healthy equilibrium for tech if each company is as likely to be the aggressor as defender in these situations, but it's important to realize that these are not purely selfless decisions. Meta has classic reasons to release a new open-source model—benefiting from external code contributions, getting developers to standardize on a platform they're familiar with, and recruiting talent. But, as this newsletter has argued before, they're the business with the most existential focus on sorting through arbitrary text, image, and video content and presenting users with a ranked list of exactly the kinds of things they want to see. A world where cranking out content is much cheaper is a world where good feed curation is that much more valuable. And that's a good world for Meta.
Disclosure: Long META
Still running payroll the old way?Warp is built for founders and startups who have less time on their hands. Warp bundles payroll, compliance, and benefits into a single dashboard, automating annoying registrations that come with hiring across states or offshore. Unlike bloated platforms pushing unnecessary features like applicant tracking systems and performance reviews for your employees, Warp is laser-focused on the essentials. Get in, get it done, get out. Say goodbye to endless payroll reminders and cryptic notices from tax departments. Reclaim your time and focus on what truly matters – growing your startup. Get started now at joinwarp.com/thediff. Elsewhere
The Quantum Gambit
A few weeks ago, The Diff covered the odd story of Northern Data, which was rumored to be spinning off a US subsidary worth $10-16bn while the parent company itself has a market value of under $900m and many red flags besides ($). This story about Honeywell planning an IPO for their quantum computing unit, at a $10bn valuation, isn't of the same degree (it's <10% of market cap rather than >1000%), but there are some similarities. Honeywell is not doing especially well, and the story came out the day after they lowered earnings guidance for the year. A sufficiently diversified company will always have some assets that are going through more of a hype cycle than others, and will often have opportunities to monetize that hype. But the long process of an IPO puts a lot of pressure on them to get the timing right, and Bloomberg characterizes their planned timing with "as soon as next year." If a company made a similar decision in 2021, they'd be IPOing their crypto mining operation or SaaS tool right into the bear market of 2022.
Monetizing Volatility
A roughly accurate way to think about financial structure is that bonds and loans monetize stability, converting some predictable future stream of cash flows from the company into liquidity that's available now. Equity monetizes growth, converting hypothetically higher earnings in the future into cash today. You might think that convertible bonds split the difference, by functioning as bonds when the company does poorly and stock when it does well, but given their embedded option, they're actually closer to a way for companies to monetize their volatility. And they are eager to do this right now ($, WSJ); year-to-date convertible bond issuance is just 20% below where it was in 2021, while US IPO issuance for the first half of this year was ~$18bn, down from $84bn in 2021. The most pleasant way to think about this is that stocks are cheaper than they were at the peak of the excesses a few years ago, and that companies don't want to give up their equity that easily. The less cheerful possibility is that whether or not valuations are high, companies recognize that uncertainty about their own future can turn into more certainty about their cash position, and they'll take that opportunity when it's offered.
Banking and Credit
Hyman Minsky's model (discussed a while back in Capital Gains) holds that one driver of economic cycles is the self-referential nature of credit: when the amount of lending goes up, some borrowers will use their loans to pay off other loans, while other borrowers' spending will have the same effect indirectly. Banks have some sense of this, and like to know when they're granting credit in order to make purchases, and when they're implicitly becoming the junior lender because the money they lend goes to paying back some other debt. But the range of consumer debt options is always changing, and rules have to change with it: Chase will no longer allow customers to use their credit cards to pay down buy-now-pay-later loans. Borrowing to pay down other borrowings has a U-shaped relationship with financial standing: it's ubiquitous in corporate finance, and there are plenty of high net-worth people who use a line of credit or margin loans as a regular source of liquidity so they don't have to either reduce the scope of their investments or realize capital gains. But at the lower end of the credit spectrum, individuals and companies are doing the same behavior for a very different reason: running out of cash is an existential risk, but sometimes it becomes inevitable, and in that case the last lender can end up taking the biggest loss.
How Long is the Long Game?
Amazon acquired video game streaming site Twitch for $1bn in 2014, and Twitch is sstill losing money ($, WSJ). Twitch fits two categories that are notorious for their uneven results: first, it's a marketing company that targets the otherwise hard-to-reach demographic of young men—a group that takes an increasingly long time to start earning and spending money. On one hand, that can make them an undervalued asset that eventually produces high returns, especially if the ads help cement their brand preferences. But it also means that it's a bet that those brand preferences will form when they aren't consuming much and then stick around at a different life stage. It's also the kind of business that compensates executives partly through consumption, specifically the opportunity to lavishly entertain celebrities. That's a necessary feature of most talent-management businesses, but it means there's a unique morale challenge—Twitch is cutting costs, laying off employees, and sending its CEO to jet around Europe meeting with professional gamers. Each of those make sense on their own, but together they make it hard to keep the rest of the workforce motivated. (And that, too, has explanatory power for the economics of the entertainment business: one way to compensate workers for the fact that they're tightening their belts while their boss's boss's boss is selecting a nice wine to go with an expense-account dinner is to just pay them more. And there go a few more points of margin.)
Disclosure: Long AMZN.
Sticky Expectations
Japan has been trying to increase its inflation for decades, and it's finally working, but less well than expected. Most Japanese consumers have experienced flat prices for most of their lives, so any meaningful inflation means that everyone experiences sticker shock. Consumer responses to inflation are hard to model, and there are symmetric dangers: get people used to an inflation rate that's higher than they'd like, and you lock in that same inflation rate because they shift some of their savings to buying consumer durables and bulk goods, at which point you've requisitioned huge amounts of household real estate and balance sheets to warehouse goods that aren't needed yet. But at the other end, if they treat every price increase as implicitly unfair, they keep their nominal spending constant and either trade down or consume less, defeating the whole purpose of getting inflation up to a mild but persistently positive level.
Diff JobsCompanies in the Diff network are actively looking for talent. See a sampling of current open roles below: - A company building ML-powered tools to accelerate developer productivity is looking for ML researchers with a knack for converting research into Github repos. (Washington DC area)
- Growing team at a multi-strategy firm is looking for senior quantitative researchers with direct experience or exposure to a relevant field (statistics, optimization, ML) 400k+ (NYC)
- A company reinventing the way Americans build wealth for the long-run by enabling them to access "Universal Basic Capital" is looking for fullstack engineers with prior experience in fintech. (NYC)
- A well funded early stage startup founded by two SpaceX engineers is building the software stack for hardware companies, and is looking for an experienced hire who can head their marketing operations. 3+ years experience in marketing for a sales-focused company ideal, especially with experience in hardware (LA)
- A well-funded company building AI-powered compliance tools is looking for legal analysts who can help build and refine AI models. 2+ years of experience in a highly regulated industry, with compliance exposure. (NYC)
Even if you don't see an exact match for your skills and interests right now, we're happy to talk early so we can let you know if a good opportunity comes up. If you’re at a company that's looking for talent, we should talk! Diff Jobs works with companies across fintech, hard tech, consumer software, enterprise software, and other areas—any company where finding unusually effective people is a top priority.
|