Data Science Weekly - Data Science Weekly - Issue 409

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #409

September 23 2021

Editor Picks
 
  • Tree Thinking
    Trees have long served as models of intellectual inquiry and as sites of religious and civic deliberation. Now, as we learn more about plant intelligence, they are inspiring deeper forms of ecological investigation...
 
 

A Message from this week's Sponsor:

 

 
Kickstart Your New Career with a Data Science & Analytics Bootcamp

Join an Online Flex Data Science & Analytics Bootcamp and work on your own schedule with on-demand lectures, while still getting dedicated 1:1 instructor support. You’ll also get focused career support until you’re hired. Ready to start your journey? Learn more about the Metis Online Flex Data Science & Analytics Bootcamps...

 

 

Data Science Articles & Videos

 
  • Robots Must Be Ephemeralized
    In this blog post, I outline why it is tempting for roboticists to iterate directly on real life, and how the difficulty of evaluating general-purpose robots will eventually force us to increasingly rely on offline evaluation techniques such as simulation...
  • Interview of Erik Bernhardsson - Former CTO @ Better.com
    Up until quite recently, I was the CTO of Better.com for six years, taking the eng team from 1 person to 300, and doing all sort of “CTO stuff” – mostly recruiting, but also lots of technical stuff, occasionally writing code. Before Better, I was at Spotify for 6.5 years, initially running the (very nascent) data/BI team, then later managing the music recommendation team. I built the first version of the music rec system at Spotify...
  • A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning
    The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models...This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain...
  • Image Encoders: BigTransfer vs CLIP
    I've been mucking around with building a meme search engine...To do so I’m testing a couple of different image encoders: a) Big Transfer encoder from Google and b)CLIP image encoder...In essence, these use a neural network to turn an image file into vector embeddings that can be compared for a similarity (“nearest neighbor”) search. Which one is best (at least for memes)? Let’s put them to the test. We’ll index 10,000 memes and compare...
  • An End-to-End Guide to Photogrammetry with Mobile Devices
    Constructing 3D models with photogrammetry allows journalists to share objects and environments with their audiences in a comprehensive, immersive way that can’t be achieved with photography or videography alone...Over the past several years, the R&D team at The Times has worked to simplify the production of photogrammetry-driven stories...This resource compiles what we've learned into a series of guides, demos and open-source software tools that we hope will aid anyone seeking to capture, process and deliver high-quality 3D models...
  • Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
    Weak supervision is a popular method for building machine learning models without relying on ground truth annotations. Instead, it generates probabilistic training labels by estimating the accuracies of multiple noisy labeling sources (e.g., heuristics, crowd workers). Existing approaches use latent variable estimation to model the noisy sources, but these methods can be computationally expensive, scaling superlinearly in the data. In this work, we show that, for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters, obviating the need for iterative solutions like stochastic gradient descent (SGD)...
  • Machine Learning Hyperparameter Optimization with Argo
    How the hyperparameters of our machine learning models are tuned at Canva...Canva uses a variety of machine learning (ML) models, such as recommender systems, information retrieval, attribution models, and natural language processing for various applications. A typical problem is the amount of time and engineering effort in choosing a set of optimal hyperparameters and configurations used to optimize a learning algorithm’s performance...
  • Is BI dead? On dismantling data's ship of Theseus
    Over the last decade, many of the early BI functions have been stripped out of BI and relaunched as independent products...The splinter of the modern data stack that we call BI is diminished, but mostly unchanged. It’s as though we took our definition of BI from twenty years ago and started crossing off clauses, until we’re left with “visualization and reporting.”...BI tools should aspire to do one thing, and do it completely: They should be the universal tool for people to consume and make sense of data. If you—an analyst, an executive, or any person in between—have a question about data, your BI tool should have the answer...
  • Scaling TensorFlow to 300 million predictions per second
    We present the process of transitioning machine learning models to the TensorFlow framework at a large scale in an online advertising ecosystem. In this talk we address the key challenges we faced and describe how we successfully tackled them; notably, implementing the models in TF and serving them efficiently with low latency using various optimization techniques...
 
 

Summit*

 

 
Join Impact 2021 on November 3, 2021: The First-Ever Data Observability Summit. Join Today's Leading Data Pioneers.

Hear from data leaders pioneering the technologies & processes shaping data engineering. Featuring First Chief Data Scientist of the U.S., founder of the Data Mesh and many more! Get Your Free Ticket ...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
 

 

Jobs

 
  • Senior Data Scientist - TikTok - LA

    TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy by offering a home for creative expression and an experience that is genuine, joyful, and positive.
    • Generate useful features from large amount of data
    • Apply supervised and unsupervised machine learning techniques, such as linear and logistic regression, decision trees, and k-means clustering
    • Develop segmentation models, classification models, propensity models, LTV models, experimental design, optimization models
    • Perform statistical analysis such as KPI deep dives, performance marketing efficiency, behavioral clustering, and user journey analytics
    • Curate audiences and inform engagement tactics to enable differentiated, relevant marketing touches across channels (social, email, in app, push)
    • Synthesize analytics and statistical approaches into easy-to-consume storylines, both visually and verbally, and provide indicated actions for executive audiences
    • Capture business requirements for data and analytic solutions and collaborate XFN to ensure business requirements align with business needs
    • Analyze creatives and surface insights that will help drive engagement and retention
    • Support day-to-day collaboration with performance marketing to communicate insights and recommend data informed strategies

        Want to post a job here? Email us for details >> team@datascienceweekly.org
 

 

Training & Resources

 
  • River: Online machine learning in Python
    River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on streaming data...
 
 

Books

 

  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 408

Friday, September 17, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #408 September 16 2021 Editor Picks The

Data Science Weekly - Issue 407

Friday, September 10, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #407 September 09 2021 Editor Picks A

Welcome! :) Let's get started the right way...

Sunday, September 5, 2021

Hello! Thanks for joining our weekly newsletter :) We're excited to have you! It'd make sense for us to introduce ourselves, tell you a bit about who we are and why we are putting this

Microsoft warns: Active Directory FoggyWeb malware being actively used by Nobelium gang [Wed Sep 29 2021]

Wednesday, September 29, 2021

Hi The Register Subscriber | Log in The Register {* Daily Headlines *} 29 September 2021 malware Microsoft warns: Active Directory FoggyWeb malware being actively used by Nobelium gang Chief security

Drop These Phrases From Your Vocabulary If You Want to Sound More Confident

Saturday, September 25, 2021

These common phrases have a way of undermining our authority.... ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ Do everything better Saturday, September 25, 2021 Life in General Drop These Phrases From

Issue #403: ChowJS, TheatreJS, and Idle Zoo

Saturday, September 25, 2021

Weekly newsletter about HTML5 Game Development. Is this email not displaying correctly? View it in your browser. Issue #403 - September 24th 2021 If you have anything you want to share with the HTML5

Get your Instagram Shop ready for the holidays 🎁

Friday, September 17, 2021

Learn the latest Instagram best practices to make your shop a success. ‌ ‌ ‌ Hello there! 👋 Instagram's new guide to making your Instagram account ready for the holiday season is a must-read. You

[New post] Collection Performance: Sorting the Record Type

Tuesday, October 26, 2021

dotNetDave posted: " Subscriber Content The new class type called record in .NET in many cases can really speed things up in your code. I have previously written about the new record type in the

Create a timer on Linux, 9 ways to use open source every day, and more

Tuesday, October 26, 2021

My favorite LibreOffice productivity tips 5 open source tabletop RPGs you should try Opensource.com THE LATEST Create a timer on Linux A tutorial showing how to create a POSIX-compliant interval timer.

Power BI Weekly #131 - 26th October 2021

Tuesday, October 26, 2021

Power BI Weekly Newsletter Issue #131 powered by endjin Welcome to the 131st edition of Power BI Weekly! Again, a pretty quiet week from Microsoft's side, with the main highlight being the GA of

Apple is killing Google in one key area

Tuesday, October 26, 2021

REvil ransomware group reportedly taken offline; Gender pay gap is getting worse in tech Subscription | Read Online | Twitter Facebook LinkedIn Top Story of the Day October 25, 2021 Top Story of the

New Blogs on ThomasMaurer.ch for 10/26/2021

Tuesday, October 26, 2021

View this email in your browser Thomas Maurer Cloud & Datacenter Update This is the update for blog posts on ThomasMaurer.ch. Azure Arc enabled Server – Store AWS instance metadata as Azure tag By

Show HN: I built a fake VS Code to browse live cricket score in office — and The Financial Times’ 404 page

Monday, October 25, 2021

Issue #567 — Top 20 stories of October 26, 2021 Issue #567 — October 26, 2021 You receive this email because you are subscribed to Hacker News Digest. You can open it in the browser if you prefer. 1

New productivity app Routine manages note-taking and task management

Monday, October 25, 2021

TechCrunch Newsletter TechCrunch logo The Daily Crunch logo Monday, October 25, 2021 • By Alex Wilhelm Hello and welcome to Daily Crunch for October 25, 2021. What a day. We kicked off with news that a

Event invite: Master the Art of Storytelling with Adam Davidson

Monday, October 25, 2021

Join us on the Every Discord on Wednesday October 27th at 2 PM EST ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Ranked | The Largest Oil and Gas Companies in the World 🌎

Monday, October 25, 2021

Oil still makes up the largest share of the global energy mix. Here are the largest oil and gas companies by market cap in 2021. Ecommerce marketing made simple. > View Infographic FEATURED STORY

Max Q - Nanoracks, Voyager and Lockheed Martin plan commercial space station

Monday, October 25, 2021

TechCrunch Newsletter TechCrunch logo Max Q logo Monday, October 25, 2021 • By Aria Alamalhodaei Hello and welcome back to Max Q. The era of commercial space stations is here. Read on for news from