Data Science Weekly - Data Science Weekly - Issue 474

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #474

December 22 2022

Editor's Picks

 
  • Everything I learned about accidentally running a successful tech conference
    On December 15, 2022, the first and only Normconf, the tech conference about all the stuff that matters in data and machine learning but doesn’t get the spotlight, happened...I do want to offer some takeaways for people who are thinking about how to run online conferences and communities and create vibrant and happy online spaces...
  • What did you screw up this year? [Reddit Discussion]
    I will go first. Yesterday I deployed my spark script and it overwrote the table I been loading for a while...Just started a new project this week and this project use older version of Spark. It was supposed to only overwrite the partition but instead, the whole table was wiped...Who knows that dynamic partition overwrite only work after spark 2.3...Now I am spending part of the weekend rebuilding the table...
  • What Building "Copilot for X" Really Takes
    This comes from the Codeium team, who I’ve had the good fortune of getting to know in the past year. This small team blew away my expectations by creating a complete Copilot clone in one month (not exaggerating - I have disabled GitHub Copilot completely to use them, in part to beta-test Hey Github, but also Codeium seems a bit faster!). Since “Copilot for X” is at the top of every AI product thinker’s wishlist, I invited them to share their learnings!...To build a “Copilot for X”, you must...


 

A Message from this week's Sponsor:

 



Ilum the Spark cluster manager and monitoring tool

With Ilum's solution, everyone can now quickly and easily deploy Apache Spark on any Kubernetes cluster. Our software eliminates the need for tedious configuration and reduces the time needed for deployment from days to minutes. By leveraging the power of container orchestration and Apache Spark's scalability and reliability, we are making it easier than ever to stay ahead of the curve and explore the future of Big Data.

Ilum provides an all-in-one solution for:
  • Data Science on Kubernetes
  • Hadoop replacement
  • Apache Livy alternative
  • Integration with Jupyter and Apache Zeppelin
It's free! Unlock the power of Big Data today with Ilum.

Learn more about Ilum!



 

Data Science Articles & Videos

 
  • Why Business Data Science Irritates Me
    Data Science is a pretty solid career compared to most others where you can earn a lot of money. This post is about my gripes as an industry insider, not intended to discourage anyone from pursuing it as a career...I resonated with ryx's post so much that I felt compelled to write my reasons for being frustrated with industry data science, and what I think you can practically do about it if you feel the same way...
  • Build a GPT-3 app: How I used GPT-3 to make gifting easier
    I built Gifthub...Gifthub is a gift ideas generator powered by GPT-3. Now, instead of asking someone for advice, I can answer a few questions regarding the person's interests, hobbies, gift type, and get a bunch of unique gift recommendations...
  • ImageNet-X - Understanding model mistakes with human annotations of ImageNet
    ImageNet-X is a set of human annotations pinpointing failure types for the popular ImageNet dataset. ImageNet-X labels distinguishing object factors such as pose, size, color, lighting, occlusions, co-occurences, etc. for each image in the validation set and a random subset of 12,000 training samples...
  • What I learned from NormConf 2022
    For busy people who didn't yet get the chance to watch all the excellent @normconf talks, here's a written summary of my lessons learned...
  • Univariate Analysis — Intro and Implementation
    As a data scientist, what is the first step you do when you receive a new and unfamiliar set of data? Well, we start familiarizing ourselves with the data. This post focuses on answering that question by analyzing only one variable at a time, which is called a univariate analysis...
  • Dimensionality Reduction for Linearly Inseparable Data
    Standard PCA works well with linearly separable data in which the different classes can be clearly separated by drawing a straight line (in the case of 2D data) or a hyperplane (in the case of 3D and higher dimensional data)...Standard PCA will not work well with linearly inseparable data in which the different classes...can only be separated by using a curved decision boundary...For nonlinear dimensionality reduction, we can use the kernel PCA which is the non-linear form of the standard PCA...
  • Agree not to Disagree
    About a year ago, I stumbled upon the GoEmotions dataset. It's a dataset that contains user texts from Reddit with emotion annotations. Google attached its name to the dataset and published a paper about how the dataset got created...All of this effort might give you the impression that this is a dataset that's ready for a machine learning model. You'd be wrong, though...GoEmotions has been a great motivation for me to dive into techniques that find mislabeled examples...It's one thing to find bad annotations in ML, but it'd be better to learn how we might get good annotations instead...So in this blog post, I’ll explore this by diving into the dataset some more...
  • PufferLib
    You have an environment, a PyTorch model, and an RL framework that are designed to work together but don’t. PufferLib is a wrapper layer that provide better compatibility between Gym / PettingZoo environments and standard reinforcement learning frameworks. You write a native PyTorch network and a short binding for your environment; PufferLib takes care of the rest...


 

Tool*

 



Build powerful ML visualizations with Comet

With just 2 lines of code, Comet automatically logs metrics, hyperparameters, libraries, and more. This means automatic chart generation so you can easily manage training runs in real time. When you combine that with:
  • built-in visualizations (like the image panel),
  • custom project views, and
  • your own python panels,
Comet is a powerful tool for optimizing your ML workflow. All for free! Less friction, more ML.

Create your free account.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



 

Tool*

 



Now where can I find that query... 🔍🔍

Did you put it in a doc? Slack? Teams? Notes?? Make searching for a query a thing of the past with Sherloq. Sherloq helps data analysts save, organize, and share their metrics, most used or complex queries for seamless collaboration within their organization. It’s a secure add-on (no integrations or permissions necessary) that works on the popular query editors. Start organizing your query repository in a shared workspace with Sherloq beta.

Try Sherloq For Free


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!




 

Jobs

 
  • Data Scientist / Machine Learning Engineer - Epsilon - NYC

    Epsilon Strategy and Insights, Data Sciences team is looking for a talented team player in a Data Scientist/Machine Learning Engineer role. You are an expert, mentor and advocate. You have strong machine learning and deep learning background and are passionate about transforming data into ml models. You welcome the challenge of data science and are proficient in Python, Spark MLLib, Tensorflow, Keras, ML algorithms and Deep Neural Networks, Big Data. You must be self-driven, take initiative and want to work in a dynamic, busy and innovative group...
     
Want to post a job here? Email us for details --> team@datascienceweekly.org



 

Training & Resources

 
  • sts-jax: Structural Time Series (STS) in JAX
    This library has a similar to design to tfp.sts, but is built entirely in JAX, and uses the Dynamax library for state-space models. We also include an implementation of the causal impact method. This has a similar to design to tfcausalimpact, but is built entirely in JAX...


Last Week's Newsletter's 3 Most Clicked Links

 
* Based on unique clicks.
** Find last week's newsletter here.

 


Cutting Room Floor





P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 473

Friday, December 16, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #473 December 15 2022 Editor's Picks

Data Science Weekly - Issue 472

Friday, December 9, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #472 December 08 2022 Editor's Picks

Data Science Weekly - Issue 471

Thursday, December 1, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #471 December 01 2022 Editor's Picks

Data Science Weekly - Issue 470

Thursday, November 24, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #470 November 24 2022 Editor's Picks

[in case you missed it] Data Science Weekly - Issue 469

Sunday, November 20, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

You Might Also Like

Feature | The Best Visualizations from April on Our New App 📲

Monday, April 29, 2024

See the most popular, most discussed, and most liked visualizations on our new data storytelling app Voronoi from April. View Online | Subscribe At the end of 2023, we publicly launched Voronoi, our

😸 Tangible change

Monday, April 29, 2024

🤖 Elon is closing in on $6 billion in funding for his AI startup. 🛜 The FCC has officially voted... Product Hunt Read in browser This newsletter is brought to you by YOU MIGHT HAVE MISSED 🤖 Elon is

⚙️ AI has emotions now

Monday, April 29, 2024

Plus: Meta AI? More like Mid-ta AI! ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Yikes! Copilot failed all our coding tests

Monday, April 29, 2024

iPad Pro with M4; Best security keys; AI conducts job interviews now -- ZDNET ZDNET Tech Today - US April 29, 2024 placeholder Yikes! Microsoft Copilot failed every single one of my coding tests I ran

Re: The smart home product I use every day!

Monday, April 29, 2024

Hey , Earlier this month, I emailed you about one of my favorite smart home products, a robot vacuum and mop. I wanted to let you know that Samsung currently has a Spring Black Friday Sale and is

The EU draws its regulatory cords tighter around Apple

Monday, April 29, 2024

The EU has said Apple's iPadOS will now fall under the DMA View this email online in your browser By Alex Wilhelm Monday, April 29, 2024 Welcome to TechCrunch AM! We're off to a quick start

GCP Newsletter #396

Monday, April 29, 2024

Welcome to issue #396 April 29th, 2024 News Networking Official Blog Partners Introducing the Verified Peering Provider program, a simple alternative to Direct Peering - Google has launched a new

How many Vision Pro headsets has Apple sold?

Monday, April 29, 2024

The Morning After It's Monday, April 29, 2024. Apple Vision Pro headset production is reportedly being cut, sales are reportedly “way down.” But but but wait: Wasn't the Vision Pro meant to

Okta Warns of Unprecedented Surge in Proxy-Driven Credential Stuffing Attacks

Monday, April 29, 2024

THN Daily Updates Newsletter cover Webinar -- Uncovering Contemporary DDoS Attack Tactics -- and How to Fight Back Stop DDoS Attacks Before They Stop Your Business... and Make You Headline News.

Import AI 370: 213 AI safety challenges; everything becomes a game; Tesla's big cluster

Monday, April 29, 2024

Are AI systems more like religious artifacts or disposable entertainment? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏