A peek into the world of data strategy #4

This week's post includes articles from LinkedIn, Google, Pinterest, Uber, Khan Academy and others.

May 26, 2020

Who is this current post for:

Product Managers and Engineers passionate about building insights using data.

Love this blog? signup to receive weekly updates:

LinkedIn published its blog post in which they explain their research on consumers dwell time on linkedIn feeds which they use to improve their ranked content. They explain how using dwell time heuristic is better than modelling behavior on click/viral actions, enabling them to create a logistic regression model to improve the quality of the contents shown in feed

Article: https://engineering.linkedin.com/blog/2020/understanding-feed-dwell-time

Uber writes about how they solved the problem of unified access across multiple datastores; they recently integrated their unified access data source connector with Athenadriver to power their business intelligence tools to query AWS cloud data

Article: https://eng.uber.com/introducing-athenadriver/

Google AI launched its COVID-19 research explorer to help scientists and researchers efficiently analyze the 50,000+ journal articles on COVID-19 research. Under the hood the framework utilizes Google’s BERT language model which also powers its search.

Article: https://www.infoq.com/news/2020/05/google-nlu-tool-covid-19/

Google talks about its new Tensorflow Runtime(TFRT) which makes efficient execution of kernels thereby improving performance on average inference time by 28% . This will be available via an opt-in flag initially to its users.

Article: https://blog.tensorflow.org/2020/04/tfrt-new-tensorflow-runtime.html

Pinterest write about its Pin-clustering solution to help auto-organize boards for its consumers. Using a two fold approach to featurizing users pins with PinSage embeddings and then using Ward clustering to group pins, Pinterest is able to group together similar Pins and suggest new board sections with appropriate names.

Article: https://medium.com/pinterest-engineering/using-machine-learning-to-auto-organize-boards-13a12b22bf5

Khan Academy shares their story of scaling to 2.5x their usual traffic by leveraging a serverless architecture on Google App Engine and serving static content delivery using providers like Fastly and Youtube. Khan academy uses Youtube as their primary video service and has a fallback service which serves video via Amazon S3 via Fastly in places where Youtube is blocked.

Article: https://engineering.khanacademy.org/posts/handling-2x-traffic-in-a-week.htm

In the following post Bernd Rucker writes about orchestrating AWS Lambdas efficiently using Business Process Model and Notation(BPMN) and Camunda Cloud, describing how using SAGA pattern it becomes easy to embed business transactions and logic.

Article: https://blog.bernd-ruecker.com/orchestrate-aws-lambda-using-camunda-cloud-8d27dc640f69

That’s all for this week. Your feedback is welcome!

Note: This blog series is for informational purposes only, and all views are my own and do not represent my employers.

Data Strategy Newsletter

Discussion about this post