It shouldn’t be a surprise that data science will define the survival of companies and organizations in the next decade. Deborah Leff, the CTO for data science and AI at IBM, says it succinctly: “If your competitors are applying AI, and they’re finding insights that allow them to accelerate, they’re going to peel away really, really quickly.” But Leff notes that only 13% of data science projects actually make it into production.
Last week, we dove into the Victoria’s Secret dataset and discovered that they had a huge variety of not only bra types, but also colors, sizes, and materials. Whenever you start working with a new dataset, always remember to explore it first so you have a good idea of the distribution of the data and you may even have new questions. Now that we have a better understanding of Victoria’s Secret’s inventory, we can ask more questions about their pricing strategies, such as:
- What’s the price distribution for each product category?
- Which cup size is cheaper? And which cup size is the most expensive?
- If you want to save money, which color is cheaper?
Victoria’s Secret is one of the best-known brands for lingerie, pajamas, and bathing suits. Thanks to Kaggle, we have the opportunity to dive into their best-selling products and see how their pricing reflects their inventory. In this first part of our series, we’ll explore their inventory and unique products to get a better understanding of the overall dataset. This is always a best practice when you work with your own data - when you invest the time up front to get an overview of your dataset, you’ll better hone your analysis and save a lot of time in the long run.
And how we can fix it.
A few weeks ago, we wrote about why we were building and distributing our Data Science Communicator Toolkit. Part of our initiative included collecting information from people who work with data so we could shape the toolkit to help bridge the communication gaps between them and their colleagues. We found some interesting results and are excited to share them with you here, as well as some recommendations for how you can alleviate the road blocks that your organization faces on its way to becoming more data driven.
This post was originally published on RStudio’s blog, R Views.
“I’m not a coder” or “I was never good at math” is a frequent refrain I hear when I ask professionals about their data analysis skills. Through popular culture and stereotypes, most people who don’t have a background in programming automatically underestimate their ability to create amazing things with code. However, Data Society has proven that this is a false narrative through our training program — with students in over 20 countries and many government and enterprise clients, we’ve seen so-called “non-coders” proficiently put together automated data cleaning code scripts and analyses within a few weeks. So how do we do it? Well, we’ve singled out three key steps to get someone started on their journey to an amazing skill set and more powerful data analytics:
A core tenet of our mission at Data Society is to empower employees and teams with powerful data science skills, and provide them with the tools to implement analytics to automate processes and find new insights.
Data is the new oil.
DJ Patil, the first Chief Data Scientist of the United States, says that integrating data into government “enable[s] transparency — you create efficiency, you provide security, you use it to foster innovation.” Integrating data analysis into your operations can significantly reduce costs and improve efficiency without increasing your operating budget. California reduced its fleet by 15 percent once the state released its budget data on vehicle spending. The Center for Medicare and Medicaid Services’ “Big Data” tools have saved the government over $1.5 billion through fraud prevention and identifying waste and abuse. But these reductions in cost and improvements in efficiency can only occur when data is leveraged regularly to provide insights.
A picture is worth a thousand words — at least, that’s how the old saying goes. But how many numbers is a picture worth? Perhaps that’s a more interesting question given the amount of charts and graphs created every day. We need to distill all the data available to us through charts and visualizations to convey a clear message. It’s the message the inspires action to change the world — or your organization.