오늘도배운다How to Start Thinking Like a Data Scientist

How to Start Thinking Like a Data Scientist


- Advertisment -spot_img

Slowly but steadily, data are forcing their way into every nook and cranny of every industry, company, and job. Managers who aren’t data savvy, who can’t conduct basic analyses, interpret more complex ones, and interact with data scientists are already at a disadvantage. Companies without a large and growing cadre of data-savvy managers are similarly disadvantaged.

Fortunately, you don’t have to be a data scientist or a Bayesian statistician to tease useful insights from data. This post explores an exercise I’ve used for 20 years to help those with an open mind (and a pencil, paper, and calculator) get started. One post won’t make you data savvy, but it will help you become data literate, open your eyes to the millions of small data opportunities, and enable you work a bit more effectively with data scientists, analytics, and all things quantitative.

While the exercise is very much a how-to, each step also illustrates an important concept in analytics — from understanding variation to visualization.

First, start with something that interests, even bothers, you at work, like consistently late-starting meetings. Whatever it is, form it up as a question and write it down: “Meetings always seem to start late. Is that really true?”

Next, think through the data that can help answer your question, and develop a plan for creating them. Write down all the relevant definitions and your protocol for collecting the data. For this particular example, you have to define when the meeting actually begins. Is it the time someone says, “Ok, let’s begin.”? Or the time the real business of the meeting starts? Does kibitzing count?

Now collect the data. It is critical that you trust the data. And, as you go, you’re almost certain to find gaps in data collection. You may find that even though a meeting has started, it starts anew when a more senior person joins in. Modify your definition and protocol as you go along.

Sooner than you think, you’ll be ready to start drawing some pictures. Good pictures make it easier for you to both understand the data and communicate main points to others. There are plenty of good tools to help, but I like to draw my first picture by hand. My go-to plot is a time-series plot, where the horizontal axis has the date and time and the vertical axis has the variable of interest. Thus, a point on the graph below (click for a larger image) is the date and time of a meeting versus the number of minutes late.

Taking Data Picture Image

Now return to the question that you started with and develop summary statistics. Have you discovered an answer? In this case, “Over a two-week period, 10% of the meetings I attended started on time. And on average, they started 12 minutes late.”

But don’t stop there. Answer the “so what?” question. In this case, “If those two weeks are typical, I waste an hour a day. And that costs the company $X/year.”

Many analyses end because there is no “so what?” Certainly if 80% of meetings start within a few minutes of their scheduled start times, the answer to the original question is, “No, meetings start pretty much on time,” and there is no need to go further.

But this case demands more, as some analyses do. Get a feel for variation. Understanding variation leads to a better feel for the overall problem, deeper insights, and novel ideas for improvement. Note on the picture that 8–20 minutes late is typical. A few meetings start right on time, others nearly a full 30 minutes late. It might be better if one could judge, “I can get to meetings 10 minutes late, just in time for them to start,” but the variation is too great.

Now ask, “What else does the data reveal?” It strikes me that five meetings began exactly on time, while every other meeting began at least seven minutes late. In this case, bringing meeting notes to bear reveals that all five meetings were called by the Vice President of Finance. Evidently, she starts all her meetings on time.

So where do you go from here? Are there important next steps? This example illustrates a common dichotomy. On a personal level, results pass both the “interesting” and “important” test. Most of us would give anything to get back an hour a day. And you may not be able to make all meetings start on time, but if the VP can, you can certainly start the meetings you control promptly.

On the company level, results so far only pass the interesting test. You don’t know whether your results are typical, nor whether others can be as hard-nosed as the VP when it comes to starting meetings. But a deeper look is surely in order: Are your results consistent with others’ experiences in the company? Are some days worse than others? Which starts later: conference calls or face-to-face meetings? Is there a relationship between meeting start time and most senior attendee? Return to step one, pose the next group of questions, and repeat the process. Keep the focus narrow — two or three questions at most.

I hope you’ll have fun with this exercise. Many find a primal joy in data. Hooked once, hooked for life. But whether you experience that primal joy or not, do not take this exercise lightly. There are fewer and fewer places for the “data illiterate” and, in my humble opinion, no more excuses.

원문링킁 : http://blogs.hbr.org/2013/11/how-to-start-thinking-like-a-data-scientist/

회신을 남겨주세요

귀하의 의견을 입력하십시오!
여기에 이름을 입력하십시오.

1 + 16 =

Latest news

남자에게 첫 사랑이란

온라인에서 우연히 본 만화인데 남자의 심리를 잘 표현해준 것 같다. 나이가 들어가는지 없던 감성이 자꾸 생기네. 풉

20대 대선 리뷰!

20대 대선이 끝났다. 득표율을 보며 우리나라 국민들이 현 수준에서 똑똑하다 생각했다. 이제는 행정도 민심에 민감하지 않으면 바로 교체될 수...

있는 그대로 받아들이기

나이가 든다는 것은 경험치가 쌓인다는 것이고, 두둑한 아카이브를 갖고 있다는 것이다. 다양한 이슈가 발생했을때 갖고있던 경험치와 비교/분석해보며 그것을 기존...

부산 명지쪽 커피 맛집, 어랏투고

 카페사장을 해봤기에 커피맛에 민감하다. 따뜻한 라떼로 커피맛을 평가한다. 이번에 부산에 내려갔다가 내 입맛에 딱 맞는 커피 맛집을 발견했다. 놀라운...
- Advertisement -spot_imgspot_img

2개월간 집중했던 골프

2016년부터 골프를 했다. 그러나 운동도 안되고 취향에 맞지 않아 월례회 참석에 의의를 뒀다. 그래서 점수는 늘 110타였다. 참고로 월례회팀은...

9월 마감을 준비하며

또 한달이 마감된다. 지난 달 대비 베넘 방문자수는 30% 빠졌고, 페이지뷰는 33% 빠졌다. 그리고 8~9월 재방문율이 감소했다. 그 이유는...

Must read

남자에게 첫 사랑이란

온라인에서 우연히 본 만화인데 남자의 심리를 잘 표현해준 것...

20대 대선 리뷰!

20대 대선이 끝났다. 득표율을 보며 우리나라 국민들이 현 수준에서...
- Advertisement -spot_imgspot_img

You might also likeRELATED
Recommended to you