Adonis Diaries

Posts Tagged ‘Sebastian Wernicke

Damned lies and statistics?

Sebastian Wernicke turns the tools of statistical analysis on TEDTalks, to come up with a metric for creating “the optimum TEDTalk” based on user ratings. How do you rate it? “Jaw-dropping”? “Unconvincing”? Or just plain “Funny”?

Sebastian Wernicke. Data scientist

After making a splash in the field of bioinformatics, Sebastian Wernicke moved on to the corporate sphere, where he motivates and manages multidimensional projects. Full bio

Filmed in Feb 2010

If you go on the TED website, you can currently find there over a full week of TEDTalk videos, over 1.3 million words of transcripts and millions of user ratings.

And that’s a huge amount of data. And it got me wondering: If you took all this data and put it through statistical analysis, could you reverse engineer a TEDTalk?

Could you create the ultimate TEDTalk? (Laughter) (Applause) And also, could you create the worst possible TEDTalk that they would still let you get away with?

0:49 To find this out, I looked at three things:

I looked at the topic that you should choose,

I looked at how you should deliver it and the visuals onstage.

Now, with the topic: There’s a whole range of topics you can choose, but you should choose wisely, because your topic strongly correlates with how users will react to your talk.

to make this more concrete, let’s look at the list of top 10 words that statistically stick out in the most favorite TEDTalks and in the least favorite TEDTalks. So if you came here to talk about how French coffee will spread happiness in our brains, that’s a go. (Laughter) (Applause) Whereas, if you wanted to talk about your project involving oxygen, girls, aircraft — actually, I would like to hear that talk, (Laughter) but statistics say it’s not so good.

If you generalize this, the most favorite TEDTalks are those that feature topics we can connect with, both easily and deeply, such as happiness, our own body, food, emotions. And the more technical topics, such as architecture, materials and, strangely enough, men, those are not good topics to talk about.

How should you deliver your talk?

TED is famous for keeping a very sharp eye on the clock, so they’re going to hate me for revealing this, because, actually, you should talk as long as they will let you. (Laughter) Because the most favorite TEDTalks are, on average, over 50% longer than the least favorite ones.

And this holds true for all ranking lists on except if you want to have a talk that’s beautiful, inspiring or funny. Then, you should be brief. (Laughter) But other than that, talk until they drag you off the stage.

While you’re pushing the clock, there’s a few rules to obey. I found these rules out by comparing the statistics of four-word phrases that appear more often in the most favorite TEDTalks as opposed to the least favorite TEDTalks.

I’ll give you 3 examples. First of all, I must, as a speaker, provide a service to the audience and talk about what I will give you, instead of saying what I can’t have.

Secondly, it’s imperative that you do not cite The New York Times. (Laughter) And

finally, it’s okay for the speaker — that’s the good news — to fake intellectual capacity. If I don’t understand something, I can just say, “etc., etc.” You’ll all stay with me. It’s perfectly fine. (Applause)

 let’s go to the visuals. The most obvious visual thing on stage is the speaker.

And analysis shows if you want to be among the most favorite TED speakers, you should let your hair grow a little bit longer than average, make sure you wear your glasses and be slightly more dressed-up than the average TED speaker.

Slides are okay, though you might consider going for props. And now the most important thing, that is the mood onstage. Color plays a very important role.

Color closely correlates with the ratings that talks get on the website. (Applause) For example, fascinating talks contain a statistically high amount of exactly this blue color, (Laughter) much more than the average TEDTalk.

Ingenious TEDTalks, much more this green color, etc., et. (Laughter) (Applause) Now, personally, I think I’m not the first one who has done this analysis, but I’ll leave this to your good judgment.

it’s time to put it all together and design the ultimate TEDTalk.

Now, since this is TEDActive, and I learned from my analysis that I should actually give you something, I will not impose the ultimate or worst TEDTalk on you, but rather give you a tool to create your own. And I call this tool the TEDPad. (Laughter)

And the TEDPad is a matrix of 100 specifically selected, highly curated sentences that you can easily piece together to get your own TEDTalk.

You only have to make one decision, and that is: Are you going to use the white version for very good TEDTalks, about creativity, human genius? Or are you going to go with a black version, which will allow you to create really bad TEDTalks, mostly about blogs, politics and stuff? So, download it and have fun with it.

I hope you enjoy the session. I hope you enjoy designing your own ultimate and worst possible TEDTalks. And I hope some of you will be inspired for next year to create this, which I really want to see.

Patsy Z and TEDxSKE shared a link.

Collecting more data is excellent: Up to a point for optimum decision-making?

Does collecting more data lead to better decision-making?

Competitive, data-savvy companies like Amazon, Google and Netflix have learned that data analysis alone doesn’t always produce optimum results.

In this talk, data scientist Sebastian Wernicke breaks down what goes wrong when we make decisions based purely on data — and suggests a brainier way to use it.

Sebastian Wernicke. Data scientist?
After making a splash in the field of bio-informatics, Sebastian Wernicke moved on to the corporate sphere, where he motivates and manages multidimensional projects. Full bio

Roy Price is a man that most of you have probably never heard about, even though he may have been responsible for 22 somewhat mediocre minutes of your life on April 19, 2013. He may have also been responsible for 22 very entertaining minutes, but not very many of you. And all of that goes back to a decision that Roy had to make about three years ago.

0:34 So you see, Roy Price is a senior executive with Amazon Studios. That’s the TV production company of Amazon. He’s 47 years old, slim, spiky hair, describes himself on Twitter as “movies, TV, technology, tacos.”

And Roy Price has a very responsible job, because it’s his responsibility to pick the shows, the original content that Amazon is going to make. That’s a highly competitive space. I mean, there are so many TV shows already out there, that Roy can’t just choose any show. He has to find shows that are really great. So in other words, he has to find shows that are on the very right end of this curve here.


How Netflix creates hit shows like Orange is the New Black … using lots of data about you:|By Sebastian Wernicke

1:16 this curve here is the rating distribution of about 2,500 TV shows on the website IMDB, and the rating goes from one to 10, and the height here shows you how many shows get that rating.

So if your show gets a rating of nine points or higher, that’s a winner. Then you have a top two percent show. That’s shows like “Breaking Bad,” “Game of Thrones,” “The Wire,” so all of these shows that are addictive, whereafter you’ve watched a season, your brain is basically like, “Where can I get more of these episodes?” That kind of show.

On the left side, just for clarity, here on that end, you have a show called “Toddlers and Tiaras” — which should tell you enough about what’s going on on that end of the curve.

Roy Price is not worried about getting on the left end of the curve, because I think you would have to have some serious brainpower to undercut “Toddlers and Tiaras.” So what he’s worried about is this middle bulge here, the bulge of average TV, you know, those shows that aren’t really good or really bad, they don’t really get you excited. So he needs to make sure that he’s really on the right end of this.

 the pressure is on, and of course it’s also the first time that Amazon is even doing something like this, so Roy Price does not want to take any chances. He wants to engineer success. He needs a guaranteed success, and so what he does is, he holds a competition.

he takes a bunch of ideas for TV shows, and from those ideas, through an evaluation, they select 8 candidates for TV shows, and then he just makes the first episode of each one of these shows and puts them online for free for everyone to watch. And so when Amazon is giving out free stuff, you’re going to take it, right? So millions of viewers are watching those episodes.

What the viewers don’t realize is that, while they’re watching their shows, actually, they are being watched. They are being watched by Roy Price and his team, who record everything.

They record when somebody presses play, when somebody presses pause, what parts they skip, what parts they watch again. So they collect millions of data points, because they want to have those data points to then decide which show they should make.

And sure enough, so they collect all the data, they do all the data crunching, and an answer emerges, and the answer is, “Amazon should do a sitcom about four Republican US Senators.” They did that show.

does anyone know the name of the show? (Audience: “Alpha House.”) Yes, “Alpha House,” but it seems like not too many of you here remember that show, actually, because it didn’t turn out that great.

It’s actually just an average show, actually — literally, in fact, because the average of this curve here is at 7.4, and “Alpha House” lands at 7.5, so a slightly above average show, but certainly not what Roy Price and his team were aiming for.

 at about the same time, at another company, another executive did manage to land a top show using data analysis, and his name is Ted, Ted Sarandos, who is the Chief Content Officer of Netflix, and just like Roy, he’s on a constant mission to find that great TV show, and he uses data as well to do that, except he does it a little bit differently.

So instead of holding a competition, what his team did was they looked at all the data they already had about Netflix viewers, you know, the ratings they give their shows, the viewing histories, what shows people like, and so on. And then they use that data to discover all of these little bits and pieces about the audience: what kinds of shows they like, what kind of producers, what kind of actors.

And once they had all of these pieces together, they took a leap of faith, and they decided to license not a sitcom about four Senators but a drama series about a single Senator. You guys know the show?

Yes, “House of Cards,” and Netflix of course, nailed it with that show, at least for the first two seasons. “House of Cards” gets a 9.1 rating on this curve, so it’s exactly where they wanted it to be.

 the question of course is, what happened here?

So you have two very competitive, data-savvy companies. They connect all of these millions of data points, and then it works beautifully for one of them, and it doesn’t work for the other one. So why?

Because logic kind of tells you that this should be working all the time. I mean, if you’re collecting millions of data points on a decision you’re going to make, then you should be able to make a pretty good decision.

You have 200 years of statistics to rely on. You’re amplifying it with very powerful computers. The least you could expect is good TV, right?

And if data analysis does not work that way, then it actually gets a little scary, because we live in a time where we’re turning to data more and more to make very serious decisions that go far beyond TV.

Does anyone here know the company Multi-Health Systems? No one. OK, that’s good actually. Multi-Health Systems is a software company, and I hope that nobody here in this room ever comes into contact with that software, because if you do, it means you’re in prison.

If someone here in the US is in prison, and they apply for parole, then it’s very likely that data analysis software from that company will be used in determining whether to grant that parole.

So it’s the same principle as Amazon and Netflix, but now instead of deciding whether a TV show is going to be good or bad, you’re deciding whether a person is going to be good or bad. And mediocre TV, 22 minutes, that can be pretty bad, but more years in prison, I guess, even worse.

unfortunately, there is actually some evidence that this data analysis, despite having lots of data, does not always produce optimum results. And that’s not because a company like Multi-Health Systems doesn’t know what to do with data.

Even the most data-savvy companies get it wrong. Yes, even Google gets it wrong sometimes.

In 2009, Google announced that they were able, with data analysis, to predict outbreaks of influenza, the nasty kind of flu, by doing data analysis on their Google searches. And it worked beautifully, and it made a big splash in the news, including the pinnacle of scientific success: a publication in the journal “Nature.”

It worked beautifully for year after year after year, until one year it failed. And nobody could even tell exactly why. It just didn’t work that year, and of course that again made big news, including now a retraction of a publication from the journal “Nature.”

So even the most data-savvy companies, Amazon and Google, they sometimes get it wrong. And despite all those failures, data is moving rapidly into real-life decision-making — into the workplace, law enforcement, medicine. So we should better make sure that data is helping.

personally I’ve seen a lot of this struggle with data myself, because I work in computational genetics, which is also a field where lots of very smart people are using unimaginable amounts of data to make pretty serious decisions like deciding on a cancer therapy or developing a drug.

And over the years, I’ve noticed a sort of pattern or kind of rule, if you will, about the difference between successful decision-making with data and unsuccessful decision-making, and I find this a pattern worth sharing, and it goes something like this.

whenever you’re solving a complex problem, you’re doing essentially two things.

The first one is, you take that problem apart into its bits and pieces so that you can deeply analyze those bits and pieces, and then of course you do the second part. You put all of these bits and pieces back together again to come to your conclusion. And sometimes you have to do it over again, but it’s always those two things: taking apart and putting back together again.

 the crucial thing is that data and data analysis is only good for the first part. Data and data analysis, no matter how powerful, can only help you taking a problem apart and understanding its pieces. It’s not suited to put those pieces back together again and then to come to a conclusion.

There’s another tool that can do that, and we all have it, and that tool is the brain. If there’s one thing a brain is good at, it’s taking bits and pieces back together again, even when you have incomplete information, and coming to a good conclusion, especially if it’s the brain of an expert.

And that’s why I believe that Netflix was so successful, because they used data and brains where they belong in the process. They use data to first understand lots of pieces about their audience that they otherwise wouldn’t have been able to understand at that depth, but then the decision to take all these bits and pieces and put them back together again and make a show like “House of Cards,” that was nowhere in the data.

Ted Sarandos and his team made that decision to license that show, which also meant, by the way, that they were taking a pretty big personal risk with that decision.

And Amazon, on the other hand, they did it the wrong way around. They used data all the way to drive their decision-making, first when they held their competition of TV ideas, then when they selected “Alpha House” to make as a show. Which of course was a very safe decision for them, because they could always point at the data, saying, “This is what the data tells us.” But it didn’t lead to the exceptional results that they were hoping for.

 data is of course a massively useful tool to make better decisions, but I believe that things go wrong when data is starting to drive those decisions. No matter how powerful, data is just a tool, and to keep that in mind, I find this device here quite useful. Many of you will … Before there was data, this was the decision-making device to use.

11:04 (Laughter)

Many of you will know this. This toy here is called the Magic 8 Ball, and it’s really amazing, because if you have a decision to make, a yes or no question, all you have to do is you shake the ball, and then you get an answer — “Most Likely” — right here in this window in real time. I’ll have it out later for tech demos.

the thing is, of course — so I’ve made some decisions in my life where, in hindsight, I should have just listened to the ball. But if you have the data available, you want to replace this with something much more sophisticated, like data analysis to come to a better decision.

But that does not change the basic setup. So the ball may get smarter and smarter and smarter, but I believe it’s still on us to make the decisions if we want to achieve something extraordinary, on the right end of the curve.

And I find that a very encouraging message, in fact, that even in the face of huge amounts of data, it still pays off to make decisions, to be an expert in what you’re doing and take risks. Because in the end, it’s not data, it’s risks that will land you on the right end of the curve.




June 2022

Blog Stats

  • 1,496,203 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by

Join 821 other followers

%d bloggers like this: