Sunday, 25 March 2012

Cinema by the Numbers: Who is the Greatest Director in the World? (Part 1)


I started working on this little experiment months ago, but because everything takes so damn long, I still haven't got around to finishing it. Well, today is the day. It may not be on the scale I wanted it to be, but for the sake of getting it done, this will have to suffice. Plus, I can always expand later. What the hell am I talking about? Basically, a simplified statistical analysis in order to determine who exactly is the best director working today. I know, it's hardly possible to work that out, but it's a fun exercise and the results are remarkably appropriate (according to my tastes of course). Hit the jump to see more.


Picking the List of Directors

From the start, I thought the list should focus on directors who are still alive and working. Taking into account legends like Hitchcock, Kubrick, Leone and Kurosawa would make things too complicated. Besides, for reasons that will soon be obvious, my methodology doesn't work as well in the case of older films that haven't been watched by as many people.

In order to have a manageable figure, I looked for 50 of the best directors working today. With the help of the most powerful force in the universe, Google, it was relatively easy to settle on 50 names. Don't hate me if I missed anyone, it's simple enough to add names to the model at any stage. Feel free to suggest additions. Of course, there need to be a few rules and assumptions. I will be highlighting them throughout this post, but here are a few of the basic ones:

  1. The director must still be alive and working. Actually, being alive is enough because it isn't really possible to say whether a director is still working or not. If there hasn't been a movie in a long time, maybe he or she is simply taking a break. 
  2. The director must have directed at least 5 feature films. The reason for this is twofold. First, part of the exercise is to examine some trends in the careers of each director, and a few films are needed in order to do that. Second, recognizing a director who has directed only a movie or two could lead to strange results. The director must have an established track record, to show consistency over the years. One fluke success won't cut it. Case in point: Spike Jonze. He has directed 3 films only, all of which have been successful, and for that reason he ranks extremely highly. One bad film from him will make him drop a long way down the rankings. So, until he has directed 5 films, he is not eligible. 

So, based on sources like the IMDB, Entertainment Weekly, The Guardian, AMC, Total Film and of course, Ropeofsilicon, here are the 50 eligible directors (in alphabetical order):

1. Alfonso Cuaron
2. Ang Lee
3. Brian De Palma
4. Brian Singer
5. Cameron Crowe
6. Christopher Nolan
7. Clint Eastwood
8. Danny Boyle
9. Darren Aronofsky
10. David Cronenberg
11. David Fincher
12. David Lynch
13. David O' Russell
14. Ed Zwick
15. Francis Ford Coppola
16. George Lucas
17. Guillermo Del Toro
18. James Cameron
19. James L Brooks
20. Jim Jarmusch
21. Joel Cohen
22. Lars Von Trier
23. Martin Scorsese
24. Michael Haneke
25. Michael Mann
26. Michel Gondry
27. Milos Forman
28. Nicolas Winding Refn
29. Oliver Stone
30. Paul Thomas Anderson
31. Pedro Almodovar
32. Peter Jackson
33. Quentin Tarantino
34. Richard Linklater
35. Ridley Scott
36. Robert Zemeckis
37. Roman Polanski
38. Ron Howard
39. Sam Mendes
40. Sam Raimi
41. Spike Lee
42. Steven Soderbergh
43. Steven Spielberg
44. Terrence Malick
45. Terry Gilliam
46. Tim Burton
47. Tony Scott
48. Werner Herzog
49. Wes Anderson
50. Woody Allen

How the Model Works

I've alluded to this before, but there are three little websites that are pretty well known film databases around the world:

The Internet Movie Database (IMDB): an on line database that has been going since 1990, and records, inter alia, movie ratings based on user votes. As such, the IMDB is therefore a sound indicator of what the general public thinks of a  movie. For the purposes of this model, an IMDB score is only counted if it has received at least 1000 votes (which is almost always the case). With fewer votes, the data becomes unreliable and too much weight would be attributed to that film.

Rotten Tomatoes: is a critics' review aggregator. According to Wikipedia, Rotten Tomatoes staff first collect online reviews from authors that are certified members of various writing guilds or film critic associations. To become a critic at the site, a critic's original reviews must garner a specific amount of "likes". Top Critics are generally ones that write for a notable newspaper. The staff then determine for each review whether it is positive ("fresh", marked by a small icon of a red tomato) or negative ("rotten", marked by a small icon of a green splattered tomato). In essence, Rotten Tomatoes provides a good indicator of the general critical appeal of a film. Some movies are widely loved, and could receive even a 100% "Fresh" rating. In the same sense, a generally panned film could get a Rotten Tomatoes score that is very low - lower than both IMDB and Metacritic.

Metacritic: is similar to Rotten Tomatoes, but for each movie, a numerical score from each review is obtained and the total is averaged. Many review websites give a review grade out of five, out of ten, out of a hundred, or even an alphabetical score. Metacritic converts such a grade into a percentage. For reviews with no explicit scores, Metacritic manually assesses the tone of the review before assigning a relevant grade. Metacritic scoring system is therefore closer to IMDB because it tries to quantify just how good a movie is rather than the yes/no approach of Rotten Tomatoes, but it uses critics' scores rather than user votes.

Data from the above three websites is what forms the basis of this model. The rationale is explained in detail throughout this post.

Source Data

In a massive data capture exercise, I recorded the following information for every director in the list above:
  • Age of the Director
  • Film Count
  • Best Director Oscar Nominations and Wins
  • For every feature film (provided it has more than 1000 IMDB Votes, and excluding short films, made for TV movies and part-directed films):
    • Title
    • Release Year
    • Running Time
    • IMDB Score
    • Rotten Tomatoes Score
    • Metacritic Score
The above data was captured for a total of 677 films. It took a while!

What is Not Catered for in the Model

Let's face it, the model is a very simple one and probably a load of crap. It could be greatly refined, but (1) I don't have all the time in the world, and (2) I got only 51% for Statistics at University. Apart from those two issues, it's possible to identify a number of issues associated with the model itself:
  • The mere fact that it takes into account scores from the above three websites only is a fundamental problem. Directorial greatness comes from more than just these scores. The model does not consider the director's contribution to the development of film and the film industry, and it does not recognize the fact that a director may have a compelling vision and groundbreaking style that has helped to define cinema as we know it today (see The FilmSite).  These aspects require a subjective analysis which is not the purpose of this model. That being said, surely they are all somehow reflected in the ratings attributed to each movie by critics and audiences alike? I certainly think so, and the results show it. The point is, if people don't like your movies, then no matter how unique you are, you cannot be considered a truly great filmmaker.
  • The model does not take box office performance into account. It's not called the movie business for nothing, and as much as I would love to be all sentimental about true artistic greatness, the amount of money a movie makes must count for something. If you are getting people off their asses and into cinemas, you must be doing something right. The model does not account for this, but as with the issue mentioned above, surely it must be somehow indirectly reflected? I thought of using Box Office Mojo stats, but with older films, inflation and all of that, it becomes a mess. 
  • An outlier can do considerable damage to a director's ranking. Take for example James Cameron's first movie, Piranha Part 2: The Spawning. It was a shocker, scoring a whopping 3.4 on IMDB and 8% on Rotten Tomatoes. That movie alone, which is his first and only really poor showing, causes his ranking to drop by quite a bit. Of course, at the end of the day, he did make a shit movie and should pay for it! The problem could be mitigated by the next issue...
  • The model does not weight scores at all, whether it be the three websites relative to each other or giving a film a lower weighting if it has fewer votes. This could improve the accuracy of the model somewhat, but (1) it gets too complicated, (2) I would have no idea what weightings to assign and (3) the law of averages means that at the end of the day, these issues are automatically ironed out to some extent. I think?
Outputs

This is where it gets a little more interesting. Because I have a shitload of data in an Excel spreadsheet now, the number of outputs I could come up with it potentially very large. However, for now I will look at some basics only:

  • Age and "Years in the Business". Who are the oldest and youngest directors? Is there an optimal age? Do directors get better with age?
  • Prolificity. Some directors take forever to make movies (think Terrence Malick) and others are extremely prolific. Does this have an impact on ratings?
  • Running Time. Who directs the longest movies? How long is the average movie? Is there a correlation between the running time of a film and how it is regarded by critics and audiences?
  • Awards. This is not all that relevant, but it is interesting to see who has been shafted most often by the Academy.
  • IMDB Scores. Here I look at which directors have the best and worst IMDB scores, what the grand average is, what the best and worst films are and of course, who can be said to be the true "People's Director"?
  • Rotten Tomatoes Scores. Basically the same exercise as in the case of IMDB, but which director is generally the most appealing to critics? And the worst? When read together with other scores, Rotten Tomatoes scores can also give an idea of divisiveness, or can they?
  • Metacritic Scores. In essence, which director is the true Critics' Darling? The best average Metacritic score is surely a good indicator of critical acclaim. Also, who is the best of the rest?
  • Combinations. The grand average between the three scores tells us who the Best Director in the World is. Will we agree with the results? Averaging the Rotten Tomatoes and Metacritic Scores can give us another critics award. How do these two websites compare? Will we have the same champion across the board?
  • Deviations. Seeing how the scores differ can shed some further light I think. For instance:
    • deducting the IMDB score from the Rotten Tomatoes score could give an indication of how much general critical appeal differs from public appeal for a particular director. A high score means he's a critics' favourite, a score close to zero indicates balance, whereas a negative score means he's more of a people's director;
    • deducting the IMDB score from the Metacritic score gives an even better indication of critics vs audiences, because IMDB and Metacritic have the same quantitative basis; and
    • of course, I will also deduct the IMDB score from the average between Rotten Tomatoes and Metacritic to see what difference that makes.

Results

The best way to see whether a model works is by looking at the results. Do they make sense? Let's have a look...

Oh, I've just decided that this post would be way too long if I included everything. I will start with looking at Age today, and do one category every day for the next while. I don't want you skipping to the good bits!

Age and Years in the Business

Average Age: 59.59 years... pretty damn old! There's hope for me yet!
Youngest Directors: Christopher Nolan, Nicolas Winding Refn and Paul Thomas Anderson are all 42 years old (about, and at the time of entering the date). 
Oldest Director: Clint Eastwood at 82.

Average Years in the Business (being the difference between latest film and first film): 25.59.
Most Years in the Business: Roman Polanski had been directing films for 49 years!
Fewest Years in the Business: Sam Mendes has only been doing it for 10 years.

Do directors get better with age?

The graph below plots the average IMDB, RT and MC score against age. It's all a bit scattered, so I added trend lines for each to get an idea of what's happening. The result? Quentin Tarantino was right, directors do get worse with age! For each category, there is a clear decline in scores with age.


That may be the case for directors on average, but surely it doesn't apply to all of them? 

Number of directors getting better with age: 17 out of 50 (34%)

Number of directors getting worse with age: 33 out of 50 (66%)

Director with the steepest improvement highest gradient): Alfonso Cuaron. His career really has gone from strength to strength. His early films like Solo Con Tu Pareja and Great Expectations didn't receive much love, but his later films (Harry Potter and the Prisoner of Azkhaban and Children of Men) have been hugely successful with critics and audiences alike. He has Gravity coming out this year, which looks totally awesome, so it doesn't look like Mr Cuaron is slowing down either!

Director with the steepest decline (lowest gradient): Sam Mendes. Shame, but that's what happens when you kick off a career with American Beauty and fail to match it again. Let's hold thumbs for Skyfall.

Steadiest director (gradient closest to zero): Nicolas Winding Refn. After his success with Drive, I reckon he is only going to get better.

Okay, that's it for now. Tomorrow I will have a look at prolificity...

No comments:

Post a Comment

In Camera On Facebook