It Figures
September 29, 2010
The numbers behind team performances
Posted by Madhusudhan Ramakrishnan at in Test cricket

Glenn McGrath: a huge factor behind Australia's success © Getty Images

In a recent post, the Test performance of all teams across the ages was analysed. Australia have proved to be the most consistent team with an outstanding win-loss record throughout. In this piece however, I decided to take a more detailed look at the batting, bowling and fielding records of all teams over the years which will help better to analyse the performance of teams. This analysis does not take various periods into consideration but instead the records across all years which is a fair indicator of team strength and performance. The period wise analysis provides a more detailed performance evaluation and will be taken up in a later post.

The first table lists the number of batsmen in each team possessing an average greater than 40. I have considered a minimum qualification of 3000 runs. England have played the most Tests and also have the most batsmen averaging over 40 followed closely by Australia. West Indies have fallen been ordinary over the last decade, but had dominated world cricket earlier for almost three decades. The fact that they have 19 batsmen averaging over 40 clearly indicates the quality of batting they possessed in those years. India’s batting has been at its best since the mid 1990s with five batsmen in the period averaging greater than 40. South Africa also have an impressive number of batsmen averaging over 40 since their return to international cricket. Andy Flower has an excellent Test record and is the only batsman from Zimbabwe to make the list.

Number of batsmen averaging over 40 (min qualification 3000 runs)
Team No of batsmen Best batsman (terms of average) Highest average
England 26 Herbert Sutcliffe 60.73
Australia 25 Don Bradman 99.94
West Indies 19 Everton Weekes 58.61
India 12 Sachin Tendulkar 56.02
South Africa 9 Jacques Kallis 54.94
Pakistan 8 Javed Miandad 52.57
Sri Lanka 7 Kumar Sangakkara 56.85
New Zealand 2 Martin Crowe 45.36
Zimbabwe 1 Andy Flower 51.54

Dominant teams over the years have produced outstanding bowling attacks. West Indies in their heyday comfortably won in all conditions due to the presence of top class fast bowlers and the combination of Glenn McGrath and Shane Warne enabled Australia to rule world cricket in the late 1990s and 2000s. Australia, over the years have produced the finest bowlers consistently and their presence at the top of the table vindicates this. Alan Davidson had a fantastic average of 20.53 and among fast bowlers; McGrath and Dennis Lillee come closest. Among England bowlers, Sydney Barnes averaged a scarcely believable 16.43 picking up 189 wickets in just 27 Tests. But among bowlers who made their debut after 1990, only Darren Gough and Andy Caddick make the list.

West Indies through the 1970s to 1990s had a superb array of fast bowlers, each of them averaging below 30. Malcolm Marshall was the finest of them all, with a haul of 376 wickets at under 21. Pakistan’s fast bowling reserves have never been affected over the years and they have continued to churn out quality pace bowlers. Imran Khan was one of the world’s best bowlers in the early 1980s while Wasim Akram and Waqar Younis spearheaded the attack through the 1990s. India have had just four bowlers in the list, with Kapil Dev being the only fast bowler. Traditionally batting friendly tracks have undoubtedly been the reason behind the high averages of Indian bowlers. Allan Donald and Richard Hadlee have been the best bowlers for their respective teams. Muttiah Muralitharan holds virtually every record in the bowling department and it is no surprise he figures in the list as Sri Lanka’s best ever.

Number of bowlers averaging less than 30 (min qualification 150 wickets)
Team No of bowlers Best bowler (terms of average) Best average
Australia 17 Alan Davidson 20.53
England 14 Sydney Barnes 16.43
West Indies 10 Malcolm Marshall 20.94
Pakistan 6 Imran Khan 22.81
South Africa 5 Allan Donald 22.25
India 4 Bishan Singh Bedi 28.71
New Zealand 2 Richard Hadlee 22.29
Sri Lanka 2 Muttiah Muralitharan 22.67
Zimbabwe 1 Heath Streak 28.14

The table below looks at the number of batsmen in each team who have more than ten Test centuries. England have 28 batsmen who have over 10 hundreds, but the highest number of centuries is just 22, scored by Geoff Boycott. Australia are next with 24, but have three batsmen over 30 centuries, with Ricky Ponting leading the way with 39. West Indies are next with Brian Lara on top with 34 centuries including 9 scores over 200. India and Pakistan have 12 players with over 10 centuries and Sachin Tendulkar and Inzamam-ul-Haq top the hundreds tally.

Number of batsmen with over 10 centuries in Tests
Team No of batsmen Batsman with most 100s No of 100s
England 28 Geoff Boycott, Wally Hammond, Colin Cowdrey 22
Australia 24 Ricky Ponting 39
West Indies 16 Brian Lara 34
India 12 Sachin Tendulkar 48
Pakistan 12 Inzamam-ul-Haq 25
South Africa 8 Jacques Kallis 35
Sri Lanka 8 Mahela Jayawardene 28
New Zealand 3 Martin Crowe 17
Zimbabwe 1 Andy Flower 12

Shane Warne, with 37 five wicket hauls leads the list of 15 Australian bowlers with over ten five fors. Australia are followed by England, Pakistan and India. West Indies have three bowlers on top with 22 five fors which is further indication of how powerful their bowling attack was. Richard Hadlee is by far the finest New Zealand bowler with 36 five wicket hauls while Muttiah Muralitharan with 67 five fors is light years ahead of the next best by a Sri Lankan which is Chaminda Vaas with 12.

Number of bowlers with over 10 five fors in Tests
Team No of bowlers Bowler with most five fors Most five fors
Australia 15 Shane Warne 37
England 10 Ian Botham 27
Pakistan 9 Wasim Akram 25
India 8 Anil Kumble 35
West Indies 7 Curtly Ambrose, Malcolm Marshall, Courtney Walsh 22
South Africa 5 Allan Donald 20
New Zealand 4 Richard Hadlee 36
Sri Lanka 2 Muttiah Muralitharan 67

The next two tables are related to wicket-keeping and fielding dismissals. England have the most keepers with 100 plus dismissals and the list is led by Alan Knott. Australia have had three of the finest keepers over the last three decades and Adam Gilchrist tops the list with 416 dismissals. Bert Oldfield of Australia, with 52 stumpings still holds the record for the most stumpings. Mark Boucher surpassed Gilchrist and is the world record holder with over 500 dismissals.

Number of wicket keepers with more than 100 dismissals
Team No of wicket keepers Keeper with most dismissals No of dismissals
England 7 Alan Knott 269
Australia 6 Adam Gilchrist 416
West Indies 5 Jeff Dujon 270
Pakistan 5 Wasim Bari 228
India 4 Syed Kirmani 198
New Zealand 3 Adam Parore 201
South Africa 3 Mark Boucher 502
Sri Lanka 2 Kumar Sangakkara 144
Zimbabwe 1 Andy Flower 151

Australia have had a tradition of producing high class fielders, especially in the slip cordon. Bob Simpson, Greg Chappell, Mark Taylor and Mark Waugh have over a 100 catches with Mark Waugh leading the list. Ian Botham and Colin Cowdrey lead the list for England with 120 catches. Rahul Dravid overtook Mark Waugh’s tally and is closing in on 200 catches. Stephen Fleming and Mahela Jayawardene top the table for their respective teams.

Number of fielders with over 100 catches
Team No of fielders Fielder with most catches Most catches
Australia 10 Mark Waugh 181
England 5 Ian Botham, Colin Cowdrey 120
India 5 Rahul Dravid 195
West Indies 4 Brian Lara 164
South Africa 2 Jacques Kallis 155
New Zealand 1 Stephen Fleming 171
Sri Lanka 1 Mahela Jaywardene 161

* The highest number of catches by a Pakistani is 94 by Javed Miandad

Another factor that determines a team’s dominance is the innings per hundred. Australia lead the way in this regard too with a century every 17 innings and have a fairly excellent away record too with a century every 18.45 innings. Sri Lanka, surprisingly are second with a century every 18 innings but this is mainly due to their extraordinary home record. They have a century every 14.5 innings in home Tests and an even more incredible hundred every nine innings against Bangladesh and Zimbabwe.

West Indies, between 1960 and 1990, had an outstanding record of a century every 17 innings, but have fallen away since then. India’s away performance has consistently improved over the years and they have scored a century every 16.2 innings since 2000 which is far better than their overall away record which stands at 20.7 innings per century. Bangladesh’s predicament is Tests can be clearly seen from the fact that the batsman score a hundred only every 66 innings, which is far too high to be able to compete.

Innings/hundred for teams
Team Innings 100s Inns per 100 HS Batsman
Australia 12605 734 17.17 380 Matthew Hayden
Sri Lanka 3359 184 18.25 374 Mahela Jayawardene
West Indies 8200 432 18.98 400* Brian Lara
India 7565 396 19.10 319 Virender Sehwag
Pakistan 6045 312 19.37 337 Hanif Mohammad
England 15652 766 20.43 364 Len Hutton
South Africa 6319 289 21.86 277 Graeme Smith
New Zealand 6569 218 30.13 299 Martin Crowe
Zimbabwe 1601 42 38.11 266 Dave Houghton
Bangladesh 1449 22 65.86 158* Mohammed Ashraful

Comments (17)
September 24, 2010
Test teams: an analysis of results across ages
Posted by Anantha Narayanan at in Test cricket

Australia: The most consistent Test team ever © Getty Images

This is a simple analysis of the results of teams across ages. I have split the 133 year period into the following 8 ages.

PreWW1: 1877-1914
PreWW2: 1921-1939
1950s:  1946-1959
1960s:  1960-1969
1970s:  1970-1979
1980s:  1980-1989
1990s:  1990-1999
2000s:  2000-2010

A simple formula is used. Readers might find this a little simplistic but I am working with limited parameters to do justice to such a macroscopic analysis. My idea is to bring to light the teams which performed well during each period and then see how each team performed over the years since they made their debut in international cricket. Many of these insights might be obvious to some of the readers but this article is a single place compendium of team performances across the years. And the normal complaints of comparing players/teams across the ages do not arise in this analysis.

A win carries 2 points. A draw/tie will carry 1 point. The total points will be compiled during the concerned period. This is evaluated against the maximum points available for the team and a Performance % arrived at. There is also a need to recognize away performances. This is especially needed to break deadlocks. Take two teams which have played 10 matches each. Both win 5 matches and draw the remaining 5 matches. Both teams will have a 75% performance index. If team A won 3 away and 2 at home and the other team 2 away and 3 at home, Team A should be considered to have done slightly better. Hence I have provided 25% additional weight for away performances, that too, only for wins and draws. The actual weight given is less consequential than the fact that the away performances are recognized.

This is a simple analysis based on results. The relative team strengths or the series position or the win margins are not considered. That is a totally different type of analysis of Team Ratings.

Reg the Graphs. The first graph is the one covering all 1971 tests. This is across 133 years. This graph can be used to lead on to the other graphs. The Period graphs have been drawn in the order of teams' performances. I have also included the summary table for the period in the right as part of the graph for easier viewing and identification. At the end of the 8 period graphs, the graphs for the teams are drawn.

Summary of Test results across ages
© Anantha Narayanan

Australia leads the all-time table comfortably with a Performance value of 66.0%. England come next with 59.3%. It may be a surprise that Pakistan edges out West Indies for the third position. This has been a result of the recent fall from grace of the West Indian team. Again it is a surprise that Sri Lanka edges out India for the fifth place although it must be admitted that India has had a 50 year head start to put in some awful years earlier. This has also been made possible by Sri Lanka's strong showing during the 99 tests played during the 2000s. New Zealand is the only leading team to have an overall sub-50% index value.

Summary of Test results in the 2000s © Anantha Narayanan

The dominant team during 2000s has been the Australians with a Performance Index (PIdx) value of 84.6%. Even their recent wobble has only got them down a bit. They are still the team to beat. South Africa are next with 66.6% and then India with 63.9%. It is debatable whether India can maintain this ascendant graph over the next few years with the huge void which is going to be created. England, with its periodic high-level performances are in next and Sri Lanka, buoyed by their strong home record, complete the top-5. Pakistan comes in next despite their continuing problems and their inability to play at home. The next 4 teams each have significant daylight between themselves and the team ahead of them. One reason why the 2000s has seen a wider dispersion of the numbers are the increased number of decisive results (only 23.8% draws) and presence of two weak teams.

Summary of Test results in the 1990s
© Anantha Narayanan

The dominant team during 1990s has again been the Australians with a PIdx value of 69.7%. They have not had the extent of domination they had during the last decade. South Africa are next clocking in at a very close 68.2%. Pakistan, with the lethal bowling attack and great batsmen, are next placed with a PIDx value of 65.0%. The West Indies, not be confused with today's hapless and dispirited team, were fourth placed at 56.8%. India completed the top-5 barely crossing 50%. Sri Lanka, England and New Zealand were closely bunched around the 45% mark and even Zimbabwe clocked in at a respectable 34%.

The interesting point in this period was the close bunching of the teams. The difference between the first and ninth team was a low 35% as compared to the 2000s where this difference is a whopping 74%. The other surprising feature is the low number of matches played by teams other than the Ashes rivals. There have also been a greater number of draws (35.7%).

Summary of Test results in the 1980s
© Anantha Narayanan

It would not be a surprise to read the 1980s charts. The dominant team, by a mile, was the great West Indian team, with their quintet of outstanding pace bowlers and feared batting attack. They clock in at 81.2%. Next comes the Imran Khan controlled Pakistan with 61.9%. Now comes the Hadlee-inspired New Zealand with 58.0%. Australia is the only other team to have a 50+%. India's lack of match-winning players kept them in the lower half. England comes in next and finally the new entrants, Sri Lanka. This period witnessed 46.1% draws.

Summary of Test results in the 1970s
© Anantha Narayanan

The 1970s was an interesting period. Packer and World Series happened. England, probably less affected by WSC than Australia and West Indies were the leading, if not dominant, team with PIdx value of 63.3%. They are followed by four teams with 50+ %, led by West Indies. India, no doubt bolstered by Gavaskar and the spinners, did not do too badly. New Zealand had only a 33.5% index value. A tweak had to be done for this decade. South Africa played 4 tests and won all these. The 100% index value is an anomaly and should be removed from the analysis. This has been done. No major impact, though.

Summary of Test results in the 1960s
© Anantha Narayanan

The 1960s was very much a defensive era as evidenced by the single digit column of wins for four of the six teams. The three leading teams, West Indies, Australia and England were separated only at the decimal point level, that too only because West Indies had slightly better away results. The close bunching of teams during these two periods, 1960s and 1970s, is a reflection of the parity which existed between the teams. It is also caused by fewer decisive results 42.6% and 47.8%).

Summary of Test results in the 1950s
© Anantha Narayanan

The post-war period of 40s/50s was probably much better than the later dreary period. Bradman was there to start with. His legacy was continued by strong players. Australia had an outstanding PIdx value of 78.3%. England also had a very good team and were second with 61.7%, very closely followed by the W-driven West Indies. Surprisingly, Pakistan the new entrants were the next team having a better than 50% record. This is probably the best entrance decade for any of the later entrants. One must also allow for the fact that the pitches were conducive to the great strength of Pakistan, their seam bowling. The draw % was around 35%.

Summary of Test results pre World War 2
© Anantha Narayanan

The in-between Wars period was a two team period with Australia comfortably ahead of England. That England, despite Bradman, were only 7% behind Australia indicates the effective manner in which their strategies, starting with body-line, worked. The newcomers, India, New Zealand and West Indies propped up the table. There was a spurt in the draw % compared to the previous era, 37.1%.

Summary of Test results pre World War 1
© Anantha Narayanan

There were only three teams before WW1. England were the comfortable leaders during this period, no doubt aided by their bowling attack, led by Barnes and Lohmann. Not to forget Hobbs and Sutcliffe. Only 17.9% of the matches were draws, no doubt contributed by the types of pitches.

Now for the team performance graphs, presented in a different format. I have used line graphs instead of the bar graphs since it is easier to follow the changes. Also the graphs are shown in a chronological sequence. There is no graph for Bangladesh which has had one decade nor for Zimbabwe which has had two decades. It is not possible to derive anything sensible without three decades.

Summary of Test results for Australia
© Anantha Narayanan

Australia has maintained very steady performance levels throughout the 133 years. they are the only team never to have fallen below 50% in any of the periods. What is important is that Australia have topped in 4 out of the 8 periods, the PreWw2, 1940s-50s, 1990s and 2000s period.

Summary of Test results for England
© Anantha Narayanan

Barring the 1980s and 1990s, England have always maintained a 60+ % level. That is a consistency which is comparable to that of Australia. They have led the table in two of the eight periods, the 1970s and the Pre-WW1 periods.

Summary of Test results for West Indies
© Anantha Narayanan

West Indies led the table during two periods, the 1960s and 1980s but have since fallen off drastically, especially during the past decade. Their 80+% can be compared only to the Australians of the 2000s. Compared to the awful 2000s even the average 1990s looks good.

Summary of Test results for India
© Anantha Narayanan

India have had a poor start, understandable, and had a poor 1980s and barely acceptable 1990s. They recovered in the current decade although the huge chasm is in front of them. The day without the three gladiators at 3/4/5 is looming ahead. The bowling is another major concern. Where are the bowlers to take 20 wickets on good pitches?

Summary of Test results for South Africa
© Anantha Narayanan

Barring a slight dip in the current decade, South Africa have improved their figures every decade. Possibly the only team to do so. There is a caveat so far as South Africa are concerned. This has already been referred to in the period graph. They played 4 Tests during the 70s and won all. Since I did not want their graph to have an abrupt dip or spurt, I have allotted a notional % for this period. 1980s, of course, is excluded.

Summary of Test results for Pakistan
© Anantha Narayanan

Pakistan started very well, dropped off, picked up very well again but again fell of during the current decade. Overall they have been quite good. Very understandable in view of the circumstances. We must feel for the talented Pakistanis. As they seem to come out of one problem, another one crops up. Maybe it is time for Imran Khan to come forward and run Pakistan cricket the way he ran his team.

Summary of Test results for New Zealand
© Anantha Narayanan

New Zealand have been like the proverbial yo-yo. Down, up, down, up and so on.They had a golden 1980s when the kings were scattered around the tropical islands near Florida..

Summary of Test results for Sri Lanka
© Anantha Narayanan

Sri Lanka have had only three decades and have been steadily improving a la South Africa. However the impressive thing is that their excellent performance of 61+% has been over the last 100 tests, more than a half of their tally.

To view the "Results Summary - By Periods" tables, please click here.

To view the "Results Summary - By Teams" tables, please click here.

An important announcement to the readers. I have created an open mailid to which the comments and suggestions, not meant for publication, can be submitted. The mail id is ananth.itfigures@gmail.com. Since the readers would have to use a mail route I give the readers my assurance that the mail id is safe and will never be used by me for anything other than communicating with the reader specifically. This will not be part of any group mail nor will mails be cc'd.

Comments (59)
September 17, 2010
ODI Bowlers: a totally new look through BCG charts
Posted by Anantha Narayanan at in Bowling

Glenn McGrath: in the right BCG quadrant almost all the time © Getty Images
This article is a completely different graphical look at the ODI bowlers and is a continuation of the similar article on ODI batsmen.

Just to recap, Bruce Henderson of BCG (Boston Consulting Group) had created these charts during 1968 to study the Growth-Share aspects of products/business units. This is an excellent way to study two related variables together. These are plotted on a graph which is split into four equal (or unequal) size quadrants. The placement of a particular player, gives excellent insight into the bowler's position in the galaxy of bowlers. However please do not forget that this is clearly a two-dimensional graph between two related variables. Also these are all career figures.

Bowling is a far more cleaner and crisper playing aspect which lends itself to excellent analysis. There are only two independent variables, bowling strike rate and bowling accuracy, in the form of Rpb. These two together can be used to generate the bowling average, which is a single measure incorporating the two constituent parameters very clearly, unlike the batting average which has the dicey not outs concept embedded within. The bowling strike rate is represented in X-axis and the bowling accuracy (Rpo or Rpb, it does not matter what we take) in the Y-axis. The only special requirement is that, in bowling both variables have reverse-effectiveness in that the lower these are the better the bowler is. hence I have laid down the axis from the highest to lowest values.


© Ananth Narayanan


The above represents a typical BCG chart. The bowlers in the top-right quadrant, the red one, are the "Top bowlers". They are to the right of the Bowling strike rate line and above the Rpo line. The ones in the bottom right quadrant, the green one, are the "Attackers". They capture wickets quite frequently but go for plenty of runs. Certainly an asset, but could do better. Similarly, the top left quadrant, the blue one, contains the "Defenders". They take more balls to capture a wicket but are miserly. They are equally valuable as the Attackers. The bottom left quadrant, the orange one, represents the "Also rans". They fall behind in both areas and lag behind the others.

As I explained in the batsmen article, the two central dividing lines can be drawn in two ways. One is to draw the same right in the middle. However this does not take into account the distribution of values. The alternative method is to draw the lines around the median value so that we get around half the bowlers on top of the mid line of the Rpo and around half the bowlers to the right of the mid line of the Bowling strike rate line. This leads to unequal quadrants but would make analysis of the bowlers far more meaningful. Let me add that the drawing of the asymmetrical central lines is my own idea and most of the BCG charts have only centrally located divider lines. However my idea of asymmetrical dividing lines ensures a fairer distribution of players across quadrants.

Finally the chart is drawn on two criteria. The top wicket-takers and the top bowlers based on bowling averages, the minimum wickets requirement for the later selection being 100 wickets.

I have also changed the graph presentation method. I have used Gaurav's suggestion and drawn fixed diameter circles supported by numbers. There is a legend at the right hand side to link the numbers to the bowler names. It has come out very well and the graph is now uncluttered.

The first chart is drawn with the wickets captured as the criteria. 200 wickets are the cut-off for selection. Anything fewer will clutter up the graph. Already I feel we are over-populated. The median Strike rate is 38 and the median Rpo is just short of 4.3. The dividing lines are drawn around these figures. These lines split the distribution approximately equally on either side. Let us now look at the chart.

Graph of runs scored
© Ananth Narayanan


The "Top performers" are led by Glenn McGrath and closely followed by Muttiah Muralitharan and Wasim Akram. Donald and Saqlain have excellent strike rates and quite good Rpo figures. Hence they are also in this quadrant. Warne and McDermott have taken more balls to capture a wicket but are also comfortably in the top bowlers quadrant. It is difficult to question the credentials of any of these ODI greats.

The "Attackers" group is led by Lee, Waqar Younis and Shoaib Akhtar. Incidentally these two Pakistani greats are almost like Siamese twins with almost identical figures (30.5/4.69 and 30.7/4.69). Agarkar has excellent strike rate but is a millionaire when it comes to conceding runs. Ntini has similar strike rate but is far more economical. Srinath, Gough and Zaheer just about make it to this quadrant.

The "Defenders" group is led by Ambrose and Pollock. they are followed by Kapil Dev and Walsh. A few other modern spinners, Harris, Vettori, Kumble and Harbhajan are at the border-line.

The "Also rans", has Afridi and Jayasuriya as prominent members. These are followed by Streak, Razzaq, Cairns and Kallis.

Lee is the outlier as far as the strike rate is concerned with a sub-30 Bpw figure. The three Pakistani greats have just over 30. Ambrose is the outlier as far as Rpo is concerned with a sub-3.5 figure. Pollock and Kapil Dev follow next.

The second chart is drawn with the Bowling average as the criteria. 26.00 is the cut-off with a minimum of 100 wickets. The median Strike rate is around 34 and the median Rpo is around 4.2. The dividing lines are drawn around these figures. These lines split the distribution approximately equally on either side. Let us now look at the chart. The selected bowlers are distributed around the whole graph quite well.

Graph of runs scored
© Ananth Narayanan


This selection is far more stringent because of the dual cut-offs. There are only two bowlers in the "Top performers" quadrant. Only McGrath and Donald make the cut. Their positioning is also intriguingly close to the middle lines. McGrath is more economical but takes a few extra deliveries per wicket. Donald is the other way around. Two of the greatest of ODI bowlers ever.

Lee leads in the "Attackers" group and is closely followed by the Siamese-twins, Waqar Younis and Shoaib Akhtar. Saqlain is also very close to the top performer quadrant. Bond is another exciting new entry who is in a similar position. Maharoof and Flintoff are also in this group.

Lillee and Muralitharan are the leaders in the "Defenders" group and these two are quite close to the top performers. Joel Garner is a well-deserved new entrant here and has an outstanding economy rate. Wasim Akram is well-placed here. This group also has other wonderful ODI bowlers like Hadlee, Pollock, Ambrose and Holding.

Warne is in the fourth quadrant but is very close to the Rpo dividing line. Fleming is close to the Strike rate dividing line. Because we have taken only bowlers of average 26 and below, there are very few poor performers.

Lee is the outlier as far as the strike rate is concerned with a sub-30 Bpw figure. Bond follows him. Garner is the outlier as far as Rpo is concerned with an amazing 3.1. Hadlee and Holding follow him.

There is a clear trend here. Where the bowlers are clustered together, the concentration is on the middle performance quadrants. There are very few bowlers in the high performance and low performance quadrants. This trend is different to the wickets based graph where the bowlers are scattered all over the graph. Hence there are more players are present in the extreme performance quadrants.

I have also drawn the charts for the top players by Strike rate. Unlike the corresponding batting chart, this has got excellent distribution all over the graph. This is mainly because the Bowling strike rate values are closely bunched together. The top group is led by Donald, Saqlain and Bond. The chart can be seen below.

Graph of runs scored
© Ananth Narayanan


The last one is the chart of the top bowlers by Rpo. Here also there is good distribution. The leading bowlers are Lillee, Hadlee, Holding, McGrath, Wasim Akram and Muralitharan. What a collection of greats. The chart can be seen below.

Graph of runs scored
© Ananth Narayanan


I will attempt a similar analysis on Test Batsmen/Bowlers. I think I have got the two variables for batting identified. That will be the Batting average and Average career weighted bowling quality faced. The results seem to be coming out very well. Readers are welcome to give their suggestions.

An important announcement to the readers. In one of my comments I had mentioned that I would create an open mail id to which readers could send their suggestions. To start with I would appreciate if readers can send in their suggestions on batting and bowling performances in the third innings or the Test batting BCG charts. I will complete my work and depending on the reader responses will incorporate a few popular performances amongst these. Please note that this is a one-to-one communication and the contents will not be published. Please continue to use the blog posting method for the comments you want to be published. This is not my mail id and has been created only for this purpose. To separate the spam, it will be a nice idea if all readers can follow a simple idea of making their title as "It Figures Blog: ..............".

The mail id is ananth.itfigures@gmail.com

Since the readers would have to use a mail route I give the readers my assurance that the mail id is safe and will never be used by me for anything other than communicating with the reader specifically. This will not be part of any group mail nor will mails be cc'd.

Comments (48)
September 14, 2010
Form is temporary ...
Posted by Gabriel Rogers at in Batting

Alastair Cook: has he escaped his run of bad form? © Getty Images

Having written a couple of blogs unpicking the value of innings-to-innings consistency among batsmen and bowlers, I'm now turning my attention to variability of performance over longer periods. In these analyses, I look at how players' careers are made up of spells of relative success and failure. In other words, what I'm interested in is the statistical basis of what we often call form. Once again, I'm going to start with batsmen and, for reasons of space, I've concentrated on Test cricket only.

The key statistical technique I have used to look at this issue is the simple moving average. That is to say, I have cut up each player's career into a series of overlapping blocks of the same length, and calculated his average for each block in turn. In my base case, the length of block I have chosen is 20 innings. This means that we start with the individual's average over his first 20 innings, then we look at innings 2–21, then innings 3–22, and so on. (There are good arguments for using a slightly more sophisticated kind of moving average; if you're interested in why I didn't, please see the Technical Appendix at the foot of this blog.)

Later, I'm going to do some number-crunching on the results of my analysis but, to begin with, I want to do something a bit simpler. I want to draw pictures of the results. By and large, I think that cricket statisticians tend to be pretty poor at finding helpful ways of visually presenting the scads of data we often turn out, and we could all do with giving more thought to information graphics. There's a couple of visualisations we routinely see on telly (especially in limited-overs cricket, in which the so-called "worm" and "Manhattan" are used with some frequency), but I'm convinced it would be useful to have an awful lot more tricks of this kind up our sleeves. [Note: I drafted this paragraph before Anantha published his most recent It Figures blog, which I was really pleased to see.]

I find it particularly remarkable that there is no common way of depicting individual players' career records over time (what a statistician would call a longitudinal approach). We all know that, to one degree or another, all players go through peaks and troughs of performance, and that the career stats with which they end up iron out the kinks in their record, through the magic of aggregates and averages. I think it would be great to have a way of thinking about – and looking at – the information that gets lost.

So, in this column, I am introducing my stab at plugging this gap. Because I'm a statistician, I call it the Longitudinal Career Graph (LCG for short); if I were a telly producer, I'd probably call it an iceberg plot, or something like that. An example is shown in Figure 1, depicting Sachin Tendulkar's test batting career. There are two key features:

* Firstly, the player's moving average throughout his career is given in the shaded area. It is shown relative to his long-run career average, which is pegged to the central axis: whenever the black area is above the axis, the player averaged more over the previous 20 innings than he did over his whole career and, whenever the black area is below the axis, his average for the last 20 innings was worse than he achieved in the long run. The advantage of presenting the data in this way is that it allows us immediately to see a given player's hot and cold streaks in relation to his overall level of performance (which is important because, of course, the kind of figures that constitute a purple patch for one player might represent a dry spell for another).

* Secondly, the evolution of the player's career average over time is indicated by the red line (this is a straightforward depiction of what Statsguru calls the cumulative average). Because the final career average is the point of reference for the moving average plot, the red line will always end at the exact point around which the black area pivots.

Fig 1 Longitudinal Career Graph showing Sachin Tendulkar's Test batting career (20-innings moving average) © Gabriel Rogers

(By the way, I'm not going to use them here, because I can't squish them into the 470 pixels Cricinfo give me to play with, but I've also developed a flashier version which gives more context about where and against whom runs were made – here's Tendulkar again, as an example.)

As you get used to reading these graphs, you'll come to recognise that Tendulkar's LCG shows a pretty constant level of achievement, without too much in the way of dramatic swings of form (that is to say, there's not a whole lot of black on his graph). Nevertheless, we can see relatively good and relatively bad streaks, perhaps most obviously over his last 50 or 60 knocks, with an apparent drop-off in form reaching a nadir at the turn of 2007, and then a distinct renaissance over the last two years (over his last 20 innings, he averages 78.22, with 7 hundreds, which isn't far behind his best-ever 20-knock streak of 81.17).

If you prefer a few more thrills on your rollercoaster ride, how about Mohammad Yousuf's test career, shown in Figure 2? There's a lot more shaded area on his LCG, indicating that his career has been subject to more dramatic ups and downs. Most conspicuous of all is the amazing peak he reached at the end of 2006. In the 20 innings from the tail-end of 2005 to that point, he scored 2011 runs at an average of 105.84, reaching three figures in precisely half of those 20 knocks. There are troughs to go with the peaks, though, including one at the present moment (he averages 31.80, without a single century, in his last 20 Test innings).

Fig 2 Longitudinal Career Graph showing Mohammad Yousuf's Test batting career (20-innings moving average) © Gabriel Rogers

So much for pretty pictures; what about some numbers? The question I address, here, is which cricketers' careers appear to have been more (or less) streaky. In order to quantify streakiness, I use a measure that is directly related to the area of black on each batsman's LCG – the greater the area, the streakier the player. [Technically, the measure is the root mean squared deviation of the moving average relative to the long-run career average, which is then scaled by the overall average, to provide CV(RMSD).] Table 1 gives a list of the most and least streaky batsmen in Test history, sorted according to this measure.

Table 1: Streakiest batsmen in Test cricket, according to variation [CV(RMSD)] in 20-innings moving average
NameMIRAve20-Inns Min20-Inns Max20-Inns RngCV(RMSD)p
1.Gatting MW791384,40935.5619.9486.9266.980.5050.002
2.Vengsarkar DB1161856,86842.1320.35114.1793.820.4850.001
3.Adams JC54903,01241.2619.1191.7972.680.4820.038
4.Shoaib Mohammad45682,70544.3427.2686.6959.420.4320.020
5.Hussey MEK52903,98151.0422.2191.7169.500.4220.007
6.Flower A631124,79451.5527.26115.7988.520.4210.028
7.de Silva PA931596,36142.9818.20103.4085.200.4060.008
8.Fletcher KWR59963,27239.9015.2175.2960.080.4000.056
9.Tillakaratne HP831314,54542.8821.06101.0079.940.3970.049
10.Macartney CG35552,13141.7815.8473.0057.160.3960.008
...
13.Gambhir G32572,80052.8332.3291.1758.850.3920.004
14.Chanderpaul S1262158,96949.2824.16122.0997.930.3920.019
15.Imran Khan881263,80737.6919.1782.5063.330.3850.043
...
26.Mohammad Yousuf901567,53052.2926.70105.8479.140.3470.025
...
35.Waugh SR16826010,92751.0621.74104.6982.960.3310.178
36.Sangakkara KC911528,01656.8534.42110.0075.580.3280.079
...
39.Sobers GS931608,03257.7828.00103.9475.940.3160.186
...
41.Hayden ML1021828,43750.2225.8094.0068.200.3100.051
...
43.Kallis JH13923511,04354.9424.3595.6071.250.3100.092
...
46.Ponting RT14524511,92654.7129.7294.4764.750.3050.073
...
68.Dravid RS14124311,46753.3323.8488.8164.970.2800.170
...
79.Richards IVA1211828,54050.2427.6889.6061.920.2680.241
...
94.Gavaskar SM12521410,12251.1224.2687.8463.580.2560.394
...
129.Lara BC13023011,91253.1828.4583.8955.440.2400.700
...
162.Tendulkar SR16927613,83756.0228.9581.1852.230.2160.838
...
166.Sehwag V781336,95654.3428.2674.8446.580.2140.728
...
217.Bradman DG52806,99699.9467.05132.6165.560.1610.754
...
226.Hobbs JB601025,41056.9539.7173.2233.520.1520.686
...
229.Pietersen KP661175,30647.8035.3764.3729.000.1480.880
...
246.Greig AW58933,59940.4431.2056.0024.800.1260.883
247.Imran Farhat39752,32731.8826.5542.2815.730.1250.826
248.Cowper RM27462,06146.8439.2559.3720.120.1230.868
249.Wessels KC40712,78841.0029.8951.2521.360.1230.925
250.Richardson MH38652,77644.7735.4057.1121.710.1170.714
251.Chauhan CPS40682,08431.5823.8938.1014.210.1120.850
252.D'Oliveira BL44702,48440.0631.5049.4717.970.1040.968
253.Cook AN601084,36442.7832.0053.2421.240.1030.993
254.Bravo DJ37682,17532.4626.8539.6812.830.1000.897
255.Rameez Raja57942,83331.8326.3738.9512.580.0990.972
qual. 2,000 runs; stats correct at 30-Aug-2010; full list available here

Streakiest of the lot is Mike Gatting. His career consisted of three clear phases: to start with, he looked like he was going to fail to live up to the reputation he had gained in county cricket, with a moving average between 20 and 30 for his first fifty or so Test innings; then, he found his feet at Test level and, for the next fifty knocks, his moving average was over 40 (and, at its peak, rose to 86.92); that level of achievement couldn't last, however, and he sank back to 20–30 when he was recalled in the 1990s. The upshot of all this is that Gatting's career average of 35 is a terrible estimator of how he performed at any one time – he was either much better than that or much worse, depending on which phase you caught him in.

Fig 3 Longitudinal Career Graph showing Mike Gatting's Test batting career (20-innings moving average) © Gabriel Rogers

The best-ever 20-innings streaks are Bradman's, naturally (in fact, there are only nine batsmen who have achieved over 20 innings what Bradman managed to sustain over a whole career four times that length). Behind the Don, we find Shivnarine Chanderpaul, who, from the second innings of the Old Trafford Test of 2007 until the first innings in Napier the following year, averaged 122.09. That streak produces a dramatic peak in his LCG (Figure 4), one that is exaggerated by the notable dips in performance that are also evident – indeed, no-one's best and worst streaks encompass such a broad range as his.

Fig 4 Longitudinal Career Graph showing Shivnarine Chanderpaul's Test batting career (20-innings moving average) © Gabriel Rogers

Another remarkable case is that of Aravinda de Silva. There is a massive gap in average between his worst 20-knock streak (18.20) and his best (103.40), but what makes this gulf doubly notable is that the two streaks were almost directly consecutive (there was just one innings between them).

At the other end of the scale, the least streaky batsman in Test history was one of Gatting's opponents on the most infamous day of his career (and a fella who happens to be on the radio as I draft this), Rameez Raja. His LCG shows that he had almost no form-related deviations in his career. He averaged 33.37 over his first 20 test innings, and scarcely deviated from that level at any stage in his career, ending with a long-run average of 31.83. In his best 20 innings, he averaged 38.95; in his worst 20, 26.37.

Fig 5 Longitudinal Career Graph showing Rameez Raja's Test batting career (20-innings moving average) © Gabriel Rogers

It's not a surprise that the ranks of the least streaky include several batsmen whom I previously identified as having consistent records on an innings-to-innings level. Mark Richardson is there, and it is further evidence of his consistency to see that his 20-innings moving average never dropped any lower than 35.40 (only 11 players have done better than that). Other players who feature in the most consistent 20 of both lists are Richardson's namesake, Peter, Alastair Cook (more about him in a minute), Ranatunga, Bravo, Rameez, Chauhan, Greig, and Stollmeyer. It stands to reason that the batsmen with least variability in their records would also be those whose average stayed pretty constant throughout.

The same isn't true at the other end of the list, however: the streakiest batters are not the same ones who appeared least consistent on an innings-to-innings level. To start with, this surprised me but, after a moment's thought, it makes perfect sense: if your performance in any given innings is unpredictable, then you're less likely to end up with extended phases of good and poor performance (and, if you were consistently poor, then you'd be dropped).

Unlike innings-to-innings consistency - which I showed to be weakly, but identifiably, correlated with both higher runscoring and likelihood of victory - there is absolutely no evidence of an association between streakiness (or the lack of it) and overall batting average or win-rate (r 2=0.001, p=0.507 and r 2<0.001, p=0.648, respectively). Some good players have up-and-down records; others are much more stable. There's no evidence of an overall advantage for either profile.

The analyses above are all well and good, but do they really help us to understand form? In order to answer that question, it is important to make a distinction between a run of good (or bad) form and a run of good (or bad) scores. Batsmen themselves sometimes make a very similar point, especially when it comes to streaks of low scores (how often did Michael Vaughan tell us he was in great nick; he just kept getting out?) It is central to this argument – and central to the science of statistics – that we should attempt to distinguish any real trend from the influence of chance. If you roll a pair of dice many times, you're bound to observe runs of high scores and runs of low ones, even though the probability of getting any particular result is the same every time you roll the dice and, in the long run, the overall average will be 7.

The way in which we tend to think of form in cricket is not like this at all, though: it is much more like imagining that there are series of rolls when the dice are weighted to make a high score more likely, and series of rolls when low ones are most probable. So how do we distinguish between the two models? The key to the answer is that, if you had a pair of non-constantly weighted dice, you would observe greater variation in your overall series of rolls than you would if there was nothing but plain old luck at play.

To apply this principle to cricket data, I used a statistical technique called bootstrapping. I took each batsman's career and put the innings in a random order, to create a new virtual career, but one in which the sequence of knocks is based purely on chance, with no fundamental underlying trends (i.e. no form). For each batsman, I generated 10,000 form-free careers of this type. Then I compared the amount of variability in the random careers with what we see in the batsman's real record. In particular, I worked out the proportion of simulations showing at least as much streakiness – i.e. at least as high a RMSD based on the 20-innings moving average – as the batsman's actual career. This gives us an estimate of the probability that a career as streaky as (or more streaky than) the batsman's real one would have arisen even if there was no underlying variation in form. A statistician would call this estimate an empirical one-tailed p-value.

The p-value for each player is given in Table 1. It will be clear from the explanation above that small p-values (indicating a low likelihood that the player's career would have turned out at least as streaky as it did through chance variation alone) increase our confidence that there probably is evidence of form-related fluctuations in a player's career.

To give one obvious example: it seems extremely unlikely (p=0.007) that a career with the profile of Mike Hussey's would have developed unless there was some kind of variation in his underlying run-scoring capacity (i.e. form). His LCG (Figure 6) gives a fairly dramatic depiction of the deterioration (and subsequent slight resurgence) in his scoring.

Fig 6 Longitudinal Career Graph showing Michael Hussey's Test batting career (20-innings moving average) © Gabriel Rogers

A few other players have careers that show the opposite profile; for instance, chance seems like an unlikely explanation of the clear upward trend to Daniel Vettori's Test batting career (p=0.018). Others have careers that are too up-and-down (Yousuf, Chanderpaul, de Silva), or too dominated by one atypical peak (Gatting, Vengsarkar) to be likely to have occurred without some underlying variability in form.

However, it turns out that cases like these are the exception rather than the rule. In a substantial majority of cases, the careers batsmen end up with are perfectly consistent with the hypothesis that an individual's long-run average provides a reasonable estimator of his run-scoring ability throughout his time in the game. This suggests pretty strongly that a lot of what we think of as form is really just random variation – the streakiness of the evenly weighted dice. Cricket fans are not alone in this: it is very well established that human beings – and perhaps especially sports fans – have a pretty poor appreciation of the play of chance (a phenomenon known as the clustering illusion).

A case in point is Alastair Cook. A couple of weeks ago, gallons of newsprint were spilled describing his supposed slump in form. However, it turns out that his is one of the least form-inflected careers of all, as his LCG (Figure 7) shows. Even before his recent Oval revival, he had averaged 39.16 in his last 20 Test innings – hardly setting the world on fire, but hardly the record of a lost cause, either. In fact, his best-ever 20-innings run in Test cricket is 53.24, and his worst is 32.00 and, in the grand scheme of things, this is not very much variation at all. This much can be inferred from the fact that the streaks overlap: there are 11 innings that appear in both!

Fig 7 Longitudinal Career Graph showing Alastair Cook's Test batting career (20-innings moving average) © Gabriel Rogers

When I took Cook's innings and put them in a random order 10,000 times, a huge majority – 9,925 – of those virtual careers showed greater streakiness than we see in his actual career. If you could see the LCGs of the form-free careers, they would almost all have conspicuously more black on them than we see on Cook's real-world graph (in the most extreme, "Cook" averaged 20.55 in one 20-innings streak and 91.19 in another). And just about all of them contained at least one cold streak that looks much worse than his recent slump.

In fact, Cook is just an extreme example of a phenomenon that is very widely observed in this dataset. Brian Lara was in an extraordinary run of good form when he averaged 83.89 in 20 consecutive innings in 2004–05, right? But shuffle his scores around at random and just over three quarters of the careers you produce will contain a streak just as hot. There's a greater weight of evidence to mark out Rahul Dravid's slump of a couple of years ago as "real" but, still, put his innings in any old order and, about 15% of the time, you'll end up with a trough at least as deep. That's a degree of uncertainty that would be very unlikely to convince statisticians in any other field that we were looking at anything other than a blip.

In this respect, I hope that, as the pressure mounted on Cook, he adopted an attitude similar to that advised by Greg Chappell (as quoted by Aakash Chopra in this column): "When not in form you should look back at your career stats. More often than not you'd find that you scored runs in every fourth or fifth innings, and hence every innings of low score is actually taking you closer to the innings in which you'd score runs." This is, doubtless, excellent advice from a psychological perspective and it's almost excellent advice from a statistical perspective, too (although we should be careful of the gambler's fallacy – that is, assuming that streaks are liable to correct themselves by some sort of "law of averages"). What we can say is that many apparent slumps like Cook's recent one are, mathematically speaking, entirely consistent with simple random variation around a constant mean that is well estimated by the batsman's career average. Or, in other words, form is temporary, but class... well, even if it isn't permanent, it seldom fluctuates much.


Technical appendix

1. To start with, an acknowledgement. The approach set out in this blog is heavily influenced by (and, in some places, directly pinched from) Curve Ball, an excellent book on baseball stats by two academic statisticians. (It's aimed at people who are fascinated by baseball and mildly interested in numbers, but I've found it works just as well for those of us who'd put that the other way around.)

2. It may be noted that, although I've presented some p-values, I haven't, at any stage, used the dread words statistically significant. Conventionally, we talk about a finding being significant if its p-value is lower than some threshold. That threshold is very often 0.05 – equivalent to saying we'll accept a 1-in-20 chance of considering our finding significant when, in fact, it's just a fluke. I'm wary of this approach, for a couple of reasons: firstly, the threshold is always arbitrary, and always involves a trade-off between type I and type II errors (in other words, the more cautious you are about interpreting something as significant, the greater the chance that you'll falsely classify something as non-significant). Secondly, there's a problem, here, with multiple testing. There are 255 batsmen in the dataset, so we'd expect to end up with 12 or 13 with p-values less than 0.05 just by chance. You could correct for this, using Bonferroni methods or similar, but I took the view that that would be complicated to explain, probably unnecessarily conservative, and would put too much stress on my approximated p-values (it would require p to be accurate to five or six decimal places, and you'd need a lot more than 10,000 samples to establish that). For these reasons, I present my p-values without correction and without (much) comment.

3. Whenever an analysis is dependent on a statistician's arbitrary choices, it is crucial to examine how much of an influence these decisions had on the results of the analysis. This is a process known as sensitivity analysis, because it analyses the extent to which the outputs of the process are sensitive to its underlying assumptions.

I did loads of these analyses. The most obvious place to start is with the size of the window over which the moving average was calculated. I looked at longer and shorter windows; here are the results for 10 innings and 30 innings. You'll see that neither list is terrifically different from the 20-innings analysis. It's interesting to see that there have been a few players who've managed 10-innings streaks with higher averages than Bradman's best; highest of all is Kumar Sangakkara's 2006–07 effort of 1,185 runs with 6 hundreds (5 of them 150+) at 197.50. No one other than Bradman has ever sustained an average of 100-plus for 30 innings, though.

Another obvious sensitivity analysis is to question the use of the simple moving average at all. The measure has some disadvantages, the most notable amongst which is that it can appear to be driven not by what's happening at a particular moment in time, but by what happened 20 innings before (take another look at Tendulkar's LCG: that sudden drop-off towards halfway through 2005 comes about because it's the point at which his 241* at the SCG in 2004 is more than 20 innings ago and, thus, falls out of the calculated moving average). An alternative approach that minimises this problem is the exponentially weighted moving average, in which innings are never completely discarded; they just receive ever-decreasing weight as they recede into the past. I chose not to use this method, in my base case, because it answers a slightly different question – something like: taking into account everything we know about a player's career to date, and placing more importance on his most recent outings, what kind of form was he in at any given instant? This is a valid question that might have its uses (perhaps if you were trying to predict how well you expect the player to do in his next innings – although it doesn't answer that question very well). However, it's not quite what I'm interested in, here, which is capturing how well a batsman did over a given phase (and, in that context, I think it's entirely appropriate that the measure should be influenced by notable scores falling out of the window of interest).

Nevertheless, to investigate how much difference the alternative approach makes, I redid all the analyses detailed above using EWMAs instead of the simple moving average. The weighting coefficient I used was 0.066967, which may sound like a weird number, but it's the one that dictates that the weight applied halves every ten innings (so ten innings ago is worth 50% as much, 20 innings ago 25%, and so on). The results table is here. By and large, there is very little difference between these results and those calculated according to the simple moving average. Maybe this mode of analysis gives very slightly more prominence to players who have a distinct trend to their careers (either worsening – a la Adams and Hussey – or improving – like Vettori and Imran). On the whole, though, I can't tell much difference between them.

4. If any statsheads read my methods and inferred (correctly) that I used bootstrap sampling without replacement, and thought that I really should have used a with-replacement approach, it's a fair cop. I just thought it'd be much easier to explain the process as shuffling the deck rather than sampling from a theoretical distribution approximated by the empirical dataset. I did some sensitivity to show that it doesn't make a huge amount of difference, in this case, but I accept that with-replacement is theoretically the better approach (plus, of course, it allows you to do amusing things like estimate confidence intervals for the batting average) (another time).

Comments (34)
September 8, 2010
ODI batsmen: a totally new look through BCG charts
Posted by Anantha Narayanan at in ODIs

This article is a completely different graphical look at the ODI batsmen and has been inspired by the work done by my friend Arvind Iyengar who did a similar analysis in a cricketing site to which we both contribute. I have done some significant changes and increased the scope of analysis.

Bruce Henderson of BCG (Boston Consulting group) had created these charts during 1968 to study the Growth-Share aspects of products/business units. This is an excellent way to study two related variables together. These are plotted on a graph which is split into four equal (or unequal) size quadrants. The placement of a particular product, in this case, the batsman, gives excellent insight into the batsman's position in the galaxy of batsmen.

Arvind had drawn the chart between Batsman strike rate and Batting average. I felt that the Batting average was a wrong variable since that is arrived at by multiplying Strike rate and Average balls per innings. Consequently the Strike rate is represented in both X and Y axis. hence I have changed the Axis variables to Strike rate and Average balls per innings.


The above represents a typical BCG chart. The batsmen in the top-right quadrant, the red one, are the "Top performers". They are to the right of the Strike rate line and above the Average balls per innings line. The ones in the bottom right quadrant, the green one, are the "Dashers". They score quite fast but do not last for many balls. Certainly an asset, but could do better. Similarly, the top left quadrant, the blue one, contains the "Stayers". They last long but score relatively slowly. They are probably more valuable in the ODI game. However the dashers are likely to be more valuable in the T20 game. The bottom left quadrant, the orange one, represents the "Also rans". They fall behind in both areas.

A few things are to be made clear. I have used the Average balls per played innings rather than the balls per dismissed innings. This is to make the analysis fairer across all batsmen since the later measures would benefit the middle order a lot, possibly out of proportionately.

The other thing is that the two central dividing lines can be drawn in two ways. One is to draw the same right in the middle. However this does not take into account the distribution of values. The alternative method is to draw the lines around the median value so that we get around half the batsman on top of the mid line of the Average balls per innings and around half the batsmen to the right of the mid line of the Strike rate line. This leads to unequal quadrants but would make analysis of the batsmen far more meaningful. Let me add that the drawing of the asymmetrical central lines is my own idea and most of the BCG charts have only centrally located divider lines. However my idea of asymmetrical dividing lines ensures a fairer distribution of players across quadrants.

Finally the chart is drawn on two criteria. The top run getters and the top batting averages are used as different criteria, the minimum runs requirement for the later selection being 2500.

The first chart is drawn with the runs scored as the criteria. 6500 runs are the cut-off for selection. Anything fewer will clutter up the graph. Already I feel we are over-populated. The median Strike rate is around 76 and the median Average balls is just short of 45. The dividing lines are drawn around these figures. These lines split the distribution approximately equally on either side. Let us now look at the chart. At the end I have also shown the alternate graph in which the dividing lines are drawn right in the middle.

Graph of runs scored
© Ananth Narayanan


The "Top performers" are led by Tendulkar and include Ponting, Lara, Mark Waugh, Saeed Anwar and Richards. It is difficult to question the credentials of any of these ODI greats. Gayle just about falls short of breaking in. The "Dashers" group is led by Gilchrist, Sehwag, Jayasuriya, Gayle and Yuvraj. Quite a few attacking batsmen also fill this group. The "Stayers" group is led by Haynes and is followed by Kallis, Ganguly, Miandad and a few others. The strugglers group, the "Also rans", has Border, Fleming and Azharuddin as the prominent members. Andy Flower and Sangakkara are in this group but are quite close to the central point. Sangakkara could move out of this group by either increasing his scoring rate or average balls.

The second chart is drawn with the Batting average as the criteria. 40.00 is the cut-off with a minimum of 2500 runs. The median Strike rate is around 76 and the median Average balls is around 46. The dividing lines are drawn around these figures. These lines split the distribution approximately equally on either side. Let us now look at the chart. The selected batsmen are distributed around the whole graph quite well. Hence the mid-point graph is not necessary. It will be almost the same as this one.

Graph of batting average
© Ananth Narayanan


The "Top performers" are Tendulkar, Zaheer Abbas, Hayden and Ponting. No one else is even on the border. Richards leads in the "Dashers" group and is followed by de Villiers, Dhoni, Hussey and Pietersen. Greenidge and Haynes top the "Stayers" group which also has Jones, Ganguly and Bevan. The last "Also ran" group has very few members. Even amongst these, Sarwan, Clark and Martyn are very close to the centre line of the Strike rate. It is easy to conclude that once we select batsmen with averages of 40.00 it is difficult to find really average performers. They compensate for deficiency in one with the other. Lara, Sarwan, Clark and Mohd Yousuf are quite close to the central point.

I had also drawn the charts for the top players by Strike rate. This is quite a lop-sided graph since there is a huge gap between the strike rates of the top batsmen (113, 103, 96 ...). Surprisingly many of these players have fairly high Balls per innings. Hence the graph is heavy with players on the left side.

Graph of strike rate
© Ananth Narayanan


In addition, I had also drawn the charts for the top players by Average balls played. This is also quite a lop-sided graph since there is a huge gap between the balls played values of the top batsmen (67, 62, 57 ...). Surprisingly many of these players have decent strike rates. Hence the graph is heavy with players on the bottom.

Graph of average balls played
© Ananth Narayanan


To view/down-load the graph of top run-makers with an equal quadrant size split, please click/right-click here. The graph is self-explanatory. As I feared, this is a totally unacceptable presentation. Just one player, Tendulkar makes it to the "Top performers" group.

I will next do a similar analysis on ODI bowlers. The intriguing feature in this graph will be that for both Bowling strike rate and Bowling Rpo, the lower the value is the better the bowler. In other words, the quadrants will exchange their significance. Readers are welcome to give their suggestions.

An important announcement to the readers. In one of my comments I had mentioned that I would create an open mail id to which readers could send their suggestions. To start with I would appreciate if readers can send in their suggestions on which batting and bowling performances in the third innings can be considered. I will complete my work and depending on the reader responses will incorporate a few popular performances amongst these. Please note that this is a one-to-one communication and the contents will not be published. Please continue to use the blog posting method for the comments you want to be published. This is not my mail id and has been created only for this purpose. To separate the spam, it will be a nice idea if all readers can follow a simple idea of making their title as "It Figures Blog: ..............".

The mail id is ananth.itfigures@gmail.com

Since the readers would have to use a mail route I give the readers my assurance that the mail id is safe and will never be used by me for anything other than communicating with the reader specifically. This will not be part of any group mail nor will mails be cc'd.

Comments (62)
Y Anantha Narayanan
Y Anantha NarayananY Anantha Narayanan has over 35 years of IT background. Over the past 15 years, he has been concentrating on Cricket analysis and software development. He has been involved with StumpVision, Wisden, Hallmark Software and his own site www.thirdslip.com during this period.
David Barry
David BarryDavid Barry was cricket-starved when teaching English in France, and study of cricket stats was his only way to stay sane. He is now back in Brisbane, Australia, and working towards a PhD in Physics. He once played for the worst team in the G-division of Muscat's cricket league.
Rajesh
RajeshRajesh After doing an MBA in marketing and working in an advertising agency, S Rajesh decided that his skills might be put to better use by number-crunching on cricket. He hasn’t regretted that decision in the last six years, and edits the Numbers Game column on cricinfo.com every Friday.
Rajesh Kumar
Rajesh KumarRajesh Kumar A product of Delhi's Shri Ram College of Commerce, Rajesh Kumar pursued cricket statistics at an early age before joining a nationalised bank, where he served for over two decades. He opted for a VRS nine years back, and hasn't regretted that decision. Apart from being a regular contributor to the Wisden Cricketers' Almanack over the years, Rajesh brought out five World Cup editions for Australia's Peter Murray. He has assisted Bill Frindall from 1980 till his death in January 2009 for the publications of various editions of The Wisden Book of Test Cricket, The Guinness Book of Cricket Facts and Feats, The Wisden Book of Cricket Records, Limited-Overs International Cricket and Playfair Cricket Annual.
Gabriel Rogers
Gabriel RogersGabriel Rogers was born on the ninety-somethingth birthday of Test cricket, and his fate may well have been sealed from that moment. His day-job revolves around medical statistics, and he is interested in applying principles from the field to the analysis of cricket data. Gabriel has spent most of his life in the south-west of England, but has recently moved to Manchester; he hasn't quite worked out yet whether living in a city with a Test ground is adequate compensation for moving away from his beloved Somerset CCC.
Ric Finlay
Ric FinlayRic Finlay Having just taken early retirement as a Mathematics teacher in Hobart, Ric Finlay now fully devotes his time to recording cricket, both past and present, for the popular CSW cricket database, along with his colleague David Fitzgerald (www.tastats.com.au). His interest in the game is inversely proportional to his ability as a player, but he did once score a century after being dropped at 3 and running out three of his team-mates. His first memory of international cricket is the 1962-63 MCC tour of Australia, described as one of the most boring ever. Totally fascinated, he was instantly hooked, and has never looked back. Author of three books on cricket of a historical nature, he has provided statistics and scored for radio and television cricket coverage since 1983.
Latest News
Specials
© ESPN EMEA Ltd