It Figures
June 25, 2010
Significant Test innings, and their architects: a follow-up
Posted by Anantha Narayanan at in Batting

Brian Lara has a high Significant Innings percentage of 45.69 © AFP
A few days back I had come out with an article on the significant innings in Test cricket. It received, arguably, the best responses I have received for any of my articles in this web log. The readers appreciated that there is a completely new measure to evaluate Test innings. The fact that it was off the beaten track of averages, centuries, strike rates et al was a very important factor. The comments and suggestions were some of the best I have received and I was determined to come out with the follow-up article sooner than later.

I will summarise the changes below.

1. As many readers have suggested, I have used the innings as the basis for determining the significant innings rather than the two team innings together. This takes care of the many Tests where the two innings by a team are as different as chalk and cheese. If we take the famous Calcutta Test of 2001, the two Indian innings were 171 and 657. The 59 in the first innings was an outstanding innings considering the 171 for 10 as the basis, probably not if we take 728 for 17 as the basis.

2. This is one lapse which was missed by all the readers. And for that matter myself. In the base analysis, I had taken the wickets as the base for determining the runs and balls cut-off. This is quite wrong. I should have taken the number of batsman who batted as the basis. Take the West Indian innings of 790 for 3. The base should be 5 (which includes Sobers) and not 3 (the number of wickets). If a team is all out, the base will be 11. Of course batsmen who did not bat will be excluded, but batsman who retired hurt will be included. This is absolutely the correct method.

3. Raise the multiplier values for two reasons. One is the consideration of innings as the base and the other is the taking of batsmen as base rather than wickets. I have also introduced a graded multiplier. The multiplier is highest at 2.00 for low rpb/bpb (runs per batsman and balls per batsman) values for 1-7 batsmen and stays at 1.00 for high rpb/bpb values ford 8-11 batsmen. The capping of run-cutoff at 100 and balls-cutoff at 200 is also retained.

4. I will ignore all not out innings below 10, if they have already not become SI, from the total innings. This is a very relevant suggestion. This is necessary since quite a few batsmen, especially in the late order and in later innings remain unbeaten on low scores. Since they have not been given the opportunities to further their innings, these innings are excluded from the total.

5. Now that we have the single innings as the base and have raised the cut-off values, there is no need to have the one-third criteria. Even in the 26 by New Zealand, the 11 by Sutcliffe does not really warrant being considered as a SI. On the other hand, Hutton's 30 out of 52, Tancred's 26 out of 47 and Flintoff's 24 out of 51 must be included. This is done by keeping the lower limit for runs cut-off as 20.

6. Finally one very important addition. I have done a weighting of the innings by determining a Situation innings index value. A 100 out of 200 and a 100 out of 500 are both significant innings. However the first innings is far more significant than the later. This measure indicates the extent of significance. It is possible that this factor can very well be used to determine the influence of batsmen. So there is an additional table based on the average SI Index value. The SI Index value is a simple calculation. The innings measure, runs or balls, is divided by the runs cut-off or balls cut-off, as required. Thus the minimum value for this is 1. Where a player has crossed both cut-offs, the higher index value is taken.

Let me conclude this section by saying that the user responses have been outstanding revealing a very incisive way of thinking. Let us now look at the tables now.

First the table of players, ordered by the % of SIs played. This is a reflection of the consistency of the players. Players such as Dravid, Border et al are likely to be at the top. They are likely to score two 75s in successive innings.

List of players, ordered by the % of SIs achieved

SNo Batsman              Cty Mats  Runs Inns  SIs % of SI

  1.Bradman D.G          Aus   52  6996   80   40  50.00
  2.EdeC Weekes          Win   48  4455   81   38  46.91
  3.Hobbs J.B            Eng   61  5410  101   47  46.53
  4.Barrington K.F       Eng   82  6806  129   59  45.74
  5.Lara B.C             Win  131 11953  232  106  45.69
  6.Dravid R             Ind  139 11395  236  106  44.92
  7.May P.B.H            Eng   66  4537  105   47  44.76
  8.Sutcliffe H          Eng   54  4555   83   37  44.58
  9.Hutton L             Eng   79  6971  137   61  44.53
 10.Chanderpaul S        Win  124  8710  210   93  44.29
 11.Hammond W.R          Eng   85  7249  137   60  43.80
 12.Younis Khan          Pak   63  5260  111   48  43.24
 13.Gavaskar S.M         Ind  125 10122  211   91  43.13
 14.Umrigar P.R          Ind   59  3631   93   39  41.94
 15.Flower A             Zim   63  4794  110   46  41.82
 16.Compton D.C.S        Eng   78  5807  129   53  41.09
 17.Kallis J.H           Saf  138 10911  232   95  40.95
 18.Javed Miandad        Pak  124  8832  187   76  40.64
 19.Richards I.V.A       Win  121  8540  180   73  40.56
 20.Tendulkar S.R        Ind  166 13447  269  107  39.78

The top three remain the same. A few minor changes down the table. Chanderpaul moves down a few places. Sutcliffe also moves down. Lara, Dravid and May move up. Andy Flower moves down a few places.

The most significant change is that of Tendulkar who moves up quite a few places into the top-20 table.

Now the table of players, ordered by the % of SIs played. This is a reflection of the extent of significance once the cut-off is reached. This is likely to have players like Sehwag, Lara et al at the top. They are likely to score a 150 and 0 in two successive innings.

List of players, ordered by the average SI index value

SNo Batsman              Cty Mats  Runs Inns  SIs SII   Avge
                                                  Pts   SII

  1.Bradman D.G          Aus   52  6996   80   40  92  2.320
  2.Sehwag V             Ind   76  6691  129   40  80  2.009
  3.Hanif Mohammad       Pak   55  3915   95   33  65  1.995
  4.Sangakkara K.C       Slk   88  7549  146   52 102  1.965
  5.Lara B.C             Win  131 11953  232  106 207  1.959
  6.Pietersen K.P        Eng   62  5166  111   35  66  1.914
  7.Amiss D.L            Eng   50  3612   86   27  51  1.897
  8.Flower A             Zim   63  4794  110   46  87  1.892
  9.Cullinan D.J         Saf   70  4554  111   33  61  1.877
 10.Crowe M.D            Nzl   77  5444  129   42  78  1.865
 11.Walcott C.L          Win   44  3798   73   29  54  1.863
 12.Atapattu M.S         Slk   90  5502  149   40  74  1.853
 13.Mitchell B           Saf   42  3471   79   30  55  1.844
 14.Ijaz Ahmed           Pak   60  3315   92   25  45  1.836
 15.Hill C               Aus   49  3412   89   30  54  1.833
 16.Saeed Anwar          Pak   55  4052   91   35  64  1.832
 17.Asif Iqbal           Pak   58  3575   97   28  51  1.832
 18.Gomes H.A            Win   60  3171   87   21  38  1.830
 19.Harvey R.N           Aus   79  6149  134   48  87  1.821
 20.Gooch G.A            Eng  118  8900  215   75 136  1.820

The batsman non pareil, Bradman has an average SI Index value of 2.27. Then comes Sehwag, as expected. His string of high scores have propelled him to this second position. Now there is a surprise. Hanif Mohammad, the chalk to cheese (or vice versa) of Sehwag, closely follows Sehwag. His third position indicates how under-rated the great little master was. What he did for Pakistan cricket is unbelievable. That too on difficult pitches and often away. Now come two modern greats, Lara and Sangakkara. This confirms their penchant for out-performing often.

I have given below the best three innings as far as the SI Index is concerned. The first one is the Asif Iqbal classic. During 1967, Pakistan scored 216 in the first innings. England replied with 440. Then Pakistan slumped to 65 for 8. Asif Iqbal then played the greatest of all late order innings and one of the best ever. He added 190 with Intikhab Alam and took the total to 255. England won comfortably. Asif Iqbal’s innings has the highest SI index value ever of 5.41, based on a runs-cutoff value of 27.7 (255/11, multiplied by a factor of 1.333 (no 8-11) and adjusted downwards by 10% for being the second innings).

Pakistan 2nd innings
+Wasim Bari                              b Titmus              12
Mohammad Ilyas        c Cowdrey          b Higgs                1
Saeed Ahmed           c Knott            b Higgs                0
Majid Khan                               b Higgs                0
*Hanif Mohammad       c Knott            b Higgs               18
Ghulam Abbas          c Knott            b Higgs                0
Mushtaq Mohammad      c D'Oliveira       b Underwood           17
Javed Burki                              b Underwood            7
Asif Iqbal            st Knott           b Close              146
Intikhab Alam                            b Titmus              51
Saleem Altaf          not out                                   0
Extras                (b 1, lb 1, nb 1)                         3
Total                 (all out, 101.1 overs)                  255
FoW: 1-1, 2-5, 3-5, 4-26, 5-26, 6-41, 7-53, 8-65, 9-255, 10-255.
The next is one is another all-time great innings by Dennis Amiss. During 1974, in Kingston, England started their second innings, 230 in arrears. Amiss opened the innings, remained unbeaten on 262 and guided England to safe total of 432 for 9. This innings is reminiscent of the Laxman classic. Amiss' innings has the second highest SI index value ever of 4.52, based on a runs-cutoff value of 39.3 (432/11, multiplied by a factor of 1.667 (no 1-7) and adjusted downwards by 10% for being the second innings).
England 2nd innings
G Boycott             c Murray           b Boyce                5
DL Amiss              not out                                 262
JA Jameson            c Rowe             b Barrett             38
FC Hayes              run out                                   0
*MH Denness           c Rowe             b Barrett             28
AW Greig                                 b Gibbs               14
DL Underwood          c Murray           b Sobers              12
+APE Knott            run out                                   6
CM Old                                   b Barrett             19
PI Pocock             c sub              b Boyce                4
RGD Willis            not out                                   3
Extras                (b 10, lb 11, w 1, nb 19)                41
Total                 (9 wickets, 183 overs)                  432
FoW: 1-32, 2-102, 3-107, 4-176, 5-217, 6-258, 7-271, 8-343, 9-392.
Now a modern classic by Saeed Anwar. During 1999, in Calcutta, Pakistan started their second innings, 38 in arrears. Anwar opened the innings, remained unbeaten on 188 and guided Pakistan to good total of 316, with the Pakistani bowlers dismissing India for 232. Anwar's innings has the third highest SI index value ever of 4.48, based on a runs-cutoff value of 28.7 (316/11, multiplied by a factor of 1.667 (no 1-7) and adjusted downwards by 10% for being the second innings).
Pakistan 2nd innings                                            R   M   B  4 6
Saeed Anwar           not out                                 188 452 259 23 1 4
Wajahatullah Wasti    c Mongia           b Srinath              9  54  33  2 0
Saqlain Mushtaq       c Mongia           b Harbhajan Singh     21 108  86  1 0
Ijaz Ahmed            c Mongia           b Srinath             11  55  47  1 0
Yousuf Youhana        c Dravid           b Srinath             56 139 123  7 1 2
Shahid Afridi         c Laxman           b Srinath              0   1   1  0 0
Saleem Malik          lbw                b Srinath              9  34  16  1 0
+Moin Khan            c Mongia           b Prasad               8  22  13  1 0
Azhar Mahmood         lbw                b Srinath              0   9   9  0 0
*Wasim Akram          c Mongia           b Srinath              1   7   3  0 0
Shoaib Akhtar                            b Srinath              1  14   8  0 0
Extras                (lb 3, w 5, nb 4)                        12
Total                 (all out, 99 overs)                     316
FoW: 1-26 (Wajahatullah Wasti, 10.5 ov), 2-94 (Saqlain Mushtaq, 35.3 ov), 3-147 
(Ijaz Ahmed, 49.1 ov), 4-262 (Yousuf Youhana, 82.3 ov), 5-262 (Shahid Afridi, 82.4 ov), 
6-284 (Saleem Malik, 88.4 ov), 7-301 (Moin Khan, 93.1 ov), 8-302 (Azhar Mahmood, 94.6 ov), 
9-304 (Wasim Akram, 96.2 ov), 10-316 (Shoaib Akhtar, 98.6 ov).
To view/down-load the complete player table, ordered by the % of SIs played, please click/right-click here.

To view/down-load the complete player table, ordered by the average values of SI Index, please click/right-click here.

I have also made available the complete list of significant performances for all the 159 qualifying batsmen.

To view/down-load the table for all the first 1960 tests, please click/right-click here.

Finally the grand-daddy of all tables. Let me warn you these tables are huge, 500kb each. These are the lists of all significant innings, all 14782 of them, covering all 1960 tests played.

To view/down-load the complete table for tests 1-999, please click/right-click here.

To view/down-load the complete table for tests 1000-1960, please click/right-click here.

A few readers have asked for some summarized figures based on criteria. I have given these, and more below. I have not done the %. I leave it for the readers.

Summary information
===================

TotInns:68988    TotInnsSel: 64964
Perfs: 14782
100+runs: 3374   50+runs: 6581     <50runs:4827
200+balls: 1233  100+balls: 2545   <100balls: 2062
BPos 1-7: 12835  BPos 8-11: 1947
Both: 3053       Rpw: 11141        Bpw: 588
1Inns: 8678      2Inns: 6104
Wins: 4433       Losses: 5678      Draws: 4671
SI1: 5.41        SI2: 4.52         SI3:4.48
I will attempt to do a significant innings analysis for ODIs later as also, possibly more complex, a significant innspell analysis for Tests.
List of selected players ordered by the average SI index value

Batsman          Cty Mats  Runs Inns  SIs  SII    Avge
                                           Pts    SII


Headley G.A      Win   22  2190   39  18  46.15  2.214
Pollock R.G      Saf   23  2256   41  15  36.59  2.033
Nurse S.M        Win   29  2523   54  16  29.63  2.026
Turner G.M       Nzl   41  2991   72  27  37.50  1.936
Hazare V.S       Ind   30  2192   52  18  34.62  1.912
Ponsford W.H     Aus   29  2122   47  11  23.40  1.896
Nourse A.D       Saf   34  2960   62  29  46.77  1.842
Gambhir G        Ind   31  2798   53  18  33.96  1.812
Mankad M.H       Ind   44  2109   70  16  22.86  1.812
Macartney C.G    Aus   35  2131   53  16  30.19  1.781
Taylor H.W       Saf   42  2936   76  34  44.74  1.774
McCabe S.J       Aus   39  2748   61  20  32.79  1.760
Rowe L.G         Win   30  2047   48  11  22.92  1.747
Richardson M.H   Nzl   38  2776   64  31  48.44  1.672
Rowan E.A.B      Saf   26  1965   49  21  42.86  1.665
O'Neill N.C      Aus   42  2779   67  23  34.33  1.565
Dhoni M.S        Ind   43  2428   66  18  27.27  1.415
This is a selected set of players whose career runs are between 1965 and 3000. This list has been requested for by John Clark. I have selected a few players including Mark Richardson, in view of Gabriel's recent article.

Headley almost touches Bradman. The other great, Greame Pollock, also crosses 2.00.

On 29 June 2010

As requested by Abhi and Alex I have expanded the Player tables with the following information.

1. Add number of fifties and % of selected inns to enable comparison with SI %.
2. Runs per innings for significant innings.
3. Total of SI Runs and % of total career runs.

I have also corrected the format of the Selected players Si report to enable proper downloading into XL files.

To view/down-load the complete revised player table, ordered by the % of SIs played, please click/right-click here.

To view/down-load the complete revised player table, ordered by the average values of SI Index, please click/right-click here.

Comments (48)
June 21, 2010
Occupying the crease
Posted by Ric Finlay at in Batting

Don Bradman has the fastest scoring rate among batsmen who have faced more than 100 balls per innings © Getty Images

The table below lists the 30 batsmen in Test history whose known “balls faced” innings numbers at least 20, and whose average balls faced per innings exceeds 100:
Players with average balls faced/innings greater than 100
Player Team Balls faced/innings Balls faced/run
Herbert Sutcliffe England 163.95 2.89
Don Bradman Australia 142.00 1.71
Walter Hammon England 129.16 2.63
Glenn Turner New Zealand 126.91 2.94
Bill Woodfull Australia 125.66 3.21
Maurice Leyland England 125.47 2.50
John Reid New Zealand 124.24 2.82
Len Hutton England 123.71 2.64
Geoff Boycott England 122.23 2.82
Bill Lawry Australia 118.65 2.50
Jack Hobbs England 115.94 2.15
John Edrich England 115.41 2.69
Ian Redpath Australia 113.46 2.58
Mark Richardson New Zealand 113.31 2.65
Rahul Dravid India 112.50 2.36
Bob Simpson Australia 111.95 2.20
Trevor Bailey England 111.73 4.05
Bill Ponsford Australia 111.36 2.23
Bill Brown Australia 110.63 2.57
Shoaib Mohammad Pakistan 107.49 2.56
Sunil Gavaskar India 105.70 2.25
Jacques Kallis South Africa 105.29 2.25
Ken Barrington England 104.54 2.36
Jack Fingleton Australia 103.67 3.24
Tom Graveney England 103.29 2.51
Allan Border Australia 103.29 2.43
Chris Tavare England 102.41 3.27
John Wright New Zealand 102.23 2.84
Andrew Jones New Zealand 102.03 2.58
Asanka Gurusinha Sri Lanka 101.82 2.73

Three things stand out for me. The first is the over-representation of players from days gone by. One has to go to 14th place to find someone (Mark Richardson) who played this century, and in this list of 30, there are only two other, Dravid and Kallis. Test cricket was clearly more a battle of attrition in the past than it is now. But also, there were simply more balls available to be defended in those times than there are now.

Secondly, the obduracy of Herbert Sutcliffe is perhaps understated. His figure of nearly 164 balls per innings is more than 15% higher than the next most obdurate, Bradman. And at a run every 2.89 balls, he was hardly fluent, either. Another player whose high position deserves recognition is New Zealand’s Glenn Turner, a very major player in a struggling team

Thirdly, the absence of any West Indians in this list confirms the impression of a carefree approach to batting. The preponderance of Australian and English batsmen is not significant. Many of the Test scorecards involving other countries simply don’t have the “balls faced” data available. The highest placed West Indians are Sobers and Chanderpaul, both just over 96 balls per innings. But in the three innings for which we have “balls faced” data, George Headley averaged 139 balls per innings.

Rearranging the table in order of scoring fluency, we have:

Best scoring rate among players with average balls faced/innings greater than 100
Player Team Balls faced/innings Balls faced/run
Don Bradman Australia 142.00 1.71
Jack Hobbs England 115.94 2.15
Bob Simpson Australia 111.95 2.20
Bill Ponsford Australia 111.36 2.23
Jacques Kallis South Africa+ 105.29 2.25
Sunil Gavaskar India 105.70 2.25
Ken Barrington England 104.54 2.36
Rahul Dravid India+ 112.50 2.36
Allan Border Australia 103.29 2.43
Maurice Leyland England 125.47 2.50
Bill Lawry Australia 118.65 2.50
Tom Graveney England 103.29 2.51
Shoaib Mohammad Pakistan 107.49 2.56
Bill Brown Australia 110.63 2.57
Ian Redpath Australia 113.46 2.58
Andrew Jones New Zealand 102.03 2.58
Walter Hammond England 129.16 2.63
Len Hutton England 123.71 2.64
Mark Richardson New Zealand 113.31 2.65
John Edrich England 115.41 2.69
Asanka Gurusinha Sri Lanka 101.82 2.73
John Reid New Zealand 124.24 2.82
Geoff Boycott England 122.23 2.82
John Wright New Zealand 102.23 2.84
Herbert Sutcliffe England 163.95 2.89
Glenn Turner New Zealand 126.91 2.94
Bill Woodfull Australia 125.66 3.21
Jack Fingleton Australia 103.67 3.24
Chris Tavare England 102.41 3.27
Trevor Bailey England 111.73 4.05

In this respect, Bradman (over 20% more fluent than anyone else) and Hobbs show their class, while who would have thought that Ponsford would have rated so highly here? Perhaps we need to re-assess some of these players! Barrington beats Border. Lawry beats Redpath. But Tavare and Bailey are where we expect!

The last table gives the same data for top three most obdurate players at each position in the batting order. The qualification has been reduced to at least ten innings where “balls faced” data is known.

Players with highest average balls faced/innings by batting position
Batting Position 1st Balls/innings 2nd Balls/innings 3rd Balls/innings
Openers Herbert Sutcliffe 163.49 Bill Woodfull 128.07 Herbie Collins 127.79
3 Walter Hammond 175.69 Don Bradman 144.50 Ken Barrington 135.82
4 Graeme Pollock 125.44 Lindsay Hassett 116.57 Mike Denness 115.10
5 Ian Redpath 122.91 Michael Hussey 114.53 Allan Border 110.57
6 Trevor Bailey 137.08 Garry Sobers 124.05 Shivnarine Chanderpaul 123.19
7 Thilan Samaraweera 111.91 Brian McMillan 100.78 Ravi Shastri 92.00
8 Dion Nash 69.91 Manoj Prabhakar 69.77 Fred Titmus 65.38
9 Graham Dilley 60.20 Kiran More 58.43 Ian Salisbury 55.60
10 John Bracewell 45.33 Tim May 38.85 Sarfraz Nawaz 38.00
11 Arthur Mailey 36.30 Danny Morrison 20.28 Ashley Mallett 19.83

Occupancy of the crease clearly declines as one descends through the batting order, although the figures at number 6 are interesting. It is not only the special character of Trevor Bailey causing this, because Sobers and Chanderpaul also are higher than many players above them in the batting order. I suspect it is a realisation by a number 6 that he is the last specialist batsman, and he sets himself to bat through the innings with the tail.

A study of players at the other end of the scale, those who survive least, is also interesting, but that can wait for another time.

Comments (19)
June 16, 2010
Achieving the right consistency - I
Posted by Gabriel Rogers at in Batting

Mark Richardson wasn't the most attractive batsman, but with him you knew, more than with any other player, what you were going to get © Getty Images

My first few analyses for It Figures are all going to be broadly about the same thing, and that thing could broadly be called consistency. I’ll bet that, at some time or other, everyone reading this post has criticised a cricketer for being inconsistent. I’ve done it myself but, whenever I have, I’ve had a nagging doubt: is performing brilliantly in one match and terribly in the next really any worse (or better) than being moderately good in two games on the trot? Maybe some stats can help us to unpick this issue.

I’m going to start by looking at batsmen. More specifically, my focus, in this first post, is batsmen’s innings-to-innings consistency. If Batsman A has scores of 0, 138, 11, 0, & 101, and Batsman B has scores of 52, 50, 45, 48, & 55, then they both have the same average (50.00). However, there’s a very obvious difference between the ways in which they’ve achieved the mark that we won’t appreciate, if we concentrate on the average alone.

There are two big questions here, for me: (i) is it possible and instructive to identify batsmen with more or less consistent careers, and to quantify how much variability their records show? and (ii) does it matter? Is there any way in which a run of scores like Batsman A’s is demonstrably better or worse – for himself and/or his team – than that of Batsman B?

Mister Hugely Reliable

S Rajesh comes close to answering the first of my questions in this It Figures avant la lettre column from 2006. He proposed a consistency index that is derived by dividing a batsman’s average by the standard deviation (SD) of runs scored in each of his innings. I think he’s on exactly the right lines, here, but I think the index can be improved in two ways. Firstly, I’m twitchy about combining one measure – the batting average – that makes an adjustment for not-out innings with another – the SD of the same dataset – that does not. For this reason, I’d rather rely on simple runs-per-innings (RPI), in this context. This way, both halves of the sum are quantifying the same thing and, although both may be affected by not-out innings, they are both affected equally. The second modification I have made is to turn the sum upside-down, so we have SD divided by RPI. Mathematically, this makes no difference to the ranking of results (although it means that low numbers, rather than high ones, indicate greater consistency).

The advantage of doing these two things is that the number you end up with has a solid interpretation: it is the percentage of deviation around the mean that is observed, on average, throughout the dataset. Dividing the SD by the mean is a trick statisticians use quite often; they call the result the coefficient of variation (CoV). As Rajesh pointed out, it’s important to perform this scaling, rather than concentrating on SDs on their own, otherwise the batsmen who score most runs will always appear to have more variability in their records. A batsman with scores of 5, 30, and 100 has the same CoV as one with scores of 10, 60, and 200, though they have very different SDs.

So much for the theory; what about the results? Table 1 shows the batsmen who have been most and least consistent on an innings-to-innings basis throughout Test history, with a few notable figures picked out from the middle of the table.

Top of the lot is Kiwi opener Mark Richardson. He may not have set the world alight compared to some of his dashing contemporaries, but his solidity as an opening batsman can easily be overlooked: he reached double figures in 80% of his Test innings (a very high proportion, as noted in another Numbers Game a few years ago), and only ever registered one duck. What stopped him from threatening the real top rank of the game was that, though he’d seldom get out cheaply, he was also pretty unlikely to score very heavily, as a total of four centuries from 65 innings and a top score of 145 attests. These characteristics are perfect for a low CoV, because they imply that a large majority of his innings fell in a relatively tight range in the middle of possible scores. Cricket will always find a way of surprising you but, to a greater extent than with any other batsman, you knew what you were going to get from Richardson.

Table 1: Test batsmen sorted according to consistency (coefficient of variation) in score
NameMIRAveRPISDCoV
1.MH Richardson38652,77644.7742.7135.160.823
2.H Sutcliffe54844,55560.7354.2345.240.834
3.TL Goddard41782,51634.4732.2627.110.840
4.SM Katich49853,79248.0044.6138.720.868
5.MS Dhoni43662,42842.6036.7932.360.880
6.JB Hobbs601025,41056.9553.0446.680.880
7.IR Redpath661204,73743.4639.4834.890.884
8.JB Stollmeyer32562,15942.3338.5534.200.887
9.PE Richardson34562,06137.4736.8032.860.893
10.A Ranatunga931555,10535.7032.9429.440.894
...
32.JH Kallis13622910,76054.6246.9944.950.957
...
35.AR Border15626511,17450.5642.1740.490.960
...
47.KP Pietersen621115,16649.2046.5445.540.978
...
56.DG Bradman50806,99699.9487.4586.650.991
...
85.RS Dravid13823811,37254.1547.7848.911.024
...
97.RT Ponting14324111,82855.2749.0850.891.037
...
107.Inzamam-ul-Haq1191988,82950.1644.5946.631.046
108.SR Tendulkar16627113,44755.5749.6251.921.046
...
113.IVA Richards1211828,54050.2446.9249.431.053
...
115.SM Gavaskar12421410,12251.1247.3049.961.056
116.SR Waugh16826010,92751.0642.0344.511.059
...
131.GS Sobers931608,03257.7850.2054.021.076
...
226.BC Lara13023011,91253.1851.7962.431.205
...
238.V Sehwag751286,60853.7251.6364.711.254
...
245.DW Randall47792,47033.3831.2740.901.308
246.Zaheer Abbas781245,06244.8040.8254.001.323
247.SE Gregory581002,28224.5422.8230.331.329
248.LG Rowe29492,04743.5541.7855.981.340
249.GJ Whittall46822,20729.4326.9136.451.354
250.DL Amiss50883,61246.3141.0555.741.358
251.MS Atapattu901565,50239.0235.2749.931.416
252.Mohammad Ashraful551072,30622.3921.5530.701.425
253.Wasim Akram1041472,89822.6419.7128.151.428
254.MH Mankad44722,10931.4829.2946.061.572
qual. 2,000 Test runs; complete list available here

I was slightly surprised to see MS Dhoni riding high in this list. His reputation is for a more free-spirited kind of play than might be expected to generate a low CoV. But it turns out that any such assumptions do him a bit of a disservice: his Test record is that of a reliable runscorer, rather than a hit-or-miss gunslinger. Simon Katich’s presence next to him is perhaps more in keeping with his reputation.

It is intriguing to see both Herbert Sutcliffe and Sir Jack Hobbs in the top half-dozen of this list. There could surely be no firmer foundation for a partnership as successful as theirs than the kind of shared dependability this statistic suggests. If they both had more mercurial profiles then, though they each might have scored as many runs, they would have been unlikely to have shared so many significant partnerships.

The fact that Jacques Kallis has fallen down the list somewhat compared to Rajesh’s analysis is, to a small extent, a reflection of my slightly different methods, but it’s more to do with the fact that his record has become a wee bit more inconsistent in the 4 years since Rajesh wrote his column.

According to this analysis, the least consistent batsman in Test history is Vinoo Mankad. His career has the opposite profile to Mark Richardson’s: there is a very high proportion of low scores in his record (he only got into double figures 57% of the time) but, when he got in, he often went on to score big hundreds (including two doubles in one series against New Zealand in 1955/56). In contrast to Richardson’s reliable-but-unspectacular record, Mankad’s performances were an awful lot less predictable.

Wasim Akram’s position at the bottom of the list is very largely ascribable to the effect of one mammoth score of 257* in the midst of a dataset that characteristically reflects a much more modest level of achievement (there’s a good argument for calling this the most out-of-character innings in Test history, as discussed in a recent Ask Steven). If that one innings is excluded from his record, his CoV reverts to a much more run-of-the-mill 1.119.

Marvan Atapattu’s status is probably not surprising for a man who started his Test career with a famous string of failures, but ended up with 6 double-centuries under his belt.

So...?

The unanswered question I find most intriguing is whether, in the grand scheme of things, any of this matters. As cricket fans, we’re quite used to berating inconsistent batsmen (“you never know what you’re going to get: one day, he’s brilliant; the next, he couldn’t buy a run”) but, then again, we may have a paradoxical tendency to look down our noses at those with the least variable records (“he’s good at getting in, but he never goes on to register a matchwinning score”). Is either of these positions more justifiable than the other?

I’ve come up with two ways of answering this question. The first is to examine whether consistent batsmen, ultimately, score more runs than their more mercurial counterparts. It’s all very well to invent hypothetical 50-averaging batsmen with consistent and inconsistent records, like I did in my introduction, but it may be that, in the real cricketing world, batsmen with one profile or the other are more likely to achieve a decent average.

To explore this, I used a statistical technique called regression (to be more precise: univariate ordinary least squares linear regression), which enables us to assess the relationship between two variables. The results are shown in Figure 1. Each batsman’s CoV is plotted against his average, with the typical relationship between the two (the regression line) indicated by the red dotted line. You can see that, although there’s an awful lot of scatter around the trend, the datapoints generally appear to line up with a slight downwards slope. This suggests that there is a weak but identifiable association between the two variables, with more consistent batsmen tending to average slightly more (for any statsheads, that means that r 2 is a pretty dismal 0.065, but p<0.001 for the slope coefficient).

Fig 1 Association between consistency (coefficient of variation) and success (average) for Test batsmen © Gabriel Rogers

Clearly, there are plenty of examples that do not fit the general trend too well, but it appears that, on average, consistency is associated with higher runscoring. Actually, a more pronounced correlation would have been surprising, because we didn’t see a very obvious hierarchy in the consistency list – no one is suggesting that Mark Richardson was, in any meaningful way, a better batsman than Brian Lara. Nevertheless, it does seem to be the case that consistency is, by and large, a positive thing for individual batsmen. This may seem like an obvious finding, but I don’t think it’s been demonstrated before.

My second way to assess the value of batting consistency was to see whether it has a positive effect for the team. So I looked to see if there’s any correlation between each batsman’s CoV and his record of winning matches. I did this in exactly the same way, plotting one variable against the other, and drawing a univariate regression line through the results. For Test match cricket, there was a very weak, but still detectable, association between CoV and percentage of matches won (r 2=0.015; p=0.005); this vaguely suggests that, the more consistent a batsman is, the more likely he is to be on the winning side. It’s a pretty unsatisfactory analysis, though, with an awful lot of noise around the hint of a signal. What I was more interested to find is that the correlation gets quite a bit stronger when, instead of winning record, you look at each batsman’s not-losing record. The results of this analysis are shown in Figure 2. You can see a relatively shallow, but pretty obvious, upwards slope to the dataset, showing that, on average, the most consistent batsmen are also those who have lost the lowest proportion of the Test matches in which they have played.

Fig 2 Association between consistency (coefficient of variation) and losing record for Test batsmen © Gabriel Rogers

The fact that consistency is associated with not-losing more strongly than it is with winning suggests that consistent batsmen really come into their own when it comes to securing draws for their teams. (And, indeed, regressing CoV against draw-rate produces a strongly significant result [p=0.002].) So, if you’ve got a team packed with consistent batsmen, you might not win too many more games, but you might draw some that less consistent teams would lose. I’m not quite sure how to explain this finding in cricketing terms; if you’ve got any bright ideas, please feel free to comment!

Once again from the top in pyjamas

The remainder of this post repeats the above analysis for ODI cricket.

Table 2 lists the most and least consistent batsmen in ODIs, The list is topped by Australia’s two great “finishers” – Michaels Hussey and Bevan. We’re used to seeing them high on lists of ODI stats, but it’s worth remembering that – because CoV, as I have calculated it, relies on RPI rather than average – the high number of not-outs in each of their records has no direct influence on their excellent consistency ratings. Plenty of players have higher RPIs that these two; it’s only once not-outs are factored in that their averages rise so high (although that doesn’t necessarily mean the not-outs inflate their average, as is often assumed; Charles Davis has done good work on this). Accordingly, it is notable that consistency stats for these two players agree with their conventional records: I conclude that it was the dependability – as much as the volume – of their contributions that marked them out as matchwinners for their team.

Table 2: ODI batsmen sorted according to consistency (coefficient of variation) in scores
NameMIRAveRPISDCoV
1.MG Bevan2321966,91253.5835.2725.600.726
2.MEK Hussey1371134,02953.0135.6526.040.730
3.RR Sarwan1561465,09843.9534.9227.980.801
4.AH Jones87872,78435.6932.0025.720.804
5.NH Fairbrother75712,09239.4729.4623.810.808
6.IR Bell78762,48335.4732.6726.450.810
7.GP Thorpe82772,38037.1930.9125.040.810
8.CG Greenidge1281275,13445.0440.4333.010.817
9.AJ Lamb1221184,01039.3133.9827.870.820
10.RG Twose87812,71738.8133.5427.540.821
11.Javed Miandad2332187,38141.7033.8627.810.821
12.DM Jones1641616,06844.6237.6931.030.823
...
17.Zaheer Abbas62602,57247.6342.8735.580.830
...
20.ML Hayden1601546,13144.1139.8133.380.838
21.GC Smith1531515,73240.3737.9631.860.839
...
24.S Chanderpaul2612458,64841.7835.3029.960.849
...
26.Inzamam-ul-Haq37434811,70139.5333.6228.580.850
...
28.MS Dhoni1591415,24650.4437.2131.790.855
29.JH Kallis29828410,80946.5938.0632.680.859
...
31.Mohammad Yousuf2752619,45842.8036.2431.250.862
32.RS Dravid33530910,64439.4234.4529.710.863
...
40.KP Pietersen96863,20245.1037.2332.830.882
...
49.RT Ponting34133212,62342.7938.0234.110.897
...
55.IVA Richards1871676,72147.0040.2536.450.906
...
74.BC Lara29528510,34840.9036.3134.280.944
...
105.SR Tendulkar44243117,59845.1240.8340.140.983
...
147.BB McCullum1711453,56929.0224.6126.771.088
148.DI Gower1141113,17030.7828.5631.171.091
149.Mohammad Ashraful1571513,29823.9021.8424.091.103
150.Kapil Dev2251983,78323.7919.1121.101.105
151.WW Hinds1191112,88028.5125.9528.831.111
152.RS Kaluwitharana1891813,71122.2220.5022.791.112
153.VVS Laxman86832,33830.7628.1731.661.124
154.L Vincent102992,41327.1124.3728.051.151
155.H Masakadza95952,60128.5827.3832.861.200
156.KO Otieno89872,01623.4423.1728.161.215
qual. 2,000 ODI runs; complete list available here

It’s surprising to see Ian Bell so high in the list (actually, it’s kinda surprising to learn that Ian Bell has 2,000 ODI runs). His reputation may not suggest limited-overs strength but, if you look at his record, it’s clear that he has very seldom failed completely in ODIs. Of course, there’s a relative lack of dramatic successes, too, and we’ve already seen that these two characteristics tend to produce a low CoV.

Another unexpected finding is that Sachin Tendulkar’s ODI record is not an especially consistent one. To an extent, this is just because his scores encompass such a wide range. Obviously, Tendulkar has a much higher proportion of big scores in his record than someone like Ian Bell. Less predictably, however, he also has a higher proportion of cheap dismissals, and it is the swing from one extreme to the other that produces a higher CoV. Tendulkar’s customary position at the top of the ODI order may provide part of the explanation for both of these features: if he hits his stride in any given innings, he will have full opportunity to score plenty of runs, and this may not be true of those who bat lower in the order; on the other hand, he also has the challenge of facing the new ball with close catchers in place – something that may raise his probability of early dismissal. It is notable that there are relatively few opening batsmen amongst those with the lowest CoVs (although I checked, and this is not a systematic bias in the dataset: batting position is not, in itself, predictive of consistency).

In contrast, it is anything but a shock to see Tendulkar’s teammate, VVS Laxman, near the foot of the table: has any batsman more dramatically personified sublime-to-the-ridiculous swings of achievement in the ODI era?

I don’t know if you’ve noticed it, but I think there’s quite a difference between the ODI consistency list and the Test match equivalent, in Table 1, above. It seems to me that – the odd diminutive ginger anomaly aside – there is some sort of hierarchy going on in the ODI list. Players in the top part of the table are, by and large, better than those who come lower down. So we might expect to see a more pronounced correlation between ODI CoV and batting average than was the case in the analogous Test analysis. Any such expectation would be right on the money, as Figure 3 shows. There is a fairly strong association between the two variables (r 2=0.196; p<0.001), with a clear downward trend, suggesting that lower CoVs – indicating greater consistency – are associated with higher averages. I’m pretty sure that there is no mathematical reason why consistency should appear to be more valuable in ODI cricket than it is in the five-day game. If anyone can think of a cricketing reason, please do put it in the comments.

Fig 3 Association between consistency (coefficient of variation) and success (average) for ODI batsmen © Gabriel Rogers

And, if it’s positive for individual batsmen to be more consistent in their ODI runscoring, you’d expect their teams to see some benefit. As Figure 4 shows, this would appear to be another reasonable inference: batsmen with lower CoVs are, on average, those who win most ODIs (r 2=0.163; p<0.001).

Fig 4 Association between consistency (coefficient of variation) and winning record for ODI batsmen © Gabriel Rogers

Conclusions

So what if, playground style, you’re offered first pick between two batsmen with different records? Given a choice between a consistent batsman and an inconsistent one who averages more, you’d be a fool not to go for the one with the higher average. However, if your choice was between two batsmen with similar averages but different CoVs, I’d go for the more consistent one every time, as a result of what I’ve learned in this analysis. My expectation would be that he’d help me win – or at least not lose – a greater proportion of my matches. What is more, if I was provided with no information at all about the batsmen’s averages, but did know their CoVs, I’d favour whoever had the more consistent record, because it would be a reasonable – though far from infallible – guess that he’d also have the better average.

One corollary of this conclusion may be that “matchwinning” performers are a bit of a myth. In the future, I want to do some work on whether there is such a thing as a true matchwinner in cricket (what analysts in other sports sometimes refer to as clutch players) but, with this analysis as my starting point, my provisional view is that the kind of batsman who quietly gets on with contributing on a match-to-match basis may be of at least as much value as one who has an exceptional game once in a while.

I’ve got some related posts coming up about consistency among bowlers, and swings of form over longer periods.

All stats calculated Jun 10, 2010 (i.e. all Tests up to England v Bangladesh at Manchester, Jun 4-6, 2010 [Test # 1959] and all ODIs up to Zimbabwe v Sri Lanka at Harare, Jun 9, 2010 [ODI # 2990]).


Technical appendix

Anyone who isn’t incredibly fascinated by statistical methods doesn't need to read this bit, but I like to give a precise account of what I’ve done, in case anyone cares.

Technical note #1. My SDs are calculated as population SDs (i.e. if we’re going to be really geeky, I have not adopted Bessel’s correction). The reason for this is that, in batsmen’s complete careers, we’re dealing with all the observations that are available to us. This is unusual, for a statistician: normally, we have a limited sample of observations from which we want to draw inferences about a wider population (ask 1,000 people how they intend to vote, and you can predict what the entire electorate is going to do; give 500 people a drug, and you can tell how effective it’ll be for everyone... that sort of thing). Here, though, the data we have is all we’re going to get, so it’s appropriate not to use the tiny correction that’s normally strictly necessary. If anyone wants to replicate my analyses in Excel or Access, you need to use the StDevP function, not the normal StDev. If anyone else read the foregoing, didn’t understand a word of it, but wonders whether it makes a difference to my outputs, the answer is no: the effect is tiny, but I believe it’s more correct, so that’s what I’ve done.

Technical note #2. Any statisticians reading this analysis might have been slightly concerned that the regression I presented in Figure 1 is unduly influenced by the highest-averaging batsmen (a phenomenon statisticians refer to as leverage). I did some sensitivity analyses that established that this isn’t the case (p remains <0.001 when the dataset is restricted to batsmen averaging <60 or <50).

Technical note #3. Another concern statisticians might have with my regressions is that I’ve picked two covariates of consistency and analysed them separately (univariately). A more comprehensive model would be a multivariate one – that is, one that bundles everything up in the same analysis. So I did that. For Test cricket, I regressed CoV against average, losing percentage, and an interaction term. The only significant covariate was average (p<0.001). This suggests that the reason more consistent batsmen lose fewer Test matches is that they average more: there’s no independent effect of consistency on not-losing. For ODIs, the multivariate results – regressing CoV against average, winning percentage, and an interaction term – are more interesting: all three covariates come up p<0.05 (and r 2 rises to 0.415). This suggests that more consistent batsmen are likely to win more games even if they don’t average any higher, and the significant interaction term indicates that, the more games you win, the more consistency is of value in raising your average.

Technical note #4. The super eagle-eyed may have noticed that the scatterplots showing CoV -v- Average are rather bushier than those showing CoV -v- Results. This is because winning (or losing) percentages are constrained at both ends, and I found that the good number of players have won or lost all of their games were skewing results around, somewhat. Accordingly, all the CoV -v- Results analyses are limited to players with 40 or more innings. As a rule, I don’t like doing this because, although tiny samples can produce weird results, their weirdness should balance out on either side of the average, so it’s my preference to use all the data that’s going. In this instance, though, I found I got much more sensible results by adopting an artificial constraint.

Technical note #5. I take the view that Australia v. ICC World XI, 14–17 October 2005, was not a Test match; similarly, these games are never included in my ODI stats.

Comments (39)
June 11, 2010
Analysing wides and no-balls in Twenty20 internationals
Posted by Anantha Narayanan at in Bowling

... © Getty Images
Wides and no-balls are the bane of the bowlers in Twenty20 matches. Not to forget the additional (unrecorded) runs scored off possible free hits. This article analyses the wides and no-balls bowled by bowlers in Twenty20 internationals. I have specifically considered only Twenty20 internationals and excluded IPL matches, which I do not consider as true internationals. The basic criteria is that the bowlers should have bowled a minimum of 120 balls, which works to no less than 5 Twenty20 International matches. 1. Bowlers who have conceded the most number of wides and no-balls
No Bowler             Ctry Mat  Overs Wides NBs Total
                                                 W+Nb

 1 Malinga S.L         Slk  28   94.0   35   2    37
 2 Umar Gul            Pak  26   93.2   25  11    36
 3 Johnson M.G         Aus  21   77.1   32   4    36
 4 Sohail Tanvir       Pak  15   51.0   26  10    36
 5 Steyn D.W           Saf  21   78.0   29   2    31
 6 Tait S.W            Aus  15   55.4   30   1    31
 7 Anderson J.M        Eng  18   66.2   29   0    29
 8 Lee B               Aus  16   58.1   13  14    27
 9 Roach K.A.J         Win  10   35.0   20   5    25
10 Broad S.C.J         Eng  26   89.5   17   5    22
Lasith Malinga of Sri Lanka has bowled the maximum number of wides and no-balls. with 37. Umar Gul, Johnson and Sohail Tanvir come in next with 36 wides and no-balls. In fifth place in this list is Steyn with 31.

It is not a surprise that all the bowlers in the table are the quicker bowlers. They are all attacking wicket-taking bowlers. The spinner who has conceded the most wides and no-balls is Shoaib Malik with 21.

Now a look at the best performing bowlers in this classification.

2. Bowlers who have conceded the least number of wides and no-balls

No Bowler             Ctry Mat  Overs Wides NBs Total
                                                 W+Nb
 1 Mudassar Bukhari    Hol   7   25.4    0   0     0
 2 Haq R.M             Sco   7   25.0    0   0     0
 3 Seelaar P.M         Hol   9   35.0    1   0     1
 4 Borren P.W          Hol   9   35.0    1   0     1
 5 Dhaniram S          Can  11   33.4    1   0     1
 6 Patel J.S           Nzl  11   33.1    0   1     1
 7 Vaas WPUJC          Slk   6   22.0    0   1     1
 8 McCallan W.K        Ire   8   21.5    1   0     1
 9 Collingwood P.D     Eng  30   32.0    2   0     2
Quite a few bowlers from the unfancied teams have conceded one noball or wide. Vaas and Jeetan Patel have also bowled a single wide.

Now for some qualitative assessments. First a table based on the number of wides and no-balls conceded per match.

3. Bowlers who have conceded most numbers of wides & no-balls per match

No Bowler             Ctry Mat  Overs  Total WNb/M
                                        W+Nb

 1 Roach K.A.J         Win  10   35.0    25   2.50
 2 Sohail Tanvir       Pak  15   51.0    36   2.40
 3 Rampaul R           Win   8   30.0    17   2.12
 4 Tait S.W            Aus  15   55.4    31   2.07
 5 Shoaib Akhtar       Pak   7   23.0    14   2.00
 6 Johnson M.G         Aus  21   77.1    36   1.71
 7 Lee B               Aus  16   58.1    27   1.69
 8 Langeveldt C.K      Saf   9   35.0    15   1.67
 9 Sreesanth S         Ind   9   34.0    15   1.67
10 Anderson J.M        Eng  18   66.2    29   1.61
Kemar roach, who bowls quite wildly often averages 2.5 wides and no-balls per match. That is something. If you add the runs scored off the free hits, if any, he is quite a liability.

Sohail Tanvir also clocks in at 2.4 wides and no-balls per match. However he is a great match-winner, at least in IPL matches. Rampaul follows next.

Ray Price is the leading (maybe the wrong term) spinner. He has conceded 1.43 wides and no-balls per match.

4. Bowlers who have conceded least numbers of wides & no-balls per match

No Bowler             Ctry Mat  Overs  Total WNb/M
                                        W+Nb

 1 Mudassar Bukhari    Hol   7   25.4     0   0.00
 2 Haq R.M             Sco   7   25.0     0   0.00
 3 Collingwood P.D     Eng  30   32.0     2   0.07
 4 Patel J.S           Nzl  11   33.1     1   0.09
 5 Dhaniram S          Can  11   33.4     1   0.09
 6 Dilshan T.M         Slk  31   20.0     3   0.10
 7 Yuvraj Singh        Ind  21   20.0     2   0.10
 8 Styris S.B          Nzl  28   46.3     3   0.11
 9 Seelaar P.M         Hol   9   35.0     1   0.11
Patel, Yuvraj Singh and Styris have quite low wides and noballs per match.

Now for the frequency of no-balls and wides.

5. Bowlers who have been most frequent with no-balls and wides

No Bowler             Ctry Mat  Overs  Total Balls/WNb
                                        W+Nb

 1 Roach K.A.J         Win  10   35.0    25     8.4
 2 Sohail Tanvir       Pak  15   51.0    36     8.5
 3 Shoaib Akhtar       Pak   7   23.0    14     9.9
 4 Rampaul R           Win   8   30.0    17    10.6
 5 Tait S.W            Aus  15   55.4    31    10.8
 6 Edwards F.H         Win  12   36.3    18    12.2
 7 Shoaib Malik        Pak  30   43.0    21    12.3
 8 Johnson M.G         Aus  21   77.1    36    12.9
 9 Lee B               Aus  16   58.1    27    12.9
10 Bresnan T.T         Eng  12   37.0    17    13.1
The usual fast bowling culprits lead this list. One wide or no-ball every 10 balls or so.

Ray Price leads the spinners with a wide or no-ball every 16.8 balls, quite frequent for a spinner.

6. Bowlers who have been least frequent with no-balls and wides

No Bowler             Ctry Mat  Overs  Total Balls/WNb
                                        W+Nb

 1 Seelaar P.M         Hol   9   35.0     1   210.0
 2 Borren P.W          Hol   9   35.0     1   210.0
 3 Dhaniram S          Can  11   33.4     1   202.0
 4 Patel J.S           Nzl  11   33.1     1   199.0
 5 Vaas WPUJC          Slk   6   22.0     1   132.0
 6 McCallan W.K        Ire   8   21.5     1   131.0
 7 Mascarenhas A.D     Eng  14   42.0     2   126.0
 8 Abdul Razzaq        Pak  17   39.3     2   118.5
 9 O'Brien K.J         Ire  16   35.3     2   106.5
10 Collingwood P.D     Eng  30   32.0     2    96.0
Three of the bowlers from the lesser teams have bowled a no-ball or wide once in 200 balls or so. That shows a level of accuracy not necessarily present in the more fancied bowlers.

Conclusion:

This started as a simple article. However the results are extremely fascinating. So much so the conclusions are quite speculative and I invite the enlightened readers to come in with their comments.

The tables 1,3,5 are the "negative" tables in this analysis in that these show the bowlers who have bowled more wides/no-balls, more wides/no-balls per match and more frequent wide/no-balls. However these tables are dominated by the genuinely good fast bowlers from top teams who have won more matches for their teams than the other bowlers.

The tables 2,4,6 are the "positive" tables in this analysis in that these show the bowlers who have bowled less wides/no-balls, less wides/no-balls per match and less frequent wide/no-balls. However these tables are dominated by the bowlers from lesser teams and some ordinary spinners. These are not necessarily match-winners. The top spinners, Muralitharan, Mendis, Harbhajan, Vettori, Botha, Swann et al are conspicous by their absence.

What does one conclude.

- That the top fast bowlers go for broke at the cost of control.
- That the top spinners do similarly but exercise more control.
- That the lesser bowlers, especially from the weaker teams, handicapped by their own lack of skills and team strength, show greater discipline and exercise lower levels of variety.
- That the wides/no-balls, especially the wides, because of the unknown free-hit component of the no-balls, are not that negative a trait that a bowler can have.

Note: For some perceptive comments on the conclusion, I must thank Sriram (Ananthanarayanan) who has brought in welcome independent editing skills.

The bottom line is that the really attacking bowlers, especially the fast bowlers, necessarily go for pace and variation and this might lead to more wides and no-balls. Maybe such a measure should be looked into in conjunction with strike rates of bowlers. My gut feel is that it is not possible to derive any conclusion from looking only at wides and no-balls.

To view/down-load the complete table, please click/right-click here.

Comments (11)
June 1, 2010
Significant Test innings, and their architects
Posted by Anantha Narayanan at in Batting

Shivnarine Chanderpaul has a significant innings percentage of 46.7%, which places him fourth in the all-time list © Getty Images
It is nice to be back after a valuable and recharging break. It is also wonderful to renew acquaintance with the valued readers. The break was necessary but I could not wait for the self-imposed sabbatical to be over.

In this article I have gone back to the reader's suggestions, specifically Xolile. He had suggested a few months back that I should look at separating the significant Test innings based on runs scored and balls faced, wherever such information is available, and rating batsmen using this information. I have taken that suggestion and completed the analysis after significantly improving the basis.

He had suggested that I take 80 runs and 160 balls as the basis. I have instead worked on a dynamic fixing of the cut-off points based on the specific match conditions. The idea is that I should achieve the following inclusions and exclusions through this analysis.

The analysis should be done so that the following innings (just a few examples) are included.

- Gillespie's 9 (off 51) out of Aus total of 93 a.o (30 overs) at Mumbai
- Guptill's 30 (off 122) out of Nzl total of 157 a.o (59.1 overs) at Wellington
- Srinath's 76 (off 159) out of Ind total of 416 a.o (128.3 overs) at Hamilton
- Hutton's 30 (balls n/a) out of Eng total of 52 a.o. (42.1 overs) at Oval
- A.H.Kardar's 69 (balls n.a) out of Pak total of 199 a.o (91.3 overs) at Karachi
and so on.

and the following innings (just a few examples) are not included.

- Collingwood's 60 out of Eng total of 569 for 6 at Chester
- Clarke's 83 out of Aus total of 674 for 6 at Cardiff
- Ranatunga's 86 out of Slk total of 952 for 6 at Colombo
- Walcott's 88* out of Win total of 790 for 3 at Kingston
- Rae's 63* and Stollymeyer's 76* out of Win total of 142 for 0 at Trinidad
and so on.

I have taken one decision, slightly reluctantly. Any 100 would be considered to be significant. Although I do not consider a 100 by itself to be anything special, I think this is a correct decision since out of the 68,879 innings played to date only 3370 hundreds have been scored and this constitutes around 5%. It is not a bad premise to start with, banking one in twenty innings.

As far as the often quoted instances of batsmen scoring 100s in dead match situations, the following example will show the pitfalls.

Take a match where two days have been washed out. The match scores are

Team 1: 300 for 5. Team 2: 300 for 6. Team 1: 300 for 7 (Xyz 100+).

If the first two days are lost due to rain, the third innings century is a totally irrelevant one scored on the last day. On the other hand if the last two days have been washed out, the third innings century is a very relevant one made in a live match situation on the third day. If the rain had occurred on other days, the value of the 100 would oscillate significantly. Hence pre-conceived notions of the significance or non-significance of innings should not be used to come to conclusions. Also incorporating rain factor, when it happened, on what day the runs were scored all are virtually impossible in any analysis because of the absence of dependable data.

Since 80 and 160 are arbitrary, I have worked on a dynamic determination of the cut-off for each match, separate for either team. This makes sense since I should include an innings of 9 and exclude a 88* innings. There cannot be common cut-off criteria.

The cut-off methodology is explained below. Based on the cut-off points 2 to 5, 12,529 innings below 100 have got selected.

An innings is considered to be significant if it satisfies any one of the following five conditions.

1. The runs scored is greater than or equal to 100 (already talked of).

2. The balls faced is greater than or equal to 200.

3. The runs scored is greater than or equal to the cut-off figure for the team, as explained below.
- For batsmen 1-7, 1.333 times the Runs per wkt value for the team for the two innings together.
- For batsmen 8-11, 1.167 times the Runs per wkt value for the team for the two innings together.

4. The balls faced is equal to or higher than the cut-off figure for the team, as explained below.
- For batsmen 1-7, 1.667 times the Balls per wkt value for the team for the two innings together.
- For batsmen 8-11, 1.333 times the Balls per wkt value for the team for the two innings together.

5. To take care of very low innings totals, see Hutton example above, the runs scored is greater than or equal to one third of the team total. The team should have lost 5 wickets or more. Otherwise Stollymeyer-type innings would get through.

Seems complicated but all conditions are logical once the above 5 conditions are understood properly, and the fact that an innings has to adhere to at least one of these in order to be seen as significant in this analysis. Of course, a cursory glance would be woefully inadequate. These cut-off numbers have also been determined after a lot of trial work during the past few days. A higher cut-off will mean missing out of some significant innings while a lower cut-off will mean inclusion of ordinary innings. Overall this method is slightly unfair to older batsmen since they have only the "Runs scored" criteria available to them. However nothing can be done about that.

I got a massive list of 15,899 innings, which is about 23% and this figure looks good. Then I posted these into the player database and got the player table. This table is sequenced on the % of significant innings since the number of innings played varies considerably. The cut-off for batsman selection is 3000 runs and above. 159 batsmen qualify.

The top 20 entries are listed below.

Table of batsman by % of significant innings

SNo Batsman           For Mats  Runs Inns   SI  % SI

  1.Bradman D.G       Aus   52  6996   80   43  53.8
  2.EdeC Weekes       Win   48  4455   81   39  48.1
  3.Hobbs J.B         Eng   61  5410  102   49  48.0
  4.Chanderpaul S     Win  123  8669  210   98  46.7
  5.Barrington K.F    Eng   82  6806  131   61  46.6
  6.Sutcliffe H       Eng   54  4555   84   39  46.4
  7.Lara B.C          Win  131 11953  232  106  45.7
  8.Dravid R          Ind  139 11395  240  108  45.0
  9.Hutton L          Eng   79  6971  138   62  44.9
 10.Flower A          Zim   63  4794  112   50  44.6
 11.May P.B.H         Eng   66  4537  106   47  44.3
 12.Viswanath G.R     Ind   91  6080  155   68  43.9
 13.Hammond W.R       Eng   85  7249  140   61  43.6
 14.Compton D.C.S     Eng   78  5807  131   57  43.5
 15.Umrigar P.R       Ind   59  3631   94   40  42.6
 16.Mitchell B        Saf   42  3471   80   34  42.5
 17.Sarwan R.R        Win   83  5759  146   62  42.5
 18.Manjrekar V.L     Ind   55  3208   92   39  42.4
 19.Javed Miandad     Pak  124  8832  189   80  42.3
 20.Gavaskar S.M      Ind  125 10122  214   89  41.6
How often do we a table headed by Bradman. More than 1 out of 2 innings played by Bradman are significant. He is the only player to have exceeded 50%. Then come two giants, Weekes and Hobbs, who have figures around 48%, the one mitigating factor is that they are within 10% of Bradman.

Now the biggest surprise. The unheralded and unsung Chanderpaul clocks in at 46.7% ahead of his more illustrious contemporaries. It shows the solidity and quality Chanderpaul brought to position No. 6. He could very well improve in the years to come. Barrington and Sutcliffe come in next, both great defensive batsmen. Hutton chips in in the 10th position.

Now we have two modern greats, Lara and Dravid. Lara's playing in a weaker team has helped a bit in this regard, but there can be few detractors to the claims of his greatness. Same applies to Dravid. What he has achieved for India has not been acknowledged, especially on the Test front. It is very pleasing to see some of the Indian greats of the past eras, viz., Viswanath, Umrigar, Manjrekar and Gavaskar appear in the top-20. They played in tough times and this has been recognised. Rounding this table in the 9th position is Andy Flower, one of the greatest modern batsmen ever, slightly benefiting from playing for a weaker team.

To view/down-load the complete table, please click/right-click here.

I have also given below the top 10 batsmen in terms of number of significant innings.

Table of batsman by number of significant innings

SNo Batsman           For Mats  Runs Inns   SI  % SI

  1.Dravid R          Ind  139 11395  240  108  45.0
  2.Lara B.C          Win  131 11953  232  106  45.7
  3.Border A.R        Aus  156 11174  265  103  38.9
  4.Tendulkar S.R     Ind  166 13447  271  103  38.0
  5.Chanderpaul S     Win  123  8669  210   98  46.7
  6.Kallis J.H        Saf  137 10843  231   94  40.7
  7.Waugh S.R         Aus  168 10927  260   92  35.4
  8.Stewart A.J       Eng  133  8465  235   90  38.3
  9.Gavaskar S.M      Ind  125 10122  214   89  41.6
 10.Inzamam-ul-Haq    Pak  120  8830  200   82  41.0
This is a quantity table. Dravid is on top with 108 performances and is followed by Lara with 106. Both are placed in the top-10 of the main table. Then comes the great fighter, Border and the incomparable Tendulkar with 103 significant innings. These four are the only batsmen to exceed 100 significant innings. Chanderpaul and Kallis should soon breach this number.

To view/down-load the complete table, please click/right-click here.

I have also made available the complete list of significant performances for all the 159 qualifying batsmen.

To view/down-load the table for the first 999 tests, please click/right-click here.

To view/down-load the table for tests 1000-1957, please click/right-click here.

Finally the grand-daddy of all tables. Let me warn you these tables are huge, 500kb each. These are the lists of all significant innings, all 15899 of them, covering all 1957 tests played.

To view/down-load the complete table for tests 1-999, please click/right-click here.

To view/down-load the complete table for tests 1000-1957, please click/right-click here.

Finally a usual note. This is a unique attempt to apply a common set of criteria across 1957 Tests spread over 133 years. There are bound to be anomalies. Readers are better off suggesting improvements rather than pointing out such stray instances.

A few readers have asked for spme summarized figures based on criteria. I have given these, and more below. I have not done the %. I leave it for the readers.

Total: 15908
100s: 3372      
200 balls but < 100 runs:  312
Out of other 12224 innings,
Both rpw & bpw criteria: 2517  Rpw criteria: 9270    Bpw criteria: 410
50-99: 6944     Lt 50: 5592
BPos 1-7: 13932   BPos 8-11: 1976
Ist inns: 8791      2nd inns: 7117
Wins: 4587     Draws: 4713     Losses: 6608

Comments (83)
Y Anantha Narayanan
Y Anantha NarayananY Anantha Narayanan has over 35 years of IT background. Over the past 15 years, he has been concentrating on Cricket analysis and software development. He has been involved with StumpVision, Wisden, Hallmark Software and his own site www.thirdslip.com during this period.
David Barry
David BarryDavid Barry was cricket-starved when teaching English in France, and study of cricket stats was his only way to stay sane. He is now back in Brisbane, Australia, and working towards a PhD in Physics. He once played for the worst team in the G-division of Muscat's cricket league.
Rajesh
RajeshRajesh After doing an MBA in marketing and working in an advertising agency, S Rajesh decided that his skills might be put to better use by number-crunching on cricket. He hasn’t regretted that decision in the last six years, and edits the Numbers Game column on cricinfo.com every Friday.
Rajesh Kumar
Rajesh KumarRajesh Kumar A product of Delhi's Shri Ram College of Commerce, Rajesh Kumar pursued cricket statistics at an early age before joining a nationalised bank, where he served for over two decades. He opted for a VRS nine years back, and hasn't regretted that decision. Apart from being a regular contributor to the Wisden Cricketers' Almanack over the years, Rajesh brought out five World Cup editions for Australia's Peter Murray. He has assisted Bill Frindall from 1980 till his death in January 2009 for the publications of various editions of The Wisden Book of Test Cricket, The Guinness Book of Cricket Facts and Feats, The Wisden Book of Cricket Records, Limited-Overs International Cricket and Playfair Cricket Annual.
Gabriel Rogers
Gabriel RogersGabriel Rogers was born on the ninety-somethingth birthday of Test cricket, and his fate may well have been sealed from that moment. His day-job revolves around medical statistics, and he is interested in applying principles from the field to the analysis of cricket data. Gabriel has spent most of his life in the south-west of England, but has recently moved to Manchester; he hasn't quite worked out yet whether living in a city with a Test ground is adequate compensation for moving away from his beloved Somerset CCC.
Ric Finlay
Ric FinlayRic Finlay Having just taken early retirement as a Mathematics teacher in Hobart, Ric Finlay now fully devotes his time to recording cricket, both past and present, for the popular CSW cricket database, along with his colleague David Fitzgerald (www.tastats.com.au). His interest in the game is inversely proportional to his ability as a player, but he did once score a century after being dropped at 3 and running out three of his team-mates. His first memory of international cricket is the 1962-63 MCC tour of Australia, described as one of the most boring ever. Totally fascinated, he was instantly hooked, and has never looked back. Author of three books on cricket of a historical nature, he has provided statistics and scored for radio and television cricket coverage since 1983.
Latest News
Specials
© ESPN EMEA Ltd