Accuracy and credibility
Accuracy and credibility are the two most important characteristics of any ranking system. They describe the relation between the:
- list of results
- ranking list
- performance of the competitors
- "real strength" of the competitors
The list of results and ranking list are well-defined subjects. Each consists of names and numbers. In the screenshot below the list of results is on your left and the ranking list is on your right.
The performance of the competitors is also a well-defined concept. Every match result represents the relative performance of the opponents. So a result is simply a numeric representation of the performances in one match and a ranking is a numeric representation of the participants' performances throughout the whole tournament.
Accuracy - is a measure of the relationship between a tournament's results and a tournament's ranking. The higher the accuracy the better the tournament's ranking represents the tournament's results. All you should know about accuracy is that as a user you cannot affect it. It is a property of the mathematical model which we developed to measure performance and it is very accurate. The only way to improve accuracy is to use more digits in the ranking representation. For instance, in the screenshot above, the Sharks have 10,000 points and the Alphas have 9,594 points. A more accurate ranking would be if the Sharks had 10,000.000 points, then the Alphas would have 9,594.442 points. But through many experiments we discovered that 4 digits should be enough for almost all competitions known to us. Using more digits would be necessary in a tournament that has a large number of participants of approximately the same level. If by the end of such a tournament several competitors had the same number of points, say 9,594, then it would be a proper time to use more digits in the ranking.
In contrast to the concept of performance, "real strength" is not a well-defined concept. In the 29th match of the above tournament, the 5th team in the league - the Alphas - were beaten 3:4 by the Warriors who are 9th in the ranking. But during the match the Alphas overshot the Warriors 40:13 and they hit the post 5 times. Besides that the Warriors scored one fluke goal.
So, can we say that the Alphas are "really stronger" than the Warriors? Can we say that the Warriors won the match because they got "lucky"? Yes, we can, but only if we were present at the match and our ice-hockey experience allowed us to conclude that indeed all 40 shots on the goalie and 5 post hits were a logical manifestation of their superiority.
But what if the story was the complete opposite? What if the 3:4 loss was logical and their 40 shots on the goalie were "lucky"? What if in reality it was not the Warriors who got "lucky" but the Alphas who got "lucky" that they didn't lose by a larger difference in goals?
Finally, what if all of the above data was logical? Is it possible that the 3:4 loss was logical while the 40:13 shots on the goalie were logical as well? Yes, even such a scenario is possible. If the Alphas' offense is so much better then the Warriors' defense that they would be able to produce 40 shots, but at the same time their shots are not good enough to beat the Warriors' goalie.
In order to define who is "really stronger" it's not enough just to know the results and shots on the goalie, more data is required. So is it possible to collect such a match statistic that just by looking at it one could always tell which team is "really stronger"? Even if it is possible, then how could it be tested? All of the above questions are very interesting and important in certain professions but they are also very complex. As a result, the concept of "real strength" still belongs to the category of weakly-defined concepts. The question of - who is "really stronger"? - is left for discussion by the fans, media, bookmakers, sport agents and anyone else who has an opinion.
The next logical question is - why we brought-up the concept of "real strength" in the first place?
A properly organized tournament must provide an opportunity to all participants to demonstrate their "real strength" and minimize the "luck" factor in a ranking.
Here the notion of credibility arises. Credibility is a measure of how well a tournament is managed or, in other words, how wisely the opponents are paired in every round of the tournament so that the ranking - which represents the performances - can also be considered as a good approximation of the real strengths. The practical issues of how to manage a tournament which produces a more credible ranking is explained below.
As the software takes care of the accuracy component, the credibility of a tournament is in the hands of tournament officials. In order for a ranking to be credible all of the specifics of a particular sport or competition should be taken into consideration. For example a 5000 meters speed skating event, all of the participants have only one attempt to race, so they must to do their best. But in ice-hockey, even at high-profile events, it is not so easy to figure it out how "seriously" some top teams are taking their matches at the group stage of the competition.
However, the 'play a closest competitor' rule will make any tournament in any sport fairly credible. This rule provides the competitors with enough opportunity to show what they are capable of.
The more chances all players have to play against opponents who are close to them in the ranking, the more credible the resulting ranking will be.
For example, in the above tournament the Alphas played against only one team ahead of them in the ranking and five matches against teams which are below them. So in the next round it would be wise to put the Alphas against the Sharks, or the Supersonics, or the Penguins.
In the final ranking of a perfectly managed tournament every team would have played against a certain number of teams which are slightly higher and slightly lower in the ranking. For example, if the above ranking was a final ranking of a tournament then it would be great if the Alphas had played against the Penguins, United, eCentral and the Warriors. It would be even better if in addition the Alphas would had played against the Supersonics and the Hornets.
Simply applying the above rule should make a tournament credible enough, assuming that the competitors played a sufficient number of matches. After obtaining the initial ranking the following rounds should match the opponents who are closer in the ranking. For an in-depth analysis in improving the credibility of a tournament please read the next section.
Credibility and connectivity
It has been shown that the connectivity is a sufficient condition to the existence of an equRanking. Under this condition it is possible to obtain the initial ranking just after two rounds of matches. But in most sports such ranking would not be considered credible. The only exception would be if the rules stipulated that a competition consist of only two rounds. If a competition consists of more than two rounds then another question arises - what strategy to use in matching the competitors in the following rounds?
Some effective matching strategies are implemented in MatchMaster-R and MatchMaster-A - types of equTournaments where the software itself makes all of the scheduling decisions. MatchMaster types remain blind to anything but the credibility of a ranking. For example, MatchMasters cannot and should not give preference to some pairings only because the fans want it.
By contrast, in ALPHA, it is possible to take into account the fans' wishes as well as those of the competitors because the matching of competitors is done by tournament officials. ALPHA - which can be associated with a freedom to play, to choose and to take into account all of the desires of all the main stakeholders in a tournament - has a tremendous flexibility in scheduling. For example, in tennis tournaments of the ALPHA type, officials can literally guarantee a match between the top-listed players. To combine the flexibility of ALPHA with intelligent scheduling to produce a credible ranking, is not so difficult. Connectivity is a major tool that can be used in creating an intelligent schedule. It is visual and easy to understand. Therefore, we use connectivity to measure the credibility of a ranking.
We say that a tournament is path-connected if the graphical representation of the tournament contains at least one path which connects all nodes. For example, in the graph below where the participants are represented by nodes and matches by arcs, such a path exists.
We say that two paths are independent if they do not possess a common arc. We say that the tournament is N-path-connected if the graphical representation of the tournament contains N independent paths and each of them connects all of the nodes. The graph below is 2-path-connected. The first path is dark-coloured and the second path is light-coloured.
The next tournament has 16 participants and 57 matches. Competitors A,B,C,D,E,F,G and H are very well connected within its group. This group is 4-path-connected, which is the maximum possible for a group of 8 competitors. The group K,L,M,N,O,P,R and Q is also 4-path-connected but the whole tournament is only 1-path-connected. Therefore, it is a poorly managed tournament.
Usually a higher path-connectivity implies a wiser scheduling design, but this is not always true. There is a small chance that things might go wrong. For example the tournament below has 15 participants and is 4-path-connected. It is not a smart design though because both groups are measured in relation to each other through only one competitor - F.
The above tournament is too "dependant" on competitor F. Therefore, the high path-connectivity is not a sufficient condition of smart design. Here the definition of graph connectivity from graph theory comes into play. It is also referred as node-connectivity or vertex-connectivity. It defines a graph as N-node-connected if it is possible to disconnect the graph by removing no less then N nodes. In the above tournament it is possible to disconnect the graph by removing just one participant, F. Therefore, the above tournament is 1-node-connected and 4-path-connected.
A properly managed tournament would require both: high path-connectivity and high node-connectivity. To measure the credibility of a tournament's ranking we rely on the following definition.
A tournament has credibility, N, if both the path-connectivity and node-connectivity of its graph representation are no less than N.
Below is an example of a well managed tournament. It has 16 participants and 32 matches. It is 2-path-connected and 4-node-connected, which is the maximum possible.