The first study of judging - report

The preliminary version of report was prepared in the mid of February and made available to study participants for comments. This published version does not differ substantially from the preliminary.

Introduction

1. First, let me express my thanks to 5 composers - judges, who dared to participate in the first experimental study about judging the helpmates. It was surely time consuming, to look at problems critically and to write their opinion, I am really grateful and happy they have contributed. Reportedly, a few more people got involved and moved to some stage of the process, but some of them just did not hit the "Send" button in their e-mail clients...

2. Five contributing people are:

Evgeny Bourd (EB), Israel
Ilja Ketris (IK), Latvia
Mario Parrinello (MP), Italy
Menachem Witztum (MW), Israel
Peter Wong (PW), Australia

3. They have approached the task very variedly, from just ordering the problems to making complete judgement and even more. This made the analysis of the results a bit difficult, but after some considerations I have decided to take only their orderings as the main quantitative measure. Obviously, I had to make some decisions in the process. However, below you can find hopefully their full comentaries along with some mine summarizing findings provided by the particular problem. As it was not the aim of the study to evaluate abilities of particular judges - participants, I have decided not to publish individual judgements or to measure their individual outcomes (except a few reasonable cases). Further below there are a few issues that were uncovered and in my opinion they do deserve explicit mention.

Comments to problems

4. Now, let's look at comments received on all problems in the order of their average placement according to our jury. (As we had only 5 participants in the study, none of numeric results is probably significant from statistical point of view, they, however, indicate at least something.)

5. JT3 by Mirko Degenkolbe - average position: 3,3
EB: "Very good geometry in the solutions using the WQh8 "shooting" both corners, the idle white piece doesn't bother me at all . it is very interesting how the 3 pieces in the 1st rank are cleared."
IK: "Reversed phoenix achieved quite imaginatively. Remarkable economy."
MP: "An original interpretation of one of the oldest motif, castling, exploited for battery mates and ernriched here by Phenix effects and pin mates."
PW: "As you say, very original. The White pieces unused in the mates don't bother me because they take a very active part in the play."
JL: "Well, now I can say it openly, without affecting the outcome: it was me, who have placed this helpmate almost on the top of the 12 awarded helpmates in the international match with theme 'h#2 with royal battery'. I dared to do so despite no active role for white B/S pair in mates. I am quite relieved the participants of this judging study have in a sense confirmed this my slightly bold step. Or my remark about originality of the concept was too influential? See also ideas about general feeling and importance of comments below."

6. JT7 by Efren Petite - average position: 4
EB: "Amusing problem where all white moves are made by 3 pawns. The play is rather obvious."
IK: "Extremely attractive cycle of moves. Technical pieces seem to be unavoidable."
MP: "I have excluded JT7 (anticipated by A. Kuzovkov 1st Prize Goumondy-40 JT 1986)."
PW: "The two tempo checks on b3/d3 are a bit familiar, but the addition of a third tempo check to generate a White move cycle is excellent."
JL: "It is by no means meant as a criticism to other participants, rather, it is very useful they have approached the study differently. But thanks to Mario we have uncovered two important things I have felt before. Firstly, one should try to look for comparison problems whenever possible (see also paragraph 21 below). The other matter are remarks about construction, one should be very about these, my own judging practice hace learned me that lesson."

7. JT8 by Mario Parrinello - average position: 4,25
EB: "Zilahi with pin mate of the same piece on different lines, looks hard to achieve but the WQh3 seems out of the game . the big amount of pieces just not to allow BSd4 to "jump" is also not very pretty."
IK: "ASP involving bS with good ODT, but the Zilahi seems a bit artificial."
JL: "Of course, white B/R were placed on destination squares of wS on purpose, that points to the fact that themes might be attained in varying fashions and not every embodiement of named theme is equally valuable. One should not be fooled by theme name! By the way, Mario has not judged this own helpmate, gentleman approach, provoking the question about how much is anyone able to judge qualities of own works."

8. JT6 by Zoltan Labai - average position: 5,2
EB: "The "forced" BQ captures of the pawns are unharmonic, once to open BBb6 and once to allow c5 but overall nice pin mates."
IK: "Fresh productive design, ASP by bQ, every pawn clears 2 lines in each solution. Economy seems adequate."
MP: "Two good anticipatory selfpins by the BQ for line opening purpose but the twinning mechanism semms to me rather weak and the BQ has very limited space."
JL: "Rather than summarizing findings, after reading the comments of participants, first I would like to express my way of understanding this helpmate. For me the key to content is the twinning device. Rh4 frees S for mating, provided that d4 and d3 are blocked or guarded. Rd1 allows d3#, provided that e3 is blocked or guarded. Here I see the source of imbalance - showing itself also in the Evgeny's comment about imbalanced motivation of bQ role, approaching the content from other point of view. The same thing might be felt differently. Also, what is told in two comments - activity of piece plays important role - see general comment below."

9. JT11 by Valerij Gurov - average position: 5,2
EB: "3 different pawn batteries but the move Sa1 in the 3rd solution ruins the harmony."
IK: "Attractive mechanism in a healthy setting, 2... Sa1 in 3) is quite unexpected, but 2nd and 3rd solutions are way too symmetrical."
MP: "This problem shows original anticritical moves by the rear piece of a battery. Though the motivation of B2 in one solution differs from the other two and on the other hand it seems unavoidable, the real merit of this difficult strategy is that the author has achieved a three-solutions setting not being satisfied with only two solutions."
JL: "On the surface there is fact probably uncovered by many helpmate composers - that making 3 analogous solutions is usually much more difficult than just two. I have experienced it many times myself and even one of my early helpmates with 3 solutions (with grasshoppers, ack, sorry in this orthodox context) got quite high distinction in the tourney although I did not like it very much! ... just because of 3 solutions, I suppose. Mario, beeing expert h#2 composer, has put Gurov's h# on the top of his award, and the second experienced h#2 composer, Menachem, gave it the 2nd highest rank of all study participants. This might indicate that having own composing experience in the judged field might affect one's preferences quite a lot. This is not to say that beeing good composer is necessary for beeing good judge or vice versa - just to say that viewpoint of composer active in the area might be very different from that of an occasional judge."

10. JT10 by Menachem Witztum - average position: 5,9
EB: "Very disharmonic but interesting line play, the moves Sd6 and Se3 that appear twice is a nice touch."
IK: "Two solutions correspond in an unusual fashion. The quality of rendering is exemplary."
MP: "Completely change of motivations of similar move (Sd6) by both sides (unpin both of a black and white piece in a) and check avoidance in b); unfortunately the impure simultaneous unpin in a) spoils the clarity of the play."
JL: "For once, I have nothing special to add..."

11. JT4 by Paz Einat & Ofer Comay - average position: 6,3
EB: "Interesting royal batteries but the play itself is not harmonic, once WBd5 is captured, the piece that mates, while in the second solution WSf2 is captured, a piece that only guards d3."
IK: "King dances are entertaining, but I wish wRh4 were sacrificed thematically. Also pity about static wRa2."
MP: "It is visually attractive non only the creation of two different royal batteries but the paradoxical double exchange of place between both Kings in the twinning mechanism and especially in the mates. A more economical version, which saves 3 units, can be achieved."
JL: "Here my own feeling did not quite correspond with that of participants. One disturbing element beeing the disharmonic way of treating the rear pieces of batteries - in the a) wB is captured, in b) Rh4 guards squares - for me acceptable as it was not the core of the problem. Further, I like the use of Rb1 as despite beeing away from main actions, it proves useful in both positions. Finally, Mario has again proved it is always necessary to question the construction of a problem, even if it is time consuming."

12. JT5 by Jan Kovalic - average position: 7,6
EB: "Good problem but the idle WBg7 in 3rd and 4th solution don't allow a higher placing."
IK: "Nice mixture of strategic elements, albeit not very complex, achieved in a “helpmate of the future” setting. The symmetry is discouraging though."
MP: "I have excluded JT5 (too symmetrical)."
JL: "Interesting to find out that despite hard comments this problem did not finish near total bottom of averaged orderings. Moreover, as you might find below, all judges ordered it more than less similarly This might indicate that sometimes expressions in words do not correspond to quantitative expression. Which one is then more important or reliable?"

13. JT2 by Anatolij Stjopotchkin - average position: 7,6
EB: "Good harmony between the 3 solutions, perfectly matched."
IK: "The only strategic element is changed block of d6, otherwise the solutions are too homological, and the twinning is not pretty."
MP: "Three pleasant echoes with pawn double-step mates and different selfblocks on d6."
JL: "On the other hand, the opinion of study participants differed very much, anyway, the average ordering is the same! Surprising, isn't it?"

14. JT9 by Lev Grolman - average position: 7,7
EB: "Battery creating combined with pin mates and black reversal, good tries on B2 (2.R~?) prove that the rook must be pinned."
IK: "Familiar sequence of pins and unpins, in rather crowded setting."
MP: "I have excluded JT9 (this strategy of reversal black moves is well-known, compare this with C.Jonsson Springaren 1997 - a more economical version can be achieved, which saves 6 (!) units, has two solutions setting and still shows the exchange of functions between the WR and the WS in guarding the BK's field)."
JL: "General feeling about heaviness of construction is once again proved by Mario. In some fields is nowadays becoming very difficult to find somethin original,"

15. JT1 by Helmut Zajic - average position: 9,6
EB: "Zajic theme but very simple."
IK: "The setting is not very spectacular for the idea this simple. There is a feeling it could be done with lesser amount of plugs."
MP: "I have excluded JT1 (neat Zajic theme but several problems exist showing similar exchange of functions of the thematic white pieces and more complex play)."
JL: "The views of participants converged on this problem..."

16. JT12 by Eliahu Fasher - average position: 9,8
EB: "Poor twining of the mating piece and the idle WBb2 ruin the problem."
IK: "While being valid, this problem has unused white pieces due to the deficient symmetry between solutions. Moves by bK in h#2 with no good motivation diminishes its strategic value, in our opinion."
MP: "I have excluded JT12 (the BBb2 is idle in b)."
PW: "The problem with the most impressive strategy. I would have placed this 1st or 2nd if not for the unused White bishop in (b)."
JL: "... on the other hand diverged a bit here. Personally, I liked especially the a) position, but I have found the unused wB in b) disturbing too. Clearly, this was seen as important defect as there was no need for him to sleep because of the strategy shown. Compare that to the problem ranked as the best on average by M. Degenkolbe!"

Further comments by participants

17. Generally, all participants have expressed they interest in results - indeed that fact they participated have confirmed it much than anything else as they were willing to invest their time. MW wondered: "Have you gotten a lot of responds? If not, maybe you should try refering directly to some composers\judges." Frankly, I did not know what I can expect from this experiment. I could not even imagine how much work I will have to do after receiving results. I think I have made enough advertising and any Internet composer active might have heard about study this or another way. I just think that time given (about 1 month) was too short and not everyone willing to participate in general might have found enough free time. If there is some next study, the time for "judging" will have to be longer.

18. MP has written much more than just award and you might find his ideas inspiring (next 4 paragraphs):

19. "Your recent experience as FIDE-Album judge has surely provided you satisfaction but mainly a large amount of work and sense of responsibility since a given composer gains titles from the FIDE-Albums points and his expectations are, rightly or wrongly, very high. You say 'FIDE Album is not only about giving a handful of honours. You have to judge all the problems, mark them with points' and 'it surely does not look well if the points given by two judges differ too much'. I entirely agree with you but let me add some other points.

20. First, I have to say that what I will say is of course my personal point of view and you can simply disagree but I try to convince you and the interested readers that my thoughts are not only important but also a real problem.

21. In the specific case of the FIDE Albums, the three judges mark the problems with points according to their personal liking, experience and, as a consequence, the knowledge of what has been produced in that particular field; all that explains quite often the difference (in some cases very large) in the single score and thus I think it will a miracle if only two judges give the same points and make the same order in an award. But to give points is not, in my view, the only task of the three judges: I do think that they have to intensively search for anticipations. The now available computerized databases (the PDB is an extremely useful gold-mine), not to mention many personal problem collections, easily provide the necessary informations.

22. Another extremely important task for a FIDE-Album judge (or perhaps for section director) should be to check the construction of a given problem (economy of force or tempo, impure effects, weasels and so on), in order to verify whether the proposed setting is the best possible. In fact we have to remember what is FIDE-Album: it is not only the route to obtain titles but is also an anthology and that means that the problems included are (or should be) the best in the world in that three-year period. But as all we know, it does not happen very often and we see anticipated or badly constructed problems, not to mention those which are not worthy of being included at all (but this is another story)."

23. Here I dare to interrupt Mario's words with my own view. While I entirely agree with need to check for anticipations and indeed I think is more possible now than ever before, I cannot imagine how this construction checking would work in practice. Let's count: in fairy section of FIDE Album I had to judge about 1150 fairies. We do not need to bother with 2/3 of them just because of its quality without construction check. However, still about 400 problems remain. Thorough construction check might take some days... this is unbearable. Of course, there are suspicious positions, hinting construction weaknesses on the first view, but some are more subtle. Rather I would argue by the fact that was original idea of FIDE Albums - to be anthology of works for the given time period. For me, even a bit worse constructed problems deserve their place in the Album provided that they are worthy representatives of its field in in other respects.

24. MP continues and suggests: "What I said is valid not only for the specific situation of the FIDE-Album but it counts also for all the composing tourneys. As I said, this is only my personal point of view and look forward to hearing from other interested people. In this regard, I too should like to invite you and your readers to try another little special tourney which is of course for judges but also for all those composers who take care the construction and purity of effects. Please, don't rate this tourney as a school examination (there will not be any rejected, but on the other hand the winner will not win prizes!) but only an absolutely partial (take this adjective as the opposite of unimpartial) demonstration for what I said. I propose two single line helpmates in two moves and you are invited to find what is their drawback."

25. Find here the positions provided by Mario: Q1, Q2. Please, send your opinion on drawbacks of these to juraj.lorinc@bigfoot.com.

26. Further comment by MW emphasizes the point already given on the the older page written by me long time ago: "The problems you chose were very varied and quite in the same level, so the personal taste rem of each judge is a major factor in his judgement."

27. Finally, MP observed something really interesting: "I do not know whether this discussion about judging can be of interest at all and at the same time I hope you will receive many reactions from your readers but what I firmly think (I hope someone one day can explain the reasons?) is that this topic is not one of the most popular argument in our chess problem community."

28. Yes, indeed, as far as I remember all discussions about judging appeared as reactions to judgements that displeased the initiators of discussion. Perhaps we have good luck now and the initiation of discussion by me was finally motivated differently - by a lot of recent judging experience (by the way, I have already done 2 more judgements after finishing FIDE Album...) and a bit of positive feedback.

29. How much did the views of participants on problems differ? One way to measure this might be computing standard deviations of places for given problem from its average placing. This gives the following table:

JT1	Helmut Zajic	1,34
JT5	Jan Kovalic	1,67
JT8	Mario Parrinello	2,06
JT11	Valerij Gurov	2,49
JT4	Paz Einat & Ofer Comay	2,59
JT9	Lev Grolman	2,99
JT3	Mirko Degenkolbe	3,35
JT7	Efren Petite	3,46
JT12	Eliahu Fasher	3,49
JT10	Menachem Witztum	3,61
JT2	Anatolij Stjopotchkin	3,65
JT6	Zoltan Labai	3,96

Summary of issues uncovered

30. In this study we have discovered no America. All conclusions below are known, however they seem to be valid even in quickly changing world...

31. Search for comparison problems and anticipation is important part of judging process. Either in one's memory or in available databases, in helpmate area we have PDB or some more specific personal collections, for twomovers there is Albrecht's collection, for fairies there is WinChloe database containing good deal of the best works of past and present - and late Alexander Pituk have once said: "Stupid problems cannot be anticipated!".

32. Remarks about construction are sometimes dangerous when they are not backed by judge's tries to improve construction. However, it is time consuming and perhaps worth effort only with the best problems in the competition.

33. One should not get fooled just by theme names, the content of problem should more often than not speak for itself. There are exceptions to this rule, especially with new and very difficult themes, they either mean exploring completely new area or proving abilities by digging something new and really valuable from already digged fields.

34. It is important to give explanatory comments whenever possible. This might help to understand judge's way of thinking and perhaps mitigate some unnecesary polemics over judgement.

35. Activity of pieces on diagram is a tricky point. In some instances pieces with little possible activity might help to turn attention on the main idea that would be too hidden with more possible activity. On the other hand, as JT6 and JT8 have shown, more often limiting activity of pieces by blocking it is seen as negative.

36. Viewpoints of composer and noncomposer in the field might differ considerably. In the past, experience with leading solvers judging composing tourneys has shown this clearly.

37. Judging own works is very difficult. Usually it is not allowed in competition for obvious reasons. Composers tend to take into account not only qualities of final product, but also all difficulties he had to overcome to get the position as it is. There are however, some competitions when you have to judge own works (Problem-Echo, some matches with special rules, to mention just those I participated in). No one should be surprised then... of strange outcome.

38. Different viewpoints might lead to the same conclusions - or very different!

39. What makes the judgement good? When the judge is considered good? Possibly there is some general feeling that he makes his work competently and although everyone would make the ordering a bit different, judge does his own consistently and preferred ordering of the judge does not differ too much from reader's.

40. Please, comment on whatever given in this report, if you have different view. The best way would be to give number of paragraph concerned, it might allow better understanding.

Juraj Lörinc

41. P.S. Mario Parrinello have already provided more extensive comments on the judging process, I would say that his thoughts would deserve some more space and perhaps even in the printed media.

Versions, comparison problems and quiz problems

Alexandr Kuzovkov 1st Prize Goumondy-40 JT 1986	1.Rg4 e4+ 2.Rxe4 g7# 1.Qh2 g7+ 2.Rg6 g4# 1.Bf3 g4+ 2.Bxg4 e4#

h#2 (8+8) 3 solutions

Christer Jonsson Springaren 1997	1.Qxd4 Rg7 2.Qxc4 Rg2# 1.Qxc4 Bc7 2.Qxd4 Bxf4#

h#2 (11+5) 2 solutions

Paz Einat Ofer Comay 5th Comm e.a The Problemist 2000 version Mario Parrinello	a) 1.Kxd5 Kf4 2.Kc4 Ke5# b) 1.Kf2 Ke4 2.Kg2 Ke3# +++ Composition In the Spotlight (CIS) No. 26 +++ Spotlight comment by Juraj Lörinc: In 2004 I have run a experimental study of judging concentrating on helpmates in two moves. The starting file (a kind of announcement) gives explanations and 12 problems to be judged by experiment participants. Then the present file is in fact a report on the experiment results, with comments by participants and some conclusions. It might be indeed interesting even today, almost 10 years did not change much, but perhaps something has changed. Among experiment participants the input of Mario Parrinello was exceptional. Not only he has commented all involved problems, but also improved a construction of some of them and provided some general remarks. This spotlight problem was one of Mario's improved versions. The play of wK in short helpmates is usually an interesting element. Here the white king marches to squares occupied in starting positions by bK, while bK goes to battery lines. Of course, there are some disharmonic elements, but the play of kings only is interesting enough.

h#2 (5+11) b) e3 «-» e5

Lev Grolman Sachmatnaja Poezija 2000 version Mario Parrinello	1.Rxc7 Sc5 2.Rxg7 Scb3# 1.Rxg7 Rf6 2.Rxc7 Rf2#

h#2 (12+3) 2.1.1.1

Mario Parrinello Original - quiz problem Q1	1.Rxc5+ Sg5 2.Rc4 Se6#

h#2 (6+2)

Mario Parrinello Original - quiz problem Q2	1.g5 Sa5+ 2.Ka4 Be8#

h#2 (5+5)

Comments to Juraj Lörinc.
Back to main page of Chess Composition Microweb.