The first study of judging - report
The preliminary version of report was prepared in the mid of February and made available to study participants
for comments. This published version does not differ substantially from the preliminary.
1. First, let me express my thanks to 5 composers - judges, who dared to participate in the first experimental
study about judging the helpmates. It was surely time consuming, to look at problems critically and to write
their opinion, I am really grateful and happy they have contributed. Reportedly, a few more people
got involved and moved to some stage of the process,
but some of them just did not hit the "Send" button in their e-mail clients...
2. Five contributing people are:
3. They have approached the task very variedly, from just ordering the problems to making complete
judgement and even more. This made the analysis of the results a bit difficult, but after some considerations
I have decided to take only their orderings as the main quantitative measure. Obviously, I had to make some decisions
in the process. However, below you can find hopefully their full comentaries along with some mine
summarizing findings provided by the particular problem. As it was not the aim of the study to evaluate
abilities of particular judges - participants, I have decided not to publish individual judgements or to measure
their individual outcomes (except a few reasonable cases).
Further below there are a few issues that were uncovered and in my opinion they do deserve explicit mention.
- Evgeny Bourd (EB), Israel
- Ilja Ketris (IK), Latvia
- Mario Parrinello (MP), Italy
- Menachem Witztum (MW), Israel
- Peter Wong (PW), Australia
Comments to problems
4. Now, let's look at comments received on all problems in the order of their average placement according to our jury.
(As we had only 5 participants in the study, none of numeric results is probably significant from statistical
point of view, they, however, indicate at least something.)
5. JT3 by Mirko Degenkolbe - average position: 3,3
EB: "Very good geometry in the solutions using the WQh8 "shooting" both corners, the idle white
piece doesn't bother me at all . it is very interesting how the 3 pieces in the 1st rank are cleared."
IK: "Reversed phoenix achieved quite imaginatively. Remarkable economy."
MP: "An original interpretation of one of the oldest
motif, castling, exploited for battery mates and ernriched here by Phenix
effects and pin mates."
PW: "As you say, very original. The White pieces unused in the mates don't bother me because
they take a very active part in the play."
JL: "Well, now I can say it openly, without affecting the outcome: it was me, who have placed this
helpmate almost on the top of the 12 awarded helpmates in the international match with theme 'h#2 with royal battery'.
I dared to do so despite no active role for white B/S pair in mates. I am quite relieved the participants
of this judging study have in a sense confirmed this my slightly bold step. Or my remark about originality
of the concept was too influential? See also ideas about general feeling and importance of comments below."
6. JT7 by Efren Petite - average position: 4
EB: "Amusing problem where all white moves are made by 3 pawns. The play is rather obvious."
IK: "Extremely attractive cycle of moves. Technical pieces seem to be unavoidable."
MP: "I have excluded JT7 (anticipated by A. Kuzovkov 1st Prize Goumondy-40 JT 1986)."
PW: "The two tempo checks on b3/d3 are a bit familiar, but the addition of a third tempo
check to generate a White move cycle is excellent."
JL: "It is by no means meant as a criticism to other
participants, rather, it is very useful they have approached the study differently. But thanks to Mario we have
uncovered two important things I have felt before. Firstly, one should try to look for comparison problems
whenever possible (see also paragraph 21 below).
The other matter are remarks about construction, one should be very about these, my own
judging practice hace learned me that lesson."
7. JT8 by Mario Parrinello - average position: 4,25
EB: "Zilahi with pin mate of the same piece on different lines, looks hard to achieve but the WQh3
seems out of the game . the big amount of pieces just not to allow BSd4 to "jump" is also not very pretty."
IK: "ASP involving bS with good ODT, but the Zilahi seems a bit artificial."
JL: "Of course, white B/R were placed on destination squares of wS on purpose, that points to
the fact that themes might be attained in varying fashions and not every embodiement of named theme is
equally valuable. One should not be fooled by theme name! By the way, Mario has not judged this own helpmate,
gentleman approach, provoking the question about how much is anyone able to judge qualities of own works."
8. JT6 by Zoltan Labai - average position: 5,2
EB: "The "forced" BQ captures of the pawns are unharmonic, once to open BBb6 and once to allow c5 but overall
nice pin mates."
IK: "Fresh productive design, ASP by bQ, every pawn clears 2 lines in each solution.
Economy seems adequate."
MP: "Two good anticipatory selfpins by the BQ
for line opening purpose but the twinning mechanism semms to me rather weak
and the BQ has very limited space."
JL: "Rather than summarizing findings, after reading the comments of participants, first I would like to
express my way of understanding this helpmate. For me the key to content is the twinning device. Rh4 frees S for mating,
provided that d4 and d3 are blocked or guarded. Rd1 allows d3#, provided that e3 is blocked or guarded. Here I see
the source of imbalance - showing itself also in the Evgeny's comment about imbalanced motivation of bQ role,
approaching the content from other point of view. The same thing might be felt differently. Also, what is
told in two comments - activity of piece plays important role - see general comment below."
9. JT11 by Valerij Gurov - average position: 5,2
EB: "3 different pawn batteries but the move Sa1 in the 3rd solution ruins the harmony."
IK: "Attractive mechanism in a healthy setting, 2... Sa1 in 3) is quite unexpected, but 2nd and 3rd solutions
are way too symmetrical."
MP: "This problem shows original anticritical moves by
the rear piece of a battery. Though the motivation of B2 in one solution
differs from the other two and on the other hand it seems unavoidable, the
real merit of this difficult strategy is that the author has achieved a
three-solutions setting not being satisfied with only two solutions."
JL: "On the surface there is fact probably uncovered by many helpmate composers - that making 3 analogous
solutions is usually much more difficult than just two. I have experienced it many times myself and even
one of my early helpmates with 3 solutions
(with grasshoppers, ack, sorry in this orthodox context) got quite high distinction in the tourney although
I did not like it very much! ... just because of 3 solutions, I suppose. Mario, beeing expert h#2 composer,
has put Gurov's h# on the top of his award, and the second experienced h#2 composer, Menachem, gave it the
2nd highest rank of all study participants. This might indicate that having own composing experience in the
judged field might affect one's preferences quite a lot. This is not to say that beeing good composer is
necessary for beeing good judge or vice versa - just to say that viewpoint of composer active in the area
might be very different from that of an occasional judge."
10. JT10 by Menachem Witztum - average position: 5,9
EB: "Very disharmonic but interesting line play, the moves Sd6 and Se3 that appear twice is a nice touch."
IK: "Two solutions correspond in an unusual fashion. The quality of rendering is exemplary."
MP: "Completely change of motivations of similar
move (Sd6) by both sides (unpin both of a black and white piece in a) and
check avoidance in b); unfortunately the impure simultaneous unpin in a)
spoils the clarity of the play."
JL: "For once, I have nothing special to add..."
11. JT4 by Paz Einat & Ofer Comay - average position: 6,3
EB: "Interesting royal batteries but the play itself is not harmonic, once WBd5 is captured, the
piece that mates, while in the second solution WSf2 is captured, a piece that only guards d3."
IK: "King dances are entertaining, but I wish wRh4 were sacrificed thematically. Also pity about
MP: "It is visually attractive non only
the creation of two different royal batteries but the paradoxical double
exchange of place between both Kings in the twinning mechanism and especially
in the mates. A more economical version, which saves 3 units, can be achieved."
JL: "Here my own feeling did not quite correspond with that of participants. One disturbing
element beeing the disharmonic way of treating the rear pieces of batteries - in the a) wB is captured,
in b) Rh4 guards squares - for me acceptable as it was not the core of the problem. Further,
I like the use of Rb1 as despite beeing away from main actions, it proves useful in both positions.
Finally, Mario has again proved it is always necessary to question the construction of a problem,
even if it is time consuming."
12. JT5 by Jan Kovalic - average position: 7,6
EB: "Good problem but the idle WBg7 in 3rd and 4th solution don't allow a higher placing."
IK: "Nice mixture of strategic elements, albeit not very complex, achieved in a “helpmate of the future”
setting. The symmetry is discouraging though."
MP: "I have excluded JT5 (too symmetrical)."
JL: "Interesting to find out that despite hard comments this problem did not finish near total bottom of
averaged orderings. Moreover, as you might find below, all judges ordered it more than less similarly
This might indicate that sometimes expressions in words do not correspond to quantitative expression.
Which one is then more important or reliable?"
13. JT2 by Anatolij Stjopotchkin - average position: 7,6
EB: "Good harmony between the 3 solutions, perfectly matched."
IK: "The only strategic element is changed block of d6, otherwise the solutions are too homological, and the
twinning is not pretty."
MP: "Three pleasant echoes with pawn double-step mates and different selfblocks on d6."
JL: "On the other hand, the opinion of study participants differed very much, anyway, the average
ordering is the same! Surprising, isn't it?"
14. JT9 by Lev Grolman - average position: 7,7
EB: "Battery creating combined with pin mates and black reversal, good tries on B2 (2.R~?) prove that the
rook must be pinned."
IK: "Familiar sequence of pins and unpins, in rather crowded setting."
MP: "I have excluded JT9 (this strategy of reversal black moves is well-known, compare this with
C.Jonsson Springaren 1997 - a more economical version can be achieved,
which saves 6 (!) units, has two solutions setting and still shows the exchange of functions between the WR and the
WS in guarding the BK's field)."
JL: "General feeling about heaviness of construction is once again proved by Mario. In some
fields is nowadays becoming very difficult to find somethin original,"
15. JT1 by Helmut Zajic - average position: 9,6
EB: "Zajic theme but very simple."
IK: "The setting is not very spectacular for the idea this simple. There is a feeling it could be done with
lesser amount of plugs."
MP: "I have excluded JT1 (neat Zajic theme but several problems exist showing
similar exchange of functions of the thematic white pieces and more complex play)."
JL: "The views of participants converged on this problem..."
16. JT12 by Eliahu Fasher - average position: 9,8
EB: "Poor twining of the mating piece and the idle WBb2 ruin the problem."
IK: "While being valid, this problem has unused white pieces due to the deficient symmetry between solutions.
Moves by bK in h#2 with no good motivation diminishes its strategic value, in our opinion."
MP: "I have excluded JT12 (the BBb2 is idle in b)."
PW: "The problem with the most impressive strategy. I would
have placed this 1st or 2nd if not for the unused White bishop in (b)."
JL: "... on the other hand diverged a bit here. Personally, I liked especially the a)
position, but I have found the unused wB in b) disturbing too. Clearly, this was seen as important defect as
there was no need for him to sleep because of the strategy shown. Compare that to the problem ranked as the best on
average by M. Degenkolbe!"
Further comments by participants
17. Generally, all participants have expressed they interest in results - indeed that fact they participated
have confirmed it much than anything else as they were willing to invest their time. MW wondered:
"Have you gotten a lot of responds? If not, maybe you should try refering directly to some composers\judges."
Frankly, I did not know what I can expect from this experiment. I could not even imagine how much work I will have
to do after receiving results. I think I have made enough advertising and any Internet composer active might have
heard about study this or another way. I just think that time given (about 1 month) was too short and not
everyone willing to participate in general might have found enough free time. If there is some next study,
the time for "judging" will have to be longer.
18. MP has written much more than just award and you might find his ideas inspiring (next 4 paragraphs):
19. "Your recent experience as FIDE-Album judge has surely
provided you satisfaction but mainly a large amount of work and sense of
responsibility since a given composer gains titles from the FIDE-Albums
points and his expectations are, rightly or wrongly, very high. You say
'FIDE Album is not only about giving a handful of honours. You have to judge
all the problems, mark them with points' and 'it surely does not look
well if the points given by two judges differ too much'. I entirely agree
with you but let me add some other points.
20. First, I have to say that what
I will say is of course my personal point of view and you can simply disagree
but I try to convince you and the interested readers
that my thoughts are not only important but also a real problem.
21. In the specific case of the FIDE Albums, the three judges mark the problems with
points according to their personal liking, experience and, as a consequence,
the knowledge of what has been produced in that particular field; all that
explains quite often the difference (in some cases very large) in the single
score and thus I think it will a miracle if only two judges give the same
points and make the same order in an award. But to give points is not, in
my view, the only task of the three judges: I do think that they have to
intensively search for anticipations. The now available computerized databases
(the PDB is an extremely useful gold-mine), not to mention many personal
problem collections, easily provide the necessary informations.
22. Another extremely important task for a FIDE-Album judge (or perhaps for section
director) should be to check the construction of a given problem (economy
of force or tempo, impure effects, weasels and so on), in order to verify
whether the proposed setting is the best possible. In fact we have to remember
what is FIDE-Album: it is not only the route to obtain titles but is also
an anthology and that means that the problems included are (or should be)
the best in the world in that three-year period. But as all we know, it
does not happen very often and we see anticipated or badly constructed problems,
not to mention those which are not worthy of being included at all (but
this is another story)."
23. Here I dare to interrupt Mario's words with my own view. While I entirely
agree with need to check for anticipations and indeed I think is more possible now than
ever before, I cannot imagine
how this construction checking would work in practice. Let's count: in fairy
section of FIDE Album I had to judge about 1150 fairies. We do not need to bother with
2/3 of them just because of its quality without construction check. However, still about
400 problems remain. Thorough construction check might take some days... this is
unbearable. Of course, there are suspicious positions, hinting construction
weaknesses on the first view, but some are more subtle. Rather I would argue
by the fact that was original idea of FIDE Albums - to be anthology of
works for the given time period. For me, even a bit worse constructed problems deserve their
place in the Album provided that they are worthy representatives of its field in
in other respects.
24. MP continues and suggests: "What I said is valid not only for the specific situation
of the FIDE-Album but it counts also for all the composing tourneys. As
I said, this is only my personal point of view and look forward to hearing
from other interested people. In this regard, I too should like to invite
you and your readers to try another little special tourney which is of course
for judges but also for all those composers who take care the construction
and purity of effects. Please, don't rate this tourney as a school examination
(there will not be any rejected, but on the other hand the winner will not
win prizes!) but only an absolutely partial (take this adjective as the
opposite of unimpartial) demonstration for what I said. I propose two single
line helpmates in two moves and you are invited to find what is their drawback."
25. Find here the positions provided by Mario: Q1, Q2.
Please, send your opinion on drawbacks of these to
26. Further comment by MW emphasizes the point already given on the
the older page written by me long time ago:
"The problems you chose were very varied and quite in the same level, so the personal taste
rem of each judge is a major factor in his judgement."
27. Finally, MP observed something really interesting:
"I do not know whether this discussion about judging can be of interest at
all and at the same time I hope you will receive many reactions from your
readers but what I firmly think (I hope someone one day can explain the
reasons?) is that this topic is not one of the most popular argument in
our chess problem community."
28. Yes, indeed, as far as I remember all discussions about judging appeared as
reactions to judgements that displeased the initiators of discussion. Perhaps we have
good luck now and the initiation of discussion by me was finally motivated differently -
by a lot of recent judging experience (by the way, I have already done 2 more judgements after
finishing FIDE Album...) and a bit of positive feedback.
29. How much did the views of participants on problems differ? One way to measure this might be
computing standard deviations of places for given problem from its average placing. This gives the following table:
|JT4||Paz Einat & Ofer Comay||2,59|
Summary of issues uncovered
30. In this study we have discovered no America. All conclusions below are known, however they seem to
be valid even in quickly changing world...
31. Search for comparison problems and anticipation is important part of judging process.
Either in one's memory or in available databases, in helpmate area we have PDB or some
more specific personal collections, for twomovers there is Albrecht's collection, for
fairies there is WinChloe database containing good deal of the best works of past and present
- and late Alexander Pituk have once said: "Stupid problems cannot be anticipated!".
32. Remarks about construction are sometimes dangerous when they are not backed by judge's
tries to improve construction. However, it is time consuming and perhaps worth effort only with the best problems
in the competition.
33. One should not get fooled just by theme names, the content of problem should more often than not
speak for itself. There are exceptions to this rule, especially with new and very difficult
themes, they either mean exploring completely new area or proving abilities by digging something new
and really valuable from already digged fields.
34. It is important to give explanatory comments whenever possible. This might help to
understand judge's way of thinking and perhaps mitigate some unnecesary polemics over judgement.
35. Activity of pieces on diagram is a tricky point. In some instances pieces with little possible activity
might help to turn attention on the main idea that would be too hidden with more possible activity.
On the other hand, as JT6 and JT8 have shown, more often limiting activity of pieces by blocking it
is seen as negative.
36. Viewpoints of composer and noncomposer in the field might differ considerably. In the past,
experience with leading solvers judging composing tourneys has shown this clearly.
37. Judging own works is very difficult. Usually it is not allowed in competition for obvious reasons. Composers
tend to take into account not only qualities of final product, but also all difficulties he had to overcome to
get the position as it is. There are however, some competitions when you have to judge own works (Problem-Echo,
some matches with special rules, to mention just those I participated in). No one should be surprised then...
of strange outcome.
38. Different viewpoints might lead to the same conclusions - or very different!
39. What makes the judgement good? When the judge is considered good? Possibly there is some general feeling
that he makes his work competently and although everyone would make the ordering a bit different, judge does his own
consistently and preferred ordering of the judge does not differ too much from reader's.
40. Please, comment on whatever given in this report, if you have different view. The best way would be to
give number of paragraph concerned, it might allow better understanding.
41. P.S. Mario Parrinello have already provided more extensive comments on the judging process, I would say that his
thoughts would deserve some more space and perhaps even in the printed media.
Versions, comparison problems and quiz problems
Back to main page of Chess Composition Microweb.