Notation

Ttrue if an event had a hit in the tagger, false otherwise
Rtrue if an event passed the recoil trigger, false otherwise
Ctrue if an event had a hit in the CPV, false otherwise
[A] the total number of events satisfying logical condition A
!Athe logical NOT of event A
A == B statement that events satisfying condition A are the same as those satisfying condition B
A =~ B statement that events satisfying condition A are statistically typical of those satisfying condition B but the sets are not necessarily equal (this implies [A] = [B])
A.B logical AND of conditions A and B for the same event
A|B logical OR of conditions A and B for the same event
A&B true for some event a if A is true for that event and, within some well-defined time interval that contains a, there exists at least one event for which B is true
A*B true for some event a if A is true for that event and, within some well-defined time interval which excludes a, there exists at least one event for which B is true

Logic

What do we want?
[R.T.!C]

What do we measure directly?
only coincidence counts, such as [R&T], [T*C] and the like

So what are these counters in the ntuple?
coint() = [R&T]
coina() = [R*T]
vetot() = [(R&T).!(R&C)]
vetoa() = [(R*T).!(R&C)]

What is the claim in the paper?
[R.T.!C] = [(R&T).!(R&C] - [(R*T).!(R&C)]

Proof:
axiom 0: A.A == A
axiom 1: A.B == B.A
axiom 2: A*B =~ B*A
axiom 3: (A.B).C = A.(B.C) = (A.C).B

theorem 1: A&B =~ (A.B)|(A*B)
    This is a consequence of the independence of different events and the
    fact that they are randomly and uniformly distributed in time.  It states
    that the only way for an event to satisfy A&B is either for the event
    to satisfy A.B or for it to satisfy A and to be found in random 
    coincidence with a second event which satisfies B.  Any random interval
    which excludes the first event may be used to evaluate the condition A*B,
    with the caveat that if A or B themselves contain a * operator in their
    definition then each * found in the composite expression must be evaluated
    on a different random interval.
theorem 2: [A&B] = [A.B] + [A*B] - [(A.B).(A*B)]
    This is a general theorem from set theory.  The caveat explained above
    under theorem 1 regarding repeated use of the * operator in any logical
    expression applies here as well.

[R.T.!C] = [(R.T).!C]
         = [(R.!C).T]
         = [(R.!C)&T] - [(R.!C)*T] + [((R.!C).T).(R.!C)*T))]
         = [(R.!C).(R&T)] - [(R.!C).(R*T)] + [((R.!C).T).(R.!C)*T))]

The first term in this expression [(R.!C).(R&T)]
         = [R&T] - [(R.C).(R&T)]

The last term in this expression [(R.C).(R&T)]
         = [(R&C).(R&T)] - [(R*C).(R&T)]
           + [(R.C).(R*C).(R&T)]

Putting these results together gives [R.T.!C]
         = [R&T] - [(R&C).(R&T)] + [(R*C).(R&T)]
           - [(R.C).(R*C).(R&T)] - [(R.!C).(R*T)] + [((R.!C).T).(R.!C)*T))]
         = [(R&T).!(R&C)] - [(R.!C).(R*T)]
           + [(R*C).(R&T)] - [(R.C).(R*C).(R&T)]
           + [((R.!C).T).(R.!C)*T))]

Now the second term in this expression [(R.!C).(R*T)]
         = [R*T] - [(R.C).(R*T)]

The last term in this expression [(R.C).(R*T)]
         = [(R&C).(R*T)] - [(R*C).(R*T)] + [(R.C).(R*C).(R*T)]

Putting these results together gives [R.T.!C]
         = [(R&T).!(R&C)] - ([R*T] - [(R&C).(R*T)])
           - [(R*C).(R*T)] + [(R.C).(R*C).(R*T)]
           + [(R*C).(R&T)] - [(R.C).(R*C).(R&T)]
           + [((R.!C).T).(R.!C)*T))]
         = [(R&T).!(R&C)] - [(R*T).!(R&C)]
           - [(R*C).(R*T)] + [(R*C).(R&T)]
           - [(R*C).(R.C).(R&T)] + [(R*C).(R.C).(R*T)]
           + [((R.!C).T).(R.!C)*T))]

[A.(R.C).(R*C)]
         = [A.(R&C).(R*C)] - [A.(R*C).(R*C)] + [A.(R.C).(R*C).(R*C)]
         = [A.(R&C).(R*C)] - [A.(R*C)^2]
           + [A.(R&C).(R*C)^2] - [A.(R*C)^3]
           + [A.(R&C).(R*C)^3] - [A.(R*C)^4]
           + ...
         = [A.(R&C).(R*C)]
           - [A.!(R&C).(R*C)^2]
           - [A.!(R&C).(R*C)^3]
           - ...

Combining this with the above results gives [R.T.!C]
         = [(R&T).!(R&C)] - [(R*T).!(R&C)]
           - [(R*C).(R*T)] + [(R*C).(R&T)]
           - [(R&T).(R&C).(R*C)]
             + [(R&T).!(R&C).(R*C)^2]
             + [(R&T).!(R&C).(R*C)^3]
             + ...
           + [(R*T).(R&C).(R*C)]
             - [(R*T).!(R&C).(R*C)^2]
             - [(R*T).!(R&C).(R*C)^3]
             - ...
           + [((R.!C).T).(R.!C)*T))]
         = [(R&T).!(R&C)] - [(R*T).!(R&C)]
           + [(R&T).!(R&C).(R*C)]
             + [(R&T).!(R&C).(R*C)^2]
             + [(R&T).!(R&C).(R*C)^3]
             + ...
           - [(R*T).!(R&C).(R*C)]
             - [(R*T).!(R&C).(R*C)^2]
             - [(R*T).!(R&C).(R*C)^3]
             - ...
           + [((R.!C).T).(R.!C)*T))]
         = [(R&T).!(R&C)]
             + [(R&T).!(R&C).(R*C)^1]
             + [(R&T).!(R&C).(R*C)^2]
             + [(R&T).!(R&C).(R*C)^3]
             + ...
           - [(R*T).!(R&C)]
             - [(R*T).!(R&C).(R*C)^1]
             - [(R*T).!(R&C).(R*C)^2]
             - [(R*T).!(R&C).(R*C)^3]
             - ...
           + [((R.!C).T).(R.!C)*T))]

But for any condition A, [A.R.(R*C)^N] = [A.R]*fC^N where N is any integer
and fC is just a universal constant which depends on [C] and the gate
width implicit in the * operator.  Note that fC lies in the interval [0,1].

Substitution into the above expression gives [R.T.!C]
         = ( [(R&T.!(R&C)] - [(R*T).!(R&C] ) / (1-fC)
           + [(R.T).!(R.C).(R*T)]
So there are two corrections to the claim in the paper.  First of all
there is the overall normalization factor in the leading term, which is
of order 5/3.  Secondly there is the second term which must be included.

For any expression A the following relation holds:
[A.(R.T)]
         = [A.(R&T)] - [A.(R*T)] + [A.(R.T).(R*T)]
         = [A.(R&T)] - [A.(R*T)]
           + [A.(R&T).(R*T)] - [A.(R*T)^2]
           + [A.(R&T).(R*T)^2] - [A.(R*T)^3]
           + ...
         = [A.(R&T)]
           - [A.!(R&T).(R*T)^1]
           - [A.!(R&T).(R*T)^2]
           - ...
         = [A]
           - [A.!(R&T).(R*T)^0]
           - [A.!(R&T).(R*T)^1]
           - ...
         = [A] - [A.!(R&T)] / (1-fT)

This implies for the second term above [(R.T).!(R.C).(R*T)]
         = [(R.T).(R*T)] - [(R.T).(R.C).(R*T)]
         = [R*T] - [(R*T).!(R&T)]/(1-fT)
           - [(R.C).(R*T)] + [(R.C).(R*T).!(R&T)]/(1-fT)
         = [R*T] - [(R*T).!(R&T)]/(1-fT)
           - [R*T] + [(R*T).!(R&C)]/(1-fC)
           + [(R*T).!(R&T)]/(1-fT)
           - [(R*T).!(R&T).!(R&C)]/((1-fC)*(1-fT))
         = [(R*T).!(R&C)]/(1-fC)
           - [(R*T).!(R&T).!(R&C)]/((1-fC)*(1-fT))

Finally this leads to [R.T.!C]
         = ( [(R&T.!(R&C)] - [(R*T).!(R&C] ) / (1-fC)
           + [(R*T).!(R&C)]/(1-fC)
           - [(R*T).!(R&T).!(R&C)]/((1-fC)*(1-fT))
         = [(R&T.!(R&C)]/(1-fC)
           - [(R*T).!(R&T).!(R&C)]/((1-fC)*(1-fT))

This is the result we have been looking for. The factor fT is only a few percent for an individual tagger channel, and !(R&T) is approximately equivalent to the factor (1-fT) so the principal difference between the last equation and the original result

[(R&T).!(R&C)] - [(R*T).!(R&C)]
is only in the overal scale factor 1/(1-fC). This does not change the shape of spectra, so for most purposes it is irrelevant.