Sendmail/Cyrus issues and integration

A follow-up to the mail server upgrade from a few weeks ago, and we’ve wound up upgrading Sendmail and Cyrus to the FC3pre versions (8.13.1 and 2.2.6, respective). These simple upgrades didn’t fix things, but I think the following did:

The mail server at work had a few intermittent issues which I suspect is due to the way we had Sendmail and Cyrus communicating with each other and checking for existing user accounts, so as to do an immediate REJECT on old or fake addresses, and prevent delayed bounce messages.

I’m not sure what the exact problem was, but I think we were seeing two things, though they may have the same underlying cause. The most obvious thing was the messages stuck in queue after processing by milter and before delivery to LMTP. It was obvious because it happened fairly frequently, so we could see it easily with a ps. Interestingly, the Sendmail queue serial numbers on the vast majority of these were consecutive: two or three consecutive messages would get stuck.

The less obvious thing we were seeing was hung SMTP connections that seemed to be stuck before milter, when Sendmail was checking for account existence by querying LDAP. This happened less frequently, and was masked by our quick-and-dirty remedy for the first thing, which was a hard restart of all sendmail processes. This less obvious problem became apparent when we shifted to doing kills on the individual from-queue sendmail processes, and we observed the stuck SMTP processes and correlate these to netstat output and see that they were stuck with open LDAP sessions, presumably just after the RCPT line in SMTP communications. Milter isn’t hit at this point, as the DATA portion of SMTP hasn’t started.

The change we did for the fix was a configuration change in the way Sendmail and Cyrus communicated the existence of user mailboxes. Such communication was necessary to avoid delayed bounces of unknown users, which got us into Spamcop a few months ago. Until the most recent Sendmail upgrade last week (when we rolled 8.13 from FC3pre), we could not do a direct query of Cyrus and had to rely on indirect means/procedure, where Sendmail would query passwd/PAM (which then talked to LDAP). The assumption was that only legitimate accounts — those found in LDAP — should have deliverable mailboxes. Sendmail 8.13 allows SOCKETMAP as a compilation option, though we had to do our own build of sendmail to turn on the new feature and use hacked/contributed m4 maps to make it work. SOCKETMAP allows direct communications with non-standard mail spools such as Cyrus, so Sendmail did not have to (implicitly) check with a third source for account information.

Sendmail had been shipping a Cyrus V2 mailer definition m4 map as part of the standard distribution since 8.12, but our old mailer definition (carried over from Cyrus 1.x days) didn’t have any problems until now. At the time, using the old mailer definition also allowed us to deal with the delayed bounce problem, because we could see where we had to put in the flag to have the mailer check passwd/PAM before accepting the message. I’m not sure how easily we could have implemented this check if we had been running the packaged m4 map with the older mail server: as usual, a one-character change in the *.mc file may result in all sorts of changes in the *.cf file after m4 processes everything. So, while we may have been able to change to the packaged mailer definition earlier, we may have opened ourselves up to the delayed bounce problem again.

Real-time integration was done according to http://anfi.homeunix.net/sendmail/rtcyrus2.html, which is referenced off the Cyrus wiki.

The following tasks were performed for the integration:

1. sendmail-8.13.1 was rebuilt to support SOCKETMAP: in the sendmail.spec file, the following was added: APPENDDEF(`confENVDEF’, `-DSOCKETMAP’)dnl. The form used was along the lines of how the spec file turned on and off other compile-time options.

2. /etc/cyrus.conf was modified to tell Cyrus master to start the socket map daemon, smmapd, and master was HUPed to reload the configuration file. FC2 (or maybe FC3pre) Cyrus 2.2 supports smmap out of the box.

3. The new sendmail-8.13 packages were installed.

4. /usr/share/sendmail-cf/m4/proto.m4 was patched according to http://anfi.homeunix.net/sendmail/mrs.html. Note that the patch file referenced on the page must be applied by hand, as it was written for the sendmail distributed proto.m4 file, not the Fedora modified one. I may get around to rolling a new sendmail RPM where these changes were put in at build time.

5. The LUSER_RELAY option was followed in http://anfi.homeunix.net/sendmail/rtcyrus2.html as it seemed the least disruptive. Note that following these instructions, we are now running with a version of the Sendmail Cyrus V2 mailer definition, as opposed to the Cyrus mailer definition we carried over from 1.6.x days (which was removed from the sendmail.mc file). Note also that the mailer definition didn’t seem to be responding to DEFINE directives in sendmail.mc and was pointing sendmail to the wrong location for the LMTP socket. This was fixed by editing the /usr/share/sendmail-cf/mailer/cyrusv2.m4 file directly and building the *.mc file afterwards.

6. sendmail was tested using “sendmail -C test.cf -d60.5 -bv user@the.domain”. We have a dummy user that exists in Cyrus and we observed the mailbox check sequence was proceeding correctly. After this was working, sendmail was started up.

tail -1000 /var/log/maillog|grep verify shows the verification messages for unknown mailboxes.

In any case, the system seems to be working much better now. We’re not observing stuck from queue messages, and we don’t see any jobs with held open SMTP communications at the RCPT stage.

7 Responses to “Sendmail/Cyrus issues and integration”

  1. Andrzej Filip Says:

    Feel free to send me your comments or suggest improvements.

  2. Andrzej Filip Says:

    Question to step 5:
    Have you used the line below?
    define(`CYRUSV2_MAILER_ARGS’,`FILE _cyrus_lmtp_socket_path_’)

    [ you wrote abour “DEFINE” ]

  3. Cheng Says:

    Yes, I have the DEFINE for the CYRUSV2_MAILER_ARGS in my sendmail.mc, pointing to the LMTP socket file.

  4. Andrzej Filip Says:

    Some extra advice:
    1) Put the following line in /etc/imapd.conf
    lmtp_downcase_rcpt: 1
    2) Use cyrus-imap virtual domain WITHOUT upper case letters

  5. Cheng Says:

    Thanks! I just put the lmtp_downcase_rcpt in there. I don’t know if this has been an issue in the past, but this should help in the future.

  6. Andrzej Filip Says:

    http://anfi.homeunix.net was discontinued at 2006-07-31

  7. Andrzej Filip Says:

    0) http://anfi.homeunix.net/ is back online
    1) new wersuins (RTCyrus3) is under development with sheduled release in February 2007
    http://anfi.homeunix.net/sendmail/rtcyrus3.html