Bits and thoughts

#!/bin/bash is not rude

Filtering spam emails with Spamassassin

Written by ⓘⓓⓔⓝⓣⓛⓤⓓ - -

Last time I explained how I setup my own email server . One of the possible improvement was to be able to filter spam on the server-side rather than relying on the client-side configuration. I configured this using spamassassin.The technical background is Debian Wheezy (testing version as it is not yet stable as for now).You will see in another article that we can go further by adding some filtering rules on the server as well ...But for now let's see how to set this up.

Installing spamassassin

Where talking about Debian here ... :
apt-get install spamassassin
One configuration step is to enable spamassassin in its configuration file /etc/default/spamassassin
# sed -i "s/ENABLED=0/ENABLED=1/g" /etc/default/spamassassin
And then the spamd service that must be launched with :
# service spamassassin start
You can check that the spamd daemon is listening to inputs on loopback address :
# netstat -ntpl | grep spamdtcp  
0  0 127.0.0.1:783  0.0.0.0:*       LISTEN      20892/spamd.pid
The version installed is :
# spamassassin --version
SpamAssassin version 3.3.2  running on Perl version 5.14.2

Filtering SMTP content through spamassassin

I'll have to configure a service for smtp in /etc/postfix/master.cf by adding a -o option :
smtp inet  n       -       -       -       -            smtpd
  -o content_filter=spamassassin
submission inet n       -       -       -       -       smtpd
   -o smtpd_tls_security_level=encrypt
   -o smtpd_sasl_auth_enable=yes
   -o smtpd_client_restrictions=permit_sasl_authenticated,reject
   -o content_filter=spamassassin
smtps     inet  n       -       -       -       -       smtpd
   -o smtpd_tls_wrappermode=yes
   -o smtpd_sasl_auth_enable=yes
   -o smtpd_client_restrictions=permit_sasl_authenticated,reject
   -o content_filter=spamassassin
And configure what spamassassin stands for in /etc/postfix/master.cf by adding it at the end of the file :
##  SPAMASSASSIN
spamassassin unix -     n       n       -       -       pipe  user=debian-spamd argv=/usr/bin/spamc -f -e /usr/sbin/sendmail \
-oi -f ${sender} ${recipient}
NB : I edited the line to have it fit in the <pre> section but I guess that it is better if the line starting with "user=" and ending with "${recipient}" is a one-liner.debian-spamd is the user created by the apt-get install.
# getent passwd | grep debian-spamd
debian-spamd:x:112:116::/var/lib/spamassassin:/bin/sh

Result

You may find some interesting logs in /var/log/mail.log
Nov 17 19:30:44  postfix/pipe[32112]: 596F361DA4: to=<xxx@lebegue.org>, relay=spamassassin, delay=1.1, delays=0.76/0.02/0/0.29, dsn=2.0.0, status=sent (delivered via spamassassin service)
Nov 17 19:35:14  postfix/pipe[5098]: C8A6761D6F: to=<xxx@lebegue.org>, relay=spamassassin, delay=519, delays=518/0.01/0/0.63, dsn=2.0.0, status=sent (delivered via spamassassin service)

Updating filters

Once in a while, or through a cron entry, you can update filters with this command  (man pages are well written) :
sa-update && service spamassassin reload

Learning behaviour

Look at man pages for sa-learn to improve spamassassin bayesian filters efficiency.

Self-hosting emails

Written by ⓘⓓⓔⓝⓣⓛⓤⓓ - -

Of the few services that where left in the hands of Google some were finally migrated to my own server :  
  • E mails
  • Calendars
  • Contacts
I had in mind to host them for autonomy and privacy reasons (and also because I find it fun to tinker with this ...).

One of my goal was to be able to access emails, contacts and calendar through my home computer (running Debian) and my smartphone (running Android).

The background OS is Debian Wheezy (testing .. not yet a 7.0 stable) as of November 2012.

Hosting emails

Somehow it's a tricky job to implement your own email server !! There are many steps, many tools to integrate and many configurations to tweak. The functionalities that I wanted to provide are :
  • Store my emails on my own server
  • Being able to securely read them from my home computer and my smart-phone
  • Avoid being a spam relay to the nasty bots while alowing to post emails without using my ISP gateway
  • Forward emails to my familly when they want to receive their emails with my domain name : myrelatives@lebegue.org
Of course I wanted to be able to migrate all my emails from Google to this new infrastructure. Over the course of the years I had stored something like 1.7 Gb of email

In a more technical fashion I took these decisions :

Store my emails on my own server

Store my emails in an easily readable format without a database back-end. Maildir format is good for this. I can even read my emails with the simple mail command through an SSH connection ... postfix is my tool of the trade.
  • Incoming messages are delivered to /home/<<user>>/Maildir. in a plain raw format. Each <<user>> has its own file
  • Creating email folders to organize them is made by using linux user's home directory : /home/<<user>>/Maildir/<<subdirectories>>

Being able to securely read them from my home computer and my smart-phone

Reading from different devices requires that the messages are downloaded on said devices but must still be available on the server. Two protocols are available IMAP and POP3. I chose IMAP (mainly because it first transfers email header so that you can sort them out before actually downloading the whole message when you want to read it)  and implemented it using dovecot.
  • Access to the mail boxes are secured with the unix user account's credentials
  • The authentication is made under TLS so that the credentials are encrypted before being sent over the network. I had to uncomment disable_plaintext_auth = yes to force TLS authentication in /etc/dovecot.conf.d/10-auth.conf
#### Authentication processes
### Disable LOGIN command and all other plaintext authentications unless
# SSL/TLS is used (LOGINDISABLED capability). Note that if the remote IP
# matches the local IP (ie. you're connecting from the same computer), the
# connection is considered secure and plaintext authentication is allowed.
disable_plaintext_auth = yes
I also added login authentication mode to /etc/dovecot.conf.d/10-auth.conf :
# Space separated list of wanted authentication mechanisms:
#   plain login digest-md5 cram-md5 ntlm rpa apop anonymous gssapi
#   otp skey gss-spnego# NOTE: See also disable_plaintext_auth
setting.auth_mechanisms = plain login

Avoid being a spam relay to the nasty bots while allowing to post emails without using my ISP gateway

This relies on :
  • Allowing any SMTP incoming emails that must be delivered to my domain. Delivering to something@lebegue.org is ok. Delivering to something@xxxx.xxx is forbidden.
  • Allowing any SMTP incoming emails to be forwarded to another domain ONLY IF the sender is authenticated.
  • Denying access when the message is received from an unqualified domain name server, unknown domain name server or of the sender's email address is not resolvable to an existing one. Examples :
    • if the server is 84.56.12.41 instead of agoodserver.fromdomain.xx it is rejected.
    • if the domain agoodserver.fromdomain.xx does not exists it is rejected
    • if the sender's email is unknown it's rejected
    I have changed this rule a bit for I could not receive or forward messages coming from addresses like noreply@xxxx.xx. It was the case for my ISP invoices notification that where rejected.
  • Denying access to well known spammers through RBL lists
Authentication is made by SMTP protocol and it relies on dovecot :in /etc/postfix/master.cf
smtp      inet  n       -       -       -       -       smtpd
#smtp inet n - - - 1 postscreen
#smtpd pass - - - - - smtpd
#dnsblog unix - - - - 0 dnsblog
#tlsproxy unix - - - - 0 tlsproxy
submission inet n - - - - smtpd
-o syslog_name=postfix/submission
-o smtpd_tls_security_level=encrypt
-o smtpd_sasl_auth_enable=yes
-o smtpd_client_restrictions=permit_sasl_authenticated,reject
# -o milter_macro_daemon_name=ORIGINATING
smtps inet n - - - - smtpd
-o syslog_name=postfix/smtps
-o smtpd_tls_wrappermode=yes
-o smtpd_sasl_auth_enable=yes
-o smtpd_client_restrictions=permit_sasl_authenticated,reject

in /etc/postfix/main.cf

In bold are the modifications I made after realizing that some legitimate emails where rejected.
#smtpd_sender_restrictions = reject_unknown_sender_domain, 
# reject_unverified_sender
smtpd_recipient_restrictions = reject_invalid_hostname,
        reject_unknown_recipient_domain,
        reject_rbl_client sbl.spamhaus.org,
        permit_sasl_authenticated,
        permit_mynetworks,
        reject_unauth_destination,
        permit
smtpd_helo_restrictions = reject_invalid_helo_hostname,
        reject_non_fqdn_helo_hostname       
#reject_unknown_helo_hostname
smtpd_client_restrictions = reject_rbl_client dnsbl.sorbs.net,
         permit_sasl_authenticated
Delegating authentication to dovecot is made by configuring some other entries in /etc/postfix/main.cf
#Activate SASLsmtpd_sasl_auth_enable = yes
#Use Dovecotsmtpd_sasl_type = dovecotsmtpd_sasl_path = private/auth
# Add in header SASL authentication information.
smtpd_sasl_authenticated_header = yes
smtpd_sasl_path = private/auth means that postfix is going to rely on informations provided by dovecot in a file located in /var/spool/postfix/private/auth and configured in /etc/dovecot/conf.d/10-master.conf
service auth {  
# auth_socket_path points to this userdb socket by default. It's typically 
# used by dovecot-lda, doveadm, possibly imap process, etc. Users that have 
# full permissions to this socket are able to get a list of all usernames and 
# get the results of everyone's userdb lookups. 

# The default 0666 mode allows anyone to connect to the socket, but the 
# userdb lookups will succeed only if the userdb returns an "uid" field that 
# matches the caller process's UID. Also if caller's uid or gid matches the 
# socket's uid or gid the lookup succeeds. Anything else causes a failure. 

# To give the caller full permissions to lookup all users, set the mode to 
# something else than 0666 and Dovecot lets the kernel enforce the 
# permissions (e.g. 0777 allows everyone full permissions). 
unix_listener auth-userdb {
    mode = 0666
    user = postfix
    group = postfix
  } 
# Postfix smtp-auth 
# 3 following lines uncommented on 11/11/2012 
unix_listener /var/spool/postfix/private/auth
{
    mode = 0666
  }

Forward emails to my familly when they want to receive their emails with my domain name : myrelatives@lebegue.org

This is configured through /etc/aliases file. If I want my sister's email to be delivered to her Gmail account : mysister@lebegue.org goes to sistergmailaccount@gmail.com then I configure this entry :
mysister: sistergmailaccount@gmail.com
This goes live only when using this command line
postalias /etc/aliases

Results and improvements

As for now I have configured Evolution email client and my Android phone with the default Android "E-mail" application.What I'd like to improve is filtering of incoming messages. There are still some spam coming in. I'd like to get rid of them on the server side to avoid to configure my Evolution client and my Android client with the same filters (though I haven't found yet how to writer filter rules on the default Android app ...)

Next step is also to move my calendars and contacts out of Google grasp ... Ok It's already done in fact ... but I'll explain it in an other blog entry ...

Why lower Google dependency

Written by ⓘⓓⓔⓝⓣⓛⓤⓓ - -

Google is indeed a huge "service" provider. Through all it's "services" it is able to host a good part of your digital life. I would say that the amount of digital data it can detain for you is 100% for any common needed usage. Let's make a short list of these services. I'll connect to gmail.com and go through the menu bar.

Google "Services"

        
  • +You : Google social network   
  • Search : Obviously their search engine is de-fact the entry point of internet
  • Images : Dedicated search engine for pictures  
  • Gmail : That's for your emails and managing your contacts   
  • Maps : Localization related tasks (directions, places)   
  • Drive : THE archetype of cloud container. That's the place where you can place any files
  • Calendar : organize your activities
  • Translate : Translation services
  • Mobile : I sincerely don't know because I have just clicked it to writ this blog entry ...
  • Books : Looks like a book search engine
  • Offers : Some kind of auto-spam service. To me it looks like a place where you point yourself to what type of spam you WANT to receive. It sounds utterly crazy.
  • Wallet : I have never once clicked on this neither. It looks like the list of all the payments you made through the different Google service. I can find there the Android apps I bought.
  • Shopping : I guess that I'm not a big shopper for, once again I had absolutely never used this service. Once again it's some buying-compulsion related site. The name of the service included a clue , didn't it?
  • Blogger : Blogging hosting services
  • Reader : Looks like a news aggregator
  • Finance : A service for wanna-be traders and people who consider that finance and speculation are part of the real life ... As for  some of these Google services I had never before clicked on this one.
  • Photos : Places to share images galleries with your friends
  • Videos : A videos specialized search engine.

Categories of "Services"

From my point of  view there are two main categories of services provided by Google if we consider them from the perspective of your data :
  • The services for which Google holds your data
  • The services for which Google uses your data
Oh and of course, and that's just a wild guess, if Google holds your data then it can also use it.Let's try to categorize the previously identified "services" :
  • +You : holds
  • Search : uses
  • Images : uses
  • Gmail : holds
  • Maps : holds
  • Drive : holds
  • Calendar : holds
  • Translate : uses
  • Mobile : ???
  • Books : uses
  • Offers : uses
  • Wallet : holds
  • Shopping : uses
  • Blogger : holds
  • Reader : uses
  • Finance : uses
  • Photos : holds
  • Videos : uses

Autonomy and Privacy concerns

You had guessed that outsourcing your data to a third party provider may be a problem about your autonomy and your privacy.If somebody holds your data what happens if this data is lost ? it's outsourced to a further level ? you are denied access to this data ?If somebody uses your data would that be a problem if any data mining would categorize you in a vaguely-defined slice of the population that don't think you are related to ? or would that be a problem if the slicing of the data would put you in the category of population that a fascist government has decided to get rid off ?

What are the solutions ?

If you feel that its a concern enough to trust a third party to hold your data then the answer is :  host them yourself.This is THE solution but it has some tradeoffs :
  • You need to be tech savy enough to implement a self hosted solution
  • It costs a bit more than simply using a "free" service such as Google
I can't deny that implementing on your own each and every Google services is quite tedious. I find some of the skills are difficult to acquire. I myself haven't yet found the courage to host my own email server.As for the cost of it the equation is simple to me ; on one hand the cost is to give away some of my privacy and risk some of my autonomy, and on the other hand the cost is paying each year a small fee to a server  provider (And in itself is not real autonomy but in France my upload bandwidth is not large enough to host a server at home ... notwithstanding that hosting a server in my little Parisian flat would be energy consuming and prone to create to much noise).I chose a mid-term between relying on Google and self-hosting everything...Here is service by service what I did to alleviate my dependency to a third party provider (I'll keep in the list what is at stake on the possible usage of my data) . The colours indicate how much I am satisfied with my current solution , very well , fine or no use, bad
  • +You : holds : I don't use much social network and i'm on the way to create my own federated status.net instance (I use identi.ca right now). I have never been a fan of 'friend circles' and the like ...
  • Search : use : I have switched my searches to Duckduckgo.com
  • Images : uses : I use ghostery web extension in Firefox. It guaranties that at least no cookies are used to track my search habits on Google Images (on duckduckgo.com you can look up images by typing "!gi" with the reference you are looking for but it brings you to the Google Image result page)
  • Gmail :holds : Unfortunately this is the hardest service for me to get rid off has hosting a mail server seems to be a little to much for me.
  • Maps : holds : I generally use it on my Android phone and I don't have much replacement found right now. I have considered using openstreetmap.org but it lacks a good support on Android
  • Drive : holds : I've used it in the past to share documents with some friends. This is not something that I do much but if I had to share some new documents and have a collaborative workflow I’d still use it I guess ...
  • Calendar : holds : This is the same as for Drive I don't use it much so that's no big deal for me.
  • Translate : uses : I don't use it
  • Mobile : ??? : I don't use it
  • Books : uses: I don't use it
  • Offers : uses: I don't use it
  • Wallet : holds : I don't use it
  • Shopping : uses : I don't use it
  • Blogger : holds : I don't use it and I host my own Pluxml instance
  • Reader : uses : I don't use it
  • Finance : uses: I don't use it
  • Photos : holds: This is the same as for Drive  and Calendar I don't use it much so that's no big deal for me.
  • Videos : uses : I don't use it

Last tip

One last thing that I did to lower my google dependency is to enable the double authentication mechanism. This obliges you to type a randomly generated one-time-password to use authenticated Google services. It's such a pain that I minimize my usage of Google and it makes me think to alternative before going to Google ...