E-Mail Date Parsing

I've done a lot of work automating various e-mail tasks -- sending e-mail, reading e-mail, parsing e-mail, etc. (Some have been just small routines as part of a site, some are sites among themselves like mypopchecker.com.)

I tend to carry code forward if it worked for me in the past. I had done this with a routine to parse e-mail dates, but apparently everyone implements the RFC differently so it would occasionally misfire. Even some commercial applications I've used will miss some of the dates. The dates I'm talking about are in a format similar to:

Sun, August 29, 2004 10:01:02 AM -0700 PST

This doesn't seem too hard to parse, but when you see the slight variations that exist on the net, it gets more complicated. I finally buckled down and rewrote my main parsing method, and it hasn't missed a date yet out of a few thousand messages. I figured I'd post it here in case anyone else has a need.

It's not 100% (nothing ever is, right?) so if you have improvements, feel free to post them. On an error, it should return either 1/1/2000 if it's unable to parse even a date out of the string, or at least the date if it cannot parse the time.


public static DateTime GetMessageDateUTC(string DateText) {

    string ParsedDate = DateText.Trim();
    DateTime BaseDate;
    Regex dateregex;
    Match match;

    //strip out Day of Week from beginning ie "Tue,"
    ParsedDate = Regex.Replace(ParsedDate, "^([A-Za-z]){1,7},?", "");
    //strip out the optional time zone information from the end, ie "(PDT)"
    ParsedDate = Regex.Replace(ParsedDate, "([(][A-Za-z]{1,4}[)])$", "");

    try
    {
        //search for the date, time, and offset characteristics in the string
        dateregex = new Regex("(?\\d* [A-Za-z]{1,10} \\d{2,4})(?);
        match = dateregex.Match(ParsedDate);

        BaseDate = Convert.ToDateTime(string.Format("{0} {1}",
                    match.Groups["date"].Value,
                    match.Groups["time"].Value));
    }
    catch
    {
#if DEBUG
        throw;
#endif
        return new DateTime(2000,1,1);
    }

    try
    {
        if (match.Groups["offset"] != null && match.Groups["offset"].Length >= 4)
        {
            //take the offset, ie +0700, -1000, and adjust the BaseDate
            int OffsetHour = 0;
            int OffsetMin = 0;
            OffsetHour = (int.Parse(match.Groups["offset"].Value)/100)*(-1);
            OffsetMin = (int.Parse(match.Groups["offset"].Value.Substring(4,2)));
            if (OffsetHour < 0) OffsetMin = OffsetMin*(-1); // flip the sign

            BaseDate = BaseDate.Add(new TimeSpan(OffsetHour,OffsetMin,0));
        }

        return BaseDate;
    }
    catch
    {
#if DEBUG
        throw;
#endif

        return BaseDate;
    }
}
Comments are closed

My Apps

Dark Skies Astrophotography Journal Vol 1 Explore The Moon
Mars Explorer Moons of Jupiter Messier Object Explorer
Brew Finder Earthquake Explorer Venus Explorer  

My Worldmap

Month List