I have exported the logs from an SMTP appliance to csv. Unfortunately, the format does not follow the standard criteria of a csv file. As a result, opening in Excel is useless. Using import-csv is very problematic also.
Below is a sample log entry.
2012-11-20,21:29:53,log_id=0200002262,type=statistics,pri=information,session_id="qAL2Tpx1002260-qAL2Tpx3002260",client_name="smtp.domain.com [192.168.1.4]",dst_ip="192.3.1.25",endpoint="",from="user@domain.com",to="mailbox@test.com",subject="Test, Message",mailer="mta",resolved="OK",direction="in",virus="",disposition="0x01",classifier="0x00",message_length="2115"
The anomalies that are making this difficult are:
- Carriage returns in the middle of some log entries. It appears to be after a consistent number of characters.
- Commas in the value portion. Because of the lack of a quote at the beginning of a column, although the commas are between the quotes are not escaped and are interpreted as a column border. This is common in subjects.
I planned on looping through the columns to remove the text before the equal signs, but until I can get around these two anomalies, I cannot do this. There are too many log entries for manual manipulation. Do you have any ideas or suggestions?