Usually a blog post is in the format of either "tactical tech note" or "deep thoughts with Andy and a sandwich" but I'm not sure how this topic will quite fit into either of those categories. The topics are driven by my day-to-day interactions with customers using the SIEM + all the products that it touches so sometimes the topic is a corner case. This might be one of those.
There are many protocols that can be used to get logs to the SIEM, but they all rely on having direct connectivity to an IP address. In environments where there are networks with different classifications, tactical/portable, disconnected or proxied it may be necessary to manually transfer files and import the files. The end result might be that you end up with a directory with some large number of sub-directories, log files, zipped log files, gzipped log files and maybe some other stuff mixed in and you're faced with the task of uploading them one-by-one with a process that takes 10-clicks per file. So then you call me.
Usually we can find a way to improve the efficiency of a process by using a feature differently or implementing a best practice, but this was really outside of what the product is designed to do. We also need to have a lot of flexibility since the requirements could change in ways that we don't anticipate and that wouldn't mesh well with a concise feature design or developer schedules. The best option came to be to create a script to search for log files and send them via syslog.
The script, creatively named send_syslog, is written in Python and available at this GitHub Repository. Since my customer is running Windows there was a need for cross platform support. I was able to compile the script into an executable using PyInstaller and it's worked out well so far. To download the latest Windows exe, click the Releases link above the blue bar and green button on the Github site.
The script has served to be useful in other use cases including dumping all the logs from a newly added hosts by pointing the script at /var/log, repopulating SIEM data for testing or forensic reconstruction.
The usage for the script is pretty straightforward however there are a few options built in.
At a minimum, it's necessary to provide an IP address for the syslog server and the path to the files that you would like to be sent. Currently the script only uses TCP syslog but this preferable and best practice vs. UDP.
# send_syslog -s 10.0.1.2 -f /var/log
Will recursively search /var/log for files, determine the type (text, binary), and send any text-based data line by line to the specified syslog server. Since the definition of a log an vary widely, there isn't any restriction on the format. This means that most every text file is going to be sent and you will want to be aware of that in case other non-log file is in the path.
It's possible to add the '-z' flag to instruct the script to also examine zip and gzip files in the path. If there is a single text file in the archive then that will be sent as a log. Archives with multiple files are not supported currently. Though it's not common to see multiple log files zipped into a single archive either.
With the command:
# python send_syslog.py -s 10.0.1.2 -f /var/log -z
The default output for the script looks like this:
****Cut a few hundred lines****
INFO Sent file: errors.7.gz: Lines: 4
INFO Sending file: errors.11.gz
INFO Sent file: errors.11.gz: Lines: 7
INFO Sending file: errors.2.gz
INFO Sent file: errors.2.gz: Lines: 551
INFO Sending file: sendmail.22.gz
INFO Sent file: sendmail.22.gz: Lines: 6
INFO Sending file: sendmail.20.gz
INFO Sent file: sendmail.20.gz: Lines: 2
INFO Sending file: sendmail.15.gz
INFO Sent file: sendmail.15.gz: Lines: 2
INFO Sending file: errors
INFO Sent file: errors: Lines: 528
INFO Sending file: sendmail.2.gz
INFO Sent file: sendmail.2.gz: Lines: 1
INFO Sending file: errors.24.gz
INFO Sent file: errors.24.gz: Lines: 18
INFO Skipped empty file: sendmail
INFO Sending file: sendmail.18.gz
INFO Sent file: sendmail.18.gz: Lines: 7
INFO Sending file: sendmail.11.gz
INFO Sent file: sendmail.11.gz: Lines: 1
INFO Sending file: sendmail.7.gz
INFO Sent file: sendmail.7.gz: Lines: 1
INFO Sending file: sendmail.1
INFO Sent file: sendmail.1: Lines: 2
INFO Sending file: errors.22.gz
INFO Sent file: errors.22.gz: Lines: 2074
INFO Sending file: errors.20.gz
INFO Sent file: errors.20.gz: Lines: 18
INFO Sending file: errors.13.gz
INFO Sent file: errors.13.gz: Lines: 1
INFO Sending file: sendmail.12.gz
INFO Sent file: sendmail.12.gz: Lines: 3
INFO Sending file: errors.4.gz
INFO Sent file: errors.4.gz: Lines: 420
INFO Sending file: errors.25.gz
INFO Sent file: errors.25.gz: Lines: 20
INFO Sending file: sendmail.14.gz
INFO Sent file: sendmail.14.gz: Lines: 6
INFO Sending file: errors.10.gz
INFO Sent file: errors.10.gz: Lines: 7
INFO Sending file: sendmail.8.gz
INFO Sent file: sendmail.8.gz: Lines: 3
INFO Sending file: sendmail.5.gz
INFO Sent file: sendmail.5.gz: Lines: 1
INFO Sending file: errors.18.gz
INFO Sent file: errors.18.gz: Lines: 26
INFO Sending file: errors.14.gz
INFO Sent file: errors.14.gz: Lines: 3
INFO Sending file: errors.21.gz
INFO Sent file: errors.21.gz: Lines: 47
INFO Sending file: errors.15.gz
INFO Sent file: errors.15.gz: Lines: 2
INFO Sending file: sendmail.4.gz
INFO Sent file: sendmail.4.gz: Lines: 1
INFO Sending file: errors.19.gz
INFO Sent file: errors.19.gz: Lines: 57
INFO Sending file: errors.6.gz
INFO Sent file: errors.6.gz: Lines: 1
INFO Sending file: errors.1
INFO Sent file: errors.1: Lines: 342
INFO Sending file: sendmail.6.gz
INFO Sent file: sendmail.6.gz: Lines: 2
INFO Sending file: errors.9.gz
INFO Sent file: errors.9.gz: Lines: 1
INFO Sending file: sendmail.16.gz
INFO Sent file: sendmail.16.gz: Lines: 4
INFO Total lines Sent: 6173313
And it only took just a couple of minutes.
A caveat to keep in mind is if there is a policy in place to limit the age of events inserted into the database. Inserting old events can impact performance so it's not recommended. To that end, it's possible to configure a maximum age for events under System Properties | Database | Data Retention. At the bottom is the setting for Restrict Insertion of Historical Data. So if your data isn't showing up, this is something to consider.
There are a couple of approaches to use it with the SIEM depending upon the variety of log data that needs to be sent. If logs for only a single data source need to be sent, make sure that the IP address configured for the data source on the ESM matches the IP address of the host that the tool is running on.
In the case where logs were generated by multiple data sources, assuming they are RFC 5424 formatted, you can add the IP address of the host the tool is running on as a new data source with the following settings:
In addition to the relay data source, the data sources themselves need to exist on the ESM/Receiver also. If they don't already exist, it is possible to add the data source for the relay IP as described above.
One method is to enable Auto-learn under Receiver Properties | Data Sources | Auto-learn. Run the send_syslog script once through; logs will be sent to the ESM but will not be received beyond the first packet representing each data source. Once the script has run, disable Auto-learn and finish adding the data sources. This method may not be practical if there is a large volume of logs to be sent. It's also possible and efficient to add data sources via file import.
The full help file is available at the GitHub link above however I'm happy to answer any questions here.
As I mentioned, this is something that won't be applicable to most folks and was only created due to a customer corner case. I'm interested to hear any other use cases that it might be useful for. Thanks.