8 Replies Latest reply on Aug 4, 2016 12:36 PM by trekkiecat

    Data Source Organization

    habanero

      How do you have your data sources designed to accommodate systems with multiple roles? For example, a linux system might host an apache web server and oracle database. That would be three different types of data sources, right?

       

      A consultant told us to create a dummy parent data source for each type of data (i.e. apache, linux, oracle, windows, ....). Then add all the actual system IPs as client data sources under the parent matching the type of data.

       

      Linux Parent with fake IP address

      Client 10.10.10.1

      Client 10.10.10.2

      Apache Parent with fake IP address

      Client 10.10.10.1

      Client 10.10.10.2

      Oracle Audit Parent with fake IP address

      Client 10.10.10.1

      Client 10.10.10.2

       

      Unfortunately, that doesn't appear to work when the data sources are fed from syslog messages. As the syslog messages come into the receiver, they are matched with the first data source with a matching IP address. For example, the linux data source. The downstream data sources never see the log messages.

       

      I understand why it works that way but now the design our consultant suggested will not work for us.

       

      If messages are forwarded by MEF, can it accommodate multiple client data sources with the same IP address?

       

      The other scenarios that come to mind:

       

      1) Create a standalone data source for each system, and modify the policy so other data source parsers are enabled. For example, create the data source as a linux system and modify the policy to enable the apache and oracle audit parsers. This seems labor and time intensive and you can't tell by looking at the data source what it is actually doing - all you'll see is "linux". This also limits the number of data sources that can be created per receiver compared to the parent/client architecture.

       

      2) Create a parent for each system and create clients for each type of data. For example, for a given system, create a dummy parent and add client data sources for the linux, apache, and oracle data. However, those will all have the same ip address so this option doesn't solve the problem with the syslog stream.

       

      I was really hoping to stick with syslog. All our machines (e.g. linux, oracle, windows with snare) are configured to forward to a central syslog server. That syslog server forwards to our receivers. We would like to keep the central syslog server and syslog forwarding client in play for several reasons:

       

      1) It uses a data storage format we're very familiar with.

      2) After ESM points us in the right direction, it is sometimes faster and easier to search the raw logs on the syslog server than try to drill down using ESM or ELM. Particularly when long time periods are involved.

      3) It can be expanded and designed with high availability in mind for a relatively small cost.

      4) The forwarding client can be used to forward to an ELK instance used for operations monitoring without installing yet another agent.

       

      The original plan was to use Snare, SyslogNG-PE, or NXLOG to forward syslog, windows event log, and ascii log information to the receivers through the syslog server. However, if a syslog stream can't be sent to multiple data sources with the same IP to accommodate the linux-apache-oracle scenario, we'll need to do something else. Either use MEF as a second forwarding agent assuming it can handle the duplicate client IPs or hand modify the policies for each system's data source so they can accommodate all the data types.

       

      Anything I've overlooked?

       

      Thanks for reading.

        • 1. Re: Data Source Organization
          danhnt

          "I understand why it works that way but now the design our consultant suggested will not work for us."

           

          Please explain how it works as you mentioned above. Thanks.

          I also have problem as yours, I want to add 2 data source with same IP for OS and APP running on it.

          • 2. Re: Data Source Organization
            xded

            Hi,

             

            in the policy manager you have the option to enable more ASP rules for one Datasource.

            Select the Datasource in the left Menu in the ESM Console --> than one Policymanager in the top left --> In this Menu click on Filter button in the right side -_> than select your Vendor and model --> refresh in the top right --> enable your second model and your third model asp rule.

             

            After this you have one Datasource with three or more ASP rules and not three Datasources with one or two clients.

            • 3. Re: Data Source Organization
              habanero

              **********edited content ***************

              The manual says that duplicate IP addresses must be differentiated by port number. I assume they mean the port number to which the logs are sent. Maybe something like this:

               

              LinuxOS -> port 514

              Apache ASP -> port 10000

              Oracle Audit ASP -> port 10001

               

              --or--

               

              WindowsEvent WMI MEF -> port 8002

              Microsoft IIS ASP MEF -> port 8003

               

              "You can add more than one client data source with the same IP address and use the port number to differentiate them. This allows you to segregate your data using a different port for each data type, then forward the data using the same port it came into."

               

              The manual goes on to make a statement that looks like you could assign a data source a network range instead of a single host IP. That doesn't help in this case but its interesting.

               

              "Events go to the data source (parent or client) that is more specific. For example, you have two client data sources, one with an IP address of 1.1.1.1 and the second with an IP address of 1.1.1.0/24, which covers a range. Both are the same type. If an event matches 1.1.1.1, it goes to the first client because it is more specific."

               

               

              ************ original content ****************

              Based only on observed behavior, if there are two data sources with the same IP address, whatever data source the log message is routed to first will consume it and not pass it along. To use my previous example:

               

              Linux Parent with fake IP address

              Client 10.10.10.1

              Client 10.10.10.2

              Apache Parent with fake IP address

              Client 10.10.10.1

              Client 10.10.10.2

              Oracle Audit Parent with fake IP address

              Client 10.10.10.1

              Client 10.10.10.2

               

              Only one of those receives events. If I disable that one, another one starts seeing events.

               

              The same thing happened when I created data sources using MEF:

               

              Windows Event Log WMI MEF Parent with fake IP address

              Client 10.10.10.1

              Client 10.10.10.2

              Windows IIS(ASP) MEF Parent with fake IP address

              Client 10.10.10.1

              Client 10.10.10.2

               

              It makes sense that you wouldn't want to route every log message to every data source for efficiencies sake. But routing it to all data sources with the same IP address wouldn't seem to present too much overhead. We only did it this way based on a recommendation from a consultant.

              • 4. Re: Data Source Organization
                habanero

                I have confirmed that works but it is labor intensive. We were told by a consultant that the other method (multiple clients with same IP address under different parents) would work. That model would have also used the parent/client model which has the advantage of supporting many more data sources per receiver. It isn't a big game changer but I thought I'd post to see if I missed something. I've also logged a support call to verify the observed behavior means what I think it does.

                 

                A recent post here suggested that we need a detailed data flow diagram of a log message going through the receiver and I agree 100%. Having that type of information would allow us to answer a lot of our own questions and better understand the product and the best ways to use it. The online SIEM documentation has improved immensely since around 2014 but the actual product documentation still sucks. It is little more than "press the Go button to go" or "enter an IP address to add an IP address". A SIEM product architect/user needs much more in-depth knowledge about how the product works to make best use of it. I was told the classes offered by McAfee do not address this internal knowledge otherwise I would have enrolled.

                 

                To end on a high note, Intel/McAfee should be congratulated on the SIEM Foundation series. Much more along the lines of what is needed in SIEM documentation, particularly the parts that tell you why to do something rather than what buttons to push to do it.

                • 5. Re: Data Source Organization
                  habanero

                  Interesting.

                   

                  The Device Type ID filter in views seems to work if parsers are applied manually in policy even though the actual data source Device Type ID does not match.

                   

                  I created a linux ASP data source type. Then I modified the policy to add Apache ASP parsers.

                   

                  If I select the data source and use a view filter for Device Type ID Apache ASP, I see the Apache events even though the data source Device Type ID is not Apache ASP. The signature ID must be tied to Device Type ID somewhere and the view is actually saying "find events where Signature ID is associated with Device Type ID XXX". Not sure how it is doing this but it is important that this work in views, reports, and correlation rules for this to be a practical way to handle devices with multiple functions.

                  • 6. Re: Data Source Organization
                    danhnt

                    Thanks for your comment.

                    • 7. Re: Data Source Organization
                      danhnt

                      Hi Habanero,

                       

                      "

                      LinuxOS -> port 514

                      Apache ASP -> port 10000

                      Oracle Audit ASP -> port 10001

                      "

                       

                      For data source above. How to add those with difference port? Thanks.

                      • 8. Re: Data Source Organization
                        trekkiecat

                        We are doing this successfully with LDAP/Linux data sources -- all on the normal syslog port of 514.  We created a parent data source, using Oracle/Directory Services for the vendor/model and the server's IP.  Then created a client data source for it, using the same server IP, telling it to use parent port, checking the box for "match by type", and changing the vendor/model to UNIX/Linux (ASP).  We had to write a lot of custom parsers for the Oracle DSEE events, so we created a policy for it and added the LDAP data sources to it.  We also enabled all the Linux ASP rules (device type ID 65) in the Oracle policy.  The data sources log events for both Oracle DSEE and Linux.  I assume this could be done with more than 2 data types (like Apache) for the same IP by just creating more client data sources and changing the vendor/model accordingly.  Hope this helps!