3 Replies Latest reply on Sep 28, 2017 4:02 PM by andy777

    Parsing vs Normalization


      Hi Everyone.


      Can anyone share the difference between the above one like Parsing vs Normalisation..I am little bit confused about these two concepts.

        • 1. Re: Parsing vs Normalization

          In the McAfee SIEM architecture, there are at least two pieces, the ESM and the Receiver. They might be combined into a combo appliance, but both pieces are there.


          The ESM cannot collect logs directly and the Receiver cannot be managed without an ESM.


          When a Receiver collects the logs it performs three functions, parsing, normalization and aggregation, to create the metadata that populates the ESM database. Parsing is matching logs to rules to determine which text strings should be mapped to database fields. These are defined in the Policy Manager and the reason that policy needs to be rolled to the Receiver when new data sources are added. Only the rules required to parse the configured data sources are pushed to the Receiver for efficiency.


          In addition to being parsed, the Receiver enriches the events with additional fields including geo-location and a category/sub-category. Assigning the event to a category is normalization since groups of similar events can be referenced by category instead of their individual signature IDs.


          Just to round out the story, after the data has been groomed, the Receiver inserts it into it's local database. If the Receiver and data source are associated with an ELM/ELS, it will also package up the raw logs and send them off for storage and search. The Receiver does not send logs to the ESM though; it's job is done after inserting the logs into the local database. The ESM will then query the Receivers at the configured interval, which is also the aggregation window.


          This will all change in an upcoming release and become much closer to real time.

          • 2. Re: Parsing vs Normalization

            Thanks Andy, can you please provide some more examples which focus straightway on difference between in parsing and normalisation..You can share the separate points for parsing and normalisation,which will reflect the proper difference between them.The things which is going to happen in parsing and the things which is going to be perform in normalisation.

            • 3. Re: Parsing vs Normalization

              Parsing = Mapping text into fields


              Given the line:

              Sep 28 16:39:03 app_server sshd[8677]: Failed password for invalid user icecast2 from port 57238 ssh2


              It would be parsed into:

                host = app_server

                process = sshd

                source_user = icecast2

                source_ip =

                source_port = 57238

              and inserted into the database.


              Normalization = Assign category


              For normalization, the event above be assigned a normalization ID of: 409075712 which is Authentication | Login | SSH Login in the normalization taxonomy.



              If I use the Normalized group, SSH Login, as a filter, it will show me all events categorized as SSH logins regardless of the originating device, OS or signature ID.


              Does that help?