CEE Join the CEE Effort News
CEE™ Common Event Expression: A Standard Log Language for Event Interoperability in Electronic Systems
 

About CEE

Terminology

Documents

FAQs

CEE Language

Event Taxonomy

Log Syntax

Log Transport

Log Recommendations

Community

CEE Editorial Board

CEE Working Group

Discussion List

Discussion Archive

News

Calendar

Free Newsletter

Contact Us

CEE Language

Event Taxonomy

Log Syntax

Log Transport

Log Recommendations

Comparison to Other Efforts

Road Map

Schedule


Participate

Join the Discussion List

Join the Working Group

CEE Language

Introduction | Event Taxonomy | Log Syntax | Log Transport | Log Recommendations

Introduction

MITRE has created a CEE Email Discussion List and a CEE Working Group to assist the community in developing the CEE Event Taxonomy, CEE Log Syntax Specification, CEE Log Transport Specification, and CEE Log Recommendations, which once finalized will allow any CEE message to be received, parsed, and understood by any recipient or device.

Drafts of these specifications will be posted here as soon as they are available.

We encourage event producers, event consumers, and IT and security operations end users to participate in the development of these specifications on the CEE Email Discussion List. Brief descriptions of each task requiring community input are noted below.

BACK TO TOP

Common Event Taxonomy

Common Event Expression Taxonomy (CEET) will be an unambiguous event language for classifying logged events. If in event is observed by multiple systems, taxonomy descriptions of that event should be identical. A computer should be able to immediately determine whether two logs refer to the same type of event. In order to make this happen, there needs to be a collection of well-defined words that can be combined in a predictable fashion. Presumably these words would describe the type of activity, the actors involved, the outcome, and other relevant event data.

For example, take the simple event of the root user logging into a system. In the PAM framework, this event is expressed as "session opened for user root by LOGIN(uid=0)." In a typical Linux distribution it might be logged as "ROOT LOGIN ON tty1," while a Snort trigger reports "POLICY ROOT login attempt [Classification: Misc activity] [Priority: 3]." The goal of CEE is for each of these different products to identify the event using the same terminology. Imagine how straightforward log interpretation would be if the event was always reported in the same manner, with authentication events always being represented by similar phrasing.

By defining and utilizing common taxonomy for recording events, CEET will be a scalable and universal way to convey the meaning of event messages to both human and computer recipients. Event producers can be constrained to recording one event per log entry and supporting a more event consumer-focused model. By eliminating subjective information, such as perceived impact or importance sometimes seen in current log messages, end-users and event consumers can generate a more flexible, accurate, environment-focused overview which takes into account of all possible logs from all supporting devices.

CEET may follow one of several approaches. One way is to provide a vocabulary associated with categories or "buckets" for various event characteristics. The buckets might be something like "subject," "object," "action type," and so on. In each of these, the event producer selects the appropriate term. Another, similar approach would be to define a pseudo-language with subjects, objects, verbs, etc. along with a finite set of words. In this case, the producer would build a parsable grammar out of the elements, which would reflect the event details.

BACK TO TOP

Common Log Syntax

CEE makes a distinct separation between the log syntax and the transport. While the syntax is unique, it can be expressed and transmitted in a number of different ways. For example, the syntax may be expressed in XML and transported via SOAP or email (SMTP). Some syntax and transport options are complementary, but others do not work as well, such as communicating XML over Syslog or SNMP. Whether the event syntax is recorded locally in a flat file (to be transported over FTP or SCP protocols in batch mode) or sent via the network on a known protocol, this choice should be left up to the event producers and consumers. Both the sender and receiver must agree as to what communication channel will be used.

Common Log Syntax (CLS) will define a dictionary of syntactic identifiers to be used for communicating details regarding a logged instance of an event. Since it is not possible to create a syntax that is appropriate for every situation, the dictionary will need to define a universal set of terms along with their data types and usage (e.g., source, destination, username, domain, etc. that may be reused for previous standard efforts). By using the same data dictionary, event consumers and end-users can be assured that the expected event details are included and used consistently.

BACK TO TOP

Common Log Transport

When using a syntax there should be options, depending on the environment and objectives, as to how information is transmitted. An administrator should be able to choose the best transport, regardless of whether it is an encoded binary syntax, name-value pairs, or an XML-based one. Common Log Transport (CLT) will be used to define the potential mediums through which CLS can be expressed and transmitted. Below are three possibilities for CLT that address issues of speed, ease-of-use, and expressiveness:

Speed — A binary log format (and corresponding syntax of fixed sized fields in a binary file) can express comprehensive information and is the fastest way to log and exchange data. When wanting to minimize size and network impact, compressed binary is the best option. However, binary syntaxes are not designed for human readability and require conversion libraries for encoding and decoding logs.

Ease-of-Use — Plaintext syntaxes include delimited and key-value pairs such as CSV and CEF, which humans and machines can more easily read and understand. With a fairly basic syntax, this format is very practical and would most likely have the best overall acceptance by event producers and consumers. Additionally, this type of syntax offers compatibility with a majority of transports. At the same time, this format is not as speed-efficient as a binary format above.

Expressiveness — Syntaxes based on structures such as XML are comprehensive and capable of representing complex data structures, such as lists and nested object relationships. Similar to ease-of-use syntax options, an XML-based syntax would be a desirable option for some event producers and consumers. Some drawbacks include a limited choice of compatible transports as well as the overhead requires extra space for storage and transmission and possible difficulties with human understanding of such logs. Since most event data is fairly straightforward, forcing it into an expressive syntax would be unnecessarily excessive.

An important feature of CLT will be that many of the currently used log transport options can be adopted as a supplemental "standard." For example, using Syslog over port UDP 514 is used by millions of UNIX-derived systems and thus can probably be considered as a standard log transport mechanism.

BACK TO TOP

Common Event Log Recommendations

With a common way of expressing events, it is possible to advocate what events products should log. Common Event Log Recommendations (CELR) will provide that logging guidance. For example, while firewalls log events such as blocked connection attempts they provide no logging advisements. Should firewalls log all rule change events? What about login and administration events? CELR will assure that administrators receive an entire view of all auditable events.

CELR will also address what level of detail of information should be logged. For example, a firewall should offer recommendations as to what data should be included with various firewall-related events: source, destination, NAT’ed sources, ports, protocols, and the connection result (allowed, dropped, etc.). Further considerations could include how applications should log certain events: username, source, connection method, and results for authentication; configuration changes; and a plethora of other important event information.

IDS and IPS systems need guidance concerning how to report potential attack events, such as the source, destination, what triggered the alert, and reporting with which attack that the alert is related. One important outcome of CELR regarding network sensors is whether sensors should report what they detected, what they think they observed, or both. For higher level or automated analysis, the packet details are facts and generally more useful for correlation and analysis. However, an alert guessing at a buffer overflow attack or brute force login attempts based upon signatures may be better for a small-scale LAN and add some input value to prioritizing. This information can then be used to as feedback to improve the CEE syntax and expression taxonomy.

More detailed information about CEE and each of these tasks is available in the Common Event Expression White Paper posted on the CEE Documents page.

BACK TO TOP

Page Last Updated: April 3, 2008