TOC 
CEE BoardW. Heinbockel
CEE SpecificationThe MITRE Corporation
July 8, 2011
 


CEE Event Record and CLS Encoding Specification
draft-cee-cls-06-8

Abstract

This document describes the abstract format for Common Event Expression (CEE) event records, which is designed for maximum interoperability with existing event and interchange standards. To ensure compatibility with other encoding standards, CEE provides CEE Log Syntax (CLS) Encodings. Each CLS Encoding defines a mapping from the CLS abstracted format to an encoding syntax, such as XML or JSON.

Copyright Notice

Copyright (c) 2011 The MITRE Corporation. All rights reserved.

Document License

The MITRE Corporation (MITRE) hereby grants you a non-exclusive, royalty-free license to use CEE for research, development, and commercial purposes. Any copy you make for such purposes is authorized provided that you reproduce MITRE's copyright designation and this license in any such copy.

Disclaimers

THIS DOCUMENT IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THE DOCUMENT ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE DOCUMENT OR THE PERFORMANCE OR IMPLEMENTATION OF THE CONTENTS THEREOF.

The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to this document or its contents without specific, written prior permission. Title to copyright in this document will at all times remain with copyright holders.



Table of Contents

1.  Introduction
2.  Conventions Used in This Document
3.  Definitions
4.  Event Recording and Encoding Process
5.  CEE Event Record
    5.1.  Field
        5.1.1.  Field Name
        5.1.2.  Field Value
        5.1.3.  Fields with No Values
        5.1.4.  Fields with More Than One Value
    5.2.  Value Types
        5.2.1.  string
        5.2.2.  binary
        5.2.3.  tag
        5.2.4.  integer
        5.2.5.  float
        5.2.6.  boolean
        5.2.7.  timestamp
        5.2.8.  duration
        5.2.9.  ipv4Address
        5.2.10.  ipv6Address
        5.2.11.  macAddress
    5.3.  Core Fields
        5.3.1.  id
        5.3.2.  time
        5.3.3.  action
        5.3.4.  status
        5.3.5.  p_sys_id
        5.3.6.  p_prod_id
6.  Event Extensions
    6.1.  Event Augmentation
7.  Event Logs
8.  Event Identification
9.  Event Transport
10.  Event Processing
    10.1.  Event, Field, and Value Ordering
    10.2.  Event Modification
    10.3.  Event Augmentation
    10.4.  Event and Field Limits
    10.5.  Fields with No Value
    10.6.  Multi-Value Rule
11.  Acknowledgments
12.  References
    12.1.  Normative References
    12.2.  Informative References
§  Author's Address




 TOC 

1.  Introduction

The goal of Common Event Expression (CEE) [CEE.ARCH] (CEE Board, “CEE Architecture Overview,” May 2010.) is to define a standardized representation for an event record. In order to encourage use and improve compatibility with existing standards, CEE event records are encoded using one or more Common Log Syntax (CLS) Encodings. Each CLS Encoding defines how a CEE event record is mapped to a specific syntax encoding.

CEE and CLS enable the efficient and lossless storage, exchange, and consumption of event records. The CEE event record format was designed for maximum compatibility and can be encoded to interoperate with standards. CLS Encodings are defined for representation of CEE event records using XML and JSON. Additional mappings may be defined to enable CEE support for other syntaxes.



 TOC 

2.  Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].

All ABNF [RFC5234] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” January 2008.) definitions assume that the CEE event record and field data is presented using Unicode characters and encoded according to UTF-8 [RFC3629] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.).



 TOC 

3.  Definitions



 TOC 

4.  Event Recording and Encoding Process

CEE makes a distinction between the event recording and the event encoding process. Once an event occurs that should be recorded, the relevant data from that event is captured as a event record. When the event record is to be recorded in a log or shared, the CEE record should be encoded according to one or more CLS Encoding.

In order for event consumers to consume CLS encoded event records, the consumer must simply decode the CLS encoded record into the original CEE event record. The encoding and decoding of CEE event records using a CLS Encoding MUST be able to be performed without data loss.



+---------+        +----------+  encode  +-----------+
|         |  data  |  CEE     | =======> |  CLS      |
|  Event  | =====> |  Event   |          |  Encoded  |
|         |        |  Record  | <======= |  Record   |
+---------+        +----------+  decode  +-----------+
 CLS Encoding Process 



 TOC 

5.  CEE Event Record

A CEE event record consists of an event body and zero or more event extensions. The event body is a sequence of fields. Each field is made up of an identifying field name and zero or more field values. The event extensions are optional and allow for things such as augmentation of the original event and digital signatures.

Each syntax declaration MUST identify how the body of a CEE event record is to be encoded and decoded. Additionally, any syntax declaration SHOULD support event extensions Section 6 (Event Extensions) and the representation of logs Section 7 (Event Logs), or collections of CEE event records.



 TOC 

5.1.  Field

Each field represents a single property of an event. CEE events have two (2) different types of fields: name-value fields (NV-FIELD) and core fields. Name-value fields are specified at the time the event record is created. Core fields are fields that are shared by all CEE event records and have been predefined within the scope of this CEE event specification.

Name-value fields MUST have exactly one field name, and zero to 255 field values. Each field name SHOULD correspond to exactly one field definition in a corresponding CEE Profile [CEE.Profile] (CEE Board, “CEE Profile Specification,” June 2011.).

Each CEE encoded event MUST include each of the core fields. The value of each core field MUST comply with that fields syntax encoding. See Section 5.3 (Core Fields) for a listing of core fields and their syntaxes.



 TOC 

5.1.1.  Field Name

The field name is an identifier to provide context to the field values. For name-value fields, the field name MUST be specified. Field names for core fields MUST be able to be determined and MAY be inferred through argument position or other encoding.

Each field name MUST start with a lower-case or upper-case ASCII letter (ABNF ALPHA) or underscore ('_', ABNF %x5F) character, followed by no more than 31 letter, digit (ABNF DIGIT), and underscore characters:

   FIELD-NAME    = CEE-NAME
   CEE-NAME      = CEE-STARTCHAR *31CEE-CHAR
                   ; [A-Za-z_][A-Za-z0-9_]{1,31}
   CEE-STARTCHAR = ALPHA / %x5F
   CEE-CHAR      = CEE-STARTCHAR / DIGIT
   ALPHA         = %x41-5A / %x61-7A  ; A-Za-z
   DIGIT         = %x30-39            ; 0-9


 TOC 

5.1.2.  Field Value

Each field value consists of a value type along with the value data. The value type for each value indicates how the associated value data MUST be encoded within each CLS syntax declaration.

Additionally, a syntax encoding MUST define how each VALUE-TYPE defined in Section 5.2 (Value Types), is indicated. If the value types are explicitly defined, the encoding MUST specify how each type is represented. Otherwise, if the value types implicit, instructions MUST be provided on how an implementation is supposed to infer the proper value type based on the encoding.



 TOC 

5.1.3.  Fields with No Values

Every field MUST have at least one value. Fields that do not have a value SHOULD be removed from the CEE event record when possible. Otherwise, the field value MUST be the nil value. There is no lexical representation of nil values defined in this specification. Each CLS syntax declaration MUST define how the nil value is encoded.



 TOC 

5.1.4.  Fields with More Than One Value

A CLS encoding MUST be able to associate more than one value with a single field. The CLS syntax declaration MUST specify how to encode a field that has more than one value.



 TOC 

5.2.  Value Types

CEE defines eleven (11) core value types: string, binary, integer, float, boolean, timestamp, duration, ipv4Address, ipv6Address, macAddress, and tag.

While the CLS component only recognizes these 11 value types, the CEE Profile specification [CEE.Profile] (CEE Board, “CEE Profile Specification,” June 2011.) allows for additional value types to be defined and associated with certain fields. These types can be used by event consumers to help with processing or to provide further validation for individual CEE Event field values.

Each syntax declaration MUST define a representation encoding for each value type. The value type encoding MUST specify how any value corresponding to that value type MUST be encoded. Additionally, the encoding SHOULD specify the value type of the encoded value.

This section defines the lexical representations for each CEE value type. If a syntax encoding does not use a value type's provided lexical representation, then the encoding MUST provide instructions on how to convert the encoded value to and from the CEE lexical representation.



 TOC 

5.2.1.  string

The string value type represents a sequence of zero or more Unicode characters [UNICODE] (The Unicode Consortium, “The Unicode Standard—Version 5.0,” 2007.). The NUL character (ABNF %x00) MUST NOT appear as any character within an encoded CEE string value.

The syntax encoding specification MUST be able to handle a string value containing any Unicode character (except NUL). This may be done with a Unicode encoding format, such as UTF-8 [RFC3629] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.), or by using another encoding, such as Base64 [RFC4648] (Josefsson, S., “The Base16, Base32, and Base64 Data Encodings,” October 2006.). An encoding MAY declare reserved characters. If reserved characters are defined, the syntax mapping MUST provide an encoding or escaping method to represent the literal value of the reserved character.

The lexical representation of a CEE string value MUST adhere to the following ABNF [RFC5234] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” January 2008.) definition:

   string        = *UNICODE-CHAR
   UNICODE-CHAR  = *OCTET ; NUL MUST NOT appear



 TOC 

5.2.2.  binary

A binary value is a series of zero or more octet values and is intended to hold binary data such as files or byte streams. It can be used to hold values that contain non-Unicode data or NUL values.

Some syntax encodings may provide support for arbitrary sequences of octets. If the syntax encoding does not support native binary values, it MUST allow for the binary value to be encoded in Base64 as described in [RFC4648] (Josefsson, S., “The Base16, Base32, and Base64 Data Encodings,” October 2006.).

The lexical representation of a CEE binary value MUST adhere to the following ABNF definition:

   binary       = *OCTET

A CLS encoding may requires that binary values be Base64 encoded. If so, the Base64 encoded value MUST meet the following ABNF definition:

   binary-b64   = *( 4B64 ) B64FINAL
   B64FINAL     = B64FINALQUAD / B64PADDED16 / B64PADDED8
   B64FINALQUAD = 3B64 B64CHAR
   B64PADDED16  = 2B64 B16 "="
   B64PADDED8   = B64 B04 "=" [SP] "="
   B64          = B64CHAR [SP]
   B64CHAR      = ALPHA / DIGIT / %x2B / %x2F ; [A-Za-z0-9+/]
   B16          = B16CHAR [SP]
   B16CHAR      = B04CHAR / "E" / "M" / "U" / "Y" / "c" / "k"
                  / "o" / "s" / "0" / "4" / "8"
   B04          = B04CHAR [SP]
   B04CHAR      = "A" / "Q" / "g" / "w"



 TOC 

5.2.3.  tag

The tag field type is a special field type within the CEE that is used to include references to event tags, or keywords. Each tag value SHOULD correspond to a tag definition, whose format is defined within the CEE Profile specification [CEE.Profile] (CEE Board, “CEE Profile Specification,” June 2011.).

The format for a CEE tag value is identical to that of a CEE field name (see Section 5.1.1 (Field Name)). A CEE tag value MUST begin with an ASCII letter (ABNF ALPHA) or underscore ('_', ABNF %x5F), and be followed by no more than 31 letter, digit (ABNF DIGIT), and underscore characters. The lexical representation of a CEE tag value MUST adhere to the following ABNF definition:

   tag         = CEE-NAME



 TOC 

5.2.4.  integer

Integer values are a sequence of decimal integers (ABNF DIGIT) of values %x30 through %x39 ("0" through "9"). An integer MUST be able to represent any arbitrary signed or unsigned integer number. If a leading positive ('+', ABNF %x2B) or negative ('-', ABNF %x2D) sign is not supplied, the integer MUST be assumed to represent a positive value.

The value of a CEE integer MUST be representable within a 64-bit signed integer. That is, the minimum integer that can be represented in a CEE integer type is -(2^63), or -9,223,372,036,854,775,808. The maximum CEE integer value is 2^63-1 or 9,223,372,036,854,775,807. If the numeric value exceeds these limitations, then it should be expressed in the record as CEE string type.

The lexical representation of a CEE integer value MUST adhere to the following ABNF definition:

   integer     = [ "+" / "-" ] 1*19DIGIT
              ; MUST in the range [-(2^63), 2^63-1]



 TOC 

5.2.5.  float

A float value is a representation of an arbitrary precision, finite, real number of the form: i*10^n; where i and n are integers, -(2^53) < i < 2^53, and -1075 < n < 970. Due to the various precisions and encodings of floating point numerals, CEE float values MUST be able to be represented within a 64-bit binary IEEE 754 (Institute of Electrical and Electronics Engineers, “Standard for Binary Floating-Point Arithmetic,” August 1985.) [IEEE.754.1985] ("double precision") encoded number.

A syntax encoding MAY specify alternative floating point value encodings with other widths, precisions, or optimizations for better space or processing efficiency. If alternative floating-point numeral encodings are specified, a conforming system MAY alter the encoded value and precision into a 64-bit IEEE 754 format. When performing any comparison or arithmetic operations on CEE float values, each float value MAY be converted into a 64-bit IEEE 754 encoded floating-point numeral and perform floating point arithmetic operations as defined by IEEE 754 (Institute of Electrical and Electronics Engineers, “Standard for Binary Floating-Point Arithmetic,” August 1985.) [IEEE.754.1985].

The lexical representation of a CEE float value MUST adhere to the following ABNF definition:

   float     = [ "+" / "-" ] DECIMAL-PT-NUM [DEC-SCI-NUM]
               ; MUST fit within an IEEE 754 64-bit floating point
   DECIMAL-PT-NUM = DEC-NUM-PT-NUM / DEC-PT-NUM
   DEC-NUM-PT-NUM = 1*DIGIT [ "." *DIGIT ]
   DEC-PT-NUM     = "." 1*DIGIT
   DEC-SCI-NUM    = [ "e" / "E" ] 1*DIGIT



 TOC 

5.2.6.  boolean

A boolean value supports Boolean, or binary-valued, logic and can only represent the values of "true" and "false".

If a syntax encoding defines boolean value representations other than "true" (ABNF %x74 %x72 %x75 %x65) and "false" (ABNF %x66 %x61 %x6C %x73 %x65), the encoding specification MUST define how each alternative representation maps to either "true" or "false".

The lexical representation of a CEE boolean value MUST adhere to the following ABNF definition:

   boolean     = "true"     ; %x74 %x72 %x75 %x65
                 / "false"  ; %x66 %x61 %x6C %x73 %x65



 TOC 

5.2.7.  timestamp

The CEE timestamp value type allows for the specification of a date, time, and timezone information as defined in [ISO.8601.1988] (International Organization for Standardization, “Data elements and interchange formats - Information interchange - Representation of dates and times,” June 1988.).

Syntax encodings MUST support the specification of timestamp values up to nanosecond (1E-9) granularity. Timestamp values having precision greater than 1 nanosecond MAY be rounded or truncated to no more than nanosecond granularity.

The lexical representation of a CEE timestamp value MUST consist of no more than 36 characters and adhere to the following ABNF definition:

   timestamp   = DATE "T" TIME ; MUST be less than 36 chars long

   DATE           = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY
   DATE-FULLYEAR  = 4DIGIT
   DATE-MONTH     = 2DIGIT  ; 01-12
   DATE-MDAY      = 2DIGIT  ; 01-31

   TIME           = TIMESPEC-BASE [TIME-FRACTION] [TIME-ZONE]
   TIMESPEC-BASE  = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND
   TIME-HOUR      = 2DIGIT  ; 00-23
   TIME-MINUTE    = 2DIGIT  ; 00-59
   TIME-SECOND    = 2DIGIT  ; 00-59, 00-60 based on leap seconds
   TIME-FRACTION  = "." 1*DIGIT
   TIME-NUMOFFSET = ( "+" / "-" ) TIME-HOUR ":" TIME-MINUTE
   TIME-ZONE      = "Z" / TIME-NUMOFFSET



 TOC 

5.2.8.  duration

CEE duration value type supports values that represent a consecutive period of time in a format compliant with [ISO.8601.1988] (International Organization for Standardization, “Data elements and interchange formats - Information interchange - Representation of dates and times,” June 1988.).

Additionally, a syntax encoding MUST support duration values having up to nanosecond (1E-9) granularity. Duration values with precision greater than 1 nanosecond MAY be rounded or truncated to no more than nanosecond granularity.

The lexical representation of a CEE duration value MUST consist of no more than 36 characters and adhere to the following ABNF definition:

   duration     = "P" DAYTIME-DUR ; MUST be less than 36 chars long

   DAYTIME-DUR  = DUR-DATE / DUR-TIME
   DUR-SECOND   = 1*DIGIT [DUR-FRACTION] "S"
   DUR-FRACTION = "." 1*DIGIT
   DUR-MINUTE   = 1*DIGIT "M" [DUR-SECOND]
   DUR-HOUR     = 1*DIGIT "H" [DUR-MINUTE]
   DUR-TIME     = "T" (DUR-HOUR / DUR-MINUTE / DUR-SECOND)
   DUR-DAY      = 1*DIGIT "D"
   DUR-DATE     = DUR-DAY [DUR-TIME]



 TOC 

5.2.9.  ipv4Address

The ipv4Address field type holds a single network address within an IPv4 network. ipv4Address values are represented using dotted-decimal notation.

The lexical representation of a CEE ipv4Address value MUST adhere to the following ABNF definition:

   ipv4Address = 3( DEC-OCTET "." ) DEC-OCTET
   DEC-OCTET   = DIGIT                   ; 0-9
                 / %x31-39 DIGIT         ; 10-99
                 / "1" 2DIGIT            ; 100-199
                 / "2" %x30-34 DIGIT     ; 200-249
                 / "25" %x30-35          ; 250-255



 TOC 

5.2.10.  ipv6Address

The ipv6Address field type holds a single network address within an IPv6 network.

The lexical representation of a CEE ipv6Address value MUST adhere to a valid IPv6 address representation format as defined in [RFC4007] (Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and B. Zill, “IPv6 Scoped Address Architecture,” March 2005.) and [RFC4291] (Hinden, R. and S. Deering, “IP Version 6 Addressing Architecture,” February 2006.):

   ipv6Address = IPv6Literal [ IPv6ZoneID ]
   IPv6Literal = 6( H16 ":" ) LS32
                 /                       "::" 5( H16 ":" ) LS32
                 / [               H16 ] "::" 4( H16 ":" ) LS32
                 / [ *1( H16 ":" ) H16 ] "::" 3( H16 ":" ) LS32
                 / [ *2( H16 ":" ) H16 ] "::" 2( H16 ":" ) LS32
                 / [ *3( H16 ":" ) H16 ] "::"    H16 ":"   LS32
                 / [ *4( H16 ":" ) H16 ] "::"              LS32
                 / [ *5( H16 ":" ) H16 ] "::"              H16
                 / [ *6( H16 ":" ) H16 ] "::"
   H16         = 1*4HEXDIG
   LS32        = ( H16 ":" H16 ) / ipv4Address
   IPv6ZoneID  = "%" 1*ZoneIDChar  ; SHOULD be an non-negative number
   ZoneIDChar  = %x21-24 / %x26-7E ; MUST NOT contain '%'

Examples of valid CEE ipv6Address values include:



 TOC 

5.2.11.  macAddress

The CEE macAddress field type represents Media Access Control (MAC) address values that are the physical network address of a network interface.

The lexical representation of a CEE macAddress value MUST adhere to a valid MAC address representation format as defined in [IEEE.802.1990] (Institute of Electrical and Electronics Engineers, “Local and Metropolitan Area Networks: Overview and Architecture,” 1990.):

   macAddress  = 5( 2HEXDIG ":" ) 2HEXDIG



 TOC 

5.3.  Core Fields

Many event records contain a similar set of core fields. To make these fields easy to use and process, many of these fields have been predefined. Each core field MUST be specified within every CEE event record. There are six (6) core fields that MUST be supported by each syntax encoding: id, time, action, status, p_sys_id, and p_prod_id.

There may be cases where one or more core fields within an event record do not have a value. The syntax declaration MUST define how these situations are to be handled. The recommended method is to use the representation for the nil value from Section 5.1.3 (Fields with No Values).



FieldField Type
id string
time timestamp
action tag
status tag
p_sys_id string
p_prod_id string

 Table 1: Core Fields 

Each syntax encoding MUST specify how each core field is encoded. A syntax encoding MUST support every core field.



 TOC 

5.3.1.  id

The "id" core field MUST contain one CEE string value that represents the event type. The event.id SHOULD be unique within the context of the producing source, application, or device.

The "id" value is intended to identify semantically similar event records. It MAY be used to find or filter events, or used by an event consumer to apply additional event-based processing or inferences.



 TOC 

5.3.2.  time

The "time" field MUST have a single CEE timestamp value that indicates the date, time, and timezone offset indicating the time when the event began.

If the event was not instantaneous, then the additional fields should be used to indicate the duration of the event.



 TOC 

5.3.3.  action

The "action" field MUST contain one CEE tag value that indicates the primary action of the event. The tag value SHOULD correspond to a CEE action Tag definition with a CEE Profile document.

If the event action is unknown, then the "action" should be listed as "unknown".



 TOC 

5.3.4.  status

The "status" field MUST contain one CEE tag value that indicates the status associated with the "action" field. I.e., the "status" field value is the current status of the event's primary action. The tag value SHOULD correspond to a CEE status Tag definition with a CEE Profile document.

If the event status is unknown, then the "status" should be listed as "unknown".



 TOC 

5.3.5.  p_sys_id

The "p_sys_id" field MUST have one value that provides an identifier for the host system that produced the event. The p_sys_id value MAY be an IP address or hostname.

The value of a p_sys_id field MUST be of a CEE string type.



 TOC 

5.3.6.  p_prod_id

The core "p_prod_id" field MUST have one value that is a CEE string type. The value MUST be an identifier for the product producing the event.

The p_prod_id value is a producer assigned value. Examples of p_prod_id values include a product or driver identifier.



 TOC 

6.  Event Extensions

Event extensions allow for supplementary information to be supplied in a CEE event record along with the original event body. Extensions provide a straightforward way of implementing features such as digital signatures and non-destructive event augmentation.

Extensions are optional within a CEE event record and MUST be provided after the event body. Zero or more extensions may be included with an event record.

Each extension is defined below, as part of this specification. Other extensions SHOULD NOT be used. A CLS Encoding declaration SHOULD support the concept of event extensions, but does not have to support every extension.



 TOC 

6.1.  Event Augmentation

Event augmentations allow for already encoded CEE event records to be modified while preserving the integrity of the original event. Instead of directly modifying the event data, an event processing device may choose to append an event augmentation section onto the CEE event record.

An event record may have zero or more augmentation sections. In order to provide proper traceability, each event augmentation section MUST indicate the order in which the augmentation was performed. Additionally, each augmentation section MUST provide the time when the augmentation was created ("time", Section 5.3.2 (time)), as well as an identifier for the product ("p_prod_id", Section 5.3.6 (p_prod_id)) and host ("p_sys_id", Section 5.3.5 (p_sys_id)) producing the augmentation. These fields are defined as part of the CEE core fields Section 5.3 (Core Fields).

The remainder of the augmentation section MUST consist of zero or more fields that indicate the added fields and field values.



 TOC 

7.  Event Logs

An event log is simply an ordered collection of CEE event records. Syntax encodings MAY define a specific encoding to represent a log of zero or more encoded event records.



 TOC 

8.  Event Identification

Each syntax encoding SHOULD define a MIME type ([RFC4288] (Freed, N. and J. Klensin, “Media Type Specifications and Registration Procedures,” December 2005.)) to allow consumers to correctly identify the CEE event record encoding syntax.

The MIME type SHOULD be an "application" type with a subtype of "cee+ENCODING", where "ENCODING" is the common short name for the underlying syntax encoding. For example, "application/cee+json" would identify a JSON encoded CEE event record, while "application/cee+xml" is an XML encoded record.



 TOC 

9.  Event Transport

This specification does not define or require the use of any specific transport or protocol for transferring CEE events. A syntax encoding MAY define or suggest compatible protocols, but MUST NOT mandate the use of any specific protocol.



 TOC 

10.  Event Processing

The following are processing rules that dictate requirements such as CLS encoded event size limits, event record preservation, and how to process fields with multiple values.



 TOC 

10.1.  Event, Field, and Value Ordering

In some cases, the ordering of the CEE events, fields, and values may be significant. Intermediate systems MUST be able to maintain the ordering of fields within an event record and MUST preserve the ordering of sequential event records when forwarding or re-sending those records.



 TOC 

10.2.  Event Modification

Some environments require that the original event record be preserved. To support such environments, event consumers of CEE event records MUST be able to preserve the entire contents of an event record. This includes that no fields or values be modified, added, removed, or reordered.



 TOC 

10.3.  Event Augmentation

When processing a CEE event record containing event augmentation sections, the event body MUST be processed first. Each augmentation section SHOULD be processed individually and in the order that the augmentations were added.

Each non-core field in an augmentation section SHOULD be added to the content of the event body and any previously processed augmentations sections. If the field already exists, the field values SHOULD be added to the existing field values.



 TOC 

10.4.  Event and Field Limits

In order to help with interoperability, CEE sets limitations on the sizes of encoded event record values. These limits are for the entire event record, including any event extensions. Event producers and consumers compatible with the CLS specifications must be able to handle encoded event messages within the following limits:



NameSize
Event Record Size  64 KB
Field Value Size   2 KB
Number of Fields 255
Number of Values per Field 255

 Table 2: Event Record Limits 

These limits were chosen to be large enough to handle most event records produced by modern systems. However, they are not large enough to allow arbitrary data to be added to an event. For example, large objects such as files and data stream SHOULD NOT be placed within an event record. Instead, such objects SHOULD be referenced within the event record.

Note that these limitations are independent of the encoding used and include any encoding overhead and all field and value delimiters. For example, representation of international characters using UTF-8 will consume up to six times more space than an ASCII character. If Unicode characters are encoded using UTF-32, then every character, including ASCII characters, will take up exactly 32 bits (4 octets).

In order to make efficient use of the allotted space, the event producer SHOULD consider both the CLS encoding properties as well as the efficiency of the Unicode character encoding for the event data.



 TOC 

10.5.  Fields with No Value

This specification does not define a null value or field type to indicate when a field has no value. A CLS syntax mapping MAY define ways to encode a field with an empty or missing value.

For the cases when events are being recording and the event has a field with no associated value, two potential solutions will work with every CEE compatible device. In most cases, it is acceptable to just not include that field instance in the event record. The other option is to set the field to be an empty field value.



 TOC 

10.6.  Multi-Value Rule

A field with multiple values MUST be treated as the combination of all values of all fields within an event record.

Example 1 - One Multi-Valued Field

    Event := Field(time, 2010-11-12T12:01:38-04:00), ...
             Field(src_ipv4, 127.0.0.1),
             Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]).

The CEE Multi-Value Rule states that this record reflects "time" * "src_ipv4" * "dst_ipv4". This means that this record represents an event that occurred on 12 November 2010 where something occurred between a source IPv4 address and three destination IPv4 addresses—for example, three packets were sent from src_ipv4 to each dst_ipv4.

Example 2 - Two Multi-Valued Fields

    Event := Field(time, 2010-11-12T12:01:38-04:00), ...
             Field(src_ipv4, [127.0.0.1, 127.0.0.2]),
             Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]).

With this record, the CEE multi-value rule states that each source IPv4 address talked to each destination address. If this event reflects packets being sent, then six (6) packets would have been transmitted (|time| * |src_ipv4| * |dst_ipv4| = 6).

Example 3 - More Than Two Multi-Valued Fields

    Event := Field(time, 2010-11-12T12:01:38-04:00), ...
             Field(src_ipv4, [127.0.0.1, 127.0.0.2]),
             Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]),
             Field(out_bytes, [4, 12]).

One more multi-valued field was added to record size of the data transmitted (out_bytes) of 4 and 12 bytes.

Now, the event record reflects 12 packets (1*2*3*2=12) transmitted: a packet of each length was transmitted by each source IPv4 address to each destination IPv4 address. If this is not what the event producer intended, then multiple event records MUST be used.

Event consumers MUST NOT attempt to map a single value from one multi-valued field to a single value from another multi-valued field. E.g., in Example 3, a consumer must not associate the 127.0.0.1 src_ipv4 address with only the 10.0.0.1 dst_ipv4 address, the 127.0.0.2 src_ipv4 address with the 10.0.0.2 dst_ipv4 address, etc.

In order to avoid problem resulting from inconsistent interpretation of the CEE multi-value rule, event producers MAY limit each record to have only one multi-valued field.



 TOC 

11.  Acknowledgments

This specification is the product of the CEE Board and community. The CEE Board would like to thank their colleagues who reviewed drafts of this and previous documents. The CEE Board also gratefully acknowledges and appreciates the many contributions from individual representing public and private section, especially Rainer Gerhards, Eric Fitzgerald, Raffael Marty, Steve Grubb, Dr. Anton Chuvakin, Dominique Karg, and Peter Czanik.



 TOC 

12.  References



 TOC 

12.1. Normative References

[IEEE.754.1985] Institute of Electrical and Electronics Engineers, “Standard for Binary Floating-Point Arithmetic,” IEEE Standard 754, August 1985.
[IEEE.802.1990] Institute of Electrical and Electronics Engineers, “Local and Metropolitan Area Networks: Overview and Architecture,” IEEE Standard 802, 1990.
[ISO.8601.1988] International Organization for Standardization, “Data elements and interchange formats - Information interchange - Representation of dates and times,” ISO Standard 8601, June 1988.
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC3629] Yergeau, F., “UTF-8, a transformation format of ISO 10646,” STD 63, RFC 3629, November 2003 (TXT).
[RFC4007] Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and B. Zill, “IPv6 Scoped Address Architecture,” RFC 4007, March 2005 (TXT).
[RFC4291] Hinden, R. and S. Deering, “IP Version 6 Addressing Architecture,” RFC 4291, February 2006 (TXT).
[RFC4648] Josefsson, S., “The Base16, Base32, and Base64 Data Encodings,” RFC 4648, October 2006 (TXT).
[RFC5234] Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” STD 68, RFC 5234, January 2008 (TXT).
[UNICODE] The Unicode Consortium, “The Unicode Standard—Version 5.0,” 2007.


 TOC 

12.2. Informative References

[CEE.ARCH] CEE Board, “CEE Architecture Overview,” May 2010.
[CEE.Profile] CEE Board, “CEE Profile Specification,” June 2011.
[RFC4288] Freed, N. and J. Klensin, “Media Type Specifications and Registration Procedures,” BCP 13, RFC 4288, December 2005 (TXT).


 TOC 

Author's Address

  William Heinbockel
  The MITRE Corporation
  202 Burlington Rd
  Bedford, MA 01730
  United States
Phone:  +1 781-271-2615
Email:  heinbockel@mitre.org