CEE Board W. Heinbockel CEE Specification The MITRE Corporation July 8, 2011 CEE Event Record and CLS Encoding Specification draft-cee-cls-06-8 Abstract This document describes the abstract format for Common Event Expression (CEE) event records, which is designed for maximum interoperability with existing event and interchange standards. To ensure compatibility with other encoding standards, CEE provides CEE Log Syntax (CLS) Encodings. Each CLS Encoding defines a mapping from the CLS abstracted format to an encoding syntax, such as XML or JSON. Copyright Notice Copyright (c) 2011 The MITRE Corporation. All rights reserved. Document License The MITRE Corporation (MITRE) hereby grants you a non-exclusive, royalty-free license to use CEE for research, development, and commercial purposes. Any copy you make for such purposes is authorized provided that you reproduce MITRE's copyright designation and this license in any such copy. Disclaimers THIS DOCUMENT IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THE DOCUMENT ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE DOCUMENT OR THE PERFORMANCE OR IMPLEMENTATION OF THE CONTENTS THEREOF. The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to this document or its contents without specific, written prior permission. Title to copyright in this document will at all times remain with copyright holders. Heinbockel [Page 1] CEE Specification CEE Event and Encoding July 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Event Recording and Encoding Process . . . . . . . . . . . . . 4 5. CEE Event Record . . . . . . . . . . . . . . . . . . . . . . . 4 5.1. Field . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5.1.1. Field Name . . . . . . . . . . . . . . . . . . . . . . 5 5.1.2. Field Value . . . . . . . . . . . . . . . . . . . . . 5 5.1.3. Fields with No Values . . . . . . . . . . . . . . . . 6 5.1.4. Fields with More Than One Value . . . . . . . . . . . 6 5.2. Value Types . . . . . . . . . . . . . . . . . . . . . . . 6 5.2.1. string . . . . . . . . . . . . . . . . . . . . . . . . 6 5.2.2. binary . . . . . . . . . . . . . . . . . . . . . . . . 7 5.2.3. tag . . . . . . . . . . . . . . . . . . . . . . . . . 8 5.2.4. integer . . . . . . . . . . . . . . . . . . . . . . . 8 5.2.5. float . . . . . . . . . . . . . . . . . . . . . . . . 8 5.2.6. boolean . . . . . . . . . . . . . . . . . . . . . . . 9 5.2.7. timestamp . . . . . . . . . . . . . . . . . . . . . . 9 5.2.8. duration . . . . . . . . . . . . . . . . . . . . . . . 10 5.2.9. ipv4Address . . . . . . . . . . . . . . . . . . . . . 10 5.2.10. ipv6Address . . . . . . . . . . . . . . . . . . . . . 11 5.2.11. macAddress . . . . . . . . . . . . . . . . . . . . . . 11 5.3. Core Fields . . . . . . . . . . . . . . . . . . . . . . . 12 5.3.1. id . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.3.2. time . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.3. action . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.4. status . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.5. p_sys_id . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.6. p_prod_id . . . . . . . . . . . . . . . . . . . . . . 13 6. Event Extensions . . . . . . . . . . . . . . . . . . . . . . . 14 6.1. Event Augmentation . . . . . . . . . . . . . . . . . . . . 14 7. Event Logs . . . . . . . . . . . . . . . . . . . . . . . . . . 14 8. Event Identification . . . . . . . . . . . . . . . . . . . . . 14 9. Event Transport . . . . . . . . . . . . . . . . . . . . . . . 15 10. Event Processing . . . . . . . . . . . . . . . . . . . . . . . 15 10.1. Event, Field, and Value Ordering . . . . . . . . . . . . . 15 10.2. Event Modification . . . . . . . . . . . . . . . . . . . . 15 10.3. Event Augmentation . . . . . . . . . . . . . . . . . . . . 15 10.4. Event and Field Limits . . . . . . . . . . . . . . . . . . 16 10.5. Fields with No Value . . . . . . . . . . . . . . . . . . . 16 10.6. Multi-Value Rule . . . . . . . . . . . . . . . . . . . . . 17 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 12.1. Normative References . . . . . . . . . . . . . . . . . . . 18 12.2. Informative References . . . . . . . . . . . . . . . . . . 19 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19 Heinbockel [Page 2] CEE Specification CEE Event and Encoding July 2011 1. Introduction The goal of Common Event Expression (CEE) [CEE.ARCH] is to define a standardized representation for an event record. In order to encourage use and improve compatibility with existing standards, CEE event records are encoded using one or more Common Log Syntax (CLS) Encodings. Each CLS Encoding defines how a CEE event record is mapped to a specific syntax encoding. CEE and CLS enable the efficient and lossless storage, exchange, and consumption of event records. The CEE event record format was designed for maximum compatibility and can be encoded to interoperate with standards. CLS Encodings are defined for representation of CEE event records using XML and JSON. Additional mappings may be defined to enable CEE support for other syntaxes. 2. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. All ABNF [RFC5234] definitions assume that the CEE event record and field data is presented using Unicode characters and encoded according to UTF-8 [RFC3629]. 3. Definitions o An "event" is an observable occurrence within an IT environment, often associated with a change in state. o An "event field" describes one characteristic of an event. Examples of an event field include date, time, source IP address, user identification, and host identification. o An "event record" is a collection of event fields that, together, describe a single event. Terms synonymous to event record include "audit record" and "log entry". In CEE, an event record is conceptually recorded using the CEE record format defined in Section 5. o An "event log" is a collection of event records. Terms such as "log", "data log," "activity log," "audit log," "audit trail," "log file," and "event log" are often used to mean the same thing as log. Heinbockel [Page 3] CEE Specification CEE Event and Encoding July 2011 o A "CLS Encoding" is a process describes how an event record is translated in to ("encode") and out of ("decode") an encoded format. CLS Encodings allow for event records to be more easily shared and processed. o A "syntax declaration" is a formal document that defines a CLS Encoding process. 4. Event Recording and Encoding Process CEE makes a distinction between the event recording and the event encoding process. Once an event occurs that should be recorded, the relevant data from that event is captured as a event record. When the event record is to be recorded in a log or shared, the CEE record should be encoded according to one or more CLS Encoding. In order for event consumers to consume CLS encoded event records, the consumer must simply decode the CLS encoded record into the original CEE event record. The encoding and decoding of CEE event records using a CLS Encoding MUST be able to be performed without data loss. +---------+ +----------+ encode +-----------+ | | data | CEE | =======> | CLS | | Event | =====> | Event | | Encoded | | | | Record | <======= | Record | +---------+ +----------+ decode +-----------+ CLS Encoding Process 5. CEE Event Record A CEE event record consists of an event body and zero or more event extensions. The event body is a sequence of fields. Each field is made up of an identifying field name and zero or more field values. The event extensions are optional and allow for things such as augmentation of the original event and digital signatures. Each syntax declaration MUST identify how the body of a CEE event record is to be encoded and decoded. Additionally, any syntax declaration SHOULD support event extensions Section 6 and the representation of logs Section 7, or collections of CEE event records. Heinbockel [Page 4] CEE Specification CEE Event and Encoding July 2011 5.1. Field Each field represents a single property of an event. CEE events have two (2) different types of fields: name-value fields (NV-FIELD) and core fields. Name-value fields are specified at the time the event record is created. Core fields are fields that are shared by all CEE event records and have been predefined within the scope of this CEE event specification. Name-value fields MUST have exactly one field name, and zero to 255 field values. Each field name SHOULD correspond to exactly one field definition in a corresponding CEE Profile [CEE.Profile]. Each CEE encoded event MUST include each of the core fields. The value of each core field MUST comply with that fields syntax encoding. See Section 5.3 for a listing of core fields and their syntaxes. 5.1.1. Field Name The field name is an identifier to provide context to the field values. For name-value fields, the field name MUST be specified. Field names for core fields MUST be able to be determined and MAY be inferred through argument position or other encoding. Each field name MUST start with a lower-case or upper-case ASCII letter (ABNF ALPHA) or underscore ('_', ABNF %x5F) character, followed by no more than 31 letter, digit (ABNF DIGIT), and underscore characters: FIELD-NAME = CEE-NAME CEE-NAME = CEE-STARTCHAR *31CEE-CHAR ; [A-Za-z_][A-Za-z0-9_]{1,31} CEE-STARTCHAR = ALPHA / %x5F CEE-CHAR = CEE-STARTCHAR / DIGIT ALPHA = %x41-5A / %x61-7A ; A-Za-z DIGIT = %x30-39 ; 0-9 5.1.2. Field Value Each field value consists of a value type along with the value data. The value type for each value indicates how the associated value data MUST be encoded within each CLS syntax declaration. Additionally, a syntax encoding MUST define how each VALUE-TYPE defined in Section 5.2, is indicated. If the value types are explicitly defined, the encoding MUST specify how each type is represented. Otherwise, if the value types implicit, instructions Heinbockel [Page 5] CEE Specification CEE Event and Encoding July 2011 MUST be provided on how an implementation is supposed to infer the proper value type based on the encoding. 5.1.3. Fields with No Values Every field MUST have at least one value. Fields that do not have a value SHOULD be removed from the CEE event record when possible. Otherwise, the field value MUST be the nil value. There is no lexical representation of nil values defined in this specification. Each CLS syntax declaration MUST define how the nil value is encoded. 5.1.4. Fields with More Than One Value A CLS encoding MUST be able to associate more than one value with a single field. The CLS syntax declaration MUST specify how to encode a field that has more than one value. 5.2. Value Types CEE defines eleven (11) core value types: string, binary, integer, float, boolean, timestamp, duration, ipv4Address, ipv6Address, macAddress, and tag. While the CLS component only recognizes these 11 value types, the CEE Profile specification [CEE.Profile] allows for additional value types to be defined and associated with certain fields. These types can be used by event consumers to help with processing or to provide further validation for individual CEE Event field values. Each syntax declaration MUST define a representation encoding for each value type. The value type encoding MUST specify how any value corresponding to that value type MUST be encoded. Additionally, the encoding SHOULD specify the value type of the encoded value. This section defines the lexical representations for each CEE value type. If a syntax encoding does not use a value type's provided lexical representation, then the encoding MUST provide instructions on how to convert the encoded value to and from the CEE lexical representation. 5.2.1. string The string value type represents a sequence of zero or more Unicode characters [UNICODE]. The NUL character (ABNF %x00) MUST NOT appear as any character within an encoded CEE string value. The syntax encoding specification MUST be able to handle a string value containing any Unicode character (except NUL). This may be Heinbockel [Page 6] CEE Specification CEE Event and Encoding July 2011 done with a Unicode encoding format, such as UTF-8 [RFC3629], or by using another encoding, such as Base64 [RFC4648]. An encoding MAY declare reserved characters. If reserved characters are defined, the syntax mapping MUST provide an encoding or escaping method to represent the literal value of the reserved character. The lexical representation of a CEE string value MUST adhere to the following ABNF [RFC5234] definition: string = *UNICODE-CHAR UNICODE-CHAR = *OCTET ; NUL MUST NOT appear 5.2.2. binary A binary value is a series of zero or more octet values and is intended to hold binary data such as files or byte streams. It can be used to hold values that contain non-Unicode data or NUL values. Some syntax encodings may provide support for arbitrary sequences of octets. If the syntax encoding does not support native binary values, it MUST allow for the binary value to be encoded in Base64 as described in [RFC4648]. The lexical representation of a CEE binary value MUST adhere to the following ABNF definition: binary = *OCTET A CLS encoding may requires that binary values be Base64 encoded. If so, the Base64 encoded value MUST meet the following ABNF definition: binary-b64 = *( 4B64 ) B64FINAL B64FINAL = B64FINALQUAD / B64PADDED16 / B64PADDED8 B64FINALQUAD = 3B64 B64CHAR B64PADDED16 = 2B64 B16 "=" B64PADDED8 = B64 B04 "=" [SP] "=" B64 = B64CHAR [SP] B64CHAR = ALPHA / DIGIT / %x2B / %x2F ; [A-Za-z0-9+/] B16 = B16CHAR [SP] B16CHAR = B04CHAR / "E" / "M" / "U" / "Y" / "c" / "k" / "o" / "s" / "0" / "4" / "8" B04 = B04CHAR [SP] B04CHAR = "A" / "Q" / "g" / "w" Heinbockel [Page 7] CEE Specification CEE Event and Encoding July 2011 5.2.3. tag The tag field type is a special field type within the CEE that is used to include references to event tags, or keywords. Each tag value SHOULD correspond to a tag definition, whose format is defined within the CEE Profile specification [CEE.Profile]. The format for a CEE tag value is identical to that of a CEE field name (see Section 5.1.1). A CEE tag value MUST begin with an ASCII letter (ABNF ALPHA) or underscore ('_', ABNF %x5F), and be followed by no more than 31 letter, digit (ABNF DIGIT), and underscore characters. The lexical representation of a CEE tag value MUST adhere to the following ABNF definition: tag = CEE-NAME 5.2.4. integer Integer values are a sequence of decimal integers (ABNF DIGIT) of values %x30 through %x39 ("0" through "9"). An integer MUST be able to represent any arbitrary signed or unsigned integer number. If a leading positive ('+', ABNF %x2B) or negative ('-', ABNF %x2D) sign is not supplied, the integer MUST be assumed to represent a positive value. The value of a CEE integer MUST be representable within a 64-bit signed integer. That is, the minimum integer that can be represented in a CEE integer type is -(2^63), or -9,223,372,036,854,775,808. The maximum CEE integer value is 2^63-1 or 9,223,372,036,854,775,807. If the numeric value exceeds these limitations, then it should be expressed in the record as CEE string type. The lexical representation of a CEE integer value MUST adhere to the following ABNF definition: integer = [ "+" / "-" ] 1*19DIGIT ; MUST in the range [-(2^63), 2^63-1] 5.2.5. float A float value is a representation of an arbitrary precision, finite, real number of the form: i*10^n; where i and n are integers, -(2^53) < i < 2^53, and -1075 < n < 970. Due to the various precisions and encodings of floating point numerals, CEE float values MUST be able to be represented within a 64-bit binary IEEE 754 [IEEE.754.1985] ("double precision") encoded number. A syntax encoding MAY specify alternative floating point value Heinbockel [Page 8] CEE Specification CEE Event and Encoding July 2011 encodings with other widths, precisions, or optimizations for better space or processing efficiency. If alternative floating-point numeral encodings are specified, a conforming system MAY alter the encoded value and precision into a 64-bit IEEE 754 format. When performing any comparison or arithmetic operations on CEE float values, each float value MAY be converted into a 64-bit IEEE 754 encoded floating-point numeral and perform floating point arithmetic operations as defined by IEEE 754 [IEEE.754.1985]. The lexical representation of a CEE float value MUST adhere to the following ABNF definition: float = [ "+" / "-" ] DECIMAL-PT-NUM [DEC-SCI-NUM] ; MUST fit within an IEEE 754 64-bit floating point DECIMAL-PT-NUM = DEC-NUM-PT-NUM / DEC-PT-NUM DEC-NUM-PT-NUM = 1*DIGIT [ "." *DIGIT ] DEC-PT-NUM = "." 1*DIGIT DEC-SCI-NUM = [ "e" / "E" ] 1*DIGIT 5.2.6. boolean A boolean value supports Boolean, or binary-valued, logic and can only represent the values of "true" and "false". If a syntax encoding defines boolean value representations other than "true" (ABNF %x74 %x72 %x75 %x65) and "false" (ABNF %x66 %x61 %x6C %x73 %x65), the encoding specification MUST define how each alternative representation maps to either "true" or "false". The lexical representation of a CEE boolean value MUST adhere to the following ABNF definition: boolean = "true" ; %x74 %x72 %x75 %x65 / "false" ; %x66 %x61 %x6C %x73 %x65 5.2.7. timestamp The CEE timestamp value type allows for the specification of a date, time, and timezone information as defined in [ISO.8601.1988]. Syntax encodings MUST support the specification of timestamp values up to nanosecond (1E-9) granularity. Timestamp values having precision greater than 1 nanosecond MAY be rounded or truncated to no more than nanosecond granularity. The lexical representation of a CEE timestamp value MUST consist of no more than 36 characters and adhere to the following ABNF definition: Heinbockel [Page 9] CEE Specification CEE Event and Encoding July 2011 timestamp = DATE "T" TIME ; MUST be less than 36 chars long DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY DATE-FULLYEAR = 4DIGIT DATE-MONTH = 2DIGIT ; 01-12 DATE-MDAY = 2DIGIT ; 01-31 TIME = TIMESPEC-BASE [TIME-FRACTION] [TIME-ZONE] TIMESPEC-BASE = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND TIME-HOUR = 2DIGIT ; 00-23 TIME-MINUTE = 2DIGIT ; 00-59 TIME-SECOND = 2DIGIT ; 00-59, 00-60 based on leap seconds TIME-FRACTION = "." 1*DIGIT TIME-NUMOFFSET = ( "+" / "-" ) TIME-HOUR ":" TIME-MINUTE TIME-ZONE = "Z" / TIME-NUMOFFSET 5.2.8. duration CEE duration value type supports values that represent a consecutive period of time in a format compliant with [ISO.8601.1988]. Additionally, a syntax encoding MUST support duration values having up to nanosecond (1E-9) granularity. Duration values with precision greater than 1 nanosecond MAY be rounded or truncated to no more than nanosecond granularity. The lexical representation of a CEE duration value MUST consist of no more than 36 characters and adhere to the following ABNF definition: duration = "P" DAYTIME-DUR ; MUST be less than 36 chars long DAYTIME-DUR = DUR-DATE / DUR-TIME DUR-SECOND = 1*DIGIT [DUR-FRACTION] "S" DUR-FRACTION = "." 1*DIGIT DUR-MINUTE = 1*DIGIT "M" [DUR-SECOND] DUR-HOUR = 1*DIGIT "H" [DUR-MINUTE] DUR-TIME = "T" (DUR-HOUR / DUR-MINUTE / DUR-SECOND) DUR-DAY = 1*DIGIT "D" DUR-DATE = DUR-DAY [DUR-TIME] 5.2.9. ipv4Address The ipv4Address field type holds a single network address within an IPv4 network. ipv4Address values are represented using dotted-decimal notation. The lexical representation of a CEE ipv4Address value MUST adhere to the following ABNF definition: Heinbockel [Page 10] CEE Specification CEE Event and Encoding July 2011 ipv4Address = 3( DEC-OCTET "." ) DEC-OCTET DEC-OCTET = DIGIT ; 0-9 / %x31-39 DIGIT ; 10-99 / "1" 2DIGIT ; 100-199 / "2" %x30-34 DIGIT ; 200-249 / "25" %x30-35 ; 250-255 5.2.10. ipv6Address The ipv6Address field type holds a single network address within an IPv6 network. The lexical representation of a CEE ipv6Address value MUST adhere to a valid IPv6 address representation format as defined in [RFC4007] and [RFC4291]: ipv6Address = IPv6Literal [ IPv6ZoneID ] IPv6Literal = 6( H16 ":" ) LS32 / "::" 5( H16 ":" ) LS32 / [ H16 ] "::" 4( H16 ":" ) LS32 / [ *1( H16 ":" ) H16 ] "::" 3( H16 ":" ) LS32 / [ *2( H16 ":" ) H16 ] "::" 2( H16 ":" ) LS32 / [ *3( H16 ":" ) H16 ] "::" H16 ":" LS32 / [ *4( H16 ":" ) H16 ] "::" LS32 / [ *5( H16 ":" ) H16 ] "::" H16 / [ *6( H16 ":" ) H16 ] "::" H16 = 1*4HEXDIG LS32 = ( H16 ":" H16 ) / ipv4Address IPv6ZoneID = "%" 1*ZoneIDChar ; SHOULD be an non-negative number ZoneIDChar = %x21-24 / %x26-7E ; MUST NOT contain '%' Examples of valid CEE ipv6Address values include: o 2001:0db8:85a3:0000:0000:8a2e:0370:7334 o FE80:0:0:0:202:B3FF:FE1E:8329 o fe80::202:b3ff:fe1e:8329%1 o ::ffff:10.0.0.1 o ::1 5.2.11. macAddress The CEE macAddress field type represents Media Access Control (MAC) address values that are the physical network address of a network interface. Heinbockel [Page 11] CEE Specification CEE Event and Encoding July 2011 The lexical representation of a CEE macAddress value MUST adhere to a valid MAC address representation format as defined in [IEEE.802.1990]: macAddress = 5( 2HEXDIG ":" ) 2HEXDIG 5.3. Core Fields Many event records contain a similar set of core fields. To make these fields easy to use and process, many of these fields have been predefined. Each core field MUST be specified within every CEE event record. There are six (6) core fields that MUST be supported by each syntax encoding: id, time, action, status, p_sys_id, and p_prod_id. There may be cases where one or more core fields within an event record do not have a value. The syntax declaration MUST define how these situations are to be handled. The recommended method is to use the representation for the nil value from Section 5.1.3. +-----------+------------+ | Field | Field Type | +-----------+------------+ | id | string | | time | timestamp | | action | tag | | status | tag | | p_sys_id | string | | p_prod_id | string | +-----------+------------+ Table 1: Core Fields Each syntax encoding MUST specify how each core field is encoded. A syntax encoding MUST support every core field. 5.3.1. id The "id" core field MUST contain one CEE string value that represents the event type. The event.id SHOULD be unique within the context of the producing source, application, or device. The "id" value is intended to identify semantically similar event records. It MAY be used to find or filter events, or used by an event consumer to apply additional event-based processing or inferences. Heinbockel [Page 12] CEE Specification CEE Event and Encoding July 2011 5.3.2. time The "time" field MUST have a single CEE timestamp value that indicates the date, time, and timezone offset indicating the time when the event began. If the event was not instantaneous, then the additional fields should be used to indicate the duration of the event. 5.3.3. action The "action" field MUST contain one CEE tag value that indicates the primary action of the event. The tag value SHOULD correspond to a CEE action Tag definition with a CEE Profile document. If the event action is unknown, then the "action" should be listed as "unknown". 5.3.4. status The "status" field MUST contain one CEE tag value that indicates the status associated with the "action" field. I.e., the "status" field value is the current status of the event's primary action. The tag value SHOULD correspond to a CEE status Tag definition with a CEE Profile document. If the event status is unknown, then the "status" should be listed as "unknown". 5.3.5. p_sys_id The "p_sys_id" field MUST have one value that provides an identifier for the host system that produced the event. The p_sys_id value MAY be an IP address or hostname. The value of a p_sys_id field MUST be of a CEE string type. 5.3.6. p_prod_id The core "p_prod_id" field MUST have one value that is a CEE string type. The value MUST be an identifier for the product producing the event. The p_prod_id value is a producer assigned value. Examples of p_prod_id values include a product or driver identifier. Heinbockel [Page 13] CEE Specification CEE Event and Encoding July 2011 6. Event Extensions Event extensions allow for supplementary information to be supplied in a CEE event record along with the original event body. Extensions provide a straightforward way of implementing features such as digital signatures and non-destructive event augmentation. Extensions are optional within a CEE event record and MUST be provided after the event body. Zero or more extensions may be included with an event record. Each extension is defined below, as part of this specification. Other extensions SHOULD NOT be used. A CLS Encoding declaration SHOULD support the concept of event extensions, but does not have to support every extension. 6.1. Event Augmentation Event augmentations allow for already encoded CEE event records to be modified while preserving the integrity of the original event. Instead of directly modifying the event data, an event processing device may choose to append an event augmentation section onto the CEE event record. An event record may have zero or more augmentation sections. In order to provide proper traceability, each event augmentation section MUST indicate the order in which the augmentation was performed. Additionally, each augmentation section MUST provide the time when the augmentation was created ("time", Section 5.3.2), as well as an identifier for the product ("p_prod_id", Section 5.3.6) and host ("p_sys_id", Section 5.3.5) producing the augmentation. These fields are defined as part of the CEE core fields Section 5.3. The remainder of the augmentation section MUST consist of zero or more fields that indicate the added fields and field values. 7. Event Logs An event log is simply an ordered collection of CEE event records. Syntax encodings MAY define a specific encoding to represent a log of zero or more encoded event records. 8. Event Identification Each syntax encoding SHOULD define a MIME type ([RFC4288]) to allow consumers to correctly identify the CEE event record encoding syntax. Heinbockel [Page 14] CEE Specification CEE Event and Encoding July 2011 The MIME type SHOULD be an "application" type with a subtype of "cee+ ENCODING", where "ENCODING" is the common short name for the underlying syntax encoding. For example, "application/cee+json" would identify a JSON encoded CEE event record, while "application/ cee+xml" is an XML encoded record. 9. Event Transport This specification does not define or require the use of any specific transport or protocol for transferring CEE events. A syntax encoding MAY define or suggest compatible protocols, but MUST NOT mandate the use of any specific protocol. 10. Event Processing The following are processing rules that dictate requirements such as CLS encoded event size limits, event record preservation, and how to process fields with multiple values. 10.1. Event, Field, and Value Ordering In some cases, the ordering of the CEE events, fields, and values may be significant. Intermediate systems MUST be able to maintain the ordering of fields within an event record and MUST preserve the ordering of sequential event records when forwarding or re-sending those records. 10.2. Event Modification Some environments require that the original event record be preserved. To support such environments, event consumers of CEE event records MUST be able to preserve the entire contents of an event record. This includes that no fields or values be modified, added, removed, or reordered. 10.3. Event Augmentation When processing a CEE event record containing event augmentation sections, the event body MUST be processed first. Each augmentation section SHOULD be processed individually and in the order that the augmentations were added. Each non-core field in an augmentation section SHOULD be added to the content of the event body and any previously processed augmentations sections. If the field already exists, the field values SHOULD be added to the existing field values. Heinbockel [Page 15] CEE Specification CEE Event and Encoding July 2011 10.4. Event and Field Limits In order to help with interoperability, CEE sets limitations on the sizes of encoded event record values. These limits are for the entire event record, including any event extensions. Event producers and consumers compatible with the CLS specifications must be able to handle encoded event messages within the following limits: +----------------------------+--------+ | Name | Size | +----------------------------+--------+ | Event Record Size | 64 KB | | Field Value Size | 2 KB | | Number of Fields | 255 | | Number of Values per Field | 255 | +----------------------------+--------+ Table 2: Event Record Limits These limits were chosen to be large enough to handle most event records produced by modern systems. However, they are not large enough to allow arbitrary data to be added to an event. For example, large objects such as files and data stream SHOULD NOT be placed within an event record. Instead, such objects SHOULD be referenced within the event record. Note that these limitations are independent of the encoding used and include any encoding overhead and all field and value delimiters. For example, representation of international characters using UTF-8 will consume up to six times more space than an ASCII character. If Unicode characters are encoded using UTF-32, then every character, including ASCII characters, will take up exactly 32 bits (4 octets). In order to make efficient use of the allotted space, the event producer SHOULD consider both the CLS encoding properties as well as the efficiency of the Unicode character encoding for the event data. 10.5. Fields with No Value This specification does not define a null value or field type to indicate when a field has no value. A CLS syntax mapping MAY define ways to encode a field with an empty or missing value. For the cases when events are being recording and the event has a field with no associated value, two potential solutions will work with every CEE compatible device. In most cases, it is acceptable to just not include that field instance in the event record. The other option is to set the field to be an empty field value. Heinbockel [Page 16] CEE Specification CEE Event and Encoding July 2011 10.6. Multi-Value Rule A field with multiple values MUST be treated as the combination of all values of all fields within an event record. Example 1 - One Multi-Valued Field Event := Field(time, 2010-11-12T12:01:38-04:00), ... Field(src_ipv4, 127.0.0.1), Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]). The CEE Multi-Value Rule states that this record reflects "time" * "src_ipv4" * "dst_ipv4". This means that this record represents an event that occurred on 12 November 2010 where something occurred between a source IPv4 address and three destination IPv4 addresses--for example, three packets were sent from src_ipv4 to each dst_ipv4. Example 2 - Two Multi-Valued Fields Event := Field(time, 2010-11-12T12:01:38-04:00), ... Field(src_ipv4, [127.0.0.1, 127.0.0.2]), Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]). With this record, the CEE multi-value rule states that each source IPv4 address talked to each destination address. If this event reflects packets being sent, then six (6) packets would have been transmitted (|time| * |src_ipv4| * |dst_ipv4| = 6). Example 3 - More Than Two Multi-Valued Fields Event := Field(time, 2010-11-12T12:01:38-04:00), ... Field(src_ipv4, [127.0.0.1, 127.0.0.2]), Field(dst_ip4, [10.0.0.1, 10.0.0.2, 10.0.0.3]), Field(out_bytes, [4, 12]). One more multi-valued field was added to record size of the data transmitted (out_bytes) of 4 and 12 bytes. Now, the event record reflects 12 packets (1*2*3*2=12) transmitted: a packet of each length was transmitted by each source IPv4 address to each destination IPv4 address. If this is not what the event producer intended, then multiple event records MUST be used. Event consumers MUST NOT attempt to map a single value from one multi-valued field to a single value from another multi-valued field. E.g., in Example 3, a consumer must not associate the 127.0.0.1 src_ipv4 address with only the 10.0.0.1 dst_ipv4 address, the Heinbockel [Page 17] CEE Specification CEE Event and Encoding July 2011 127.0.0.2 src_ipv4 address with the 10.0.0.2 dst_ipv4 address, etc. In order to avoid problem resulting from inconsistent interpretation of the CEE multi-value rule, event producers MAY limit each record to have only one multi-valued field. 11. Acknowledgments This specification is the product of the CEE Board and community. The CEE Board would like to thank their colleagues who reviewed drafts of this and previous documents. The CEE Board also gratefully acknowledges and appreciates the many contributions from individual representing public and private section, especially Rainer Gerhards, Eric Fitzgerald, Raffael Marty, Steve Grubb, Dr. Anton Chuvakin, Dominique Karg, and Peter Czanik. 12. References 12.1. Normative References [IEEE.754.1985] Institute of Electrical and Electronics Engineers, "Standard for Binary Floating-Point Arithmetic", IEEE Standard 754, August 1985. [IEEE.802.1990] Institute of Electrical and Electronics Engineers, "Local and Metropolitan Area Networks: Overview and Architecture", IEEE Standard 802, 1990. [ISO.8601.1988] International Organization for Standardization, "Data elements and interchange formats - Information interchange - Representation of dates and times", ISO Standard 8601, June 1988. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [RFC4007] Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and B. Zill, "IPv6 Scoped Address Architecture", RFC 4007, March 2005. Heinbockel [Page 18] CEE Specification CEE Event and Encoding July 2011 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006. [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, October 2006. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [UNICODE] The Unicode Consortium, "The Unicode Standard--Version 5.0", 2007, . 12.2. Informative References [CEE.ARCH] CEE Board, "CEE Architecture Overview", May 2010. [CEE.Profile] CEE Board, "CEE Profile Specification", June 2011. [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005. Author's Address William Heinbockel The MITRE Corporation 202 Burlington Rd Bedford, MA 01730 United States Phone: +1 781-271-2615 Email: heinbockel@mitre.org Heinbockel [Page 19]