CEE Board W. Heinbockel CEE Specification The MITRE Corporation October 18, 2011 CEE Log Transport (CLT) Mapping: Syslog draft-clt-syslog-06-1 Abstract This document defines a Common Event Expression (CEE) Log Transport (CLT) mapping for sending CEE Log Syntax (CLS) encoded events using Syslog. The mapping defined is generically applicable to all current versions of Syslog, supporting both an RFC3164 and RFC5424 Syslog formats. Copyright Notice Copyright (c) 2011 The MITRE Corporation. All rights reserved. Document License The MITRE Corporation (MITRE) hereby grants you a non-exclusive, royalty-free license to use CEE for research, development, and commercial purposes. Any copy you make for such purposes is authorized provided that you reproduce MITRE's copyright designation and this license in any such copy. Disclaimers THIS DOCUMENT IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THE DOCUMENT ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE DOCUMENT OR THE PERFORMANCE OR IMPLEMENTATION OF THE CONTENTS THEREOF. The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to this document or its contents without specific, written prior permission. Title to copyright in this document will at all times remain with copyright holders. Heinbockel [Page 1] CEE Specification CLT Mapping: Syslog October 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 3. Syslog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.1. Legacy Syslog . . . . . . . . . . . . . . . . . . . . . . 3 3.2. RFC5424 Syslog . . . . . . . . . . . . . . . . . . . . . . 3 4. CLT Mappings . . . . . . . . . . . . . . . . . . . . . . . . . 4 5. CLT Syslog Mapping . . . . . . . . . . . . . . . . . . . . . . 4 5.1. Syslog Header . . . . . . . . . . . . . . . . . . . . . . 5 5.2. Syslog Body . . . . . . . . . . . . . . . . . . . . . . . 5 5.2.1. Compatible CLS Encodings . . . . . . . . . . . . . . . 6 5.2.2. CEE Event Flag . . . . . . . . . . . . . . . . . . . . 6 5.2.3. Character Encoding . . . . . . . . . . . . . . . . . . 6 6. Transmission . . . . . . . . . . . . . . . . . . . . . . . . . 7 7. CEE-over-Syslog Examples . . . . . . . . . . . . . . . . . . . 7 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 9. Normative References . . . . . . . . . . . . . . . . . . . . . 9 Appendix A. Full ABNF Grammar . . . . . . . . . . . . . . . . . . 9 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12 Heinbockel [Page 2] CEE Specification CLT Mapping: Syslog October 2011 1. Introduction In order for events to be shared and collected, they need to be transported, typically using a network protocol. A majority of log management infrastructures and processes rely on the Syslog protocol to send event data. Since Syslog is the de facto standard in log transport protocols and it is supported by numerous products, CEE provides a way to send CEE Event data over Syslog. This document defines a standard process to encode CEE Events using a CEE Log Syntax (CLS) Encoding [CEE.CLS] and place the encoded event into a Syslog message. 2. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 3. Syslog There exist several different formats for Syslog messages. Within this document, they will be separated into two groups: legacy and RFC5424. While the basic process for transmitting CEE Events is the same for both Syslog groups, there are some nuances surrounding the use of legacy Syslog of which implementers must be cognizant. 3.1. Legacy Syslog Most people are familiar with legacy Syslog. Legacy Syslog is characterized by its timestamp format missing the year and the event message consisting only of an unstructured line of text. The various formats of legacy Syslog messages are identified in [RFC3164]. Besides the variance in message formats, not all legacy Syslog implementations can handle 8-bit character sets. That is, some implementations use only the lower 7 bits of each byte. This causes problems when trying to send binary data or extended character sets (e.g., extended ASCII, UTF-8) that rely on "8-bit clean" processing. 3.2. RFC5424 Syslog The Syslog specification was updated in March 2009 with the release of RFC 5424 [RFC5424]. This update brought many needed enhancements, including a standardized timestamp that includes the year, Unicode support (UTF-8), and a way of providing more structured content Heinbockel [Page 3] CEE Specification CLT Mapping: Syslog October 2011 within Syslog messages. One of the lesser recognized improvements was to make the entire Syslog protocol and RFC5424 implementations 8-bit clean. This step was necessary to enable full Unicode support. While RFC5424 Syslog is preferred over legacy versions, there are still many environments and platforms that have been built on top of legacy Syslog implementations. All Syslog implementations supporting CEE SHOULD be RFC5424-compatible. 4. CLT Mappings +---------+ +----------+ +-----------+ | | data | CEE | encode | JSON | | Event | -----> | Event | -------> | Encoded | | | | Record | | Event | +---------+ +----------+ +-----------+ | +-------------------+ | | Syslog message | | | +---------------+ | | | | Header | | | | |~~~~~~~~~~~~~~~| | | | | Body | | | | | +-----------+ | | embed | | | | JSON | | | <----------+ | | | CEE Event | | | | | +-----------+ | | | +---------------+ | +-------------------+ Sending CEE Events over Syslog 5. CLT Syslog Mapping It is possible to use Syslog to transport CEE Events. To do this, the CEE Event must be encoded using a CLS Encoding compatible with the Syslog implementation Section 3 and placed into the body of a new Syslog message. Each Syslog message MUST conform to the CEE CLS Specification [CEE.CLS] as well as the appropriate Syslog specification, such as [RFC3164] or [RFC5424]. This includes the complete Syslog header and content. Heinbockel [Page 4] CEE Specification CLT Mapping: Syslog October 2011 This document defines a CLT mapping that is conformant to both CEE and Syslog specifications via the inclusion of a CLS JSON (Javascript Object Notation) encoded event within a Syslog message. The resulting CLT Syslog message has the following ABNF (Augmented Backus-Naur Form) [RFC5234] definition: SYSLOG = HEADER BODY HEADER = PRI [VERSION SP] TIMESTAMP SP HOSTNAME SP BODY = MSG [ CEE-EVENT ] PRI = "<" PRIVAL ">" PRIVAL = 1*3DIGIT ; range 0 .. 191 VERSION = NONZERO-DIGIT 0*2DIGIT MSG = MSG-7BIT / MSG-8BIT MSG-7BIT = *( SP / PRINTUSASCII ) MSG-8BIT = [BOM] UTF-8-STRING CEE-EVENT = CEE-FLAG JSON-RECORD ; CLS-JSON grammar defined in the ; CLS JSON Encoding Specification CEE-FLAG = "cee:" [SP] ; %x63.65.65.3A UTF-8-STRING = *OCTET ; UTF-8 string, RFC 3629 ; MUST NOT contain the NUL %x00 character OCTET = %x00-FF PRINTUSASCII = %x21-7E ALPHANUM = ALPHA / DIGIT SP = %x20 NONZERO-DIGIT = %x31-39 DIGIT = %x30 / NONZERO-DIGIT BOM = %xEF.BB.BF 5.1. Syslog Header The standard Syslog header MUST be used. The actual formatting of the Syslog header is dependent on the Syslog protocol version and may vary based on the implementation. Regardless, these header values are for the Syslog protocol and are independent of CEE. The Syslog header values SHOULD NOT be used to add or modify any values within an enclosed CEE Event. 5.2. Syslog Body A CEE Event MUST be encoded using a Syslog-compatible CLS Encoding Section 5.2.1. The encoded CEE Event is then placed into the content, or body, of a Syslog message. Heinbockel [Page 5] CEE Specification CLT Mapping: Syslog October 2011 5.2.1. Compatible CLS Encodings The CEE Event MUST be represented using a CLS Encoding. For compliance with this specification, the CEE Event MUST be represented using the CLS JSON Encoding, as defined in [CEE.CLS-JSON]. In contrast to Section 2 of [RFC4627], the JSON encoding suitable for Syslog transport MUST NOT contain insignificant whitespace before or after any of the six JSON structural characters. 5.2.2. CEE Event Flag The beginning of the encoded CEE Event MUST be identified by the CEE Event Flag. Within Syslog, the CEE Event Flag is "cee:" (ABNF %x63.63.65.3A). The CEE Event Flag MAY be followed by one space (' ', ABNF %x20) character. A CLS JSON Encoded CEE Event MUST appear immediately following a CEE Event Flag. Other, non-CEE, non-JSON content MUST NOT appear in the Syslog body after a CEE Event Flag. A Syslog message MUST NOT contain more than one CEE Event Flag. 5.2.3. Character Encoding All CLS Encodings, including the CLS JSON Encoding, assume an 8-bit clean environment. Therefore, it is important that the event producer understand the Syslog format Section 3 that will be used, especially whether implementation is a 7-bit or 8-bit clean implementation. If the Syslog implementation is 8-bit clean, then the implementation supports UTF-8 [RFC3629] and no additional encodings are necessary beyond those required in [CEE.CLS-JSON]. However, if the implementation is only 7-bit, then all characters not in the ASCII character set (ABNF %x00-7F) MUST be additionally escaped. When using a 7-bit Syslog implementation or have requirements for 7-bit Syslog compatibility, then all UTF-8 encoded characters that cannot be represented using the lower 7 bits of an 8-bit byte MUST be escaped. These characters SHOULD be escaped using the JSON escape sequences [RFC4627], especially the Unicode escape: '\u' followed by four (4) hexadecimal digits (ABNF %x5C.75 4HEXDIG). Heinbockel [Page 6] CEE Specification CLT Mapping: Syslog October 2011 6. Transmission A Syslog message containing a CEE Event should be able to be transmitted using any Syslog-based transport mechanism. Due to the potential priority or sensitivity of certain event records, it is recommended that the transmission protocol supply the necessary confidentiality and integrity measures for the event content and operating environment. The original Syslog protocol sent event records in plaintext over UDP. This does not provide any security controls. One option is to transmit the Syslog messages using Transport Layer Security, as specified in [RFC5425]. 7. CEE-over-Syslog Examples Example 1 (RFC5424 Syslog)--Valid <165>1 2011-04-01T17:01:20Z 10.10.0.1 process - example-event-1 cee:{"Event":{"id":"example-event-1", "time":"t|2011-04-01T17:00:00.123456789Z","action": "g|remove","status":"g|failed","p_sys_id":"host.example.com", "p_prod_id":"cpe:2.3:Vendor:Product:Version:*:*:*:*:*:*", "file_name":"example.txt","proc_dur":"d|PT.0014S","sess_id": "user1"}} A valid RFC5424 Syslog message that contains an embedded CLS JSON Encoded CEE Event. Notice that the field values contain a type designator and the JSON format complies with [CEE.CLS-JSON]. Also, see that the event message identifier "example-event-1" is seen in both the Syslog header as well as the CEE Event content. Example 2 (Legacy Syslog)--Valid <0>Apr 4 17:01:20 10.10.0.1 process[35]: cee:{"Event":{ "id":"example-event-2","time": "2011-04-01T17:00:00.123456789Z","action":"download", "status":"success","p_sys_id":"host.example.com", "p_prod_id":"cpe:2.3:Vendor:Product:Version:*:*:*:*:*:*", "example_internal_id":10000,"proc_dur":"PT.0014S", "sess_id":12345,"file_name":"example.txt", "file_content":"b|RmlsZSBDb250ZW50Li4uAAo="}} A valid encoding of a CLS JSON Encoded CEE Event embedded within a legacy Syslog message. This event encoding does not use the JSON type designators. Heinbockel [Page 7] CEE Specification CLT Mapping: Syslog October 2011 Example 3--Invalid <0>Apr 4 17:01:20 10.10.0.1 process[35]: cee: { "id" : "example-event-2", "time" : "2011-04-01T17:00:00.123456789Z", "action" : "download", "status" : "success", "p_sys_id" : "host.example.com", "p_prod_id" : "cpe:2.3:Vendor:Product:Version:*:*:*:*:*:*", "sess_id" : 12345, "file_name" : "example.txt", "ex_internal_id" : 10000, "proc_dur" : "PT.0014S", "file_content" : "RmlsZSBDb250ZW50Li4uAAo=" } This event representation is invalid as it does not use the minimal JSON encoding; non-essential whitespace is left between JSON object structures. Additionally, the CEE event record is not encoded according to the CLS JSON Encoding declaration [CEE.CLS-JSON] as the top-level "Event" JSON object is absent. Example 4--Invalid <165>1 2011-04-01T17:01:20Z 10.10.0.1 process - example-event-4 {"Event":{"action":"login", "time":"2011-04-01T17:00:00.123456789Z", "p_prod_id":"cpe:2.3:Vendor:Product:Version:*:*:*:*:*:*", "status":"success","p_sys_id":"10.10.1.2"}} This event representation is invalid as it is missing the CEE Event Flag, "cee:", to designate the start of the CEE content. Also, the JSON CEE Event encoding does not comply with the CLS JSON Encoding specification [CEE.CLS-JSON]: it is missing the "id" field. 8. Acknowledgements This specification is the product of the CEE Board and community. The CEE Board would like to thank their colleagues who reviewed drafts of this and previous documents. The CEE Board also gratefully acknowledges and appreciates the many contributions from individual representing public and private section, especially Rainer Gerhards, Eric Fitzgerald, Raffael Marty, Steve Grubb, Dr. Anton Chuvakin, Dominique Karg, Peter Czanik, and George Saylor. Heinbockel [Page 8] CEE Specification CLT Mapping: Syslog October 2011 9. Normative References [CEE.CLS] CEE Board, "CEE Event Record and CLS Encoding Specification", June 2011. [CEE.CLS-JSON] CEE Board, "CEE Log Syntax (CLS) Encoding: JSON", June 2011. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3164] Lonvick, C., "The BSD Syslog Protocol", RFC 3164, August 2001. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [RFC4627] Crockford, D., "The application/json Media Type for JavaScript Object Notation (JSON)", RFC 4627, July 2006. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [RFC5424] Gerhards, R., "The Syslog Protocol", RFC 5424, March 2009. [RFC5425] Miao, F., Ma, Y., and J. Salowey, "Transport Layer Security (TLS) Transport Mapping for Syslog", RFC 5425, March 2009. Appendix A. Full ABNF Grammar What follows is the full ABNF grammar for a Syslog message containing a CEE Event. Both legacy Syslog (based on [RFC3164]) and RFC5424 Syslog ([RFC5424]) grammars are supported. The CEE Event MUST be encoded using the CLS JSON Encoding. The grammars for the CEE core types (e.g., string, integer, ipv4Address) are standardized and defined in [CEE.CLS]. SYSLOG = PRI ( LEGACY / RFC5424 ) CEE-EVENT PRI = "<" PRIVAL ">" PRIVAL = 1*3DIGIT ; range 0 .. 191 CEE-EVENT = CEE-FLAG CLS-JSON CEE-FLAG = "cee:" [SP] ; %x63.65.65.3A LEGACY = LEGACY-HEADER SP LEGACY-MSG Heinbockel [Page 9] CEE Specification CLT Mapping: Syslog October 2011 LEGACY-HEADER = LEGACY-TIMESTAMP SP SYSLOG-HOSTNAME LEGACY-TIMESTAMP = LEGACY-MONTH LEGACY-DAY TIME LEGACY-MSG = LEGACY-TAG MSG-7BIT LEGACY-TAG = *32ALPHANUM RFC5424 = RFC5424-HEADER SP RFC5424-BODY RFC5424-HEADER = RFC5424-VERSION SP RFC5424-TIMESTAMP SP SYSLOG-HOSTNAME SP RFC5424-APPNAME SP RFC5424-PROCID SP RFC5424-MSGID RFC5424-BODY = MSG-8BIT RFC5424-TIMESTAMP = DATE "T" TIME [ MSEC ] [ TZOFFSET ] RFC5424-VERSION = NONZERO-DIGIT 0*2DIGIT RFC5424-APPNAME = EMPTY / 1*48PRINTUSASCII RFC5424-PROCID = EMPTY / 1*128PRINTUSASCII RFC5424-MSGID = EMPTY / 1*32PRINTUSASCII RFC5424-BODY = RFC5424-STRDATA SP RFC5424-MSG RFC5424-STRDATA = EMPTY / 1*RFC5424-SDELEM RFC5424-SDELEM = "[" RFC5424-SDID *( SP RFC5424-SDPARAM ) "]" RFC5424-SDID = SDNAME RFC5424-SDPARAM = SDNAME "=" %x22 PARAM-VAL %x22 PARAM-VAL = UTF-8-STRING ; escape all '"', ']', '\' SDNAME = 1*32USPRINTASCII ; no '=', SP, ']', " EMPTY = %x2D ; '-' hyphen MSG-7BIT = *( SP / PRINTUSASCII ) MSG-8BIT = [BOM] UTF-8-STRING SYSLOG-HOSTNAME = IPV4ADDR / IPV6ADDR / HOSTNAME HOSTNAME = 1*255PRINTUSASCII IPV4ADDR = 3( DEC-OCTET "." ) DEC-OCTET DEC-OCTET = DIGIT ; 0-9 / %x31-39 DIGIT ; 10-99 / "1" 2DIGIT ; 100-199 / "2" %x30-34 DIGIT ; 200-249 / "25" %x30-35 ; 250-255 IPV6ADDR = 6( H16 ":" ) LS32 / "::" 5( H16 ":" ) LS32 / [ H16 ] "::" 4( H16 ":" ) LS32 / [ *1( H16 ":" ) H16 ] "::" 3( H16 ":" ) LS32 / [ *2( H16 ":" ) H16 ] "::" 2( H16 ":" ) LS32 / [ *3( H16 ":" ) H16 ] "::" H16 ":" LS32 / [ *4( H16 ":" ) H16 ] "::" LS32 / [ *5( H16 ":" ) H16 ] "::" H16 / [ *6( H16 ":" ) H16 ] "::" H16 = 1*4HEXDIG LS32 = ( H16 ":" H16 ) / ipv4Address DATE = YEAR "-" MONTH "-" DAY Heinbockel [Page 10] CEE Specification CLT Mapping: Syslog October 2011 YEAR = 4DIGIT MONTH = 2DIGIT DAY = 2DIGIT TIME = HOUR ":" MIN ":" SEC HOUR = 2DIGIT MIN = 2DIGIT SEC = 2DIGIT MSEC = "." 1*DIGIT TZOFFSET = "Z" / ( ( "+" / "-" ) 2DIGIT ":" 2DIGIT ) LEGACY-MONTH = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" LEGACY-DAY = ( SP / DIGIT ) DIGIT CLS-JSON = JSON-EVENT / JSON-LOG JSON-LOG = arr-s JSON-EVENT *( nv-sep JSON-EVENT ) arr-e JSON-EVENT = obj-s JSON-FIELDS obj-e ; JSON structures nv-sep = "," ; JSON name-value separator n-sep = ":" ; JSON name separator arr-s = "[" ; JSON array begin arr-e = "]" ; JSON array end obj-s = "{" ; JSON object begin obj-e = "}" ; JSON object end JSON-FIELDS = JSON-FIELD *254( nv-sep JSON-FIELD ) JSON-FIELD = JSON-CORE-FIELD / JSON-NV-FIELD JSON-CORE-FIELD = EVT-ID / EVT-ACTION / EVT-STATUS / EVT-TIME / EVT-SYS / EVT-PROD EVT-ID = %x22 "id" %x22 n-sep json-string EVT-TIME = %x22 "time" %x22 n-sep json-timestamp EVT-SYS = %x22 "p_sys_id" %x22 n-sep json-string EVT-PROD = %x22 "p_prod_id" %x22 n-sep json-string EVT-ACTION = %x22 "action" %x22 n-sep json-tag EVT-STATUS = %x22 "status" %x22 n-sep json-tag JSON-NV-FIELD = %x22 FIELD-NAME %x22 n-sep JSON-FIELD-VALUE JSON-FIELD-VALUE = JSON-DATATYPE / arr-s [JSON-DATA-ARRAY] arr-e JSON-DATA-ARRAY = JSON-DATATYPE *254( nv-sep JSON-DATATYPE ) JSON-DATATYPE = JSON-NATIVE-TYPE / JSON-STRING-TYPE / JSON-NIL-VALUE ; string values cannot contain unescaped " \ Heinbockel [Page 11] CEE Specification CLT Mapping: Syslog October 2011 ; or ASCII control characters (%x00-1F) JSON-STRING-TYPE = json-string / json-binary / json-tag / json-timestamp / json-ipv4Address / json-ipv6Address / json-macAddress / json-duration JSON-NATIVE-TYPE = json-boolean / json-integer / json-float JSON-NIL-VALUE = arr-s arr-e ; [] json-string = %x22 ["s|"] string %x22 json-binary = %x22 ["b|"] BASE64-DATA %x22 json-tag = %x22 ["g|"] tag %x22 json-timestamp = %x22 ["t|"] timestamp %x22 json-ipv4Address = %x22 ["4|"] ipv4Address %x22 json-ipv6Address = %x22 ["6|"] ipv6Address %x22 json-macAddress = %x22 ["m|"] macAddress %x22 json-duration = %x22 ["d|"] duration %x22 json-boolean = boolean json-integer = integer json-float = float FIELD-NAME = CEE-NAME CEE-NAME = CEE-STARTCHAR *31CEE-CHAR CEE-STARTCHAR = ALPHA / %x5F CEE-CHAR = CEE-STARTCHAR / DIGIT UTF-8-STRING = *OCTET ; UTF-8 string, RFC 3629 ; MUST NOT contain the NUL %x00 character OCTET = %x00-FF PRINTUSASCII = %x21-7E ALPHANUM = ALPHA / DIGIT SP = %x20 NONZERO-DIGIT = %x31-39 DIGIT = %x30 / NONZERO-DIGIT BOM = %xEF.BB.BF Author's Address William Heinbockel The MITRE Corporation 202 Burlington Rd Bedford, MA 01730 United States Phone: +1 781-271-2615 Email: heinbockel@mitre.org Heinbockel [Page 12]