February 10, 2013 by
Defensive Construct Exchange Standard
This was originally posted on blogger here.
Why do we need something else? First, this is not a new SIEM. This is not Arcsight, it's not Splunk, it's not CIF or ELSA. This is not an information structure. It's not STIX, VERIS, or Mandiant IOCs. If anything, it's similar to TAXII or IDMEF. However all of these approaches (and the many other existing approaches) have a primary flaw: they have structure. The fundamental issues it that no matter what tool we use, it will collect different data. We will have similar fields (URLs, IPs, etc) tool to tool, but each provides a slightly different construct with slightly different fields associated with each construct. This limits all but the most general indexing tools (such as Splunk or ELSA) from importing data without an importer designed specifically for that data (such as an Arcsight connector).
Also, basically all tools (other than Paterva's Maltego) take a database approach to storing data. While this still allows searching data to match specific patterns (such as IP address), it is less efficient as linkages are implied only by the existence of the pattern in a row with other data. Passing data as records may hide linkages that could otherwise be uncovered.
- All discrete pieces of information within a construct will be given an individual node (in the graph theory sense). All nodes within the construct are a type of Attribute. The actual attribute Attribute will be stored as a tuple within the node's "Metadata" attribute.
- All nodes will be linked to a node containing a construct ID generated by the construct originator. (It will be recommended that those linking constructs into their own graphs generate a local construct ID so as to avoid conflicts within their graph.)
- The construct ID will be a child node of all Attributes within the construct.
- The nodes and edges will be represented in JSON. They will be transmitted in accordance with the JSON format outlined by the @gephi graph streaming project (Gephi graph streaming). In practice, all constructs should be transmittable as a grouping of 'add node' and 'add edge' messages, with the recipient deciding how to actually handle the information.
- Attributes within the construct may have their own Attributes. (I.E. A threat construct's location Attribute may have a 'confidence' Attribute. Also, Attributes within the construct may have a child attribute representing a Classification such as "company proprietary", "PII", etc.) (Note this is less of a rule of the standard as an explicit flexibility.)
What a great idea gabe! Can we see an example? The following construct is used as an example in the STIX format. In the STIX example, it represents a link within a phishing email. Using our new format, it could be visually represented as:
{"an":{"A":{"label":"Construct From X","Class":"Attribute","Metadata":{"ID":<value>}}}}\r{"ae":{"1":{"source":"A","target":"B","directed":true}}}{"ae":{"2":{"source":"A","target":"C","directed":true}}}{"ae":{"3":{"source":"A","target":"D","directed":true}}}{"ae":{"4":{"source":"D","target":"C","directed":true}}}{"ae":{"5":{"source":"C","target":"B","directed":true}}}{"ae":{"6":{"source":"A","target":"E","directed":true}}}{"ae":{"7":{"source":"A","target":"F","directed":true}}}{"ae":{"8":{"source":"A","target":"G","directed":true}}}{"ae":{"9":{"source":"G","target":"F","directed":true}}}{"ae":{"10":{"source":"F","target":"E","directed":true}}}{"ae":{"11":{"source":"A","target":"H","directed":true}}}{"ae":{"12":{"source":"A","target":"I","directed":true}}}{"ae":{"13":{"source":"A","target":"J","directed":true}}}{"ae":{"14":{"source":"J","target":"I","directed":true}}}{"ae":{"15":{"source":"I","target":"H","directed":true}}}{"an":{"B":{"label":"URL","Class":"Attribute","Metadata":{"URL":<value>}}}}{"an":{"C":{"label":"DOMAIN","Class":"Attribute","Metadata":{"DOMAIN":<value>}}}}{"an":{"D":{"label":"WHOIS","Class":"Attribute","Metadata":{"WHOIS":<value>}}}}{"an":{"E":{"label":"DNS Query","Class":"Attribute","Metadata":{"DNS Query":<value>}}}}{"an":{"F":{"label":"DNS Record","Class":"Attribute","Metadata":{"DNS Record":<value>}}}}{"an":{"G":{"label":"DNS Record Type","Class":"Attribute","Metadata":{"Record Type":<value>}}}}{"an":{"H":{"label":"DNS Query","Class":"Attribute","Metadata":{"DNS Query":<value2>}}}}{"an":{"I":{"label":"DNS Record","Class":"Attribute","Metadata":{"DNS Record":<value2>}}}}{"an":{"J":{"label":"DNS Record Type","Class":"Attribute","Metadata":{"Record Type":<value2>}}}}
How will this approach be used? In the most basic sense, two tools or groups exchanging information can simply use this to exchange standard formats (such as an IDMEF message). Alternately, it could be easily databased by tools such as Splunk or ELSA, however neither of these approaches makes use of the strength of the format and instead simply provide backwards compatibility with previous approaches and workflows.
10 comments captured from original post on Blogger
Gabe said on 2013-02-18
There is a concern that a graph may not know when the construct has ended. I think to deal with that, data which should be treated as a single construct can be sent together: {an:{"A":{}, "B":{}},ae:{1:{},2:{}}} etc.
Gabe said on 2013-02-18
A good question is what should be returned upon adding. I think the whole construct should be returned. Some options are: 1. The entire construct as it is represented in the receiving data store. 2. A mapping of the sending node/edge IDs to the receiving ones. 3. The construct ID within the receiving data store.
Gabe said on 2013-02-18
I think I’m going to change it slightly to have the construct ID point to the attributes as the construct ID describes them, not the other way around.
Gabe said on 2013-02-25
I’m considering using the WAMP spec (http://wamp.ws/spec) to wrap the DCES events. WAMP should provide subscribe and publish capabilities as well as allowing extention to RPC capabilities.
Gabe said on 2013-02-27
To ensure the message is appropriately interpretted, I think the dictionary should include the following keys: "DCES_TYPE" and "DCES_VERSION". DCES_TYPE will predominantly be "GRAPH". Version will probably always be "1" but never hurts to have it.
Gabe said on 2013-02-27
The DCES_TYPE may not be necessary as the Cypher will be an RDP call rather than a pubsub call and the only other type is the graph update.
Gabe said on 2013-03-13
"Metadata" should probably not be a dictionary within the dictionary as it will inherently be stored as a string in most cases. Consider making any metadata a property of the node/edge it’s self.
Gabe said on 2013-03-15
This doesn’t work as it makes it impossible to parse nodes with additional properties. "metadata" will be a required property for attributes and will contain a tuple of ("type", "value"). This should strike a balance between querying the graph for attributes and allowing flexibility in node properties.
brokeless said on 2020-12-31
I read that Post and got it fine and informative. 꽁머니
Smart said on 2022-02-22
Your blogs further more each else volume is so entertaining further serviceable It appoints me befall retreat encore. I will instantly grab your rss feed to stay informed of any updates. Roofing Contractor Calgary