Thank you for posting mock data emulation. Obviously the app developers do not implement self-evident semantics and should be cursed. (Not just for Splunk's sake, but for every other developer's sanity.) If you have any influence on developers, demand that they change JSON structure to something like {
"browser_id": "0123456",
"browsers": {
"fullName": "blahblah",
"name": "blahblah",
"state": 0,
"lastResult": {
"success": 1,
"failed": 2,
"skipped": 3,
"total": 4,
"totalTime": 5,
"netTime": 6,
"error": true,
"disconnected": true
},
"launchId": 7
},
"result": [
{
"id": 8,
"description": "blahblah",
"suite": [
"blahblah",
"blahblah"
],
"fullName": "blahblah",
"success": true,
"skipped": true,
"time": 9,
"log": [
"blahblah",
"blahblah"
]
}
],
"summary": {
"success": 10,
"failed": 11,
"error": true,
"disconnected": true,
"exitCode": 12
}
} That is, isolate the browser_id field into a unique key for browser, results, and summary. The structure you shared cannot express any more semantic browser_id than one. But if for some bizarre reason the browser_id needs to be passed along in results because summary is not associated with browser_id in each event, expressly say so with JSON key, like {
"browsers": {
"id": "0123456",
"fullName": "blahblah",
"name": "blahblah",
"state": 0,
"lastResult": {
"success": 1,
"failed": 2,
"skipped": 3,
"total": 4,
"totalTime": 5,
"netTime": 6,
"error": true,
"disconnected": true
},
"launchId": 7
},
"result": {
"id": "0123456",
"output": [
{
"id": 8,
"description": "blahblah",
"suite": [
"blahblah",
"blahblah"
],
"fullName": "blahblah",
"success": true,
"skipped": true,
"time": 9,
"log": [
"blahblah",
"blahblah"
]
}
]
},
"summary": {
"success": 10,
"failed": 11,
"error": true,
"disconnected": true,
"exitCode": 12
}
} Embedding data in JSON key is the worst use of JSON - or any structured data. (I mean, I recently lamented worse offenders, but imaging embedding data in column name in SQL! The developer will be cursed by the entire world.) This said, if your developer has a gun over your head, or they are from a third party that you have no control over, you can SANitize their data, i.e., making the structure saner using SPL. But remember: A bad structure is bad not because a programming language has difficulty. A bad structure is bad because downstream developers cannot determine the actual semantics without reading their original manual. Do you have their manual to understand what each structure means? If not, you are very likely to misrepresent their intention, therefore get the wrong result. Caveat: As we are speaking semantics, I want to point out that your illustration uses the plural "browsers" as key name as well as singular "result" as another key name, yet the value of (plural) "browsers" is not an array, while the value of (singular) "result" is an array. If this is not the true structure, you have changed semantics your developers intended. The following may lead to wrong output. Secondly, your illustrated data has level 1 key of "0123456" in browsers, an identical level 1 key of "0123456" in result, a matching level 2 id in browsers "0123456", a different level 2 id in result "8". I assume that all matching numbers are semantically identical and non-matching numbers are semantically different Here, I will give you SPL to interpret their intention as my first illustration, i.e., a single browser_id applies to the entire event. I will assume that you have Splunk 9 or above so fromjson works. (This can be solved using spath with a slightly more cumbersome quotation manipulation.) Here is the code to detangle the semantic madness. This code does not require the first line, fields _raw. But doing so can help eliminate distractions. | fields _raw ``` to eliminate unusable fields from bad structure ```
| fromjson _raw
| eval browser_id = json_keys(browsers), result_id = json_keys(result)
| eval EVERYTHING_BAD = if(browser_id != result_id OR mvcount(browser_id) > 1, "baaaaad", null())
| foreach browser_id mode=json_array
[eval browsers = json_delete(json_extract(browsers, <<ITEM>>), "id"),
result = json_extract(result, <<ITEM>>)]
| spath input=browsers
| spath input=result path={} output=result
| mvexpand result
| spath input=result
| spath input=summary
| fields - -* result_id browsers result summary This is the output based on your mock data; to illustrate result[] array, I added one more mock element. browser_id description disconnected error exitCode failed fullName id lastResult.disconnected lastResult.error lastResult.failed lastResult.netTime lastResult.skipped lastResult.success lastResult.total lastResult.totalTime launchId log{} name skipped stats success suite{} time ["0123456"] blahblah true true 12 11 blahblah blahblah 8 true true 2 6 3 1 4 5 7 blahblah blahblah blahblah true 0 true 10 blahblah blahblah 9 ["0123456"] blahblah 9 true true 12 11 blahblah blahblah9 9 true true 2 6 3 1 4 5 7 blahblah 9a blahblah 9b blahblah true 0 true 10 blahblah9a blahblah9b 11 In the table, "id" is from results[]. This is the emulation of expanded mock data. Here, I decided to not use format=json because this will preserve the pretty print format, also because Splunk will not show fromjson-style fields automatically. (With real data, fromjson-style fields are not used in 9.x.) | makeresults
| eval _raw ="
{
\"browsers\": {
\"0123456\": {
\"id\": \"0123456\",
\"fullName\": \"blahblah\",
\"name\": \"blahblah\",
\"state\": 0,
\"lastResult\": {
\"success\": 1,
\"failed\": 2,
\"skipped\": 3,
\"total\": 4,
\"totalTime\": 5,
\"netTime\": 6,
\"error\": true,
\"disconnected\": true
},
\"launchId\": 7
}
},
\"result\": {
\"0123456\": [
{
\"id\": 8,
\"description\": \"blahblah\",
\"suite\": [
\"blahblah\",
\"blahblah\"
],
\"fullName\": \"blahblah\",
\"success\": true,
\"skipped\": true,
\"time\": 9,
\"log\": [
\"blahblah\",
\"blahblah\"
]
},
{
\"id\": 9,
\"description\": \"blahblah 9\",
\"suite\": [
\"blahblah9a\",
\"blahblah9b\"
],
\"fullName\": \"blahblah9\",
\"success\": true,
\"skipped\": true,
\"time\": 11,
\"log\": [
\"blahblah 9a\",
\"blahblah 9b\"
]
}
]
},
\"summary\": {
\"success\": 10,
\"failed\": 11,
\"error\": true,
\"disconnected\": true,
\"exitCode\": 12
}
}
"
| spath
``` the above partially emulates
index="github_runners" sourcetype="testing" source="reports-tests"
```
... View more