The goal of this section is to show how one can use the lightweight and portable JData annotations to conveniently represent advanced data structures, including complex-valued arrays, sparse arrays, tables, trees, linked lists and graphs. The notation used in this section is similar to those used in the basic examples.
Complex numbers are supported in many scientific programming languages, such as MATLAB, python and R, but there is no standardized format to store complex-valued data and share among these programming languages. Using the portable JData annotation system, we can conveniently represent complex-valued data and make sharing such data possible.
A complex-valued data record must be stored using the "annotated array format" as shown in the basic examples. This is achieved via the presence of _ArrayIsComplex_ keyword and the serialization of the complex values in the _ArrayData_ constructs in the order of [[serialized real-part values], [serialized imag-part values]]
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a=10.0+6.0j | => | { | [{] |
a=[ | => | { | [{] |
Sparse array is an important data type widely used in computational models. A small number of scientific programming languages provide built-in support to sparse arrays, such as MATLAB; others support sparse arrays via 3rd party libraries, such as scipy.sparse for Python, Blas for C/C++, Lapack for FORTRAN etc.
Similar to the above case, a sparse array must be stored using the "annotated array format" as shown in the basic examples. This is achieved via the presence of _ArrayIsSparse_ keyword and the serialization of the non-zero sparse data elements and indices in the _ArrayData_ constructs in the order of [[first index i1], [second index i2], ... [last index iN], [serialized non-zero values]]
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a=sparse(5,4); | => | { | [{] |
Combining the _ArrayIsComplex_ and _ArrayIsSparse_ tags, we can easily support complex-valued sparse arrays using JData annotations. The _ArrayData_ stores the information regarding the non-zero values in the order of [[first index i1], [second index i2], ... [last index iN], [serialized real-part of non-zero values], [serialized imag-part of non-zero values]]
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a=sparse(5,4); | => | { | [{] |
The light-weight data annotation mechanism provided by JData specification can further enable efficient storage of special matrices, such as diagonal, triangular, Toeplitz matrices etc. This is enabled via the _ArrayShape_ keyword. For example
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
diagonal matrix a=diag([1.0,1.0,2.1,2.2,5.3]) | => | { | [{] |
upper triangular matrix 11 12 13 14 15 | => | { | [{] |
symmetric matrix- store upper 11 12 13 14 15 | => | { | [{] |
real Toeplitz matrix 11 12 13 0 0 | => | { | [{] |
complex diagonal matrix a=diag([1+2j,4-j,5j,10,12+7j]) | => | { | [{] |
In the JData Specification, we represent a table/spreadsheet/database data structure using 3 keywords - _TableRows_,_TableCols_ and _TableRecords_. One can also specify data types for each column of the table.
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a table without row-name
Name Age Degree Height ---- ------- ------ ------ Andy 21 BS 69.2 William 21 MS 71.0 Om 22 BE 67.1 | => |
{ "_TableCols_": ["Name", "Age", "Degree", "Height"], "_TableRows_": [], "_TableRecords_": [ ["Andy", 21, "BS", 69.2], ["William", 21, "MS", 71.0], ["Om", 22, "BS", 67.1] ] } | [{] [U][11][_TableCols_] [[] [S][U][4][Name] [S][U][3][Age] [S][U][6][Degree] [S][U][6][Height] []] [U][11][_TableRows_] [[] []] [U][14][_TableRecords_] [[] [[] [S][U][4][Andy] [U][21] [S][U][2][BS] [d][69.2] []] [[] [S][U][7][William] [U][21] [S][U][2][MS] [d][71.0] []] [[] [S][U][2][Om] [U][22] [S][U][2][BS] [d][67.1] []] []] [}] |
specifying column data types | => |
{ "_TableCols_": [ {"DataName":"Name", "DataType":"string" }, {"DataName":"Age", "DataType":"int32" }, {"DataName":"Degree", "DataType":"string" }, {"DataName":"Height", "DataType":"single" } ], "_TableRows_": [], "_TableRecords_": [ ["Andy", 21, "BS", 69.2], ["William", 21, "MS", 71.0], ["Om", 22, "BS", 67.1] ] } | [{] [U][11][_TableCols_] [[] [{] [S][U][8][DataName] [S][U][4][Name] [S][U][8][DataType] [S][U][6][string] [}] [{] [S][U][8][DataName] [S][U][3][Age] [S][U][8][DataType] [S][U][5][int32] [}] [{] [S][U][8][DataName] [S][U][6][Degree] [S][U][8][DataType] [S][U][6][string] [}] [{] [S][U][8][DataName] [S][U][6][Height] [S][U][8][DataType] [S][U][6][single] [}] []] [U][11][_TableRows_] [[] []] [U][14][_TableRecords_] [[] [[] [S][U][4][Andy] [U][21] [S][U][2][BS] [d][69.2] []] [[] [S][U][7][William] [U][21] [S][U][2][MS] [d][71.0] []] [[] [S][U][2][Om] [U][22] [S][U][2][BS] [d][67.1] []] []] [}] |
A tree contains a hierarchical dataset that is similar to a structure, except that each of the tree-node may be associated with its own data payload aside from its hierarchical information. In the JData specification, we use nested _TreeNode_ and _TreeChildren_ constructs to represent such data structure. In the meantime, we can also convert a tree-like data structure to a struct and store the nodal data to the metadata record of the struct node.
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a tree data structure
root={id:0,data:10.1} ├── node1={id:1,data:2.5} ├── node2={id:2,data:100} │ ├── node2.1={id:3,data:9} │ └── node2.2={id:4,data:20.1} └── node3={id:5,data:-9.0} | => |
{ "_TreeNode_(root)": {"id":0,"data":10.1}, "_TreeChildren_": [ {"_TreeNode_(node1)": {"id":1,"data":2.5} }, { "_TreeNode_(node2)": {"id":2,"data":100}, "_TreeChildren_": [ {"_TreeNode_(node2.1): {"id":3,"data":9} }, {"_TreeNode_(node2.2): {"id":4,"data":20.1} } ] }, {"_TreeNode_(node3)": {"id":5,"data":-9.0} } ] } | [{] [U][16][_TreeNode_(root)] [{] [U][2][id] [l][0] [U][4][data] [d][10.1] [}] [U][14][_TreeChildren_] [[] [{] [U][16][_TreeNode_(node1)] [{] [U][2][id] [l][1] [U][4][data] [d][2.5][}] [}] [{] [U][16][_TreeNode_(node2)] [{] [U][2][id] [l][2] [U][4][data] [d][100][}] [U][14][_TreeChildren_] [[] [{] [U][16][_TreeNode_(node2.1)] [{] [U][2][id] [l][3] [U][4][data][d][9][}] [}] [{] [U][16][_TreeNode_(node2.2)] [{][U][2][id][l][4][U][4][data][d][20.1][}] [}] []] [}] [{] [U][16][_TreeNode_(node3)] [{] [U][2][id] [l][5] [U][4][data] [d][-9.0][}] [}] []] [}] |
converting the above tree into a struct with metadata (_DataInfo_), only needed when a tree node contains both data and children. | => |
{ "root":{ "_DataInfo_": { "_TreeNode_":{ "id":0,"data":10.1 } }, "node1": {"id":1,"data":2.5}, "node2": { "_DataInfo_":{ "_TreeNode_":{ "id":2,"data":100 } }, "node2.1": {"id":1,"data":2.5}, "node2.2": {"id":1,"data":2.5} }, "node3":{"id":5,"data":-9.0}} } } | [{] [U][4][root] [{] [U][10][_DataInfo_] [{] [U][10][_TreeNode_] [{] [U][2][id] [l][0] [U][4][data] [d][10.1] [}] [}] [U][5][node1] [{] [U][2][id] [l][1] [U][4][data] [d][2.5] [}] [U][5][node2] [{] [U][10][_DataInfo_] [{] [U][10][_TreeNode_] [{] [U][2][id] [l][2] [U][4][data] [d][100] [}] [}] [U][7][node2.1] [{] [U][2][id] [l][3] [U][4][data] [d][9] [}] [U][7][node2.2] [{] [U][2][id] [l][4] [U][4][data] [d][20.1] [}] [}] [U][5][node3] [{] [U][2][id] [l][5] [U][4][data] [d][-9.0] [}] []] [}] |
A linked list contains a linear chain of singly or doubly connected data records. Each element along this chain contains its data payload, in additional to the topological data, namely its next and/or prior neighbor. In the JData specification, we use three dedicated annotations _ListNode_, _ListNext_, and _ListPrior_ to represent a singly- or doubly-lined list data structure
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a doubly-linked list
head ={id:0,data:10.1} ⇵ node1={id:1,data:2.5} ⇵ node2={id:2,data:100} ⇵ node3={id:3,data:9} ⇵ node4={id:4,data:20.1} ⇵ tail ={id:5,data:-9.0} | => |
[ { "_ListNode_(head)": {"id":0,"data":10.1}, "_ListNext_": "node1", "_ListPrior_": null }, { "_ListNode_(node1)": {"id":1,"data":2.5}, "_ListNext_": "node2", "_ListPrior_": "head" }, { "_ListNode_(node2)": {"id":2,"data":100}, "_ListNext_": "node3", "_ListPrior_": "node1" }, { "_ListNode_(node3)": {"id":3,"data":9}, "_ListNext_": "node4", "_ListPrior_": "node2" }, { "_ListNode_(node4)": {"id":4,"data":20.1}, "_ListNext_": "tail", "_ListPrior_": "node3" }, { "_ListNode_(tail)": {"id":5,"data":-9.0}, "_ListNext_": "null", "_ListPrior_": "node4" } ] | [[] [{] [U][16][_ListNode_(head)] [{] [U][2][id] [l][0] [U][4][data] [d][10.1] [}] [U][10][_ListNext_] [S][U][5][node1] [U][11][_ListPrior_] [Z] [}] [{] [U][16][_ListNode_(node1)] [{] [U][2][id] [l][1] [U][4][data] [d][2.5] [}] [U][10][_ListNext_] [S][U][5][node2] [U][11][_ListPrior_] [S][U][4][head] [}] [{] [U][16][_ListNode_(node2)] [{] [U][2][id] [l][2] [U][4][data] [d][100] [}] [U][10][_ListNext_] [S][U][5][node3] [U][11][_ListPrior_] [S][U][5][node2] [}] [{] [U][16][_ListNode_(node3)] [{] [U][2][id] [l][3] [U][4][data] [d][9] [}] [U][10][_ListNext_] [S][U][5][node4] [U][11][_ListPrior_] [S][U][5][node3] [}] [{] [U][16][_ListNode_(node4)] [{] [U][2][id] [l][4] [U][4][data] [d][20.1] [}] [U][10][_ListNext_] [S][U][4][tail] [U][11][_ListPrior_] [S][U][5][node3] [}] [{] [U][16][_ListNode_(tail)] [{] [U][2][id] [l][5] [U][4][data] [d][-9.0] [}] [U][10][_ListNext_] [Z] [U][11][_ListPrior_] [S][U][5][node4] [}] []] |
A graph is often considered the most general and flexible data structure, and is widelyused in advanced data processing methods. A graph is made of an interconnected set of data records (or nodes), with data payload attached to the nodes and/or edges.
In the JData specification, we use two special keywords, _GraphNodes_ and _GraphEdges_ (or _GraphEdges0_ to signify undirected edges) to encapsulate the topological data as well as the data payloads carried by a graph data structure.
Native data | text-JData/JSON form | binary-JData(UBJSON) | |
---|---|---|---|
a directed graph object
head ={id:0,data:10.1} ⇓ e1 ┌─node1={id:1,data:2.5} │ ⇓ e2 │ node2={id:2,data:100}─┐ │ ⇓ e3 │ └➝node3={id:3,data:9} e7│ e6 ⇓ e4 │ node4={id:4,data:20.1}↲ ⇓ e5 tail ={id:5,data:-9.0} | => |
{ "_GraphNodes_":[ "head": {"id":0,"data":10.1}, "node1":{"id":1,"data":2.5 }, "node2":{"id":2,"data":100 }, "node3":{"id":3,"data":9 }, "node4":{"id":4,"data":20.1}, "tail": {"id":5,"data":-9.0} ], "_GraphEdges_":[ ["head", "node1","e1"], ["node1","node2","e2"], ["node2","node3","e3"], ["node3","node4","e4"], ["node4","tail", "e5"], ["node1","node3","e6"], ["node2","node4","e7"] ] ] | [{] [U][12][_GraphNodes_] [{] [U][4][head] [{] [U][2][id] [l][0] [U][4][data] [d][10.1] [}] [U][5][node1][{] [U][2][id] [l][1] [U][4][data] [d][2.5] [}] [U][5][node2][{] [U][2][id] [l][2] [U][4][data] [d][100] [}] [U][5][node3][{] [U][2][id] [l][3] [U][4][data] [d][9] [}] [U][5][node4][{] [U][2][id] [l][4] [U][4][data] [d][20.1] [}] [U][4][tail] [{] [U][2][id] [l][5] [U][4][data] [d][-9.0] [}] [}] [U][12][_GraphEdges_] [[] [[] [S][U][4][head] [S][U][5][node1] [S][U][2][e1] []] [[] [S][U][4][node1] [S][U][5][node2] [S][U][2][e2] []] [[] [S][U][4][node2] [S][U][5][node3] [S][U][2][e3] []] [[] [S][U][4][node3] [S][U][5][node4] [S][U][2][e4] []] [[] [S][U][4][node4] [S][U][4][tail] [S][U][2][e5] []] [[] [S][U][4][node1] [S][U][5][node3] [S][U][2][e6] []] [[] [S][U][4][node2] [S][U][5][node4] [S][U][2][e7] []] []] [}] |