Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Data Structure

GazooResearch's data structure has 4 layers. 1) Diagnosis, 2) Tag, 3) Field, and 4) Value.

Let's illustrate why 4 layers are needed with an example. Say we are wanting to study the correlation of blood sugars, and kidney function in patients with diabetes. We want to collect serum glucose levels, and serum creatinine levels (Figure 1.)

Figure 1. Lab Result Lab Result

At first glance, you might think you'll just need to store one number for glucose and another creatinine: 79 for glucose, and 0.5 for creatinine. Unfortunately, this alone is not enough information for your analysis. In reality, Glucose actually has 3 fields: 1) Date, 2) Value, and 3) Units. Creatinine also has 3 fields.

Finally, since we are studying diabetes we also have a base layer which is the icd10 code for diabetes.

Figure 2. GazooResearch's Data Structure Example Data Structure

When defining data structures in GazooResearch, we allow for the 4 layers as shown in the video below.

Figure 3. Data Dictionary

Data Analysis

Within the data analysis python modules you will also see this exact 4 layer data structure.

{'icd10':str, 'tag': str, 'field':str, 'value': [str, str, ...]},