Background info on HealhFlow system (basic concept, external applications part 1)

This description applied to RetroGuide (retrospective component of the HealthFlow system, but it is still valid to the overall HealthFlow system (which indludes retrospective mode (RetroGuide) as well as prospective mode (FlowGuide)


For full info, see this book:

1 Basic introduction

In this section, the RetroGuide (RG) analytical suite (proposed in Aim 2) is described in detail. An introduction to RG is followed by a description of four phases of RG’s usage. Several systems which are similar to RG are listed and described. Finally, a comparison of RG to Structured Query Language (SQL) is provided.

1.1 Introduction

RG is a suite of applications which supports several analytical steps. The RG architecture and analytical approach was substantially inspired by workflow technology and is meant to be applied in a medical context using electronic health record (EHR) data. Some of the adopted workflow constructs were already mentioned in Chapters 1 and 2, but the following list presents them again in overview: (1) graphical executable process models; (2) ability of the process flowcharts to contain references to execution of external applications; (3) an execution scheme where each process instance (or analyzed patient) is treated separately; and (4) built-in documentation of the process execution flow for later review or analysis.

1.1.1 Requirements

Reviewing the analytical limitations presented in Chapter 1, a set of requirements for a novel analytical approach was defined. These requirements are enumerated in [1] and also presented in the list below:

1. Provision of a set of user friendly graphical tools, targeted for clinician’s use, in which a clinical process or an analysis can be modeled in a stepwise fashion. The model would resemble a flowchart format often used by clinical guidelines.

2. Extensibility of the flowchart format with elements, entered by analysts or programmers, which would enable linkage of the graphical model to real healthcare data in an enterprise data warehouse (EDW). Additionally, these extending elements would enable modeling complexity which cannot be expressed by the flowchart notation alone.

3. Direct executability of the flowchart-based format (combining the input from clinicians and programmers/analysts).

4. Support for gradual development from simple models to more complex ones, with a shared workbench used by both clinicians and EDW data analysts.

5. Expressive process modeling language, sufficient to represent healthcare specific problems and challenges (e.g., simple clinical guidelines, adverse drug events, basic temporal conditions).

6. Small development burden (i.e., reuse existing standards and available tools, and develop only institution specific or healthcare domain specific extensions).

7. Support for the generation of reports and ability to assess process variability.

8. Data export into other analytical or statistical packages.

9. Ability to reuse models created on retrospective data in prospective mode. Retrospective models must be extendable to support execution in real time, using real events as controlling triggers. In other words, models working with retrospective data must be extendable to become point of care decision support modules.

1.1.2 Basic concepts

Several key basic concepts are necessary to understand the RG analytical approach. They are structured into several subsections below.

An RG analytical question is asked in a form of a scenario, which has two layers: a graphical flowchart layer and an additional hidden code layer. The RG term “scenario” is equivalent to the concept of a process or process definition in workflow technology. However, RG adds several additional conventions (which are explained later in section 3.2.2) to the process concept, hence a different term (“scenario”) is used. A scenario is created or viewed in a workflow editor application. The workflow editor can output and save the final scenario using a process definition language as defined in the workflow technology. RG is currently implemented using XML Process Definition Language (XPDL) [2] and can use any XPDL-compliant editor for creating new scenarios or modifying existing ones. The main editor utilized in this project is JaWE version 1.4 [3], however, the CapeVision plugin [4] for Microsoft Visio, Tibco Business Studio [5] and JPEd [6] editors were also used experimentally. The flowchart and code layers

The flowchart layer (”flowchart”) consists of nodes and arrows which connect the nodes. Whereas nodes represent individual steps in the analysis, the arrows represent transitions in the flow of the analysis’ logic.

Two special nodes designate the start and end of the scenario’s logic. An RG flowchart reflects a set of instructions for a stepwise, sequential analytical process, and, if it is necessary for the analysis, it can contain loops. The procedural, algorithmic nature of the RG flowchart is very different from node-and-arrows models used in some graphical dependency models.

Both nodes and transitions may contain additional attributes which comprise the code layer and which ultimately make the flowchart executable. The additional attributes of nodes are references to execution of one or more external applications. The additional attributes of transitions are transition conditions. If there is no condition inside a transition, the next step (node) is executed in every case. If a condition is present, the transition to the next node happens only if the condition is satisfied. There may be more than one arrow originating from any given node, which offers the ability to have branching logic. An example of a reference to an external application is “ReverseFind_CodedValueCD(C-section_procedure_CD)”, which will result in a EHR event search (backwards in time) for the C-section procedure. An example of a transition condition is “(value_D_dimer < 300) AND (value_antithrombin3 < 0.45)”, which means that the next node or scenario-branch will be executed only if the given threshold criteria are met. Execution scheme

The scenario is executed on data of a single patient. An analysis of a population is achieved by sequentially running the scenario on all members of the population – one at a time. Population results are abstracted at the end of the whole process when the engine finishes the sequential execution of the scenario on all patients. Navigation through the single patient EHR

RG uses a unique method of browsing through the EHR during the scenario execution. The approach resembles a human chart review process. RG operates strictly on a time-ordered patient chart, and this chronological assumption is crucial for its analytical functions. During the execution, RG manipulates the electronic chart, according to the stepwise sequence defined in the scenario, in a manner similar to a human browsing through a paper chart.

Most paper charts are also organized chronologically. For any chart review there is usually a clear set of step-by-step instructions of what the reviewer needs to look for in the paper record. The reviewer follows these steps one by one. The steps ask him or her to either look for certain events in the record or answer any analytical questions needed for the next steps in the review process. If the task is to find a certain event, the reviewer might often be asked to write down the result of this event search – a “yes/no” outcome. In the case of a “yes” outcome (i.e., the desired event in a particular step is found), the reviewer might be instructed to remember a pertinent numerical or temporal value about the found event. This remembered value can be used for comparisons in the next steps of the instructions.

During this manual review process, the reviewer browses the paper chart forwards or backwards, fulfilling each step in the instructions. At any point in time, the chart would be open at a particular position where the reviewer finished the last step. For understanding the RG execution model, the notion of this current position in the EHR (either for a human reviewer or a computer algorithm) is crucial. RG implements the current position pointer as an integer number which means the order rank of one particular event, where the execution stopped when it finished the last step. In the manual chart review process it would be similar to a notion of a “page number” in a chronologically ordered paper chart.

The review instructions also may contain steps where the reviewer would simply be asked to browse to a particular absolute or relative position from the current position. For example: “Go back to the start of the record and look for evidence for particular comorbidity (e.g., hypertension) at any point in time.” Another example would be: “Skip the rest of the current hospitalization where you found the PTCA operation and look for complications X occurring in a window of 4-12 months from the operation.”

There is one additional key element about the behavior of the current position marker during RG execution. If the next searched desired event is not found, the marker stays where it was before such unsuccessful search. To explain this in more detail, imagine there is an EHR with 1457 events. If the current position marker after finishing the last step is at a particular position (e.g., 453) and the next desired event was not found, the marker stays where it was (position 453), although in fact RG during the unsuccessful search browsed all events 457th , 458th , 459th , etc., until the last (1457th) event and finished at event 1457. If the desired event is not found, RG’s current position marker stays unchanged. Variables

RG has the ability to remember certain facts via the use of variables. The concept of variables is part of workflow technology, where it is called workflow relevant data. The creation and use of variables technically belong to the code-layer; however, most variable names are often directly exposed and used in the flowchart node titles.

The manual chart review example presented previously hinted at the need for variables when it mentioned an instruction for the human reviewer to write down certain facts about found events (to be used later in the logic or as data output). The RG scenario, being based on a workflow process, can use variables for this purpose of remembering important facts. The number of variables used is unlimited, and various types are offered in RG. Certain naming conventions of variables are strongly recommended (e.g., “time” suffix or “t” prefix for temporal facts, “value” or “v” for numerical values, and “count” or ”c” for count variables).

For example, a scenario analyzing hypertension may search for a prescription of a certain antihypertensive drug; remember the drug’s prescription date as t_drug; search for a systolic blood pressure value (sBP) prior to this prescription and remember this value as v_sBP_prior_therapy; jump to time 6 months after t_drug; search for the next available systolic blood pressure value and remember it as v_sBP_after_therapy. Subsequent analytical nodes or conditions can use all these introduced variables to answer various clinical questions, or to restrict or split further analysis into subgroups of patients which satisfy one or multiple temporal or numerical conditions. Logical constructs

The RG analysis is controlled through two main formalisms which correspond to the two layers mentioned above. The first formalism is offered by the flowcharting logic and exposed within the flowchart layer. This includes the use of various steps, use of conditions on transitions, and use of multiple flowchart branches to model the analytical problem at hand. This graphical formalism is meant to be understood by the analysis requestor. Any fundamental scenario modifications done by the analyst to the model are reflected by changes in the flowchart (e.g., adding an extra analytical step or adding scenario branches).

The second, problem modeling formalism is the use of external applications, and is represented at the code scenario layer. RG external applications (RGEA) can contain computer code which can cover reasoning which is outside the capabilities of the flowchart formalism.

External applications can serve several different purposes from simple to complex. Simple external applications may, for example, retrieve data from an EHR (data-get applications). Another category of simple RGEAs is applications which manipulate current position in the EHR to achieve certain effects useful for the analysis at hand (e.g., Jump_forward_X_Days (Desired_Days), Jump_After_Timestamp (Desired_absolute_ time-stamp), or Jump_to_First_EHR_Event). Another category of analytical applications can do comparisons which cannot be expressed as transition conditions. For example, temporal comparisons such as: Temporal_Difference_Exceeding (Timestamp1, Timestamp2, Desired_ Difference_Days).

Complex external applications may involve use of external, complex reasoning engines. For example, calculation of advanced statistical indicators, use of Bayesian belief networks, artificial neural networks, or other machine learning techniques for classification or prediction. An RG scenario would provide the necessary input for the calculations and would be able to use the output parameters for deciding what to do next (branching) with the outputted result.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: