February | 2010 | Healthcare Workflow

Archive for February, 2010

February 16, 2010

This is based on a book: Data Manipulation with R (Phil Spector).

page 136:
Datasets can be wide or long. When there are multiple occurences of values for a single observation:

a dataset is said to be long if each occurence is a separate row in the data frame (most IDR data, EAV design).
a dataset is said to be wide if all of the occurences of values for a given observation are in the same row

Also a dataset can be “melted” and cast to a desired shape (using the reshape package; http://cran.r-project.org/web/packages/reshape/reshape.pdf

library(reshape)
melted_data= melt (data)
desired_shape_data = cast(PARAMS,data=melted_data)

Very useful.

February 12, 2010

Red node is subflow
major and minor are counts of major items or minor items.
e.g., history of breast cancer under age 50 is major risk item