/************************************************************************************************* Statistical Reporting Tools for Posting Results on ClinicalTrials.gov This project was supported in part by the National Institute for Dental and Craniofacial Research cooperative agreement U54 DE019285 for the UCSF Center to Address Disparities in Children's Oral Health and in part by the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health, through UCSF CTSI Grant Number UL1 RR024131. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. Program: OMForm.sas Description: Compute values for Outcome Measures for clinical trials, and create the baseline measure table using the formats required by ClinicalTrials.gov Parameters: inds = input dataset with outcome variable(s) on participant-level arm = numeric variable (values: 1, 2, 3, 4, ..., n) to specify study arm, 1 for the first arm, and n for the n-th arm armTitle = specifies arm titles for arms, separated by *, within each title, use / to split it into multiple lines, e.g., Trt A /Only * Trt A/ + B var = specifies outcome variable, it should be numeric dataType = specifies data type either categorical or continuous for the outcome variable if the outcome variable is 2-category, i.e., binary, make sure 1 represents the event of interest, e.g., decay=1 means a participant had 1 or more decayed teeth. varTitle = specifies outcome variable title to be displayed in the table on the first column dataFmt = specify data format for outcome variable, set it "none" for continuous variables and binary variables, data formats for categorical variables with 3 or more categories can be defined using proc format, e.g., proc format; value tstatusf 1="decayed/filled" 2='sound' 3='missing' 4='sealant'; run; measureUnit=specifies unit of measure for the outcome variable measureType=specifies measure type for the outcome variable, and choose one from below: "Number" (e.g., number of participants) Measure of Central Tendency, if a continuous measure is reported "Mean" "Median" dispersionType=specifies measure of dispersion for the outcome variable, and choose one from below: "Not Applicable" (only when Measure Type is "Number") "Standard Deviation" "Inter-Quartile Range" "Full Range" "Standard Error" "95% Confidence Interval" "90% Confidence Interval" displayTotal = indicates whether to display total # participants affected / at risk by arm, "yes" or "no", default=no outFile= specify rtf filename to save the result table Author: Nancy Cheng (nancy.cheng@ucsf.edu) Created Date: 4/23/2012 Wrote macro %getStats, which is called by %OMForm, not called by users. Modified Date: 4/24/2012 Wrote macro %OMForm Modified Date: 5/15/2012 Modified code so the macro only computes one outcome variable. Added documentation Modified Date: 5/16/2012 Modified code to process special catergorical variables - binary outcome variables. Added documentation Modified Date: 10/11/2012 Added more documentation. Modified Date: 10/19/2012 Modified code to accept special characters in armTitles such as comma. Examples to call %OMForm: 1. for continuous outcome; %let armTitle=%str(Trt A /Only * Trt A /+ B); %OMForm(inds=outcomeMeasures, arm=arm, armTitle=&armTitle, var=dfs, dataType= continuous , varTitle=%str(Caries Increment), dataFmt=none, measureUnit=tooth surfaces, measureType=Mean, dispersionType=Standard Error, displayTotal=Yes, outFile='Outcome Measure1.rtf'); 2. for 2-category (binary) categorical outcome; %OMForm(inds=outcomeMeasures, arm=arm, armTitle=&armTitle, var=decay , dataType= categorical, varTitle=%str(Any Incident Caries), dataFmt=none, measureUnit=participants, measureType=Number, dispersionType=NA, displayTotal=Yes, outFile='Outcome Measure2.rtf'); 3. for 3 or more category categorical outcome; %OMForm(inds=outcomeMeasures, arm=arm, armTitle=&armTitle, var=status, dataType= categorical, varTitle=%str(Tooth Status), dataFmt=tstatusf, measureUnit=participants, measureType=Number, dispersionType=Not Applicable, displayTotal=Yes, outFile='Outcome Measure3.rtf'); ******************************************************************************************************/; /*------------------------------------------------------------------------------------------- Users with comments, suggestions, or who have identified problems should contact Nancy Cheng via email at nancy.cheng@ucsf.edu DISCLAIMER: --------------------------------------------------------------------------------------------- !!! This software is provided AS IS and No guarantee as to suitability or accuracy is !!! given or implied. User uses this code entirely at his/her own risk!!! --------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- LEGAL NOTICES: The author and the University of California San Francisco (UCSF) disclaim all warranties, expressed or implied, with regard to this software, including without limitation all implied warranties of merchantability and fitness for a particular purpose, and in no event shall the authors, UCSF, or Regents of the University of California be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data, or profits, whether in an action of contract, tort (including negligence) or strict liability, arising out of or in connection with the use or performance of this software. ---------------------------------------------------------------------------------------------- REPORTING PROBLEMS AND MAKING SUGGESTIONS: The authors have checked the results of the computations with reported results and found no discrepancies. However, no software can be said to be totally free of problems. Therefore, the authors would appreciate knowing if you encounter difficulties or have suggestions for improvements. If you have problems with this macro or suggestions for improvement please feel free to email the author to share this information. Any corrections or suggestions that are implemented with be acknowledged in the macro documentation for the revised version. -------------------------------------------------------------------------------------------- */ %global nArm armTitle1 headerColor valueColBColor; %let headerColor=CXDFDEFF;%*background color for the header and first column of the Outcome Measure result table; %let valueColBColor=CXF3F3FC;%*background color for the value column (rest) of the Outcome Measure result table; %macro OMForm(inds=, arm=, armTitle=, var=, dataType=, varTitle=, dataFmt=, measureUnit=, measureType=, dispersionType=, displayTotal=No, outFile=); ods html close; ods rtf file=&outFile style=STATISTICAL bodytitle_aux; options orientation=landscape nodate nonumber; %*extract arm title for each arm and save them into macro variables; %let i=1; %let armTitle=%sysfunc(compbl(&armTitle)); %get1ArmTitle(armTitle=%bquote(&armTitle), i=1); %do %while(%bquote(&armTitle1) ne %str()); %global arm&i.Title; %let arm&i.Title=%bquote(&armTitle1); %let i=%eval(&i+1); %get1ArmTitle(armTitle=%bquote(&armTitle), i=&i); %end; proc sql noprint; select distinct count(distinct &arm) into: nArm from &inds; quit; %let nArm=%sysfunc(trim(&nArm)); %*call macro %getStats to generate final data for proc report to create result table; %getStats(inds=&inds, arm=&arm, var=&var, dataType=&dataType, varTitle=&varTitle, dataFmt=&dataFmt, measureUnit=&measureUnit, measureType=&measureType, dispersionType=&dispersionType, outds=sumstats); %*create result table; title j=left height=9pt bold bcolor=white "Measured Values"; %if %upcase(&displayTotal)=YES %then %do; %let nDataColDisplay=%eval(&nArm+1); %let arm&nDataColDisplay.Title=Total; %end; %else %let nDataColDisplay=&nArm; proc report data=sumstats nocenter nowd split="/" ls=256 style(header)=[background=&HeaderColor font_size=9pt] style(column)=[font_size=9pt]; column grpord roword varinfo value1- value&nDataColDisplay ; define grpord / order noprint; define roword / order noprint; define varinfo /display order=data " "; %do i=1 %to &nDataColDisplay; %str(define value&i / display "&&arm&i.Title" style(column)={background=&valueColBColor font_weight=medium just=center width=10%%}); %end; compute varinfo ; if roword = 1 then call define(_col_, "style", "style=[background=&headerColor fontweight=bold]"); else call define(_col_,"style", "style=[background=&headerColor fontweight=medium leftmargin=12pt]"); endcomp; run; title ' '; proc datasets library=work nolist; delete sumstats; run;quit; ods rtf close; ods html; %mend OMForm; ********************************************************************************************** macro to obtain key statistics for input variable: get count by arm for categorical variables get measureType and dispersion by arm for continuous variables. Parameters: inds = input dataset arm = specifies study arm var = variable of interest for computing its statistics dataType=indicate if the variable is categorical or continuous varTitle=variable title displayed in final result table measureUnit= unit of measure dataFmt = specifies the format for categorical variables, which is defined using proc format, default is none for continuous variables measureType= specifies one of these choices: Number, Mean, Median for continuous variable specifies Number for categorical variable dispersionType= specifies one of these choices: Not Applicable, Standard Deviation, Inter-quartile Range, and Full Range for continuous variable specify not applicable, leave it blank or ignore it for categorical variable outds= specifies output dataset to save summary statistics results ********************************************************************************************** *option mprint; %macro getStats(inds=, arm=, var=, dataType=, varTitle=, dataFmt=, measureUnit=, measureType=, dispersionType=, outds=); %let measureType=%sysfunc(propcase(%sysfunc(trim(&measureType)))); %let dispersionType=%sysfunc(propcase(%sysfunc(trim(%sysfunc(compbl(%bquote(&dispersionType))))))); %let nDataCol=%eval(&nArm+1); %if &dataType = categorical %then %do; proc freq data=&inds noprint; table &var*&arm/nocum out=_&var.ctf; run; %*get all possible combinations of &var by &arm, assign 0 to missing count; proc sql; create table _nVarbynArm as select distinct a.&var, b.&arm from &inds a, &inds b; quit; proc sql; create table _&var.ctfc as select a.&var, a.&arm, case b.count when . then 0 else b.count end as count from _nVarbynArm as a left join _&var.ctf as b on a.&var=b.&var and a.&arm=b.&arm; quit; proc summary data=_&var.ctfc noprint sum; class &arm; var count; output out=_&var.total sum=count; run; data _&var.total;set _&var.total; if &arm=. then &arm=100;run; proc sort data=_&var.total; by &arm;run; proc sql noprint; select count(distinct &var) into: ncat from _&var.ctfc; run; %let ncat=%sysfunc(trim(&ncat)); %if &ncat = 2 %then %do; %*binary outcome; *get the number of partcipants analyzed by arm; proc transpose data=_&var.total out=_&var (drop=_name_) prefix=n; var count; run; data _&var; length &var 8; set _&var; &var=.; run; *get the number of partcipants that &var=1; proc transpose data=_&var.ctfc out=_&var.1 (drop=_name_) prefix=n; var count; where &var=1; run; data _&var.1; length &var 8; set _&var.1; &var=1; n&nDataCol=sum(of n1-n&nArm);run; proc append base=_&var data=_&var.1;run; quit; data _&var.cnt(keep=grpord roword varinfo value1-value&nDataCol); length grpord roword 8 varinfo $35 value1-value&nDataCol $40; retain grpord 1; set _&var; array val value1-value&nDataCol; array num n1-n&nDataCol; if &var=. then do; do i=1 to 2; if i=1 then do; varinfo = "Number of Participants Analyzed"; do j=1 to &nDataCol; val(j) = ' '; end; end; else do; varinfo="[units: participants]"; do i=1 to &nDataCol; val(i) = put(num(i), 5.); end; end; roword + 1; output; end; end; else if &var=1 then do; grpord=2; roword=0; do i=1 to 2; if i=1 then do; varinfo = "&varTitle"; do j=1 to &nDataCol; val(j) = ' '; end; end; else do; varinfo="[units: &measureUnit]"; do i=1 to &nDataCol; val(i) = put(num(i), 5.); end; end; roword + 1; output; end; end; run; proc append base=&outds data=_&var.cnt; run; proc datasets library=work nolist; delete _&var.ctf _&var.ctfc _nVarbynArm _&var.total _&var _&var.1 _&var.cnt ; run;quit; %end; %else %do; *3 or more -category outcome; %*get the number of partcipants analyzed by arm and total; proc transpose data=_&var.total out=_&var.n (drop=_name_) prefix=n; var count; run; data npar(keep=grpord roword varinfo value1-value&nDataCol ); length grpord roword 8 varinfo $35 value1-value&nDataCol $40; retain grpord 1; set _&var.n; array val value1-value&nDataCol; array num n1-n&nDataCol; if _n_ =1 then do; do i=1 to 2; if i=1 then do; varinfo = "Number of Participants Analyzed"; do j=1 to &nDataCol; val(j) = ' '; end; end; else do; varinfo="[units: participants]"; do i=1 to &nDataCol; val(i) = put(num(i), 5.); end; end; roword + 1; output; end; end; run; proc append base=&outds data=npar; run; %*get the number of partcipants and total number in each category of the variable by arm and combined arms; proc transpose data=_&var.ctfc out=_&var (drop=_name_) prefix=n; var count; by &var; run; data _&var; set _&var; array num n1-n&nArm; do i=1 to &nArm; if num(i)=. then num(i)=0; end; n&nDataCol=sum(of n1-n&nArm); drop i; run; %*using above summarized results for a variable to generate variable info column and data value columns to generate final dataset for creating the result tables required by clinicalTrial.gov; data _&var.cnt(keep=grpord roword varinfo value1-value&nDataCol); length grpord roword 8 varinfo $35 value1-value&nDataCol $40; retain grpord 2; set _&var; array val value1-value&nDataCol; array num n1-n&nDataCol; n&nDataCol=sum(of n1-n&nArm); if _n_ =1 then do; do i=1 to 2; if i=1 then varinfo = "&varTitle"; else varinfo="[units: &measureUnit]"; do j=1 to &nDataCol; val(j) = ' '; end; roword + 1; output; end; end; %if &dataFmt = none %then %do; varinfo = trim(&var); %end; %else %do; %let fmt=%sysfunc(trim(&dataFmt..)); varinfo = trim(put(&var, &fmt)); %end; do i=1 to &nDataCol; val(i) = put(num(i), 5.); end; roword + 1; output; run; %* add result data to final data set; proc append base=&outds data=_&var.cnt; run; proc datasets library=work nolist; delete _&var.ctf _&var.ctfc _nVarbynArm _&var.total _&var _&var.n _&var.cnt ; run;quit; %end; %end; %*for continuous variable; %else %if &dataType = continuous %then %do; proc means data=&inds noprint; class &arm; var &var; output out=_&var.byArm n=&var._N mean=&var._Mean median=&var._Median std=&var._StdDev qrange=&var._QRange q1=&var._Q1 q3=&var._Q3 range=&var._Range min=&var._Min max=&var._Max stderr=&var._SE ; run; data _&var.byArm ; set _&var.byArm; if &arm=. then &arm=100; run; proc sort data=_&var.byArm; by &arm;run; %*get total Number of Participants Analyzed; proc transpose data=_&var.byArm out=_&var.n prefix=n; var &var._N ; run; data npar(keep=grpord roword varinfo value1-value&nDataCol); length grpord roword 8 varinfo $35 value1-value&nDataCol $40; retain grpord 1; set _&var.n; array val value1-value&nDataCol; array num n1-n&nDataCol; if _n_ =1 then do; do i=1 to 2; if i=1 then do; varinfo = "Number of Participants Analyzed"; do j=1 to &nDataCol; val(j) = ' '; end; end; else do; varinfo="[units: participants]"; do i=1 to &nDataCol; val(i) = put(num(i), 5.); end; end; roword + 1; output; end; end; run; %* add number of participants by arm and total to output data set; proc append base=&outds data=npar; run; %*transpose to get stat by arm as 1 record: measure1-meansure&nDataCol, measure1 is the stat for arm1 and meansure&nDataCol is the stat for total participants; proc transpose data=_&var.byArm out=_&var.measure prefix=measure; var &var._&measureType; run; %*get stats for total, use _id_ for merge; data _&var.measure; set _&var.measure; _id_=1; measureType="&measureType"; drop _name_ ; run; %*get dispersion data base on type; %if %bquote(&dispersionType)=%str(Standard Deviation) %then %do; proc transpose data=_&var.byArm out=_&var.dispersion prefix=StdDev; var &var._StdDev; run; data _&var.dispersion; set _&var.dispersion; _id_=1; dispersionType="&dispersionType"; drop _name_; run; %end; %else %if %bquote(&dispersionType) = %str(Inter-Quartile Range) or %bquote(&dispersionType) = %str(Interquartile Range) %then %do; proc transpose data=_&var.byArm out=_&var.QRange prefix=QRange; var &var._QRange; run; proc transpose data=_&var.byArm out=_&var.Q1 prefix=Q1; var &var._Q1; run; proc transpose data=_&var.byArm out=_&var.Q3 prefix=Q3; var &var._Q3; run; data _&var.QRange; set _&var.QRange; _id_=1; drop _name_;run; data _&var.Q1; set _&var.Q1; _id_=1; drop _name_;run; data _&var.Q3; set _&var.Q3; _id_=1; drop _name_;run; data _&var.dispersion; merge _&var.QRange _&var.Q1 _&var.Q3; by _id_; dispersionType="&dispersionType"; run; %end; %else %if &dispersionType=Range %then %do; proc transpose data=_&var.byArm out=_&var.Range prefix=Range; var &var._Range; run; proc transpose data=_&var.byArm out=_&var.Min prefix=Min; var &var._Min; run; proc transpose data=_&var.byArm out=_&var.Max prefix=Max; var &var._Max; run; data _&var.Range; set _&var.Range; _id_=1; drop _name_;run; data _&var.Min; set _&var.Min; _id_=1; drop _name_;run; data _&var.Max; set _&var.Max; _id_=1; drop _name_;run; data _&var.dispersion; merge _&var.Range _&var.Min _&var.Max; by _id_; dispersionType="&dispersionType"; run; %end; %else %if &dispersionType=%str(Standard Error) %then %do; proc transpose data=_&var.byArm out=_&var.se prefix=SE; var &var._SE; run; data _&var.dispersion; set _&var.se; _id_=1; dispersionType="&dispersionType"; drop _name_; run; %end; %else %if %bquote(&dispersionType) = %str(95%% Confidence Interval) or %bquote(&dispersionType) = %str(90%% Confidence Interval) %then %do; proc transpose data=_&var.byArm out=_&var.mean prefix=mean; var &var._mean; run; proc transpose data=_&var.byArm out=_&var.se prefix=se; var &var._se; run; data _&var.mean; set _&var.mean; _id_=1; drop _name_ ;run; data _&var.se; set _&var.se; _id_=1; drop _name_ ;run; data _&var.dispersion; merge _&var.mean _&var.se; by _id_; array mean mean1-mean&nDataCol; array se se1-se&nDataCol; array lclm lclm1-lclm&nDataCol; array uclm uclm1-uclm&nDataCol; %if %bquote(&dispersionType) = %str(95%% Confidence Interval) %then %do; alpha=.05; %end; %else %do; alpha=.1; %end; do i=1 to &nDataCol; z=probit(1-alpha/2); lclm(i)=mean(i)- z*se(i); uclm(i)=mean(i)+ z*se(i); end; dispersionType="&dispersionType"; drop i; %end; %*merge measureType and dispersionType by arm into one dataset; data _&var; merge _&var.measure _&var.dispersion; by _id_; drop _id_; run; %*create final dataset with specific data contents, which is used to generate result table; data _&var.msDpn(keep=grpord roword varinfo value1-value&nDataCol); length grpord roword 8 varinfo $35 value1-value&nDataCol $40; retain grpord 2; set _&var; array val value1-value&nDataCol; array measure measure1-measure&nDataCol; if dispersionType='Standard Deviation' then do; array stdd stdDev1-stdDev&nDataCol; end; else if dispersionType='Standard Error' then do; array se se1-se&nDataCol; end; else if dispersionType='Inter-Quartile Range' or dispersionType='Interquartile Range' then do; array qrg qrange1-qrange&nDataCol; array q1 q11-q1&nDataCol; array q3 q31-q3&nDataCol; end; else if dispersionType='Range' then do; array rg range1-range&nDataCol; array mn min1-min&nDataCol; array mx max1-max&nDataCol; end; else if dispersionType='95% Confidence Interval' or dispersionType='90% Confidence Interval' then do; array lower lclm1-lclm&nDataCol; array upper uclm1-uclm&nDataCol; end; do i=1 to 3; select (i); when (1) do; varinfo = "&varTitle"; do j=1 to &nDataCol; val(j) = ' '; end; end; when (2) do; varinfo="[units: &measureUnit]"; do j=1 to &nDataCol; val(j) = ' '; end; end; otherwise do; if /*measureType='Mean' and */ dispersionType='Standard Deviation' then do; varinfo=measureType||" ± Standard Deviation"; do j=1 to &nDataCol; *may replace 4.1 with a paramter for data format; val(j) = compbl(put(measure(j), 4.1)||' ±'||put(stdd(j), 4.1)); end; end; else if /*measureType='Mean' and */ dispersionType='Standard Error' then do; varinfo=measureType||" ± Standard Error"; do j=1 to &nDataCol; *may replace 4.1 with a parameter for data format; val(j) = compbl(put(measure(j), 4.1)||' ±'||put(se(j), 4.1)); end; end; else if /*measureType='Median' and */ (dispersionType='Inter-Quartile Range' or dispersionType='Interquartile Range') then do; varinfo=measureType||", IQR (Q1, Q3)"; do j=1 to &nDataCol; val(j) = compbl(put(measure(j), 4.1)||', '||put(qrg(j), 4.1)||' ('||put(q1(j), 4.1)||', '||put(q3(j), 4.1)||')'); end; end; else if /*measureType='Median' and */ dispersionType='Range' then do; varinfo=measureType||", Range (Min, Max)"; do j=1 to &nDataCol; val(j) = compbl(put(measure(j), 4.1)||', '||put(rg(j), 4.1)||' ('||put(mn(j), 4.1)||', '||put(mx(j), 4.1)||')'); end; end; else if /*measureType='Mean' and */ dispersionType="95% Confidence Interval" then do; varinfo=measureType||" (95% CI)"; do j=1 to &nDataCol; val(j) = compbl(put(measure(j), 4.1)||' ('||put(lower(j), 4.1)||', '||put(upper(j), 4.1)||')'); end; end; else if /*measureType='Mean' and */ dispersionType="90% Confidence Interval" then do; varinfo=measureType||" (90% CI)"; do j=1 to &nDataCol; val(j) = compbl(put(measure(j), 4.1)||' ('||put(lower(j), 4.1)||', '||put(upper(j), 4.1)||')'); end; end; end; end; roword + 1; output; end; run; %*add resuts to output data set; proc append base=&outds data=_&var.msDpn; run; proc datasets library=work nolist; delete _&var _&var.byarm _&var.n _&var.mean _&var.min _&var.max _&var.q1 _&var.q3 _&var.range _&var.qrange npar _&var.se _&var.measure _&var.dispersion _&var.msDpn ; run;quit; %end; %mend getStats; *macro to extract an arm title from input armTitle separated by * in a data step, and put it in a global macro variable called armTitle1. Parameters: armTitle=specifies all the arm titles to be displayed in the arm title headers of the final result table with each arm title separated by * i = i-th position (delimiter is *); %macro get1ArmTitle(armTitle=, i=); data _null_; armTitle1=compbl(scan("&armTitle", &i,'*')); call symput('armTitle1', trim(compbl(armTitle1))); run; %mend;