Part 1: Data Analysis

Pharmacokinetic Parameters

(This description is a couple of years old and will soon be updated)

The calculation of log GSDs can be found on Microsoft Excel 5 files. At present there are 5 files (Database files 1-5) with chemical agents listed alphabetically. Each clinical study is presented on one sheet of the file. Each sheet shows the name of the agent and the source of the data, any relevant information about the sample and the calculation of log GSDs for the parameters presented in the original research article. The first column presents the pharmacokinetic parameter data for each subject. The adjacent column undertakes some correction for body weight if necessery by either presenting the body weight and dividing the response values of each patient by his/her body weight, or indicating that the dose was administered on a per body weight basis. The subsequent column log-transforms these values and the bolded value at the bottom of this column is the standard deviation of these log values. This log GSD value is transferred to the database. For some agents, probability or z-score plots can be found on the spreadsheets. In cases where data from multiple studies needed to be combined (such as in cases where more than one study contained information about a given chemical and its response parameter) the combinations were conducted on the sheet following the calculation of the individual log GSDs. The combined value was then transferred to the database

Pharmacodynamic parameters

Pharmacodynamic parameters were also analyzed on Microsoft Excel 5. To obtain the log GSD for the parameter, a maximum liklihood analysis was conducted using the optimization tools available through Excel. In cases where the data were available in the form of a dose response relationship (i.e., the number of people who showed a particular intensity of response as a function of dose and the number tested), we did maximum likelihood fits to a log probit dose response model using an adaptation of the Excel spreadsheet system published by Haas.( ) The principal adaptation we made to the Haas system was to place the maximum likelihood optimization on one sheet (sheet 1) of an Excel workbook while including the set-ups for upper and lower confidence limits on two parallel sheets of the workbook (sheets 2 and 3 respectively). This means that one file was used for the analysis of one chemical agent. This is more convenient for repeated runs of the likelihood optimizer for the derivation of a series of upper- and lower-cofidence limits (e.g., 5% fractile, 15% fractile, etc.) to assess the likelihood distribution for the Log(GSD).] The probit model is based on assumptions that (1) individual people will show a specific response when their individual tolerances (or thresholds) are exceeded, and (2) the distribution of individual thresholds in the population is lognormal. The reciprocal of the probit slope in such models is the Log(GSD) for the distribution of individual thresholds. (29,32) This value was then transferred to the database. In cases where there was a finite background incidence of response (in controls, without additional toxicant exposure), this background response rate was included as a separate parameter to be estimated in the model, and upper and lower confidence limits for the Log(GSD) were estimated allowing the estimate of the background rate to vary. The pharmacodynamic data files are labelled by the chemical name and information regarding the study and the populations under investigation can be found on the first sheet (the maximum liklihood sheet) of the file.

To facilitate access of data for a particular chemical agent, a list of all the agents with corresponding file names and sheet numbers is available. This file is labelled Contents and indicates the location of the data sets and the authors of the original research articles.

Part 2: Creating the Database

History of the Development of the Database

The collection and analysis of the database has been a collective effort over the past decade. The original paper documenting the results of the preliminary analysis was published in 1987 . The database has grown to contain 202 entries for both pharmacokinetic and pharmacodynamic parameters. The pharmacokinetic parameters are principally from studies of pharmaceuticals in normal healthy adults.

Criteria for admission of data sets and explanation of the parameters.

Individual studies were first screened for the use of populations that consisted solely of diseased sub-groups for example patients with renal failure, chronic smokers, or cancer patients. In cases where the diseased state significantly affected pharmacokinetic parameters the studies were not included in this analysis. The minimal sample size for inclusion of a set of data in the data base was five individual people. Moreover, the individual data points must be provided in either tables or in the form of a histogram. Papers that provided only summary statistics (means or standard deviations) were not included. The reason for this is that we later combine the data points of specific types from different studies in a normalized form to make comparisons with expectations derived from various theoretical distributions (e.g. lognormal, normal, log-logistic). Analysis for this paper was restricted to the pharmacokinetic parameters described in Table 2.

Table 2: Description of Pharmacokinetic Parameters
Parameter	Description
Half Life (T1/2), Clearance (CL)	T1/2: Amount of time required for the total amount of drug in the body or the plasma drug concentration to decrease by one half. Clearance: Intrinsic ability of the body or its organs (such as kidneys of liver) to remove drug from the blood or plasma
Maximal Blood Concentrations (Cmax)	Maximum blood concentrations produced by a given dose of drug during the dosing interval.
Area Under the Curve (AUC)	Calculated area under the curve in a plasma concentration versus time after exposure plot
Volume of Distribution (Vd)	Size of a compartment necessary to account for the total amount of drug in the body if it were present throughout the body at the same concentration as found in plasma

Calculation for data sets where individual data points are available In cases where individual data points were available, Log GSDs for a given parameter for a given agent were simply calculated as the standard deviation of the log transformed values. Individual data points were corrected for patient body weight and dose when this data was available.

Calculation for data sets from other sources Data in articles that were available in the form of charts such as histograms or graphs were analyzed by extracting the points from a scanned copy using the program Data Thief. The extraction was conducted three times and the average of the three readings was added to the database.

Combining data for multiple studies of the same parameter for the same chemical There are cases where multiple studies were conducted on the same chemical and measuring the same parameter. Consolidating this data allowed only one entry in the database per chemical. To calculate a combined log GSD for a given parameter, the log variances for each set of data were calculated independently and weighted by the sample size (N). These were summed and the combined log variances were converted back into a combined log GSD.

Assembly of the Database For every chemical and its corresponding log GSD that is entered into the database, the following information is added to its entry:

Administration route. This included intravenous (i.v.), oral, intramuscular (i.m.) and others.
Parameter measured. This included the pharmacokinetic parameter (as outlined in Table 2) or pharmacodynamic parameter as a measure of response.
Type of population studied. If the population contained any unique characteristics such as diseased states.
Total number of individuals in the data set (N). In the case of more than one study, the Ns of the samples were summed
Log P. The octanol-water partition co-efficient is a measure of the lipophilicity of the chemical. The method used to calculate this is outlined in the lipophilicity analysis.
Age range of population if available.
Log GSD (body weight) if available. This is the standard deviation of the log transformed values of body weight
Gender distribution (male participants only, mixed gender, female participants only).
Weighting by N. This weighting parameter was added to apportion the weights of the log GSDs to accurately represent the sample size (N) when regression analysis was conducted.
Target organ affected.
Therapeutic use the agent is most commonly associated with. Taken from Merck Index and information provided by the authors of the papers.
Reference of data.
A series of dummy variables were included to facilitate regression analysis. This included the presence of a contact rate, an uptake or absorption rate, a dilution via distribution volume and general systemic availability net of first pass elimination, availability/general systemic availability and lastly, functional reserve capacity .