Upcoming event

A Narrative Review of Patient-reported Outcomes in Overactive Bladder: What is the Way of the Future?

  • Christopher R. Chapple 1,
  • Con J. Kelleher 2,
  • Chris J. Evans 3,
  • Zoe Kopp 3,
  • Emad Siddiqui 4,
  • Nathan Johnson 3,
  • Morgan Mako 3
1 Department of Urology Research, University of Sheffield, Sheffield, England, UK 2 Guys and St. Thomas’ Hospitals, London, UK 3 Endpoint Outcomes, Boston, MA, USA 4 Astellas Pharma Europe Ltd, Chertsey, UK

Take home message

A brief, overactive bladder symptom and health-related quality of life assessment with weekly recall has the potential to accurately characterize disease burden compared with a diary alone, and improve and standardize efficacy detection in clinical trials and ease patient burden.

PII: S0302-2838(16)30143-9

DOI: 10.1016/j.eururo.2016.04.033

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref

The International Continence Society defines overactive bladder (OAB) symptom complex as “urinary urgency, usually accompanied by frequency and nocturia, with or without urgency urinary incontinence (UUI), in the absence of urinary tract infection or other obvious pathology” [1]. This symptom-based definition is a useful starting point in terms of diagnosing patients; however, in terms of evaluating the impact of interventions, it fails to address what is most important to patients. Patients seek treatment because their symptoms affect their health-related quality of life (HRQoL) [2]. Given the heterogeneity of symptoms and multifaceted impact of OAB, measurement of outcomes in clinical trials is complicated, and researchers are confronted with the problem of balancing basic assessment with obtaining a comprehensive picture of patient outcomes [3]. Goldman et al [4] highlighted the lack of formal guidance and the significant heterogeneity of both response and nonresponse definitions in a systematic review of OAB treatment endpoints. Goldman et al [4] reports on the heterogeneity of symptom-based and patient-reported outcome measures (PROMs)-based definitions of treatment response/nonresponse. For example, while most studies defined UUI treatment response as a 50–100% reduction in UUI episodes [4], others included a reduction of ≥2 episodes/wk [5], ≥50% reduction in incontinence pad weight [6], an increase in ≥1 continent d/wk [5], or 3–7 consecutive dry d [7]. The symptoms of urgency and frequency have also been used as endpoints with similar heterogeneity in the criteria used for definitions of success.

As evidenced by the above discussion, by recording frequency, volume, and number of incontinence episodes the bladder diary is at the core of every OAB assessment and represents the gold standard investigation [8]. Additional information may include the number of pads used and quantity of fluid intake [9]. The diary is clearly a useful tool not only in the initial patient evaluation as it allows clinicians to appropriately diagnose and plan an intervention, but also in objectively defining response to therapy. See Figure 1 for an overview of recommended endpoints in OAB.


To capture the impact of symptoms on patients, several psychometrically-validated PROMs exist [10]. These include the Overactive Bladder Symptom Score (OABSS) [11], the Overactive Bladder Questionnaire (OAB-q) [12], the King's Health Questionnaire [13], and the Patient Perception of Bladder Condition [14]. PROMs are routinely included as secondary endpoints in trials alongside diaries [15]. While some trials rely solely on primary nonbladder diary-based PROMs endpoints [16], other frequently used PROMs include global assessments, satisfaction, and goal attainment scaling [17].

To understand, support, and inform the development of a new multidimensional PROMs that could be used to replace bladder diaries as a primary or key secondary endpoint in clinical trials, we conducted a review of literature published within the past 10 yr on OAB treatment-response assessments. In particular, addressing the key issues of: (1) whether the definition of treatment response/nonresponse should include a symptom assessment, (2) should PROMs provide information about whether a reduction in symptoms actually improves patients’ lives, and (3) use of measures of treatment satisfaction and goal achievement. We believe that if a new multidimensional measure can be developed, then standardization of response definitions would allow for cross-trial comparisons and remove the confusion caused by individual symptom reporting while collecting data that are meaningful to both patients and practitioners.

We conducted a narrative review of OAB literature available in the PubMed database. If an article that satisfied the study inclusion criteria was identified, two members of the research team (Kopp and Evans) reviewed the article's abstract for inclusion. If the two authors agreed, the full-text article was retrieved for analysis. A full-text article was excluded if its focus was not related to OAB outcome measures. The two researchers had to agree before an article was excluded. The goals of the search were articles that examined bladder diary utility compared with other PROMs, the presence of placebo effects, patient burden in completing daily diaries, appropriate recall, recommendations for endpoints in OAB trials, and how other therapeutic areas utilize diaries and PROMs.

Inclusion criteria included: (1) published January 1, 2004 to January 22, 2016, (2) written in English, and (3) contain key search terms in the title or abstract. Key search terms included: overactive bladder, lower urinary tract dysfunction, lower urinary tract symptoms, urinary incontinence, urge urinary incontinence AND randomized controlled trial, bladder diary, voiding diary, urinary diary, patient-reported outcomes, patient satisfaction, global assessment scale, placebo-effect, treatment response, and quality of life. In addition, we examined literature in other chronic diseases in which treatment response has historically been determined by patient reporting via diaries. A systematic review of OAB literature was not completed, as we were specifically interested in the assessment of treatment response in clinical trials.

Figure 2 outlines the search results of the review. Ultimately, 80 articles were included in the review.


3.1. Placebo and training effects in OAB trials

Clinical trials for the treatment of OAB have noted a significant response in patients treated with placebo [18]. According to Mangera et al [19], bladder diaries may influence treatment outcomes in randomized controlled trials (RCTs) of treatment with antimuscarinic agents because of the unique contribution bladder diaries have toward the placebo effect. One issue is experimental subordination, where a patient answers subjective questions in a way that is seen to please their physician [19]. Also, as OAB constitutes a complex of symptoms, behaviors, and behavior modifications, a bladder training effect is apparent when visual feedback of performance trains the patient to change their behavior [20]. This has been recognized in the American Urological Association/Society of Urodynamics, Female Pelvic Medicine, and Urogenital Reconstruction OAB Diagnosis and Treatment Guidelines [21] that note that a self-monitoring effect may occur as a daily diary makes patients aware of their voiding habits. A placebo response is evident from this survey in clinical trials of OAB, as seen in Table 1.

Table 1

Placebo and training effects in overactive bladder randomized controlled trials Error! Bookmark not defined [19].

 

Outcome No. of studies No. of patients given placebo Mean change (SD) p value
Incontinence episodes/d 12 1847 –1.12 (0.59) <0.001
Micturition episodes/d 11 1938 –1.04 (0.8) 0.0016
Urgency episodes/d 3 928 –1.15 (1.74) 0.37
Mean micturition volume (ml) 11 1854 10.61 (12.9) 0.02
Maximum cystometric capacity (ml) 6 208 –16.87 (9.99) 0.009

SD = standard deviation.

3.2. Correlations between PRO measures and bladder diary endpoints

Significant correlations between widely-used PROMs and bladder diary endpoints exist within OAB literature. The OABSS, for example, consists of the sum score of four symptom items: daytime frequency, nighttime frequency, urgency, and UUI [11]. In the original validation, the actual number of daytime and nighttime urinations were gathered and urgency and UUI were assessed with a frequency scale. Each symptom score correlated positively with the OABSS (rs = 0.10–0.78). In a comparison study of the OABSS to a 3-d bladder diary [22], statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001) were found with high correlations (rs ≥ 0.5) between score changes in nighttime frequency and UUI. Consequently, the OABSS is an alternative to a diary for assessment in clinical practice. The OAB-q is a validated 33-item symptom bother and HRQoL questionnaire [12]. The coping and social interactions subscales significantly correlate with the number of urinations per day (r = –0.20 and –0.23 respectively, p = 0.02). The sleep subscale and number of urinations per night were highly correlated (r = –0.50, p < 0.0001). A validation study comparing the 1-wk and 4-wk versions of the OAB-q to a 3-d diary, found moderate to strong correlations between the OAB-q subscales and nearly all diary variables [23].

The Overactive Bladder Awareness Tool (OAB-V8) is a validated 8-item instrument [24]. In the validation of the OAB-V8, clinical variables of urgency, nocturia, and daytime frequency were collected with a bladder diary and compared with OAB-V8 scores; the OAB-V8 performed well with high sensitivity (0.96) and specificity (0.827).

The Questionnaire-Based Voiding Diary (QVD) is another validated instrument with a high correlation to a 48-h bladder diary [25] and [26]. The sensitivity, specificity, and positive likelihood ratio of the QVD for diagnosis of UUI were 0.82, 0.79, and 4.0, respectively. The authors conclude that the QVD is a useful alternative to the bladder diary. See Table 2 for a summary of correlations between PROMs and bladder diary endpoints.

Table 2

Correlations between patient-reported outcome measures and bladder diary endpoints

 

Measure Correlations
OABSS [11] and [22] • OABSS compared with a 3-d bladder diary
• Statistically significant improvements in all OABSS and corresponding bladder diary variables (p < 0.001 for all variables)
• High correlations (Spearman's rho ≥ 0.5) between score changes in nighttime frequency and urgency incontinence
• Urgency and daytime frequency correlation coefficients were (r = 0.40, p < 0.001) and (r = 0.26, p < 0.001), respectively, demonstrating low to moderate correlation with their corresponding bladder diary variables
OAB-q/V8 [23] and [24] • OAB-q scores compared with both urgency, daytime frequency, and nocturia with 1-wk bladder diary and urogynecologist diagnosis
• Coping and social interactions subscales were significantly correlated with the no. of urinations/d (r = –0.20 and –0.23, respectively, p = 0.02). The sleep subscale and no. of urinations per night were highly correlated (r = –0.50, p < 0.0001)
• OAB V-8 is an 8-item version of OAB-q; OAB-V8 bothersomeness scores compared with bladder diary and clinician diagnosis
QVD [25] • Four QVD subscale (type and amount of fluid intake, urinary output, urinary symptoms, and fluid intake behavior) demonstrated high correlations with a 48-h bladder diary
• Correlation between QVD fluid intake and bladder diary was high (r = 0.65–0.83, p < 0.01)
• High correlation between fluid intake behavior and urinary frequency (r = 0.82, p < 0.01), urgency (r = 0.77, p < 0.01), and urge incontinence (r = 0.71, p < 0.01)

OABSS = Overactive Bladder Symptom Score; OAB-q/V8 = Overactive Bladder Awareness Tool; QVD = Questionnaire Based Voiding Diary.

3.3. Burden, over/underestimation, and lack of validation

Several publications highlight issues regarding the burden of, lack of compliance with, and overestimation of symptom frequency using bladder diaries. Diaries place a large inconvenience on patients [22] and [27]. In one study, compliance with diaries was found to be high in the office setting, yet 52% of patients demonstrated issues with adherence to instructions at home [28]. In another study, only 47% of women (p = 0.01) were found to accurately report daytime frequency using a diary [29]. Other studies of many patients overestimated or underreported nighttime frequency using a diary when compared with a medical chart [30] and [31].

Although bladder diaries are considered to be the gold standard for OAB diagnosis and remain useful in clinical practice and research, they lack validation and vary greatly in terms of content, format, and duration of recall period. In 2011, Bright et al [32] conducted a review of 81 studies using bladder diaries and concluded that, at that time, no validated urinary diary existed. See Table 3 for a summary of burden, over/underestimation, and lack of validation in bladder diaries.

Table 3

Burden, over/underestimation, recall, and lack of validation in bladder diaries

 

Burden • Patients must keep the diary for several consecutive days
• In one study, 52% of patients had issues with adherence to instructions for proper use at home [28]
Over/underestimation • In one study, only 47% of women were found to accurately report daytime urinary frequency using a bladder diary [29]
• Other studies of male-only and female patient reports may overestimate or underreport the frequency of nocturia using a bladder diary [30] and [31]
Recall period • In general for PRO measures, shorter recall periods are considered better as rating variance increases the longer the delay there is between an event/experience occurring and the reporting of it [34]
Lack of validation • Diaries vary greatly in terms of content, format, and duration of recall period
• Only one bladder diary has been evaluated for criterion and construct validity, reliability, and responsiveness [32]

PRO = patient reported outcome.

3.4. Recall periods

In diagnosing OAB, patients’ completion of the diary for 2–3 d has been recommended [33], other recommendations in literature range from 24 h to 2 wk [9]. In clinical trials it is common to complete diaries for 3–7 d. In general, shorter recall periods are considered better than longer recall periods as rating variance increases the longer the delay there is between an event and the reporting of it [34]. However, researchers have found that 1-wk diaries are as reliable as 2-wk diaries and a comparison of a 5-d diary to a 24-h diary found the 24-h diary overestimated the maximum volume voided [35] and [36].

Recall periods in other chronic, symptomatic conditions were reviewed. In pain and fatigue assessments, when momentary reports were compared with recalled reports (over 1–28 d) substantial concordance was found between reports, suggesting that longer recall periods do not necessarily lead to substantially less accurate results [37]. Research in cancer pain confirms that 24-h recall and 7-d recall can be highly correlated [38]. Conversely, there is some evidence, in pain, that a 7-d window may more accurately characterize a patient's condition than the assessment of their current status [39]. See Table 3 for a summary of recall periods in bladder diaries.

The International Consultation on Incontinence Research Society highlighted the need for a standardized measure in all outcome evaluations to increase comparability and standardize the assessment between different treatment evaluations in different populations [3]. The International Consultation on Incontinence Research Society recommends that a comprehensive evaluation should encompass satisfaction, symptoms, HRQoL, and adverse events as elements of a minimum in any outcome measurement. It is of note that OAB clinical trials have reported individual symptoms in isolation (eg, frequency) as primary outcomes; however, this approach may neither portray true therapeutic outcomes nor reflect what matters most to patients [2]. Instead, the use of composite endpoints may more accurately reflect the nature of OAB symptoms and correlate better with improved patient HRQoL, treatment satisfaction, and persistence; thereby harmonizing the reporting of trial data by removing confusion caused by individual symptom reporting.

3.6. Endpoints in similar syndrome-defined conditions

We also examined literature in relevant therapeutic areas and syndrome-defined chronic conditions (eg, restless legs syndrome [RLS]) that are patient identified and that have relied on diaries to gather symptom response. In interstitial cystitis/bladder pain syndrome where investigators historically have relied on diaries to assess treatment, our review reveals a change in interstitial cystitis/bladder pain syndrome endpoints. In a 2014 phase 3 RCT for the treatment of interstitial cystitis, investigators used the O’Leary-Sant questionnaire as primary outcome measures instead of a diary [40].

Benign prostatic hyperplasia (BPH) relies on PROMs as a primary endpoint. In a recent RCT to compare monotherapy versus combination therapy for OAB symptoms induced by BPH, the primary endpoint was a total change in OABSS score [41]. Secondary endpoints included the change in both OABSS and total International Prostate Symptom Score. A systematic review of solifenacin/tamsulosin in therapy for patients with BPH reveals widespread utilization of the International Prostate Symptom Score as a coprimary endpoint alongside diaries [42]. RCTs of treatments for RLS now routinely rely on the use of PROMs to document treatment efficacy, tolerability, symptom severity, and improvement. Allen et al [43] compared treatments for RLS using PROMs instead of traditional diary outcomes. Similarly, other pharmacological trials have defined RLS treatment response in terms of PROMs endpoints [44] and [45].

Tension headache and migraine have historically relied on the use of diaries for diagnosis and treatment. Clinical studies now incorporate PROMs as primary, coprimary, and secondary endpoints. Widely used PROMs with correlations to diaries include the Migraine Disability Assessment and Headache Impact Test [46].

This review emphasizes the limitations of the traditional use of bladder diaries as primary endpoints in OAB trials. While diaries play an important role in diagnosis, the results highlight that diaries allow for a unique bladder-training effect and contribute to the placebo effect seen in clinical trials. As there is a strong correlation between existing PROMs and diaries, the development of a new PROM as an alternate existing measures and diaries for assessing treatment outcome will bring added value. Such a tool would provide better understanding of OAB treatment efficacy. We acknowledge, however, that issues with current instruments exist. The commonly used questionnaires were developed prior to current European Medicines Agency, US Food and Drug Administration, and International Society for Pharmacoeconomics and Outcomes Research guidelines for the development and validation of PRO measures [47], [48], and [49]. Also, there is no standard recommendation for the most appropriate recall period to use in any study, although the recall period used should match the purpose of the study. A new measure appropriately developed with a longer recall period could reduce patient burden and lead to better overall compliance with recording their symptoms.

Existing PROMs would serve as a starting point for the development of a new PROM that would correlate strongly with all aspects of a bladder diary, would quantify OAB symptoms, and incorporate evaluation of satisfaction and HRQoL.

A measure that incorporates key symptoms measured in a diary and assesses impact on the patient such as HRQoL and satisfaction measures would offer advantages over existing assessments. Firstly, if the recall period is extended from momentary assessment to weekly the training effect could be reduced as the frequency of assessment is decreased. Secondly, the incorporation of a HRQoL assessment may reduce the placebo effect as it may be more difficult to subconsciously change behavior to improve HRQoL outcomes. We recognize that this is theoretical, and the placebo effect will not completely disappear; however, a brief, symptom, and HRQoL assessment utilizing a weekly recall has the potential to more accurately characterize disease burden compared with a diary alone, improve on efficacy detection in clinical trials, and provide a less burdensome method for patients to record their OAB complaints.


Author contributions: Christopher R. Chapple had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Acquisition of data: Evans, Kopp, Johnson, Mako.

Analysis and interpretation of data: Siddiqui, Chapple, Kelleher, Evans, Kopp, Johnson, Mako.

Drafting of the manuscript: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Critical revision of the manuscript for important intellectual content: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Statistical analysis: Evans, Kopp, Johnson, Mako.

Obtaining funding: Siddiqui.

Administrative, technical, or material support: Siddiqui, Evans, Kopp, Johnson, Mako.

Supervision: Chapple, Kelleher, Evans, Kopp, Siddiqui, Johnson, Mako.

Other: None.

Financial disclosures: Christopher R. Chapple certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.

Funding/Support and role of the sponsor: Astellas.

Acknowledgments: Bladder Assessment Tool Advisory Committee: Pamela Brandt, Chris Chapple, Chris Evans, Zalmai Hakimi, Yukio Homma, Con Kelleher, Kathleen Kobashi, Zoe Kopp, Chris Payne, and Emad Siddiqui.

  • [1] B.T. Haylen, D. de Ridder, R.M. Freeman, et al. An International Urogynecological Association (IUGA)/International Continence Society (ICS) joint report on the terminology for female pelvic floor dysfunction. Neurourol Urodyn. 2010;29:4-20
  • [2] C.K. Payne, C. Kelleher. Redefining response in overactive bladder syndrome. BJU Int. 2007;99:101-106 Crossref
  • [3] N. Cotterill, H. Goldman, C. Kelleher, Z. Kopp, A. Tubaro, L. Brubaker. What are the best outcome measures when assessing treatment for LUTD?—Achieving the most out of outcome evaluation: ICI-RS 2011. Neurourol Urodyn. 2012;31:400-403 Crossref
  • [4] H.B. Goldman, J.J. Wyndaele, S.A. Kaplan, J.T. Wang, F. Ntanios. Defining response and non-response to treatment in patients with overactive bladder: a systematic review. Curr Med Res Opin. 2014;30:509-526 Crossref
  • [5] S. Colman, C. Chapple, V. Nitti, C. Haag-Molkenteller, C. Hastedt, U. Massow. Validation of Treatment Benefit Scale for assessing subjective outcomes in treatment of overactive bladder. Urology. 2008;72:803-807 Crossref
  • [6] M.M. South, A.A. Romero, M.G. Jamison, D.G. Webster, C.L. Amundsen. Detrusor overactivity does not predict outcome of sacral neuromodulation test simulation. Int Urogynecol J. 2007;18:1395-1398 Crossref
  • [7] E.P. Armstrong, D.C. Malone, C.N. Bui. Cost-effectiveness analysis of anti-muscarinic agents for the treatment of overactive bladder. J Med Econ. 2012;15(Suppl1):35-44 Crossref
  • [8] M.G. Lucas, R.J. Bosch, F.C. Burkhard, et al. EAU guidelines on assessment and nonsurgical management of urinary incontinence. Eur Urol. 2012;62:1130-1142 Crossref
  • [9] R. Basra, C. Kelleher. Disease burden of overactive bladder: quality-of-life data using ICI-recommended instruments. Pharmacoeconomics. 2007;25:129-142 Crossref
  • [10] K.S. Coyne, A. Tubaro, L. Brubaker, T. Bavendam. Development and validation of patient-reported outcomes measures for overactive bladder: a review of concepts. Urology. 2006;68(Suppl2A):9-16 Crossref
  • [11] Y. Homma, M. Yoshida, N. Seki, et al. Symptom assessment tool for overactive bladder syndrome—overactive bladder symptom score. Urology. 2006;68:318-323 Crossref
  • [12] K. Coyne, D. Revicki, T. Hunt, et al. Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res. 2002;11:563-574 Crossref
  • [13] C.J. Kelleher, L.D. Cardozo, V. Khullar, S. Salvatore. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol. 1997;104:1374-1379 Crossref
  • [14] K.S. Coyne, L.S. Matza, Z. Kopp, P. Abrams. The validation of the patient perception of bladder condition (PPBC): a single-item global measure for patients with overactive bladder. Eur Urol. 2006;49:1079-1086 Crossref
  • [15] I. But, S. Oreskovic, D. Bratus, M.2 Šprem-Goldštajn, G. Hlebič. Patient-reported outcome of solifenacin treatment among women experiencing urinary urgency and urgency incontinence. Int J Gynecol Obstet. 2014;124:19-23 Crossref
  • [16] A.D. Garely, J.M. Kaufman, P.K. Sand, N. Smith, M. Andoh. Symptom bother and health-related quality of life outcomes following solifenacin treatment for overactive bladder: the VESIcare Open-Label Trial (VOLT). Clin Ther. 2006;28:1935-1946 Crossref
  • [17] L. Brubaker, E.C. Piault, S.E. Tully, et al. Validation study of the Self-Assessment Goal Achievement (SAGA) questionnaire for lower urinary tract symptoms. Int J Clin Pract. 2013;67:342-350 Crossref
  • [18] S. Lee, B. Malhotra, D. Creanga, M. Carlsson, P. Glue. A meta-analysis of the placebo response in antimuscarinic drug trials for overactive bladder. BMC Med Res Methodol. 2009;9:55
  • [19] A. Mangera, C.R. Chapple, Z.S. Kopp, M. Plested. The placebo effect in overactive bladder syndrome. Nat Rev Urol. 2011;8:495-503 Crossref
  • [20] K.L. Burgio. Current perspectives on management of urgency using bladder and behavioural training. J Am Acad Nurse Pract. 2004;16:4-7
  • [21] E.A. Gormley, D.J. Lightner, K.L. Burgio, et al. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFA guideline. J Urol. 2012;188(Suppl6):2455-2463 Crossref
  • [22] Y. Homma, H. Kakizaki, O. Yamaguchi, et al. Assessment of overactive bladder symptoms: comparison of 3-day bladder diary and the overactive bladder symptoms score. Urology. 2011;77:60-64 Crossref
  • [23] K. Coyne, H. Gelhorn, C. Thompson, Z. Kopp, Z. Guan. The psychometric validation of a 1-week recall period for the OAB-q. Int Urogynecol J. 2011;22:1555-1563 Crossref
  • [24] K.S. Coyne, T. Zyczynski, M.K. Margolis, V. Elinoff, R.G. Roberts. Validation of an Overactive Bladder Awareness Tool for use in primary care settings. Adv Ther. 2005;22:381-394 Crossref
  • [25] L.A. Arya, C. Banks, M. Gopal, G.M. Northington. Development and testing of a new instrument to measure fluid intake, output, and urinary symptoms: the questionnaire-based voiding diary. Am J Obstet Gynecol. 2008;193 559.e1–7
  • [26] L.A. Arya, H. Heidi, L. Cory, S. Segal, G.M. Northington. Construct validity of a questionnaire to measure the type of fluid intake and type of urinary incontinence. Neurourol Urodyn. 2011;30:1597-1602 Crossref
  • [27] J.H. Ku, I.G. Jeong, D.J. Lim, S.S. Byun, J.S. Paick, S.J. Oh. Voiding diary for the evaluation of urinary incontinence and lower urinary tract symptoms: prospective assessment of patient compliance and burden. Neurourol Urodyn. 2004;23:331-335 Crossref
  • [28] R.N. Pauls, E. Hanson, C.C. Crisp. Voiding diaries: adherence in the clinical setting. Int Urogynecol J. 2015;26:91-97 Crossref
  • [29] K. Stav, P.L. Dwyer, A. Rosamilia. Women overestimate daytime urinary frequency: the importance of the bladder diary. J Urol. 2009;181:2176-2180 Crossref
  • [30] I. Yalcin, R.C. Bump. The effect of previous treatment experience and incontinence severity on the placebo response of stress urinary incontinence. Am J Obstet Gynecol. 2004;191:194-197 Crossref
  • [31] S.S. Robb. Urinary incontinence verification in elderly men. Nurs Res. 1985;34:278-282
  • [32] E. Bright, M.J. Drake, P. Abrams. Urinary diaries: evidence for the development and validation of diary content, format and duration. Neurourol Urodyn. 2011;30:348-352 Crossref
  • [33] S.P. Marinkovic, R.M. Moldwin, S.L. Stanton, L.M. Gillen, C.M. Marinkovic. The management of overactive bladder syndrome. BMJ. 2012;344:e2365 Crossref
  • [34] D.E. Stull, N.K. Leidy, B. Parasuraman, O. Chassany. Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin. 2009;25:929-942 Crossref
  • [35] J.F. Wyman, S.C. Choi, S.W. Harkins, M.S. Wilson, J.A. Fantl. The urinary diary in evaluation of incontinent women: a test-retest analysis. Obstet Gynecol. 1988;71:812-817
  • [36] C. Barnick. Urogynecology: The Kings Approach. (Churchill Livingstone, New York, NY, 1977)
  • [37] J.E. Broderick, J.E. Schwartz, G. Vikingstad, M. Pribbernow, S. Grossman, A.A. Stone. The accuracy of pain and fatigue items across different reporting periods. Pain. 2008;139:146-157 Crossref
  • [38] Q. Shi, P. Trasnk, X.S. Wang, et al. Does recall period have an effect on cancer patients’ rating of the severity of multiple symptoms?. J Pain Symptom Manage. 2010;40:191-199 Crossref
  • [39] Q. Shi, S. Wang, T.R. Mendoza, K.J. Pandya, C.S. Cleeland. Assessing persistent cancer pain: a comparison of current pain ratings and pain recalled from the past week. J Pain Symptom Manage. 2009;37:168-174 Crossref
  • [40] P.C. Bosch. A randomized, double-blind, placebo controlled trial of adalimumab for interstitial cystitis/bladder pain syndrome. J Urol. 2014;191:77-82 Crossref
  • [41] K. Ichihara, N. Masumori, F. Fukuta, T. Tsukamoto, A. Iwasawa, Y. Tanaka. A randomized controlled study of the efficacy of tamsulosin monotherapy and its combination with Mirabegron for overactive bladder induced by benign prostatic obstruction. J Urol. 2015;193:921-926 Crossref
  • [42] K. Dimitropoulos, S. Gravas. Solifenacin/tamsulosin fixed-dose combination therapy to treat lower urinary tract symptoms in patients with benign prostatic hyperplasia. Drug Des Devel Ther. 2015;9:1707-1716
  • [43] R.P. Allen, C. Chen, D. Garcia-Borrequero, et al. Comparison of pregabalin with pramipexole for restless legs syndrome. N Engl J Med. 2014;370:621-631 Crossref
  • [44] J. Zhang, B. Liu, Y. Zheng, T. Chu, Z. Yang. Pramipexole for Chinese people with primary restless legs syndrome: a 12-week multicenter, randomized, double-blind study. Sleep Med. 2015;16:181-185
  • [45] C.S. Lee, S.D. Lee, S.H. Kang, H.Y. Park, I.Y. Yoon. Comparison of the efficacies of oral iron and pramipexole for the treatment of restless legs syndrome patients with low serum ferritin. Eur J Neurol. 2014;21:260-266 Crossref
  • [46] W.F. Stewart, R.B. Lipton, K.B. Kolodner, J. Sawyer, C. Lee, J.N. Liberman. Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain. 2000;88:41-52 Crossref
  • [47] European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. (EMA, London, 2005)
  • [48] United States Food and Drug Administration. Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. (FDA, MD, 2009)
  • [49] D.L. Patrick, L.B. Burke, C.J. Gwaltney, et al. Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value Health. 2011;14:967-977 Crossref