Nowadays we find statistical jargon in almost all spheres of medicine. The hospitals and other educational institutions now have ethical committees and do not allow written compilation without any statistical data. A qualified physician is required to understand at least the basics of a subject one has never been formally taught but without any statistical data none of one's written work is going to be accepted. We come from a time when emphasis was on clinical examination and diagnosis and statistical references were only of evidential value referring to past or anecdotal cases to justify a diagnosis. According to a quote, "statistics are like bikinis, concealing what is vital and revealing much that is occasionally interesting." Clinical examination and a diagnostic exercise can never be perfect and a quality perceived cannot be expressed by a finite number. However, statistical representation can forecast a trend and be a pointer to the right diagnosis. In this era of dehumanised laboratory-based medical practice, statistical trends cut down unnecessary investigations. Statistics is specialised mathematics. The tests are good for verifying a clinical diagnosis, and the grafts and charts help in the indication of trends, justify the diagnosis, and a good pictorial representation of documentation. The clinical specialist is now advised to have a healthy scepticism for statistics and use statistical data, graft, charts, and pictorial diagrams when writing only. Still, numbers and prognostic percentages are important and today the common people can take a look at the internet any time to get a sense of gratification about the knowledge without understanding the full implications. Counselling near relatives has become different now.
Statistical methods help the clinician to find out many things --
1. The observed number in the local population,
2. Incidence in different sexes, ages, and ethnic groups,
3. Probability of % of incidence,
4. Distribution and the chi-squared test,
5. Meta-analysis of previous results,
6. Variations in the local population,
7. Variations in case-matched studies,
8. Estimate of an unknown parameter or the confidence
interval,
10. Significance & Prognostication,
11. Survival, and,
12. Probability of relapse.
These have to be remembered now and a working knowledge of the statistical methods helps in the diagnosis of unusual conditions. In many situations, we found that awareness of a probability helped in arriving at the right diagnosis. Broadly speaking medical statistics is either descriptive or inferential.
Number, distribution and variance:-
Simply speaking in medical terms numbers and distribution are varied and important as the more is the and number higher is the likelihood of occurrence of what one is trying to find and its distribution within a given population. Distribution can again be continuous or discrete and from 90% of normal distribution, a statistical inference can be drawn. Distribution is Gaussian or normal and application of the laws of arithmetical average is taken into account when calculating normal distribution. It is assumed that the distribution will be predictable if the standard deviation is less than half. Binomial distribution gives us an idea of the incidence in sexes, whether a feature is present or not, and such similar bi-factorial parameters within a number. The Poisson distribution in its part helps in finding out the incidence of rare events with the probability of their incidence of various cases within a number and during a specified period. Binomial distribution deals with the outcome of two events or categories in a number. Similarly, and as the name suggests, distribution can be logarithmic as well. Industry and the Banking sector utilise statistical derivations extensively, and distribution may be of various types that are not related to medical practice.
A large sample size allows physicians to determine the confidence interval and limits of a condition or a procedure. Confidence intervals provide us with an upper and lower limit around our sample mean, and within this interval, we can then be confident we have captured the population mean. The lower limit and upper limit around our sample mean tell us the range of values our true population mean is likely to lie within. Thus -
Confidence interval = sample mean ± margin of error
The confidence limits of a measurement are the limits between which the measurement error is with a probability P. The probability P is the confidence level and α = 1 - P is the risk level related to the confidence limits. The confidence level is chosen according to the application.
Sampling:-
In Medical practice, a single case may be reported due to the unusual features. However, numbers are important and the sample size has to be sizable to get a good statistical result. Apart from the distribution, comparison of presentation, signs and symptoms, and the analysis of variations can be done. The Chi-squared test or the X² test is done to compare variables in a dataset and to prove whether two variables within a large sample are related or not. These are helpful mainly in epidemiological studies.
From the distribution itself, the variability of occurrence can be derived by analysis of variance (ANOVA). Variance in medical terms is implied as the deviation from the standard course and most often Analysis of variance (ANOVA) is used to determine the spread between data sets and the possible range of deviation of the signs and symptoms of a drug or disease. Variance can be of many types. Statistical methods involve the determination of the mean and adding up the square of these means followed by a division of these as a whole by the (degree of) freedom values. Variance to a lay physician means a deviation from the median value of the highest and lowest numbers in the range. The medical people are concerned about the following aspects of variance mainly -
i) Efficiency variance,
ii) Appropriateness of investigations,
iii) Treatment modalities, and,
iv) Prognosis.
Screening:-
Screening is done whenever possible and the terms predictive value, sensitivity, and specificity are very dear to physicians. First, it is imperative to have an idea of sensitivity and specificity. The sensitivity of a test reveals the ratio of true positive cases against a summation of true positive and false negative cases detected and thus if the sensitivity is high, then the chances of the result being positive is very high. Specificity, on the other hand, finds out the ratio of true negative patients to a sum of true negative and false positive cases, and thus, here also, high specificity is desired. The appropriateness of a diagnostic test is thus assessed. A positive predictive value,i.e. a ratio of true positive to the sum of true positive and false positives, and/or the negative predictive value (ratio of true negatives to a sum of all the negatives) further bolsters the decision. These are best applicable to the detection of the appropriateness of a diagnostic method in a group or population. The diagrammatic representation tries to simplify complex concepts.
Studies and trials:
One has to remember that each patient is different and needs a unique management system that may differ. This is where variations in investigations and treatment come in. A study or a trial can be retrospective or prospective. The physician tries to think what might happen in future just by interpolating values from retrospective and prospective studies. Survival analysis also plays an important role.
A retrospective cohort study tries to compare the risk of developing a disease to some already known exposure factors, a case-control study will try to determine the possible exposure factors after a known disease incidence. Case-controlled studies and matched trials are important because it is concerned with the frequency and amount of exposure in subjects with a specific disease (cases) and people without the disease (controls). In a way, both are similar in some aspects and though the cohort study appears a bit superior, proper implementation is required to get the desired results.
The medical profession is constantly changing and upgrading itself. Studies and ongoing trials can be retrospective, prospective, or predictive. Again there is the meta-analysis that is an exert from publications on a matter to date. To be more precise, meta-analysis is a research process used to systematically synthesise or merge the findings of single, independent studies, using statistical methods to calculate an overall or 'absolute' effect. Retrospective trials are usually meta-analytical and may predict the course from past experiences.
A study is said to be blind when only the person performing the study knows what treatment or intervention the participants are receiving until the trial is over. A double-blind study means not notifying both the performer and the participants about the modality and the outcome results of a trial. Blind studies are said to minimise bias and hence preferred.
Sensitivity and specificity:-
Sensitivity and Specificity are the terms commonly used by the medical people. Sensitivity is a ratio of true positive results of a test or procedure to the sum of true and false positives of the same. The specificity of the same test or procedure, similarly, is the ratio of true negativity to a sum of true and false negatives. Considered together, sensitivity and specificity tell the physician the appropriateness of the procedure or test.
Hypothesis testing:-
Hypothesis testing is used to assess the credibility of a hypothesis by using sample data. The null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations. Sometimes referred to simply as the "null," it is represented as H0. It is a type of hypothesis testing exercised to conclude whether or not there is a relationship between two measured phenomena. The null hypothesis is assumed and strong data-based evidence, alternative hypothesis (HO), will be required to set aside the null hypothesis which tends to disprove significant relationships between particular parameters within the original dataset. The concept of 'p' value comes here. The P value means the probability, and the p value is a number, calculated from a statistical test, that describes how likely you are to have found a particular set of observations if the null hypothesis were true. P values are used in hypothesis testing to help decide whether to reject the null hypothesis. A p-value of 0.5 is used as a cut-off. Values lower than this negate the null hypothesis and the statistical outcome is significant.
Survival statistics:-
Doctors use survival statistics to estimate a patient's prognosis. The percentage of people in a study or treatment group who are still alive for a certain period after they were diagnosed with or started treatment for a disease. Prognosis is the chance of recovery. Survival statistics also help doctors evaluate treatment options.
The Kaplan-Meier estimate is the simplest way of computing survival over time despite all these difficulties associated with subjects or situations. For each time interval, survival probability is calculated as the number of subjects surviving divided by the number of patients at risk and the difficulties associated with the subjects or situations. For each time interval, survival probability is concerned with the frequency and amount of exposure in subjects with a specific disease (cases) and people without the disease (controls). Of the many models that can be used to analyze time-to-event data, there are 4 that are most prominent: the Kaplan-Meier model, the Exponential model, the Weibull model, and the Cox proportional hazards model. In medical statistics, the Kaplan Meier and the Cox proportional hazards model are preferred.
Statistical representation of data helps us to understand the relationship between the different study parameters, This is mainly done with graphics that display and summarize data and help us to understand the data's meaning. Graphics can be of many forms and bar charts, histograms, pie charts, boxplots (box and whisker plots), etc., are usual.
Statistical models have to be chosen per what is wanted as a result. Examples include the linear regression models, the Kaplan-Meier analysis, the Cox proportional hazards model, etc. Specific statistical tools, such as the graphics of statistical techniques and raw data, have to be considered.
Statistical thinking is crucial for studies in medical and biomedical areas as statistical analysis validates the clinical findings. As such statistics is well established as a subject and there are a multitude of models and mathematical formulas, but there are several pitfalls of using statistics in these areas involving experimental design, data collection, data analysis and data interpretation but still medical statistics is an emerging subject. A healthcare worker apart from caring for patients also designs a device, does some research, and validation for this is necessary for further publication. As of now physicians and surgeons do the clinical work and document observations. The person dealing with medical statistics does the analytic mathematical work in the effort and designs the tables, grafts, and diagrams to validate the data or numbers to a statistical significance.
Comments