Talk:Confidence interval
This is the talk page for discussing improvements to the Confidence interval article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: Index, 1, 2, 3, 4Auto-archiving period: 3 months ![]() |
![]() | This ![]() It is of interest to the following WikiProjects: | ||||||||||||||||||||
|
With regards to the approachability of this article[edit]Why not use the Simple English version of this complicated article (link below)? It seems more accessible for the average reader than the in-depth one here. https://simple.wikipedia.org/wiki/Confidence_interval DC (talk) 14:26, 30 March 2016 (UTC Thank you for providing the link to the simple.wikipedia.org page. I found it to be more accessible just as you said. Thank you! -Anon 14:54 UTC, 15 Nov 2020 |
The article contradicts itself
[edit]Due to this edit, the introduction is currently spreading precisely the misunderstanding that the article later warns about. The introduction says:
- Given observations and a confidence level , a valid confidence interval has a probability of containing the true underlying parameter.
In direct contradiction, the article later rightly warns:
- A 95% confidence level does not mean that for a given realized interval there is a 95% probability that the population parameter lies within the interval [...].
Joriki (talk) 11:23, 12 May 2020 (UTC)
Another incorrect statement:
- Therefore, there is a 5% probability that the true incidence ratio may lie out of the range of 1.4 to 2.6 values.
According to the textbook Introduction to Data Science, the statement in question is false: a confidence level of does not mean we can say the interval contains the underlying true parameter with probability . How many references do we need before we can remove that misleading claim from the introduction?
TheKenster (talk) 20:06, 3 November 2020 (UTC)
- I am a statistics expert. In light of Wikipedia:Be_bold, I have corrected the mistakes mentioned in this subsection, as well as a couple others that I caught. Stellaathena (talk) 23:30, 17 November 2020 (UTC)
Dtlfg (talk) 09:20, 3 August 2021 (UTC)
The section Examples - Medical Examples still contains that contradiction:
- Furthermore, it also means that we are 95% confident that the true incidence ratio in all the infertile female population lies in the range from 1.4 to 2.
It seems the formal defintion is purely wrong. The given definition is the definition of credible interval. Unfortanately I do not find a source which give a correct definition. Here what I think it is, if I find a source someday I will correct : Let be a realisation (in general a statitistic from a set of independant following the same distribution) of a random variable following a distribution , being a parameter we have to estimate. Let be a function from to (the power set of ) which assigns to each a set which verify . The confidence interval is then defined as the set . — Preceding unsigned comment added by Samuelboudet (talk • contribs) 15:15, 9 October 2022 (UTC)
- The correct definition can be found here:
An X% confidence interval for a parameter θ is an interval (L,U) generated by a procedure that in repeated sampling has an X% probability of containing the true value of θ, for all possible values of θ (Neyman 1937).
— Morey, Richard D.; Hoekstra, Rink; Rouder, Jeffrey N.; Lee, Michael D.; Wagenmakers, Eric-Jan (2016). "The fallacy of placing confidence in confidence intervals". Psychonomic Bulletin & Review. 23 (1): 103–123. doi:10.3758/s13423-015-0947-8. PMC 4742505. PMID 26450628.- The key point being that the confidence level is the long-run frequency at which the procedure will produce intervals containing the true parameter. It does not say anything about the individual intervals that are produced – and indeed, that article as well as https://bayes.wustl.edu/etj/articles/confidence.pdf show that the post-data probability that the interval contains the true parameter can be very different from the confidence level.
- Spidermario (talk) 08:13, 11 October 2022 (UTC)
The third interpretation, using statistical significance or lack thereof, may be a problematic interpretation to include. Statistical significance is a relational concept between a sample estimator and the population's parameter suggested by the null hypothesis. When we calculate a test statistic to compare the difference between the sample estimate and the null's assertion of the truth we take advantage of the assumption of the null's truth to ascertain a p-value. The simplicity of the confidence interval is that it is oblivious to the truth. Specifically, if there is a true Θ and we repeat our study process many times then we would expect 100(1-α)% of the intervals we generate to contain the true θ regardless of the existence of a null hypothesis or whatever value it purports. Saying that a 95% interval suggests a range of values that would not be statistically significant from the sample estimator places the condition of truth on the sample and assumes that the parameter value may be a range of possible outcomes, conditions in opposition to statistical theory.
Confidence intervals are built in a way such that they almost always cover the truth. Since we never really know the truth we cannot make any kind of statements about whether the right answer is in any interval we observe. All we know is that there is a pretty good chance that our interval is one of the right ones. This upsets people because we did not know the truth before the study and now that we have an interval we still do not know the truth. At the end of the day, that will always be the problem when we have to use a sample to make inference about a population. Quantifying uncertainty is not the same as making it go away completely. — Preceding unsigned comment added by 99.116.222.7 (talk) 16:48, 15 May 2023 (UTC)
- Aren't those exactly the same thing?
- The statement "the estimated CI contains the true parameter" means the same as the statement "the true parameter is within the estimated CI"; i.e. if the estimated CI contains the true parameter then the true parameter is within the measured CI, and if the true parameter is within the estimated CI then the the estimated CI contains the true parameter.
- There isn't any scenario where the truth value of the two statements differs, therefore their probability is the same.
- I think the point about how "this upsets people because we did not know the truth before the study and now that we have an interval we still do not know the truth" is inaccurate: in both cases we're acknowledging we don't know the truth, if we did we wouldn't need a confidence interval. 2001:818:DA5F:AF00:5DF0:7BCC:8C9B:3197 (talk) 23:30, 21 June 2024 (UTC)
- They are not the same thing.
- Having computed a specific interval, “the probability that the true parameter is within this specific confidence interval” is meaningless for a frequentist – since neither the parameter nor the interval is a “random variable”, there is no (frequentist) probability at play here: either the parameter is in the interval or it isn’t. You can calculate a Bayesian probability for that statement, conditional on all the data at hand, but it’s not necessarily going to be equal to the confidence level.
- The confidence level, instead, is the answer to “the probability that if we conduct a random trial, the random data we will get will lead to a confidence interval that contains the parameter”. In other words, the distinction is not just the order of the words, it’s the fact that the probabilistic statement now refers to all the random confidence intervals we could generate if we repeated the trial, rather than to the (fixed) parameter in relation to a (fixed) calculated interval.
- It’s kind of the same difference as that between the accuracy of a test and its predictive value. Imagine a disease with 90% prevalence, and a test with 90% sensitivity and 90% specificity. Its “accuracy” (analogue to the confidence level) is 90%: if we take a random patient and have them take the test, we are 90% likely to get a correct test result (“the estimated CI contains the true parameter”).
- But if we now conduct the test, get a negative result, and ask the probability that the true disease status matches the result of the test (“the true parameter is within the estimated CI”), it’s not 90%, it’s 50%. It doesn’t matter that before the experiment, we had a pretty good chance of producing a test result that would match the patient’s true disease status. We have now produced the test result, it’s negative, and we know that it means it’s less likely to be “one of the right ones”.
- This is discussed in the link I posted above: https://link.springer.com/article/10.3758/s13423-015-0947-8
- As well as in: https://bayes.wustl.edu/etj/articles/confidence.pdf
- tl;dr: “the probability that the parameter is in the estimated interval” means (say) “P(θ ∈ [12.3, 14.7] | data)”, whereas “the probability that the estimated interval contains the parameter” (really, “will contain”) is “P(data such that θ ∈ interval(data))”.
- Spidermario (talk) 08:38, 22 June 2024 (UTC)
Definition does not make sense.
[edit]The definition provides an equation for every . But is not defined. Possibly it should be gamma, but I am not going to touch this. 88.98.89.58 (talk) 09:59, 2 January 2025 (UTC)
, representing quantities that are not of immediate interest
- Spidermario (talk) 18:20, 2 January 2025 (UTC)
- You can remove entirely from the definition. This is better mathematical style. You could also clarify things more generally, such as making it clear that . 88.98.89.58 (talk) 14:54, 3 January 2025 (UTC)
- is needed in the definition in order to cover the case when the parameter to be estimated does not completely determine the distribution of the data. For instance: a confidence interval for a normal mean mu when the variance sigma squared is unknown. A remark should be added to explain this. Richard Gill (talk) 06:11, 15 February 2025 (UTC)
- You can remove entirely from the definition. This is better mathematical style. You could also clarify things more generally, such as making it clear that . 88.98.89.58 (talk) 14:54, 3 January 2025 (UTC)
"Conservative" confidence intervals are contradictory
[edit]Currently, the article defines:
> Confidence limits of the form: and are called conservative
Typically, "conservative" means "erring on the safe side", or "even in the worst case, assuming all estimates are bad". So I would have expected a "conservative confidence interval with confidence level 95%" to be an overly large confidence interval, perhaps it actually has a confidence level of 99%, or something like that.
However, here's an example that shows how "un-conservative" a "conservative confidence interval" can be. Let's say we have a biased coin that shows heads with probability . Define the "crazy method" as:
- Step 1: Throw away the biased coin, i.e. the variable . We're not going to use it at all.
- Step 2: Instead, throw an unbiased coin.
- Step 3a: If it shows heads, define .
- Step 3b: If it shows tails, define .
I hope it's fairly obvious that and , despite .
So according to the definition, this "crazy method" is a "conservative" confidence interval, despite being absolutely terrible. This goes contrary to everything else that mathematics calls "conservative". I'm sure that the "crazy method" isn't even the most pathological method.
I suggest to remove the definition from the page, because:
- The term "conservative confidence interval" is not used anywhere else on this page.
- The term is used on Coverage probability, and defined in the sense that I would have expected: "When the actual coverage probability is greater than the nominal coverage probability, the interval is termed a conservative (confidence) interval"
- The page Standard error says "the Vysochanskiï–Petunin inequalities can be used to calculate a conservative confidence interval", but neither this nor the linked page defines it. Reading the definition, it seems to me that it is conservative in the sense I would have expected.
So in summary: This definition is contrary to the common meaning of "conservative", is not used anywhere, and contradicts the only two other places in Wikipedia where the phrase "conservative confidence interval" is used.
I'll wait a few days in case this turns out to be controversial, and then WP:Be bold and remove it myself.
--CommonTypoHunter (talk) 19:13, 1 March 2025 (UTC)
- “Conservative” does mean “erring on the safe side”, in mathematical statistics, just as in ordinary speech. The problem with the text in the article at the time the previous comment was made is that it referred to one-sided intervals. It should say something like: confidence intervals such that the probability that theta is inside the interval from lower_bound_computed_from(x) to upper_bound_computed_from(x) is at least gamma, whatever the value of theta, are called conservative”. This means that crazy intervals which nobody could ever dream of using in practice can also be termed conservative. That’s mathematics for you. Being conservative does not guarantee being sensible. (A bit like politics, in fact?). Richard Gill (talk) 06:36, 2 March 2025 (UTC)
- I’ve added the concept “conservative”. Not yet added links to where it is used elsewhere on Wikipedia, nor to the standard definition in the standard literature. Richard Gill (talk) 15:41, 2 March 2025 (UTC)
Maintenance flags
[edit]Soon it may be time to remove the maintenance flags. There is a lot that can still be improved, but the article is hopefully now more accessible. I'm not sure it's totally accurate, but at least it's less bloated now so there's less to check.