1 / 18
文档名称:

Computer Science - Cambridge University Press - Information Theory, Inference, and Learning Algorithms - on Estimation.pdf

格式:pdf   页数:18页
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

Computer Science - Cambridge University Press - Information Theory, Inference, and Learning Algorithms - on Estimation.pdf

上传人:bolee65 2014/2/4 文件大小:0 KB

下载得到文件列表

Computer Science - Cambridge University Press - Information Theory, Inference, and Learning Algorithms - on Estimation.pdf

文档介绍

文档介绍:Information Theory, Inference, and Learning Algorithms
David . MacKay
3
More about Inference
It is not a controversial statement that Bayes’ theorem provides the correct
language for describing the inference of a municated over a noisy
channel, as we used it in Chapter 1 (). But strangely, when es to
other inference problems, the use of Bayes’ theorem is not so widespread.
A first inference problem
When I was an undergraduate in Cambridge, I was privileged to receive su-
pervisions from Steve Gull. Sitting at his desk in a dishevelled office in St.
John’s College, I asked him how one ought to answer an old Tripos question
(exercise ):
Unstable particles are emitted from a source and decay at a
distance x, a real number that has an exponential probability dis-
tribution with characteristic length λ. Decay events can only be
observed if they occur in a window extending from x =1cmto
x =20cm. N decays are observed at locations {x1,...,xN }.What
is λ?
* ***** * **
x
I had scratched my head over this for some time. My education had provided
me with a couple of approaches to solving such inference problems: contructing
‘estimators’ of the unknown parameters; or ‘fitting’ the model to the data, or
a processed version of the data.
λ
Since the mean of an unconstrained exponential distribution is , it seemed
reasonable to examine the sample meanx ¯ = n xn/N and see if an estimator
λˆ could be obtained from it. It was evident that the estimator λˆ=¯x−1would
be appropriate for λ 20 cm, but not for cases where the truncation of the
distribution at the right-hand side is significant; with a little ingenuity and
the introduction of ad hoc bins, promising estimators for λ 20 cm could be
constructed. But there was no obvious estimator that would work under all
conditions.
Nor could I find a satisfactory approach based on fitting the density P (x | λ)
to a histogram derived from the data. I was stuck.
What