研究の掃溜ノオト

since 2011/2/13 知能ロボ研究の合間に思ったこととか書いてます。

[PR]

[PR]上記の広告は3ヶ月以上新規記事投稿のないブログに表示されています。新しい記事を書く事で広告が消えます。

主成分分析(PCA)はデータの圧縮手法の一つです.PCAを用いることでデータを拘束している高次元空間中の部分空間を取り出すことができます. 部分空間の基底を求める方法は一般には共分散行列の固有値問題に帰着されますが, 固有値問題は計算時間のオーダーがO(n^3)であるので次元が高いデータに対しては時間がかかりすぎてしまいます.しかしこれはすべての固有値・固有ベクトルを求める時の話であって, PCAに於いては値の大きい方から数個の固有値及び対応する固有ベクトルしか必要ないという場合が多々あるので真面目に固有値問題を解くのは効率がよくありません.
今回は数値計算の授業中に思いついたデータの主成分を効率よく必要な数だけ取ってくるアルゴリズムを紹介します. このアルゴリズムは大きい方から固有値及び対応する固有ベクトルを必要なだけ取ってくるアルゴリズムであるのでKPCAやCCA, KCCA などに応用することもできます. あとで調べて分かったのですがこの方法はGoogle がPagerank を求める時にも用いているそうです. 残念←

Principal Component Analysis (PCA) is one of the method of data compression. We can bring out the subspace in high dimension space which constrains data. Generally the method of calculate basis of the subspace is the eigenvalue problem of covariance matrix. However this method takes too long time for high dimension data because the eigenvalue problem takes time O(n^3). If you need not all eigenvalues and eigenvectors of covariance matrix and only need several these from the lager one, it is not efficient to solve the eigenvalue problem straight (in particular PCA).
In this item, I introduce the algorithm to calculate principal value of data efficiently as needed(I came up with this algorithm at the time of the lecture of numerical computation). This algorithm can be also applied to KPCA, CCA and KCCA because it calculate eigenvalues and eigenvectors of covariance matrix from the larger one as needed. Afterwards, I found this algorithm has already used by Google to calculate Pagerank.

・・・つづきはこちら

二重特殊相対論とは…

９月２３日に発表された超光速ニュートリノは二週間ちょっとたった今でもいまだに多くの議論を生んでいるようです。

朝日新聞デジタル：超光速、本当か　「光より速いニュートリノ」専門家慎重

ボクも初めて聞いた時は混乱して色々考えてましたが、
結局わからなかったので最近は日和見状態です。
(・_・;)

・・・つづきはこちら

ガウス分布の逐次推定（数値計算結果）

Key Words: ガウス分布逐次推定逐次学習オンライン学習パラメータ推定平均値分散
gaussian, normal distribution, sequential learning, online learning, parameter inference, mean, variance

前回の記事で導出したガウス分布の逐次推定における平均値と分散の更新式を用いて実際に収束するのかどうか数値計算を行ってみました！

・・・つづきはこちら

ガウス分布の逐次推定（平均と分散の更新式）

Key Words: ガウス分布逐次推定逐次学習オンライン学習パラメータ推定平均値分散
gaussian, normal distribution, sequential learning, online learning, parameter inference, mean, variance

　データが従う確率密度関数（PDF）を推定することを考えましょう.
状況として