学位論文要旨詳細

学位論文要旨


No		125008
著者（漢字）		韓,昇龍
著者（英字）
著者（カナ）		ハン,スンリョン
標題（和）		3次元映像の幾何情報圧縮
標題（洋）		Geometry Compression of Time-Varying Meshes
報告番号		125008
報告番号		甲25008
学位授与日		2009.03.23
学位種別		課程博士
学位種類		博士(科学)
学位記番号		博創域第426号
研究科		新領域創成科学研究科
専攻		基盤情報学専攻
論文審査委員		主査：　東京大学　教授　相澤,清晴　東京大学　教授　原島,博　東京大学　教授　池内,克史　東京大学　教授　近山,隆　東京大学　教授　相田,仁　東京大学　准教授　苗村,健
内容要旨		要旨を表示する This thesis proposes an algorithm for geometry compression of time-varying meshes (TVMs). We have made two major observations called temporal and spatial redundancy. It leads to the compression of TVMs. Temporal redundancy comes from the fact that each TVMs is a sequence of mesh models where the shape (geometry) of neighboring models are highly correlated with each other. Spatial redundancy means that vertices exist only on the surface of model. Building on these observations, this thesis proposes two different approaches to compress TVMs. The first algorithm is inspired by 2-D video compression, in which motion estimation plays an important role to achieve higher compression performance. As inter-frame coding of 2-D video compression removes temporal redundancy, we follow a similar framework for efficient compression of TVMs. Standard motion estimation uses a block matching algorithm (BMA). We advocate the use of BMA and extend it to TVMs.In our extend BMA (EBMA), a cubic block is used as a matching unit. Efficient motion compensation in the 3-D space is achieved by matching the mean normal vectors calculated from partial surfaces in cubic blocks. Our experiments show that mean normal vector is a suboptimal matching criterion. After motion compensation, residuals are transformed by the discrete cosine transform, uniformly quantized, and then encoded. The extracted motion vectors are also entropy coded after differential pulse code modulation. The second approach employs a two-step decomposition to decouple the global and local redundancy in TVMs. Each decomposition step consists of three subsequent processes: quantization, bit-streamization, and run-length encoding (RLE). This decomposition process is applied to TVMs hierarchically The former decomposition step converts the original each frame of TVMs into a set of blocks and the block positions are represented by binary sequences. The distribution of blocks is biased and sparse in the 3-D space. We denote this as global redundancy.This global redundancy is converted into long runs of zeros as a binary sequence.Therefore, it is efficiently encoded by RLE. Further compression is achieved by removing temporal redundancy. Removal of temporal redundancy can be done by simple Boolean operation which produces binary sequences that have longer runs of zeros.Therefore, further compression is obtained by RLE. In the latter decomposition step, vertices in each block are quantized and represented by binary sequences. As neighboring vertices in each block are highly correlated each other, therefore, the distribution is biased in each block. We call this as local spatial redundancy. Similar to the former decomposition, these also yield long runs of zeros and can be efficiently encoded by RLE. In order to quantify the distortion of our compression methods, we propose image-domain-based evaluation measure. This evaluation method uses a rendered image of each TVMs and manipulates PSNR. All virtual viewpoints of TVMs are considered for a reliable evaluation. This allows us to take into account the characteristics of human visual perception.
審査要旨		要旨を表示する本論文は,「Geometry Compression of Time-Varying Meshes(3次元映像の幾何情報圧縮)」と題し,英文で書かれており,5章よりなる.人の動きなどの動的に変化する実対象の3次元化を行う技術が進展しつつあり,それを基にして自由視点の映像メディアを実現する試みが現実味を帯びつつある,但し,その新しい映像メディアを享受するためには,利活用のための多くの技術開発が必要とされている.本論文で取り扱う3次元映像は,一コマずつメッシュの構造が変化する動的な3次元メッシュの系列(Time Varying Meshes, TVM)であり,そのデータ量は膨大である.本論文では,3次元映像の幾何情報の圧縮に関して論じている.具体的な圧縮手法として,ブロックマッチングに基づく手法,階層的量子化に基づく手法の2手法を提案するとともに,性能評価を行い,その性能評価の指標についても論じている. 第1章は,「Introduction (序論)」と題し,本論文の目的と背景,構成について論じている.3次元映像に関しての課題の整理を行い,その中で圧縮がその利活用にあたっての重要な技術要素であることを述べている. 第2章は,「Related Work(関連研究)」と題し,3次元モデルの圧縮に関しての関連研究のサーベイを行っており,まず,静止3次元メッシュの圧縮について概観し,その手法を紹介するとともに,動的な3次元メッシュ圧縮についての紹介を行っている.動的な3次元メッシュにおいては,過去の研究においては,頂点接続関係の保持される3Dアニメーションの圧縮が行われてきたことを述べ,その手法について紹介している.そして,本論文で対象とする動的な3次元メッシュ系列(TVM)の場合,一コマ毎に頂点数や接続関係までが変化するものであることを述べ,これまでの圧縮方式では対応できないことを論じている. 第3章は,「Extended Block Matching Algorithm (拡張ブロックマッチングアルゴリズム)」と題し,ブロックマッチングによる動き補償を用いるTVMの圧縮手法について論じている.ブロックマッチングは,通常の2次元動画の圧縮に多用される手法であり,それを拡張し,3次元映像にとってのブロックマッチングを考案している.提案手法では,3次元メッシュを局所的なパッチに切り分けるブロック化を行い,各局所パッチ毎に動き推定を行い,動きベクトルを算出している.さらに,動き補償の近似誤差に離散コサイン変換を適用している。性能評価においては,3次元形状としての誤差評価ばかりでなく,複数の代表的な視点からレンダリングした画像としての誤差の評価を提案し,両者による評価を行っている.提案手法により,おおよそ10分の1の圧縮が可能となることを示している. 第4章は,「Two-step Decomposition and RLE based algorithm (2段階量子化とランレングス符号化に基づくアルゴリズム)」と題する。前章での提案手法が,符号化データ中でのオーバーヘッドが大きく効率を落としていたことを踏まえて,さらなる高能率な圧縮手法について論じている.この提案手法では,効率のよい量子化を2段階で行い,ランレングス符号で有意情報の符号化を行う.量子化では,まず,粗な量子化において,データの所在領域を指定し,次に,密な量子化において,頂点座標を量子化代表点に割り当てる.さらに,粗な量子化においては,時間方向の相関を考慮することで効率を高めている.密な量子化結果については,3次元中に偏在する量子化代表点の位置をランレングス符号化で効率よく符号化している.比較的複雑さの低い方式ながら,30分の1程度まで効率を大きく改善している.本手法は,静止3次元モデルに対しても有効であり,既存のVQベースの手法と比較しても,優れた圧縮であることを確認している. 第5章は,「Conclusion(結論)」と題し,本論文で得られた知見をまとめると共に,今後の課題について言及している. 以上これを要するに,本論文は,時間的にコヒーレントでない動的3次元メッシュの圧縮という新しい課題を取り上げ,拡張ブロックマッチングに基づく手法と階層的量子化に基づく手法を提案し,評価を通じてその有効性を示したものであり,情報学の基盤に貢献するところが少なくない. 従って、博士(科学)の学位を授与できると認める。
UTokyo Repositoryリンク