School of Computer Science, Tel-Aviv University

Neural Computation & Signal Processing Lab (NCSP)

Advanced Research Seminar

סמינר מתקדם במדעי המחשב

0368-5020-01, Autumn/Spring 2005-6

Prof. Nathan Intrator

____________________________________________________________________

Next Seminar

Robust Inference in Bayesian Networks with Applications

Omer Berkman

School of Computer Science, TAU

Wednesday, Jun 14, Dan David 107, 2:00pm-3:30pm

Abstract

We are concerned with the problem of inferring a genetic regulatory network from a collection of temporal observations. This is often done via estimating a Dynamic Bayesian Network (DBN) from the short time series of gene expression data. We introduce ways to create a collection of “weak”

Dynamic Bayesian networks (WDBN) and then fuse them together for more robust network estimation. Results are demonstrated on simulated gene expression data and show improvement of the robustness with respect to quality of inference, ability to handle smaller time-steps-data, as well as increased number of examined genes.

 

 

             ____________________________________________________________________

 

Seminar Overview

The seminar this year will concentrate computational methods which are used in the different projects in the lab. Specifically, we shall concentrate on machine learning, robust statistics and robust modeling in each of the presentations. The background for the specific projects has been presented in talks in 2004-2005 seminars and thus will not be repeated. There will be few guest presentations as well.

Instructor:    Prof. Nathan Intrator, Schreiber 221, x7598, Office hours: Wednesday 4-5 or via email

NCSP Past seminars   2004-5 Sem I   2004-5 Sem II

Other relevant seminars:   PBC Seminar  Eshel Ben Yaakov  Eytan Ruppin  Ron Shamir 

Presentations

 Date

Title

Speaker

Nov 02

Organizational meeting

Nathan Intrator

Nov 09

Meeting with students

Nathan Intrator

Nov 10

Visions of language: through a mirage to an oasis

Shimon Edelman

Nov 16

Seismic data analysis

Talmor/Yariv

Nov 23

Mapping Mutations in the HIV RNA

Nimrod Bar

Nov 30

Results on hidden loop discovery and cursive hand writing recognition

Tal Steinhertz

Dec 07

Music Coloring for High Dimensional Data Representation

Eyal Balla

Dec 14

Spectral ICA and Adaptive noise removal in Heart Sounds

Haim Appleboim

Dec 21

Functional Holography of Complex Network Activity

Itai Baruchi

Jan 11

fMRI: Recent advances in DTI

Ofer Pasternak

Jan 25

Segmentation of Heart Sounds

Daniel Gill

Feb  1

Motion Estimation Improves Ultrasound Imaging

Lian Yu

Mar  8

Finding Structure in Text

Ben Sandbank

Mar 15

Coresets for Weighted Facilities and Their Applications

Dan Feldman

Mar 22

Studying Brain Activity from Electrophysiological Signals

Andrey Zhdanov

Apr 26

Hidden Markov Models and Speech Recognition

Ron Hecht

May 10

S3 and S4 Heart Sounds: Physiology and Applications

Hadas Zur

May 17

Automatic segmentation of low-frequency heart signals

Guy Amit

May 24

Thesis Defense: Cursive Word Recognition

Tal Steinhertz

May 31

Multi-Dimensional Feature Scoring For Gene Expression Data

Niv Efron

Jun 7

Structure Predicts Function: Anatomical Priors for fMRI Detection

Polina Golland

Jun 14

Robust Inference in Bayesian Networks with Applications

Omer Berkman

 

General instructions for seminar presenters

The presentation should have an introductory component that can enable all students understand the background of the seminar. They should then have a methodological component which explains at least a single method that can be used in a variety of applications. Finally, there should be some application results which demonstrate the usefulness of the proposed methods.

In contrast, a review presentation should describe several computational methods which are aimed at addressing a specific problem, together with a clear background of that problem. Preferably some comparison between the methods should be provided.

Abstract of the presentation should be sent to me up to three days before the presentation and a slides up to a day before the presentation.

 

Abstracts

 

Visions of language: through a mirage to an oasis                                                      Shimon Edelman

Over the past five decades, the conception of language adopted as the overarching theoretical framework by most linguists has been increasingly considered by most of the other involved scientists as irrelevant to the understanding of cognition and the brain. This unfortunate trend appears now to be reversing, due to a series of developments in cognitive linguistics and psychology, and in computer science. In this talk, I shall briefly survey these developments, focusing on language acquisition -- an issue with respect to which the still dominant "innatist" stance in linguistics bears a curious resemblance to the obscurantist doctrine known as "intelligent design."

 

The algorithm for language acquisition that I shall mention is joint work with Zach Solan, Eytan Ruppin, and David Horn.

 

Seismic data analysis                                                                                                  Talmor/Yariv

The talk describes the final project in the course “Neural Computation”. Information and code for obtaining seismic data will be presented together with the methodology and results which have led to a new global seismic network presentation which includes even location as well as temporal structure of a collection of events.

 

Mapping Mutations Patterns in the HIV DNA                                                                 Nimrod Bar

Outline

·         HIV introduction

·         HIV DNA mutations

·         Retrieving and processing the DNA sequence

·         From DNA to Amino Acid Mutations.

·         The importance and problem of finding mutation patterns  

·         Recent biological research of mutations – Bayesian networks.

·         Applying Branch and Bound techniques for pattern finding

·         Future research – Bi-clustering of mutation data.

 

Recent results on hidden loop discover and cursive hand writing recognition Tal Steinherz

Methods for moving between offline cursive word recognition to pseudo online representation will be discussed. In particular, recent results on hidden loop recovery will be shown.

 

Music Coloring for high dimensional data representation                                            Eyal Balla

The talk will include overview of the use of music in psycho therapy, and the use of music in viewing high dimensional data in finances and other applications. An introduction to Max/MSP will be given, together with some demos. Finally, an introduction to VST universal music code will be discussed and music effects using VST plug-ins will be presented.  The goal is to have colored music provide additional information to the user, for example about his medical condition for the purpose of monitoring and bio-feedback.

 

Spectral ICA and Adaptive noise removal in Heart Sounds                                           Haim Appleboim

The talk will include overview of ICA and spectral ICA and demonstrate its usefulness for signal conditioning and enhancing from multiple sources as well as noise and artifacts removal.

                                   

Functional Holography of Complex Network Activity                                                    Itai Baruchi

A functional holography (FH) approach is introduced for analyzing the complex activity of biological networks in the space of functional correlations.  Although the activity is often recorded from only part of the nodes, the goal is to decipher the activity of the whole network. This is why the analysis is guided by the "whole in every part" nature of a holograms a small part of a hologram will generate the whole picture but with lower resolution. The analysis is started by constructing the space of functional correlations from the similarities between the activities of the network components by a special collective normalization, or affinity transformation.  Using dimension reduction algorithms like PCA, a connectivity diagram is generated in the 3-dimensional space of the leading eigenvectors of the algorithm. The network components are positioned in the 3-dimenional space by projection on the eigenvectors and connect them with colored lines that represent the similarities. Temporal (causal) information is superimposed by coloring the node's locations according to the temporal ordering of their activities. By this analysis, the existence of hidden manifolds with simple yet characteristic geometrical and topological features in the complex biological activity was discovered from cultured networks to the human brain. These findings could be a consequence of the analysis being consistent with a new holographic principle by which biological networks regulate their complex activity.

 

Diffusion Weighted MRI - The Beltrami Flow                                                                Ofer Pasternak

Diffusion weighted MRI measures the self diffusion of water molecules. The imaging techniques originate with the work of Stejskal and Tanner in 1965, and became popular with the Diffusion Tensor Imaging of Basser, 1994. In order to model the diffusion one has to solve the diffusion equations. The solutions for those equations are simple for homogeneous material and were recently extended to the general case using the Beltrami flow. All imaging techniques to date rely on the simple solution for the diffusion equations which assumes homogeneity. This means that those models are prune to errors on heterogeneous materials, such as complex brain architecture. In this talk I will explain on different existing diffusion models, and their inabilities to model complex brain architecture. I will introduce the Beltrami flow solution to diffusion equation as it was used for image processing, and will show how the Beltrami flow might be used in diffusion imaging context.

 

Segmentation of Heart Sounds                                                                         Daniel Gill

Detection and Identification of Heart Sounds Using Homomorphic Envelogram and Self-Organizing Probabilistic Model

This work presents a novel method for automatic detection and identification of heart sounds. Homomorphic filtering is used to obtain a smooth envelogram of the phonocardiogram, which enables a robust detection of events of interest in heart sound signal. Sequences of features extracted from the detected events are used as observations of a hidden Markov model. It is demonstrated that the task of detection and identification of the major heart sounds can be learnt from unlabelled phonocardiograms by an unsupervised training process and without the assistance of any additional synchronizing channels.

 

Motion Estimation Improves Ultrasound Imaging                                                        Lian Yu

 

Finding Structure in Text                                                                                             Ben Sandbank

The problem of inferring the structural units composing a symbolic sequence is fundamental to linguistics, bioinformatics, and certain other disciplines. For example, a newborn child is confronted with a stream of undifferentiated sound, and must learn to identify the underlying morphemes, words and collocations. Analogously, in bioinformatics, the structural elements in protein and DNA sequences may correspond to active sites, promoters, and so on. In this talk I will present MEX (Motif EXtraction), a novel method aimed for just this purpose, and demonstrate its applicability on a problem of finding words in unsegmented text and on a task of protein function classification. I will also cover some of the directions for further development that are currently under way.

 

Coresets for Weighted Facilities and Their Applications                                               Dan Feldman

We give nearly linear-time approximation schemes for several basic problems in geometric optimization. We do so through the use of coresets, designed for weighted facilities. As an example, we give nearly linear-time algorithms for the approximate k-median line and the k-mean line problems, in which we wish to approximate a set P of points in R^d  by k lines, so that the sum of the Euclidean distances (or of the squared distances) from P to these lines is minimized (to within a (1+\epsilon) factor). We also give nearly linear-time algorithms for generalizations of linear regression problems to multiple regression lines. Our coresets also generalize the SVD/PCA techniques, for finding a (1+\epsilon)-approximation to the best q-dimensional flat that fits P, under various distance measures (SVD/PCA only deals with the sum of squared distances). All these results significantly improve on previous work, which could deal efficiently only with very special cases.

Joint work with Amos Fiat and Micha Sharir.

 

Studying Brain Activity from Electrophysiological Signals                                          Andrey Zhdanov

Neuroelectrophysiology is concerned with measurement and analysis of electrical and magnetic signals arising from electrical activity of neurons inside the brain. In this talk we will introduce basic concepts of electrophysiology (EP) and describe two most commonly used non-invasive EP measurement techniques, electroencephalography

(EEG) and magnetoencefalography (MEG) with a particular focus on the later. We will discuss the relationship between electrical activity of neurons inside the brain and aggregate electrophysiological measurements on the surface of the head and it's implication to the source localization problem. We will also survey major research directions in the field of EP signal analysis and discuss incorporation of prior biological knowledge into EP signal processing techniques.

 

Speech Recognition                                                                                                     Ron Hecht

The presentation will include some introduction to signal analysis, Markov models (including training the model) and other speech and signal related issues.

 

S3 and S4 Heart Sounds: Physiology and Applications                                                Hadas Zur

S3 and S4 are transient sounds which are difficult to detect. Their existence might indicate a worsening of heart condition. The talk will describe these signals and their origin as well as some algorithms for detection and applications.

 

Automatic Segmentation of Low-frequency Heart Signals                                            Guy Amit

Mechanical and electrical heart signals received on the chest wall bear valuable information about the underlying cardiovascular processes. In this talk I will describe an automatic algorithm that detects distinct segmentation points in low-frequency heart signals. The algorithm uses signal processing and pattern recognition techniques to identify multi-scale extrema points having high repeatability and low variability, without using any prior knowledge on the signal's morphology. I will demonstrate the algorithm's ability to accurately detect points with known physiological meaning in electrocardiogram and carotid pulse signals recorded from multiple subjects, and propose a quantitative measure for evaluating the segmentation quality. Finally, I will describe some future work about time series similarity measures. 

 

Multi-Dimensional Feature Scoring For Gene Expression Data                                     Niv Efron

The analysis of gene expression data presents researchers with the problem of finding optimal subsets of genes to focus on. This is a computational and statistical challenge, mostly due to the high-dimensionality of the data and the small amounts of samples. Hence, an initial process of gene (feature) selection is usually performed.

This work discusses several methods that perform feature scoring and selection. It focuses on a comparison between common one-dimensional methods (scoring each gene using only its expression values) and our proposed multi-dimensional method (scoring each gene using also its correlation with other genes), based on linear discriminant analysis (LDA). We present several techniques of regularizing the multi-dimensional LDA, aiming to solve the inherent problems of high-dimensional feature space.

We compare the performance of these methods using simulations and real data, and specifically address how several parameters (such as sample size and dimensionality) affect the methods. The results show that the multi-dimensional methods outperform the one-dimensional methods, and we discuss the scenarios in which it is more appropriate to use them.

Robust Inference in Bayesian Networks with Applications                                            Omer Berkman

We are concerned with the problem of inferring a genetic regulatory network from a collection of temporal observations. This is often done via estimating a Dynamic Bayesian Network (DBN) from the short time series of gene expression data. We introduce ways to create a collection of “weak”

Dynamic Bayesian networks (WDBN) and then fuse them together for more robust network estimation. Results are demonstrated on simulated gene expression data and show improvement of the robustness with respect to quality of inference, ability to handle smaller time-steps-data, as well as increased number of examined genes.

 

 

 

 

 

           

Some reading material

 

Sound analysis

Auditory display of hyperspectral colon tissue images

Singing the Mind Listening

Sound features

Chris Raphael Rhythm changes

 

Biomedical signals and sensors

Robust measurement of Carotid Heart sound delay

Heart Mechanical and Electrical System

Segmentation of EKG signals

EKG Overview

Heart info and abnormalities (video)

 

Sensors

Cheap off-the shelf TinyOs operated robots

PicoRadio: Low power wireless node with sensors

Sensors Magazine

Xbow sensors

Machine learning and Statistics

Information theory T. Cover

Max Entropy Methods  R. Skiling

Pattern recognition and neural networks B. Ripley

Neural networks for pattern recognition Bishop

Digital Signal Analysis: A Computer Science

Perspective J. Stein.

Biomedical Signal Analysis R. M. Rangayyan  

Breath Sounds Methodology N. Gavriely

Introduction to Bayesian Networks K. Murphy

 

Software

Max/MSP Multimedia creation

TinyOS  operating sys for wireless applications

 

 

The slides and other seminar events can be found in http://www.cs.tau.ac.il/~nin/Courses/AdvSem0506/AdvSem0506.htm