This thesis proposes an on-line method of understanding human's conducting action observed through a vision sensor.
The VME bus based vision system captures images of conducting action and extracts image coordinates of endpoint of the baton at every 1/30 second. These coordinates are used for recognizing patterns of the conducting action and are used to play the corresponding music score.
An algorithm based on the expert knowledge about conducting is proposed for recognizing patterns of the conducting action. To detect the upper corner and the lower corner without detection errors, the algorithm uses five sets of extracted coordinates of the baton. Through extensive experiments, this algorithm is found to detect lower edges and upper edges without error. Complementary algorithms are also proposed for identifying the first beat, static point, and dynamics.
The above algorithms for understanding the conducting action are implemented on a 32-bit personal computer along with the corresponding music score being played by a speaker built in the personal computer.