De Kai Professor, Musician
For pioneering contributions to machine learning of the cognitive relationships between different languages, De Kai is among only 17 scientists worldwide selected in 2011 by the Association for Computational Linguistics to be awarded the honor of Founding ACL Fellow. A native of St Louis, he worked and traveled extensively in San Francisco, New York, Germany, Spain, China, India and Canada before joining Hong Kong's ambitious creation of Asia's now top-ranked HKUST, where he developed the foundations of modern statistical machine translation technology, with broad applications in computer music and computational musicology as well as human language processing, and built the world's first public web translation service resulting in global coverage.
De Kai's cross-disciplinary work relating music, language, intelligence, and culture stems from a liberal arts perspective emphasizing creativity in both technical and humanistic dimensions.
A multi-instrumentalist songwriter classically trained at Northwestern University's School of Music, De Kai started piano and composing at age 4 while simultaneously immersed in the improvisational, rhythmic, conversational forms of Chicago's blues, soul and funk. At Berkeley he studied West African polyrhythms prior to years of intensive training on flamenco cajón with a wide range of top Spanish percussionists and dancers.
In parallel with music performance and theory, De Kai began building analog synths at the age of 12. Several years later, he began studying computer music at the UCSD Center for Music Experiment and Related Research (CME / CRCA). At Berkeley he subsequently designed digital additive synthesis chips. His recent work explores the potential of cognitive machines that learn the relationships between various different kinds of musical languages.
During his doctoral studies in cognitive science, artificial intelligence and computational linguistics at Berkeley, he worked on seminal projects on intelligent conversational dialog agents. His PhD dissertation employing maximum entropy to model human perception and interpretation of ambiguities was one of the first to spur the paradigm shift toward today's state-of-the-art statistical natural language processing technologies. De Kai also holds an executive MBA from Kellogg (Northwestern University) and HKUST. His undergraduate degree at UCSD was awarded cum laude, Phi Beta Kappa, and won the liberal arts oriented Revelle College's department award.
In 2015, Debrett's HK 100 named De Kai as one of the 100 most influential figures of Hong Kong.
ReOrientate performance Para Limes Institute "East of West, West of East", Singapore, 18 Oct 2016
ReOrientate performance + Do You Speak Pentatonic? The Multilinguality of Music TEDxWanChai, Aug 2014
ReOrientate performance TEDxHKUST, Mar 2012
De Kai performs across across a broad range of music styles and is the creator of the unique Hong Kong based collective ReOrientate rapidly gaining acclaim for accessible, soulful Asian pop grounded in Buddhist, Hindu, Sufi, and Taoist folk traditions while re-assimilating highly Oriental idioms of Spanish flamenco gypsies who originated near northern India between western China and eastern Pakistan.
The gypsies' epic Silk Road and Mediterranean migrations anchor the axis of ReOrientate's exploration of cultural resonances from East to West. Featuring multilingual songs—Chinese, Hindi, English, Punjabi, Turkish, Spanish—ReOrientate seeks common languages among the diverse traditions' complex rhythms and scales.
The live performances anticipating ReOrientate's debut album earned rave reviews from Asia Society to Hard Rock Cafe to TEDx events, at the Hong Kong Cultural Centre and Hong Kong Arts Centre, and at all major Hong Kong music festivals including Clockenflap, Freespace, and Detour.
In addition to ReOrientate, De Kai also works on keys and percussion in live settings ranging from traditional theatrical and tablao flamenco, to funk and soul, to canto-pop fusion with stars such as HOCC (Denise Ho Wan Sze) and Chet Lam.
reviewer's pick Like telepathy, the magic of music... a United Nations force of traditional instruments like the guzheng, bamboo flute, hand drums, erhu, plus electronic loop workstations... surprisingly unexpected percussion rhythms, much dance of the imagination, thick with the Latin flavor of flamenco guitar... makes you plant yourself in the festival party on the dance floor with the chorus cheering “dance!”... in the space of 6 minutes 35 seconds ReOrientate takes the crowd along the treasure trip from Brazilian samba carnival to Indian Bollywood — Re:spect Music Magazine
A truly mighty voice with both soul and an edge. Great drums and rhythms... surprising and arresting flamenco-influenced dancing... an ambitious project altogether... always entertaining — The Underground HK
brilliantly recorded — RTHK Radio 3
Regenerating the Creative Impulse: Technology, Community, Art and Space Bosphorus Summit, Istanbul, 1 Dec 2016
How AIs Are and Aren't Kids: Language, Music and Society Volumetric Society of NYC, Brooklyn Experimental Media Center at NYU, 28 Jul 2016
Generalizing Transduction Grammars to Model Continuous Valued Musical Events International Society for Music Information Retrieval Conference, New York, 11 Aug 2016
Can an A.I. Really Relate? What's Universal in Language and Music TEDxBeijing, Jan 2016
Music in Translation TEDxElsaHighSchool, Apr 2014
Music in Translation: Artificial Intelligence and the Languages of World Music World Music Expo (WOMEX), Cardiff, UK, Oct 2013
ICMA BEST PRESENTATION AWARD Neural Versus Symbolic Rap Battle Bots International Computer Music Conference (ICMC), Denton, Texas, Sep 2015
Translating Music Radio and Television Hong Kong, The Sound of Art and Science, May 2015
Language, Music and Reorientation: The Keys to Artificial Intelligence Café Scientifique, Hong Kong, Mar 2016
Translating Music: How Computational Learning Explains the Way We Appreciate Music and Language Raising The Bar, Hong Kong, Mar 2015
Reorientate Musical Frames of Reference across Cultures Detour 2014 @ PMQ, Hong Kong, Nov 2014
How Music Can Reorientate Our Cultures Sebasi x Pan-Asian Network, Jeju, South Korea, Nov 2014
• “Neural Versus Symbolic Inversion Transduction
Grammars in Language and Music”. NLP Seminar, Columbia University. New York, Jul 2016
• “How Blue Can You Get? Learning Structural Relationships for Microtones via Continuous Stochastic Transduction Grammars”. International Conference on Computational Creativity (ICCC). Paris, Jun 2016
• “Learning Musical Creativity via Stochastic Transduction Grammars: Combination, Exploration and Transformation”. International Workshop on Musical Metacreation (MUME). Paris, Jun 2016
• “Music is a Relationship between Languages”. Music Colloquium. Chinese University of Hong Kong, Apr 2016
• “Language, Music, and Creativity”. ESSCaSS. Estonia, Aug 2015
• “Learning to Rap Battle with Bilingual Recursive Neural Networks”. AI and the Arts, International Joint Conference on Artificial Intelligence (IJCAI). Buenos Aires, Jul 2015
• “Compositional bilingual artificial neural networks for predicting hypermetrical structure among interacting flamenco parts”. Society for Music Perception and Cognition (SMPC). Nashville, Aug 2015
• “Learning Music and Language with Stochastic Transduction Grammars”. EvoMus Workshop on The Evolution of Music and Language in a Comparative Perspective. Vienna, Apr 2014
• “Learning to Freestyle: Hip Hop Challenge-Reponse Induction via Transduction Rule Chunking and Segmentation”. Conference on Empirical Methods in Natural Language Processing (EMNLP). Seattle, Oct 2013
• “Simultaneous Unsupervised Learning of Flamenco Metrical Structure, Hypermetrical Structure, and Multipart Structural Relations”. 14th International Society for Music Information Retrieval Conference (ISMIR). Curitiba, Brazil, Nov 2013
• “The Magic Number 4: Evolutionary Pressures on Semantic Frame Structure”. Evolution of Language. Cortona, Italy, Sep 2013
• “Unsupervised Rhyme Scheme Identification in Hip Hop Lyrics Using Hidden Markov Models. International Conference on Statistical Language and Speech Processing (SLSP). Tarragona, Spain, Jul 2013
Music and language define humanity. Music and language, the only capabilities where humans outshine all other species, have been inextricably bound to each other since the dawn of humankind. Our prehistoric ancestors were probably singing before talking—animal songs are the likely evolutionary precursor of music and language. Out of the evolutionary refinement of these abilities, human intelligence emerged.
De Kai's work asks some fundamental questions about music and language, introducing a new computational musicology theoretical paradigm of stochastic transduction models to provide explanations.
How did evolutionary conditions drive humans to develop music and language as they did? Music and language share a common set of neural and psychological resources. These arm us with fundamental cognitive abilities including being able to semantically associate signals with contexts, and to syntactically segment song strings that are statistically noteworthy. Yet the theoretical space of possible musics and languages is far too vast for human music and language to have converged as they have, without other strong factors driving the evolutionary pressures. De Kai's transduction models naturally imply cognitive load constraints that explain the remarkable convergence of music and language characteristics even across distant cultures.
How do we learn the languages that music is built out of? Music is full of different kinds of languages. Just as lyrics are sequences of words, melodies are sequences of notes, rhythms are sequences of percussive hits, cadences and progressions are sequences of chords, and song structures are sequences of verses, choruses, bridges, and the like. And just as with spoken languages, all these different musical languages have many subtle complexities when it comes to what does and doesn't sound good. De Kai's work implements cognitive models of our ability to absorb the right patterns.
How do we learn the relationships between multiple musical languages? It is the relationships between multiple different musical languages, simultaneously being played in parallel, that differentiate aesthetically pleasing music from jarring noise. Learning the contextual relationships between languages—formally called “transductions” —can easily become far more difficult than learning the individual musical languages. De Kai's models are the first to show how we depend on a virtuous cycle to crack the code: partial knowledge of a musical language helps us learn its relationships to other kinds of musical languages, while conversely, partial knowledge of relationships between musical languages helps us learn more about specific musical languages.
How does our ability for creative expression in music arise? The true test of music learning lies in musical communication, improvisation, and composition. Not only is it important to be able to express personal sentiments and attitudes, but a musician should also complement what is being played by other musicians, and respond appropriately to subtle cues from them. De Kai's transduction models explain how the musical relationships that are learned can naturally be used creatively to improvise, accompany, or compose in context.
Language AIs will broker the Virtual Silk Road Hack The Future, Shanghai, 10 Dec 2016
Artificial Children TEDxBlackRockCity, 31 Aug 2016
Surprise! You already have kids and they're AIs TEDxXi'an, Jun 2016
Augmenting Human Communication: What Doesn't Translate, and What is the Cost of Not Translating? World Economic Forum, Summer Davos, Sep 2014
Language, Music and Reorientation: The Keys to Artificial Intelligence Café Scientifique, Hong Kong, Mar 2016
De Kai's work on machine learning of human languages and the cognitive relationships between them—in formal terms, induction of transductions—has produced over 120 scientific papers. His PhD and Masters students come from cultures all over the world including China, India, United States, Canada, France, Sweden, Algeria and Hong Kong. His laboratory is globally funded by Asian, US, and European research grants.
Milestones in statistical machine translation pioneered by his laboratory include
Some of this is surveyed in De Kai's book chapters
His new book, Introduction to Text Alignment: Statistical Machine Translation Models from Bitexts to Bigrammars, will be published this year by Springer.
• “How AIs Are and Aren't
Kids: Language, Music and Society”. Volumetric
Society of NYC, Brooklyn Experimental Media Center at
NYU. New York, Jul 2016
• “Neural Versus Symbolic Inversion Transduction Grammars”. NLP Seminar, Columbia University. New York, Jul 2016
• “Surprise! You already have kids and they're AIs”. TEDxXi'an. Xi'an, China, Jun 2016
• “Music is a Relationship between Languages”. Music Colloquium. Chinese University of Hong Kong, Apr 2016
• “Language, Music and Reorientation: The Keys to Artificial Intelligence”. Café Scientifique. Hong Kong, Mar 2016
• “Can an A.I. Really Relate? What's Universal in Language and Music”. TEDxBeijing. Beijing, Jan 2016
• “AI = Learning to Translate”. 14th Estonian Summer School on Computer and Systems Science (ESSCaSS). Nelijärve, Estonia, Aug 2015
• “Translating Music: How Computational Learning Explains the Way We Appreciate Music and Language”. Raising The Bar. Hong Kong, Mar 2015
• “Why Structural Relationships Between Human Representation Languages Are Efficiently Learnable: The Magic Number 4”. HKU Spring Symposium, Science of Learning. Hong Kong, Feb 2015
• “Augmenting Human Communication: What Doesn't Translate, and What is the Cost of Not Translating?”. SummerDavos, World Economic Forum—Annual Meeting of the New Champions. Tianjin, China, Sep 2014
• “The GAGO Principle”. QTLeap. Lisbon, Portugal, Mar 2014
• “Language Structures Thought! Learning Relationships in Big Data”. HKUST Science-for-Lunch, Institutional Advancement and Outreach Committee of the University Council. Hong Kong, Dec 2013
• “Translation Memories or Machine Learning? The Science of Statistical Machine Translation”. U-STAR Workshop. Gurgaon, India, Nov 2013
• “Music in Translation: Artificial Intelligence and the Languages of World Music”. Conference of the World Music Expo (WOMEX). Cardiff, UK, Oct 2013
• “Semantic SMT Without Hacks”. 4th Workshop on South and Southeast Asian NLP (WSSANLP). Nagoya, Japan, Oct 2013
• “Re-Architecting The Core: What SMT Should Be Teaching Us About Machine Learning”. Recent Advances in Natural Language Processing (RANLP). Hissar, Bulgaria, Sep 2013
• “Tutorial on Deeply Integrated Semantic Statistical Machine Translation”. Recent Advances in Natural Language Processing (RANLP). Hissar, Bulgaria, Sep 2013
• “Tutorial on Tree-structured, Syntactic, and Semantic SMT”. Machine Translation Summit XIII. Xiamen, China, Sep 2011
• “Master seminar on Syntax, Semantics, and Structure in Statistical Machine Translation”. Universitat Politècnica de Catalunya (UPC). Barcelona, May 2011
• “Meaningful Statistical Machine Translation: Semantic MT and Semantic MT Evaluation”. National Natural Language Processing Research Symposium (NNLPRS). De La Salle University, Manila, Philippines, Nov 2010
• “Inversion Transduction Grammars, Language Universals, and Tree-Based Statistical Machine Translation”. National Natural Language Processing Research Symposium (NNLPRS). De La Salle University, Manila, Philippines, Nov 2010
• “The Future of Machine Translation: Statistics + Syntax + Semantics”. Asian Applied Natural Language Processing for Linguistics Diversity and Language Resource Development (ADD-5). Bangkok, Feb 2010
• “Toward Machine Translation with Statistics and Syntax and Semantics”. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Merano, Italy, Dec 2009
• “SMT with Semantic Roles”. Japan-China Joint Conference on NLP (JCNLP). Okinawa, Nov 2009
• Panel talk. Third Linguistic Annotation Workshop (LAW III) at ACL/IJCNLP. Singapore, Aug 2009
• “Is There a Future for Semantics?”. Workshop on Semantic Evaluations (SemEval 2010) at NAACL. Boulder, Colorado, Jun 2009
• “Structured Models in Statistical Machine Translation”. CASIA-HKUST Workshop. Beijing, Apr 2009
• “WSD for Semantic SMT: Phrase Sense Disambiguation.” Second Symposium on Innovations in Machine Translation Technologies (IMTT). Tokyo, Mar 2008
• “Syntax and Semantics in Statistical Machine Translation”. 4th Young Scholar Symposium on Natural Language Processing (YSSNLP). Suzhou, Oct 2007
• Panel talk. Machine Translation Summit XI. Copenhagen, Sep 2007
• Panel talk. 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI). Skövde, Sweden, Sep 2007
• “Tutorial on Inversion Transduction Grammars and the ITG Hypothesis: Tree-Structured Statistical Machine Translation”. TC-STAR OpenLab on Speech Translation. Trento, Italy, Mar 2006
• Keynote. Nokia Academic Summit. Beijing, Dec 2005
• “Statistical vs. Compositional vs. Example-Based Machine Translation”. EBMT-II: 2nd EBMT Workshop at Machine Translation Summit X. Phuket, Thailand, Sep 2005
• “Directions in Tree Structured Statistical Machine Translation”. JCNLP 2005: 5th Japan-China Natural Language Processing Joint Research Promotion Conference X. Nov 2005
• “Tutorial on Inversion Transduction Grammars and the ITG Hypothesis: Tree-Based Statistical Machine Translation”. Johns Hopkins Summer Language Workshop 2005. Baltimore, Jul 2005
• “Overcoming Disambiguation Accuracy Plateaus”. MEANING-2005. Trento, Italy, Feb 2005
• Invited talk. DFKI / SJTU Workshop on Promising Language Technology and Real-World Applications. Shanghai, Nov 2004
• Invited talk. 4th China-Japan Joint Conference to Promote Cooperation in Natural Language Processing (CJNLP). Hong Kong, Nov 2004
• “True or False? Every Serious Multilingual Application Needs a Parallel or Comparable Corpus”. LREC Workshop on the Amazing Utility of Parallel Text. Lisbon, Portugal, May 2004
• “The HKUST Leading Question Translation System”. Have We Found the Holy Grail? Machine Translation Summit IX. New Orleans, Sep 2003
• “Managing Multilingual Information: Theory in the Real World, and the Real World in Theory”. RIDE/MLIM: 13th International Workshop on Research Issues on Data Engineering: Multilingual Information Management, at ICDE’03. Hyderabad, India, Mar 2003
• Panel talk. MT-03: DARPA/NIST Machine Translation Workshop. Washington DC, Jul 2003
• “A Position Statement on Chinese Segmentation”. Chinese Language Processing Workshop, University of Pennsylvania, Jul 1998
• “Bilingual Parsing of Parallel Corpora”. Japan Society for the Promotion of Science / HCRL Workshop on New Challenges in Natural Language Processing, Tokyo, May 1998
• ”Making Machines Work at Child’s Play: The Decade of Linguistic Computing”. 9th International Computer Expo and Conference (Computer-93). Hong Kong, May 1993
• “Approximate Maximum-Entropy Integration of Syntactic and Semantic Constraints”. ROCLING-IV. Taipei, Sep 1992
South China Morning Post, 8 Jan 2012
Imagine learning how to translate from Chinese to English by reading millions of sentences from Hong Kong's bilingual Legislative Council transcripts.
You look at the Chinese. You look at the English below.
Actually, you don't know either language: you are a cluster of 75 computers in Professor De Kai's computational linguistics and musicology lab at the University of Science and Technology.
But as a machine, you are not only looking at the unfamiliar sentences two at a time, as a person would. Instead, you are using statistics to relate huge heaps of data to one another simultaneously.
You notice that in thousands of instances, the English “government building” appears in the same chunk of text as the Chinese phrase for it, so it is highly probable that these chunks mean the same thing.
You study these bilingual patterns, billions of them, cranking away at your algorithms.
Mostly, you work unguided, making your own dictionary as you go along based on the multitude of connections you detect between groups of words. When you make a mistake, a human researcher may correct you with a few programming keystrokes, and over time, you learn to make the right associations in the right context.
This is the world of computational linguistics: a field that strives to model natural human language on the computer. Google Translate and Siri are some of the recent products of these hi-tech linguists.
But even a decade ago (which in the cyberage is more like a century), De Kai was already finding up to 86 to 96 per cent accuracy rates between English and Chinese translations from a set of computers that, yes, really did read Legco documents.
His program did not just operate with individual words, but used chunks of other, smaller chunks in relation to each other, a mathematical model called inversion transduction grammars which enormously sped up the process of learning to translate.
For this work, he was last month honoured as one of only 17 founding fellows around the world of the Association for Computational Linguistics, and the only one from China.
He pioneered the computational study of English-Chinese language pairs, which no one else was doing at the time; the US first put funding into Chinese translations around 1999, years after De Kai started his work. In 1995 he launched Silc, a multilingual engine that handled the first web translations from English to Chinese.
“First-mover technology takes time,” he said, pointing out that Google Translate is still not making money. He said developers in Asia had to be patient investing in new fields.
But computational linguistics is gaining ground: De Kai is just about to close multimillion-dollar translation technology research projects with the European Union and the Defense Advanced Research Projects Agency, the US military arm that funds most US computer science research.
“All your thinking and cognition is taking the world as you see it and translating to an internal language”
Decades ago, machines would try to learn English grammar by simply processing millions of sentences and trying to find a common structure.
“That's like tying a child to a chair, blindfolding the child, and making them hear millions of sentences of English,” De Kai said.
His computers are not trying to “learn” English or Chinese, per se—at least not separately. For translation to work, what the machines need to do is figure out the relationships between the two languages and then match them.That's how humans learn language, too: a child from birth to six or seven years old is constantly matching relationships between what they sense in their environment to the spoken language they hear.
They learn that the round thing on the ground is a ball because they hear their parents say “ball” repeatedly around the object, even if it is mixed up with other words, and so they eventually associate the image with the sound. They, like the computers, are making correlations, not between Chinese and English, but between the language of their environment and the language of words.
In this sense, we are all translators. A child translates an action into a meaning. De Kai's machines translate from one language to another. A newspaper reader translates the text on the page into a narrative.
“All your thinking and cognition is taking the world as you see it and translating to an internal language,” De Kai said.
The original meaning of translate is to move or transform between one place or form and another. To be translatable, then, means to be able to be shifted, to be transformed.
De Kai is used to translating. He grew up in the US Midwest, but from the age of seven went back to China, including Hong Kong, in the summer. He remembers seeing the disparity between the US and post-Cultural Revolution China, and how much of it had to do with language and culture.
“You see the cultural disconnects, that English speakers aren't understanding something about the Chinese situation and vice versa.
“This is where we can make a difference,” De Kai said.
Unconventional thinker sees the science in song
Jaws dropped when Professor De Kai first stepped into the classroom in combat boots and unevenly shaved red hair—a get-up common in Berkeley but shocking in early 1990s Hong Kong.
“This was before people even dyed their hair,” he recalls.
Today De Kai sports a thick goatee and just a hint of the wild hair he sported years ago. But the the rocker in him hasn't changed, according to his linguistics students, who find him at concerts, where he performs as the percussionist and keyboardist for fusion electronica band ReOrientate.
The local group, formed in 2010, combines styles from all over the world, including China (they've got guzheng and erhu players) and India (their singer is Nigerian-Indian).
All of their music has a base in flamenco, a musical form now associated with Spain but which De Kai has traced back to gypsies in northern India along the Chinese border. These nomads eventually travelled to Europe on the Silk Road.
De Kai's musical passion began early. At age three, he was playing with radios, learning the piano and building synthesisers from scratch.
To De Kai, music-making is not separate from his day job. Music is already recognised as a universal language of rhythms, cadences, melodies and grooves, he says. And just as languages borrow words and phrases across borders and cultures, so can music.
He describes his work as “a search for the cross-cultural beauty of language, music, and cognition”.
“Understanding what is universal among humans is what lets us understand and listen to each other,” he said.
— Kanglei Wang