• Alibek Jakupov

Evolution of Machine Learning and Music : Grace and Beauty

Updated: Nov 19



It all started with...


It all started with statistical methods that were first discovered and then refined. The world then saw the pioneering machine learning research, in 1950s, conducted using simple algorithms. Later Bayesian methods were introduced for probabilistic inference, followed by so-called 'AI Winter' in 1970s as people were not absolutely sure about AI effectiveness. But as the proverb says, where there is a will there is a way, and rediscovery of backpropagation caused a new wave in AI research. Shortly afterwards, in 1980s, there was a dramatic change that shifted the whole process from a knowledge-based to a data-driven approach. Scientists started creating software in order to analyze large amounts of data. The main goal was to rediscover the natural laws underlying the observations, in other words, learn from the initial data by drawing logical conclusion. At this period of the machine learning's history such algorithms as Support vector machines (SVMs) and recurrent neural networks (RNNs) become commonly used. It was now the start of the fields of computational complexity via neural networks and super-Turing computation. The early 2000s have seen the rise of Support Vector Clustering and other Kernel methods as well as unsupervised algorithms. Starting from 2010s the Deep learning became achievable that caused the appearance of a wide range of application based on machine learning algorithms.


And what about the other spheres of life like, say, music. How did music change over time as the AI steadily integrated our everyday lives? Here we are going to discuss the evolution of the machine learning in sight of view of the most popular songs of a discussed period. Up we go.



1763 : Mozart family grand tour, William Boyce's "At length, th’imperious Lord of War" and Bayes' Theorem


On January 1, 1763 the world saw the very first performance of William Boyce's "At length, th’imperious Lord of War". Shortly afterwards, the family of Wolfgang Amadeus Mozart organized a European tour that ended the same year in Paris.


Thomas Bayes's work "An Essay towards solving a Problem in the Doctrine of Chances" was published, two years after his death. Bayes's friend, Richard Price, amended the work and edited it before publishing. This outstanding work underpinned the famous Bayes theorem that we still apply in machine learning related tasks.



1800s : Beethoven's Symphony No. 3, Niccolò Paganini's Europe tour and Adrien-Marie Legendre's "méthode des moindres carrés"



On the 7th of April, 1805 Beethoven publicly presented his Symphony No. 3, Eroica, at the Theater an der Wien in Vienna. This event marked the beginning of his middle period. At the same time Nicolo Paganini started touring in Europe.


In 1803 Adrien-Marie Legendre described the "méthode des moindres carrés" (the least squares method) that is know commonly applied in data fitting.



1812 : Franz Schubert, Antonio Salieri and Pierre-Simon Laplace



On the 26th of July a fifteen-year-old youngster Franz Schubert makes his last appearance as a chorister at the Imperial Chapel in Vienna. At the same year Antonio Salieri presented the world his Gesù al limbo für Soli.


The same year Pierre-Simon Laplace published his outstanding work called "Théorie Analytique des Probabilités" that defined what is now known as Bayes' Theorem. In this work he expanded upon the work of Bayes that had been published in 1763. by his friend Richard Price.



1913: Irving Berlin, Otto Harbach and Andrey Markov



A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

(Quote from Wikipedia)


Andrey Markov presented the techniques he applied to analyze a poem for the first time in 1913. We now know this technique as Markov chains.


At the same time, "Daddy, Come Home" of Irving Berlin was first published. This humorous song told the story from the point of view of a young boy calling his father on the telephone to ask him to leave work and deal with an assortment of family problems at home.

Yet, in 1913 Otto Harbach, an American lyricist and librettist of about 50 musical comedies, presented "The Bubble".



1950 "Goodnight, Irene", Pink Champagne and Learning Machine



In 1913 Alan Turing formed a principle of so-callled 'learning machine'. This principle stated that the 'learning machine' was able to learn and become artificially intelligent. This specific proposal prefigured the family of genetic algorithms. In one of the previous articles we covered one of the implementations of the genetic algorithm applied on the traveling salesman problem.


"Goodnight, Irene" (or "Irene, Goodnight,"), written in 3/4 time by Gordon Jenkins and The Weavers, became a 20th-century American folk standard. Even if the song was first recorded by American blues musician Huddie 'Lead Belly' Ledbetter in 1933, in 1950 it became the year-End number-one single. At the same time "Pink Champagne" by Joe Liggins was published and became the most popular song among R&B/Soul/Hip-hop compositions.



1951: "Cold, Cold Heart", "Too Young" and the first Neural Network



1951 saw the rise the SNARC, the first neural network machine, able to learn. It was developed by Marvin Minsky and Dean Edmonds. SNARC stands for Stochastic neural analog reinforcement calculator, which was a randomly connected network of approximately 40 Hebb synapses.


The most popular pop single of that period was "Too Young" Nat King Cole. The music was written by Sidney Lippman, the lyrics by Sylvia Dee and Nat King Cole recorded the most popular version. The same year Hank Williams recorded a country music and pop song, "Cold, Cold Heart".



1953: Machines Playing Checkers, "Kaw-Liga" and "Song from Moulin Rouge"



The most popular pop song in 1953 was the "Song from Moulin Rouge" by Percy Faith.The music was composed by Georges Auric and William Engvick wrote an english version of the original French lyrics by Jacques Larue. The most listened country single was "Kaw-Liga" by Hank Williams.


At this lovely period Arthur Samuel joined IBM's Poughkeepsie Laboratory. This step marked the beginning of his work on some of the very first machine learning programs that played checkers.



1957: Perceptron, "All Shook Up" and My Fair Lady


In 1957 Frank Rosenblatt was working at the Cornell Aeronautical Laboratory. At this period he invented the perceptron, an outstanding invention that was a true breakthrough and generated a great deal of excitement. Due to this fact it was widely covered in the media.


This year has also seen the "All Shook Up" by Elvis Presley which was the most listened pop single and the most listened pop album was My Fair Lady's Original Cast. Quote from Wikipedia:

My Fair Lady is a musical based on George Bernard Shaw's Pygmalion, with book and lyrics by Alan Jay Lerner and music by Frederick Loewe. The story concerns Eliza Doolittle, a Cockney flower girl who takes speech lessons from professor Henry Higgins, a phoneticist, so that she may pass as a lady. The original Broadway and London shows starred Rex Harrison and Julie Andrews. The musical's 1956 Broadway production was a notable critical and popular success. It set a record for the longest run of any show on Broadway up to that time. It was followed by a hit London production, a popular film version, and many revivals. My Fair Lady has been called "the perfect musical".


1963: Tic-Tac-Toe, West Side Story and Little Johnny Taylor



Donald Michie was a British researcher in machine learning who worked for the Government Code and Cypher School at Bletchley Park during World War II. During this work he tried to solve a German teleprinter cipher. called "Tunny,". In 1963 he created a 'machine' to play Tic-tac-toe. This machine consisted of 304 match boxes and beads. In this research he applied reinforcement learning.


In 1963, as a year before, the most listened pop album was the West Side Story soundtrack. Released in 1961, the soundtrack spent 54 weeks at No. 1 on Billboard's album charts. This gave it the longest run at No. 1 of any album in history.


The most popular soul single was "Part Time Love" by Little Johnny Taylor. Little Johnny Taylor an American blues and soul singer, who made recordings throughout the 1960s and 1970s.



1967: Nearest Neighbors, Aretha Franklin and Greatest Hits



Everyone learning machine learning is familiar with the nearest neighbors algorithm. It was one of the first approaches to solve the famous traveling salesman problem.


These are the steps of the algorithm:
  1. Initialize all vertices as unvisited.

  2. Select an arbitrary vertex, set it as the current vertex u. Mark u as visited.

  3. Find out the shortest edge connecting the current vertex u and an unvisited vertex v.

  4. Set v as the current vertex u. Mark v as visited.

  5. If all the vertices in the domain are visited, then terminate. Else, go to step 3.

Quote from Wikipedia


This step marked the start of basic pattern recognition.


The most popular soul composition of this year was "Respect" by Aretha Franklin. It is important to mention that the music in the two versions is significantly different. The most listened album of the same genre wast Greatest Hits by The Temptations. This album was released by the Gordy (Motown) label and peaked at #5 on the Billboard 200 album chart.



1969: Marvin Minsky, Seymour Papert and Sugar, Sugar



"Sugar, Sugar", a song written by Jeff Barry and Andy Kim, was originally recorded by the virtual band the Archies. The single reached number one in the US on the Billboard Hot 100 chart in 1969 and remained there for four weeks. The most listened country album was Wichita Lineman by Glen Campbell.


This year also marked the beginning of the pessimism about the AI performance. In 1969 Marvin Minsky and Seymour Papert published their book Perceptrons. This work described some of the limitations of perceptrons and neural networks. One of the ideas underlying this research was that neural networks were fundamentally limited which caused a hindrance for later investigations into neural networks.



1970: Backpropagation, the Jacksons and Bridge over Troubled Water



The Jackson 5 (stylized as the Jackson 5ive), later known as the Jacksons, were an American pop band composed of members of the Jackson family. The group was founded in 1964 in Gary, Indiana by brothers Jackie, Tito, and Jermaine, with younger brothers Marlon and Michael Jackson joining soon after.

quote from Wikipedia


In 1970 "I'll Be There", a soul song written by Berry Gordy, Hal Davis, Bob West, and Willie Hutch and recorded by The Jackson 5, became the year-end number-one single. The most listened pop single became "Bridge over Troubled Water" by Simon & Garfunkel.


In 1970 Seppo Linnainmaa, a Finnish mathematician and computer scientist, published the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions. Data Scientists may recognize the backpropagation algorithm. Indeed, as this work described the approach that we know today as backpropagation.



1979: Students at Stanford University, Anne Murray and C'est Chic



In 1963 there was an interesting research conducted by the Stanford University students, that aimed to create a cart able to navigate and avoid obstacles in a room. It was a significant step in the Machine Learning history.


American R&B band Chic released their second studio album on Atlantic Records in 1978. This album was called C'est Chic and became the most popular R&B album of 1979. The most popular country single became "I Just Fall in Love Again" by Anne Murray.



1980: Call me, ANN and Let's get serious



Who nowadays doesn't know convolutional neural networks (CNNs)? They are used everywhere, especially in computer vision tasks. But do you know that they were mainly inspired by Neocognition. Kunihiko Fukushima, a Japanese computer scientist, most noted for his work on artificial neural networks and deep learning, first published his work on the neocognitron in 1980. Neocognition is a type of artificial neural network (ANN) a hierarchical, multilayered artificial neural network. It has been used for handwritten character recognition and other pattern recognition tasks.


Released in the US in early 1980 as a single, "Call Me" was No. 1 for six consecutive weeks on the Billboard Hot 100 chart, where it became the band's biggest single and second No. 1, making it the most listened pop single.


Jermaine La Jaune Jackson an American singer, songwriter, bass guitarist, and member of The Jackson Five. In 1980 the title track from Jermaine Jackson's 1980 album Let's Get Serious on Motown Records became the most popular song in the R&B/Soul/Hip-hop category. The song was written by Lee Garrett and Stevie Wonder.



1981: Explanation Based Learning, Endless Love and Kim Carnes



In 1981 Gerald Dejong presented his work explaining the principles of Explanation Based Learning. The main idea underlying the principle consisted in a computer algorithm able to analyze data and draw a general rule as a conclusion. Moreover the Explanation Based algorithm was able to discard unimportant data.


"Bette Davis Eyes", a song written and composed by Donna Weiss and Jackie DeShannon, became the 1981's hit. Even if it was recorded for the first time in 1974 by DeShannon it was made popular by American singer Kim Carnes. The 1981 version spent nine weeks at No. 1 on the Billboard Hot 100 and was Billboard's biggest hit of 1981.


"Endless Love", a song written by Lionel Richie and originally recorded as a duet between Richie and fellow R&B singer Diana Ross, was the 1981's most listened R&B track.



1982 : RNN, Stevie Wonder and Olivia Newton-John



In 1982 John Hopfield, an American scientist, started popularizing associative neural network, mostly known as Hopfield networks. Hopfield networks is a type of recurrent neural network that can be applied as content-addressable memory systems with binary threshold nodes. However, it is important to mention that the principle was described earlier by Little in 1974.


In 1982 Stevland Hardaway Morris, better known by his stage name Stevie Wonder, an American singer, songwriter, musician and record producer, presented "That Girl", that was the leading single from Wonder's album-era 1982 greatest-hits compilation. The most listened 1982 pop single was "Physical", a song by British-born Australian singer Olivia Newton-John for her twelfth studio album Physical.



1985: NetTalk, Careless Whisper and Lost in the Fifties Tonight



In 1985 Terry Sejnowski developed NetTalk. NetTalk was a program that learned to pronounce words the same way a baby does.


"Careless Whisper", a pop ballad written by George Michael and Andrew Ridgeley of Wham!, was released on 24 July 1984 on the Wham! album Make It Big and became the most popular single in 1985.


The song features a prominent saxophone riff, and has been covered by a number of artists since its first release. It was released as a single and became a huge commercial success around the world. It reached number one in nearly 25 countries, selling about 6 million copies worldwide—2 million of them in the United States.

quote from Wikipedia


The same year Lost in the Fifties Tonight, the seventeenth studio album by country music artist Ronnie Milsa, became the most listened country single.



1986: Backpropagation, Whitney Houston and On my Own


1986 was definitely one of the most outstanding periods for Whitney Houston.

In 1986, at the 28th Grammy Awards, Whitney Houston received four nominations; including Album of the Year and won one, Best Pop Vocal Performance, Female for "Saving All My Love for You".

Her album called "Whitney Houston" became the most sold album in both Pop and RnB categories. This album eventually topped the Billboard 200 for 14 weeks in 1986, generating three number-one singles.


The most popular RnB single was "On My Own" by Patti Labelle and Michael McDonald.


And what about Machine Learning? In 1986 Seppo Linnainmaa's reverse mode of automatic differentiatio was used in experiments by David Rumelhart, Geoff Hinton and Ronald J. Williams to learn internal representations.

Knowledge representation and reasoning (KR², KR&R) is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets. Examples of knowledge representation formalisms include semantic nets, systems architecture, frames, rules, and ontologies. Examples of automated reasoning engines include inference engines, theorem provers, and classifiers. The KR conference series was established to share ideas and progress on this challenging field.

quote from Wikipedia


However, it is important to mention that this mode of automatic differentiation was first applied to neural networks by Paul Werbos, an American social scientist and machine learning pioneer.



1989 : Q-learning, Bobby Brown and Supewoman



In 1989 Christopher Watkins developped Q-learning, a model-free reinforcement learning algorithm, which greatly improved the practicality and feasibility of reinforcement learning. The goal of Q-learning was to learn a policy, which told an agent what action to take under certain circumstances. It did not require a model of the environment that is why it was named model-free. Q-learning was able to can handle problems with stochastic transitions and rewards, without requiring adaptations.


Another huge breakthrough of that period was the stat of commercialization of Artificial Intelligence on Personal Computers. In 1989 Axcelis, Inc. released Evolver, a software package that allowed users to solve a wide variety of optimization problems. It was the first software package to commercialize the use of genetic algorithms on personal computers that marked the begging of new era in Machine Learning.


Don't Be Cruel, the second studio album by American singer Bobby Brown, was released in the United States on June 20, 1988 by MCA Records. It became the most listened pop album in 1989.


The most listened RnB single of 1989 was "Superwoman" by Karyn White. The single was presented in her debut album in 1989.



1992 : TD-Gammon, Jodeci and Boyz II Men



Backgammon is one of the oldest known board games. Its history can be traced back nearly 5,000 years to archeological discoveries in Mesopotamia. It is a two player game where each player has fifteen pieces (checkers) which move between twenty-four triangles (points) according to the roll of two dice. The objective of the game is to be first to bear off, i.e. move all fifteen checkers off the board. Backgammon is a member of the tables family, one of the oldest classes of board games.

quote from Wikipedia


In 1992 Gerald Tesauro at IBM's Thomas J. Watson Research Center developed TD-Gammon , a computer backgammon program. This program used an artificial neural network trained using temporal-difference learning, specifically TD-lambda. TD-Gammon was able to rival the abilities of top human backgammon players, however it was not able to consistently surpass their skills.


A single recorded by American R&B group Boyz II Men for the Boomerang soundtrack called End of the Road was released in 1992 and written and produced by Kenneth "Babyface" Edmonds, L.A. Reid and Daryl Simmons. This single became the most listened pop song of the year.

The most popular RnB album of the year was Forever My Lady, the debut studio album by American R&B quartet Jodeci



1995 : Random Forest, SVM, Gangsta's Paradise and Mary J. Blige



Everyone working with Machine Learning Algorithm is familiar with Random Forest Algorithm.

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.Random decision forests correct for decision trees' habit of overfitting to their training set. The first algorithm for random decision forests was created by Tin Kam Ho using the random subspace method, which, in Ho's formulation, is a way to implement the "stochastic discrimination" approach to classification proposed by Eugene Kleinberg. An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who registered "Random Forests" as a trademark (as of 2019, owned by Minitab, Inc.). The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho and later independently by Amit and Geman in order to construct a collection of decision trees with controlled variance.

quote from Wikipedia


But did you know that the algorithm was firstly mentioned in the Tin Kam Ho research paper published in 1995? Extensions of this algorithm are used nowadays as non-parametric algorithms and allow creating powerful and efficient machine learning solutions.


Another great breakthrough was the discovery of Support Vector Machines when Corinna Cortes and Vladimir Vapnik published their work.

In machine learning, support-vector machines (SVMs, also support-vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier (although methods such as Platt scaling exist to use SVM in a probabilistic classification setting). An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall. In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces.

quote from Wikipedia

Undoubtedly these algorithms had a great impact on the industrial Machine Learning as we know it today.


My Life, the second studio album by American RnB recording artist Mary J. Blige, released in 1994, by Uptown Records and became the most popular RnB album of 1995.

The most listened pop song was a single by American rapper Coolio, featuring singer L.V. called Gangsta's Paradise.

The song was listed at number 85 on Billboard's Greatest Songs of All-Time and number one biggest selling single of 1995 on U.S. Billboard. In 2008, it was ranked number 38 on VH1's 100 Greatest Songs of Hip Hop. Coolio was awarded a Grammy for Best Rap Solo Performance, two MTV Video Music Award's for Best Rap Video and for Best Video from a Film and a Billboard Music Award for the song/album. The song was voted as the best single of the year in The Village Voice Pazz & Jop critics poll. The song has sold over 5 million copies in the United States, United Kingdom and Germany alone, and at least 6 million worldwide, making it one of the best-selling singles of all time. Coolio has performed this song live at the 1995 Billboard Music Awards with L.V. and Wonder, at the 38th Annual Grammy Awards with L.V., and also with Dutch singer Trijntje Oosterhuis.

quote from Wikipedia



1997: Kasparov, LSTM, Spice Girls and Elton John



In 1997 there was a great achievement of Machine Learning as IBM's Deep Blue beated the world champion at chess, Garry Kasparov, whom many consider to be the greatest chess player of all time.

Deep Blue was a chess-playing computer developed by IBM. It is known for being the first computer chess-playing system to win both a chess game and a chess match against a reigning world champion under regular time controls. Deep Blue won its first game against a world champion on 10 February 1996, when it defeated Garry Kasparov in game one of a six-game match. However, Kasparov won three and drew two of the following five games, defeating Deep Blue by a score of 4–2. Deep Blue was then heavily upgraded, and played Kasparov again in May 1997. Deep Blue won game six, therefore winning the six-game rematch 3½–2½ and becoming the first computer system to defeat a reigning world champion in a match under standard chess tournament time controls. Kasparov accused IBM of cheating and demanded a rematch. IBM refused and dismantled Deep Blue. Development for Deep Blue began in 1985 with the ChipTest project at Carnegie Mellon University. This project eventually evolved into Deep Thought, at which point the development team was hired by IBM. The project evolved once more with the new name Deep Blue in 1989. Grandmaster Joel Benjamin was also part of the development team.

quote from Wikipedia


Another great achievement was the invention of long short-term memory recurrent neural networks (LSTM) by Sepp Hochreiter and Jürgen Schmidhuber. This algorithm greatly improved the efficiency and practical value of recurrent neural networks.

Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections that make it a "general purpose computer" (that is, it can compute anything that a Turing machine can). It can not only process single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. Bloomberg Business Week wrote: "These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music." A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.

quote from Wikipedia


The most listened pop song of 1997 was "Something About the Way You Look Tonight", a song by Elton John, released in 1997 as the first single from his 26th studio album The Big Picture.

'Billboard said the song is "a grandly executed ballad that washes John's larger-than-life performance in cinematic strings and whooping, choir-styled backing vocals. An instant fave for die-hards, this single will bring kids at top 40 to the table after a few spins."

quote from Wikipedia

The most popular album of the year was Spice by Spice Girls.



1998 : MNIST, Too close and Titanic



If you have ever worked with Computer Vision you are definitely familiar with the MNIST database. A team led by Yann LeCun released the MNIST database in 1998. MNIST was a dataset comprising a mix of handwritten digits from American Census Bureau employees and American high school students. This database is commonly known as a benchmark for evaluating handwriting recognition.

The most popular of the 1998 was "Too Close" a single by American R&B group Next, featuring uncredited vocals from Vee of Koffee Brown. This single became the most listened in both and RnB categories.

Not surprisingly, the most popular album of 1998 was the Titanic soundtrack as the movie was a true breakthrough.



2002: Torch, Nickelback and The Eminem Show



Everyone working with modern AI applications is familiar with Torch machine learning library (Python users should know PyTorch). It was first released in 2002 as a software library. However, nowadays it is not only an open-source machine learning library, but also a scientific computing framework, and a script language based on the Lua programming language. It offers a wide range of algorithms for deep learning. Torch uses the scripting language LuaJIT and an underlying C implementation.

The core package of Torch is torch. It provides a flexible N-dimensional array or Tensor, which supports basic routines for indexing, slicing, transposing, type-casting, resizing, sharing storage and cloning. This object is used by most other packages and thus forms the core object of the library. The Tensor also supports mathematical operations like max, min, sum, statistical distributions like uniform, normal and multinomial, and BLAS operations like dot product, matrix-vector multiplication, matrix-matrix multiplication, matrix-vector product and matrix product.

quote from Wikipedia

Nevertheless it is important to mention that as of 2018, Torch is no longer in active development.


The most popular album of 2002 was The Eminem Show, the fourth studio album by American rapper Eminem, released on May 26, 2002 by Aftermath Entertainment, Shady Records, and Interscope Records. This album included the commercially successful singles "Without Me", "Cleanin' Out My Closet", "Superman", and "Sing for the Moment".

The Eminem Show reached number one in nineteen countries, including Australia, Canada, the United Kingdom and the United States, and was the best-selling album of 2002 in the United States, with 7,600,000 copies sold. Since its release in 2002, the album has sold 10,600,000 copies in the United States and over 30 million copies worldwide. At the 2003 Grammy Awards, it was nominated for Album of the Year and became Eminem's third album in four years to win the award for Best Rap Album. On March 7, 2011, the album was certified 10× Platinum (Diamond) by the RIAA, making it Eminem's second album to go Diamond in the United States.

quote from Wikipedia.


The most listened pop single of 2002 was "How You Remind Me", a song by Canadian rock band Nickelback.



2006: The Netflix Prize, Daniel Powter and Some Hearts



The most popular single of 2006 was a pop song from Canadian singer Daniel Powter's self-titled second studio album.

Although "Bad Day" received mixed critical reviews, with some music critics praising its "universal appeal" while others felt it lacked depth in its lyrics, it was a commercial success. In 2005, the single charted in the top five in more than ten countries worldwide and became the most played song on European radio. After its European success, it was released in the United States where it topped the Billboard Hot 100, Pop 100, Adult Top 40, and Adult Contemporary charts. In 2006, it became the first song ever to sell two million digital copies in the United States. After another million were sold, it was certified three-times platinum by the Recording Industry Association of America (RIAA) in 2009. It was certified platinum in Australia, Canada, and the United Kingdom, gold in Denmark and Germany, and also received a certification in France and Japan.

quote from Wikipedia

The most popular album was Some Hearts, the debut studio album by American singer and songwriter Carrie Underwood, released in the United States on November 15, 2005 by Arista Nashville.


2006 has seen a huge event called The Netflix Prize. This competition was launched by Netflix. The main goal of the competition was to use machine learning to beat Netflix's own recommendation software's accuracy. Netlix's software predicted a user's rating for a film given their ratings for previous films. The aim was to bypass the software accuracy by at least 10%.



2009: ImageNet, Alicia Keys and Low



The Netflix prize, described in the previous section was finally won in 2009. Another huge breakthrough was the creation of ImageNet, a large visual database envisioned by Fei-Fei Li from Stanford University. Fei-Fei Li was conscious enough to understand that the best machine learning algorithms wouldn't work well if the data didn't reflect the real world. There is a large number of developers who consider ImageNet as the catalyst for the Machine Learning boom of the 21st century.


In 2009 the audience chose "Like You'll Never See Me Again", a song by American singer-songwriter Alicia Keys from her third studio album As I Am, as their favorite RnB song.

Upon its release, the song peaked at number twelve on the Billboard Hot 100 and became Keys' second consecutive R&B chart-topper, remaining atop the Hot R&B/Hip-Hop Songs chart for seven weeks. It was ranked number forty-seven on Billboard's Top Hot 100 Hits of 2008. The song went on to replace her single "No One" at number one on the US R&B/Hip-Hop charts. In 2008, the song won two NAACP Image Awards for Outstanding Music Video and Outstanding Song, and an ASCAP Rhythm & Soul Music Award for Top R&B/Hip-Hop Song.

quote from Wikipedia

The most listened pop single was Flow by Flo Rida featuring T-Pain.



2010: Kaggle, Unthinkable and Susan Boyle



Kaggle is one of the most important and definitely the most popular online machine learning competition. It allows all the passionate engineers, data scientist, data analyst and data engineers to test their skills in solving real-world industrial cases. It was first launched in 2010 as a website that serves as a platform for machine learning competitions.

Kaggle is an online community of data scientists and machine learners, owned by Google LLC. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges. Kaggle got its start by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and short form AI education. On 8 March 2017, Google announced that they were acquiring Kaggle.

quote from Wikipedia


Another remarkable event was I Dreamed a Dream, the debut studio album by Scottish singer Susan Boyle. What is special about this album is the fact that Susan Boyle rose to fame after appearing as a contestant on the third series of Britain's Got Talent, singing "I Dreamed a Dream" from Les Misérables.

The album entered the UK album chart at number 1 and became the fastest-selling debut album ever in the UK, selling 411,820 copies, beating the record previously set by Spirit by Leona Lewis and outselling the rest of the top five albums combined during its first week on sale. The album remained at the top spot for four weeks, becoming the biggest selling album in the UK in 2009. In the U.S., I Dreamed a Dream debuted at No. 1 on the Billboard 200, with 701,000 copies sold in its first week, breaking the record for the highest debut by a new solo female artist in the SoundScan era (post 1991). I Dreamed a Dream became the biggest opening sales week of 2009 in the U.S., beating out Eminem's Relapse which sold 608,000. It was the second-biggest selling album of 2009 in the U.S., with 3.1 million copies sold, right behind Taylor Swift's Fearless at 3.2 million copies. In only six weeks of sales, it became the biggest selling album in the world for 2009.

quote from Wikipedia


As in 2009, the most popular RnB singer still was Alicia Keys who released "Un-Thinkable (I'm Ready)", a single that became the most popular RnB song.


In the next chapter we are going to cover the 2010s, which the period that we actually living in.



2011: Beating humans in Jeopardy, Adele and Sure Thing



Jeopardy! is an American television game show created by Merv Griffin. The show features a quiz competition in which contestants are presented with general knowledge clues in the form of answers, and must phrase their responses in the form of questions. The original daytime version debuted on NBC on March 30, 1964, and aired until January 3, 1975. A weekly nighttime syndicated edition aired from September 1974 to September 1975, and a revival, The All-New Jeopardy!, ran on NBC from October 1978 to March 1979. The current version, a daily syndicated show produced by Sony Pictures Television, premiered on September 10, 1984.Both NBC versions and the weekly syndicated version were hosted by Art Fleming. Don Pardo served as announcer until 1975, and John Harlan announced for the 1978–1979 show. Since its inception, the daily syndicated version has featured Alex Trebek as host and Johnny Gilbert as announcer.With over 8,000 episodes aired, the daily syndicated version of Jeopardy! has won a record 33 Daytime Emmy Awards as well as a Peabody Award. In 2013, the program was ranked No. 45 on TV Guide's list of the 60 greatest shows in American television history. Jeopardy! has also gained a worldwide following with regional adaptations in many other countries. The daily syndicated series' 35th season premiered on September 10, 2018.

quote from Wikipedia


In 2011 IBM's Watson beat two human champions in a Jeopardy! competition. This artificial intelligence applied machine learning, natural language processing and information retrieval techniques.

Watson is a question-answering computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO, industrialist Thomas J. Watson. The computer system was initially developed to answer questions on the quiz show Jeopardy! and, in 2011, the Watson computer system competed on Jeopardy! against legendary champions Brad Rutter and Ken Jennings winning the first place prize of $1 million. In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan Kettering Cancer Center, New York City, in conjunction with WellPoint (now Anthem). IBM Watson's former business chief, Manoj Saxena, says that 90% of nurses in the field who use Watson now follow its guidance.

quote from Wikipedia


This event marked an important step in the machine learning evolution as artificial intelligence used to have deficiencies in understanding the contexts of the clues. As a result, human players used to generate responses faster than a machine, especially to short clues.


The year when AI started bypassing human performance was also marked by the release of Adele's Rolling in the Deep, that became the 2011's most popular song.

"Rolling in the Deep" is a song recorded by English singer-songwriter Adele for her second studio album, 21. It is the lead single and opening track on the album. The song was written by Adele and Paul Epworth. The singer herself describes it as a "dark blues-y gospel disco tune". The largest crossover hit in the United States from the past 25 years, "Rolling in the Deep" gained radio airplay from many different radio formats. It was first released on 2010 as the lead single from 21 in digital download format. The lyrics describe the emotions of a scorned lover.

quote from Wikipedia


The most popular RnB song of 2011 was Sure Thing by Miguel.



2012 : Cats on YouTube, Gotye and 21



The most popular song of 2012 was "Somebody That I Used to Know", written by Belgian-Australian singer-songwriter Gotye, featuring New Zealander singer Kimbra.

"Somebody That I Used to Know" is a mid-tempo ballad. It samples Luiz Bonfá's instrumental "Seville" from his 1967 album Luiz Bonfa Plays Great Songs. The song received a positive reception from critics, who noted the similarities between the song and works by Sting, Peter Gabriel, and American folk band Bon Iver. In Australia, the song won the Triple J Hottest 100 poll at the end of 2011, as well as ARIA Awards for song of the year and best video, while Kimbra was voted best female artist and Gotye was named best male artist and producer of the year. The song came ninth in the Triple J Hottest 100 of the Past 20 Years, 2013. In 2013, the song won two Grammy Awards for Best Pop Duo/Group Performance and Record of the Year.

quote from Wikipedia

The most popular album of 2012 was 21 by Adele.


In 2012, The Google Brain team, a deep learning artificial intelligence research team at Google, led by Andrew Ng and Jeff Dean, created a neural network that learnt to recognize cats by analyzing unlabeled images captured from frames of YouTube videos.

Formed in the early 2010s, Google Brain combines open-ended machine learning research with systems engineering and Google-scale computing resources.

quote from Wikipedia

The project was particularly interesting as Google's brain simulator taught itself for object recognition.



2014 : Facebook, Happy and Frozen



In 2014 Facebook researchers published their work on DeepFace, a deep learning facial recognition system. The system used neural networks that identified faces with 97.35% accuracy. The 2014's results were an improvement of more than 27% over previous systems and rivals human performance.

DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. It employs a nine-layer neural net with over 120 million connection weights, organized as a siamese network, and was trained on four million images uploaded by Facebook users. The system is said to be 97% accurate, compared to 85% for the FBI's Next Generation Identification system. One of the creators of the software, Yaniv Taigman, came to Facebook via their 2007 acquisition of Face.com.

quote from Wikipedia


The most popular song of 2014 was Happy by Pharrell Williams.

"Happy" is a song written, produced, and performed by American singer Pharrell Williams, from the Despicable Me 2 soundtrack album. It also served as the lead single from Williams' second studio album, Girl (2014). It was first released on November 21, 2013, alongside a long-form music video. The song was reissued on December 16, 2013, by Back Lot Music under exclusive license to Columbia Records, a division of Sony Music. "Happy" is an uptempo soul and neo soul song on which Williams's falsetto voice has been compared to Curtis Mayfield by critics. The song has been highly successful, peaking at No. 1 in the United States, United Kingdom, Canada, Ireland, New Zealand, and 19 other countries. It was the best-selling song of 2014 in the United States with 6.45 million copies sold for the year, as well as in the United Kingdom with 1.5 million copies sold for the year. It reached No. 1 in the UK on a record-setting three separate occasions and became the most downloaded song of all time in the UK in September 2014; it is the eighth highest-selling single of all time in the country. It was nominated for an Academy Award for Best Original Song. A live rendition of the song won the Grammy Award for Best Pop Solo Performance at the 57th Annual Grammy Awards.

quote from Wikipedia

And the most popular album was the Frozen soundtrack.



2016 : H.O.L.Y., 25 and Beating Humans in Go


The audience has chose H.O.L.Y as the most popular country single of 2016. The most listened pop album was 25 again by Adele.

25 is the third studio album by English singer-songwriter Adele, released on 20 November 2015 by XL Recordings and Columbia Records. Issued nearly five years after her previous album, the internationally successful 21 (2011), the album is titled as a reflection of her life and frame of mind at 25 years old and is termed a "make-up record". Its lyrical content features themes of Adele "yearning for her old self, her nostalgia", and "melancholia about the passage of time" according to an interview with the singer by Rolling Stone, as well as themes of motherhood and regret. In contrast to Adele's previous works, the production of 25 incorporated the use of electronic elements and creative rhythmic patterns, with elements of 1980s R&B and organs. Like 21, Adele worked with producer and songwriter Paul Epworth and Ryan Tedder, along with new collaborations with Max Martin and Shellback, Greg Kurstin, Danger Mouse, the Smeezingtons, Samuel Dixon, and Tobias Jesso Jr.

quote from Wikipedia


The most important achievement of 2016 was the victory of Google's AlphaGo program over an unhandicapped professional human player. It was the first Computer Go program to achieve such an impressive result. The solution combined machine learning and tree search techniques. It was later improved as AlphaGo Zero and then generalized to Chess and more two-player games with AlphaZero in 2017.

AlphaGo is a computer program that plays the board game Go. It was developed by Alphabet Inc.'s Google DeepMind in London. AlphaGo had three far more powerful successors, called AlphaGo Master, AlphaGo Zero and AlphaZero. In October 2015, the original AlphaGo became the first computer Go program to beat a human professional Go player without handicaps on a full-sized 19×19 board. In March 2016, it beat Lee Sedol in a five-game match, the first time a computer Go program has beaten a 9-dan professional without handicaps. Although it lost to Lee Sedol in the fourth game, Lee resigned in the final game, giving a final score of 4 games to 1 in favour of AlphaGo. In recognition of the victory, AlphaGo was awarded an honorary 9-dan by the Korea Baduk Association. The lead up and the challenge match with Lee Sedol were documented in a documentary film also titled AlphaGo, directed by Greg Kohs. It was chosen by Science as one of the Breakthrough of the Year runners-up on 22 December 2016. At the 2017 Future of Go Summit, its successor AlphaGo Master beat Ke Jie, the world No.1 ranked player at the time, in a three-game match (the even more powerful AlphaGo Zero already existed but was not yet announced). After this, AlphaGo was awarded professional 9-dan by the Chinese Weiqi Association. AlphaGo and its successors use a Monte Carlo tree search algorithm to find its moves based on knowledge previously "learned" by machine learning, specifically by an artificial neural network (a deep learning method) by extensive training, both from human and computer play. A neural network is trained to predict AlphaGo's own move selections and also the winner's games. This neural net improves the strength of tree search, resulting in higher quality of move selection and stronger self-play in the next iteration.

quote from Wikipedia

So why a computer program was unable to beat a human player in Go? We may find the response in its rules

Despite its relatively simple rules, Go is very complex. Compared to chess, Go has both a larger board with more scope for play and longer games, and, on average, many more alternatives to consider per move. The lower bound on the number of legal board positions in Go has been estimated to be 2 x 10^170


2020: Deep speed, Turing-NLG and Rain on me



In February 2020 DeepSpeed Microsoft's deep learning optimization library for PyTorch was introduced. This library has been optimized to run Turing-NLG, the "largest language model ever published at 17 billion parameters".


The same year OpenAI's GPT-3, a state-of-the-art autoregressive language model was introduced (in May 2020), which was in beta testing in June 2020. This model used deep learning to produce a variety of computer codes, poetry and other language tasks exceptionally similar, and almost indistinguishable from those written by humans. Its capacity was ten times greater than that of the T-NLG.


The most popular song of 2020 was "Rain on me" by Lady Gaga and Ariana Grande. Grande became the first solo artist to premiere four songs at No. 1; Gaga became the artist with the greatest gap between No. 1 debuts; and “Rain on Me” became the first all-female collaboration to replace another all-female collaboration (Megan Thee Stallion & Beyoncé's “Savage” remix) at the Hot 100's summit.



Conclusion


It's been a long and a fascinating way and together we tried to track the evolution of machine learning in sight of view of music timeline.


So, what is next? No one knows. And this what is really thrilling about machine learning. It is now our turn to make history, so up we go!


References

  1. Solomonoff, Ray J. "A formal theory of inductive inference. Part II." Information and control 7.2 (1964): 224–254.

  2. Marr, Bernard. "A Short History of Machine Learning – Every Manager Should Read". Forbes. Retrieved 28 Sep 2016.

  3. Siegelmann, Hava; Sontag, Eduardo (1995). "Computational Power of Neural Networks". Journal of Computer and System Sciences. 50 (1): 132–150.

  4. Siegelmann, Hava (1995). "Computation Beyond the Turing Limit". Journal of Computer and System Sciences. 238 (28): 632–637.

  5. Ben-Hur, Asa; Horn, David; Siegelmann, Hava; Vapnik, Vladimir (2001). "Support vector clustering". Journal of Machine Learning Research. 2: 51–86.

  6. Hofmann, Thomas; Schölkopf, Bernhard; Smola, Alexander J. (2008). "Kernel methods in machine learning". The Annals of Statistics. 36 (3): 1171–1220. JSTOR 25464664.

  7. Bennett, James; Lanning, Stan (2007). "The netflix prize" (PDF). Proceedings of KDD Cup and Workshop 2007.

  8. Bayes, Thomas (1 January 1763). "An Essay towards solving a Problem in the Doctrine of Chance" (PDF). Philosophical Transactions. 53: 370–418. doi:10.1098/rstl.1763.0053. JSTOR 105741. Retrieved 15 June 2016.

  9. Legendre, Adrien-Marie (1805). Nouvelles méthodes pour la détermination des orbites des comètes (in French). Paris: Firmin Didot. p. viii. Retrieved 13 June 2016.

  10. O'Connor, J J; Robertson, E F. "Pierre-Simon Laplace". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved 15 June 2016.

  11. Hayes, Brian. "First Links in the Markov Chain". American Scientist. Sigma Xi, The Scientific Research Society. 101 (March–April 2013): 92. doi:10.1511/2013.101.1. Retrieved 15 June 2016. "Delving into the text of Alexander Pushkin's novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin's poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction."

  12. Turing, Alan (October 1950). "Computing Machinery and Intelligence". Mind. 59 (236): 433–460. doi:10.1093/mind/LIX.236.433. Retrieved 8 June 2016.

  13. Crevier 1993, pp. 34–35 and Russell & Norvig 2003, p. 17

  14. McCarthy, John; Feigenbaum, Ed. "Arthur Samuel: Pioneer in Machine Learning". AI Magazine (3). Association for the Advancement of Artificial Intelligence. p. 10. Retrieved 5 June 2016.

  15. Rosenblatt, Frank (1958). "The perceptron: A probabilistic model for information storage and organization in the brain" (PDF). Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519.

  16. Mason, Harding; Stewart, D; Gill, Brendan (6 December 1958). "Rival". The New Yorker. Retrieved 5 June 2016.

  17. Child, Oliver. "Menace: the Machine Educable Noughts And Crosses Engine Read". Chalkdust Magazine. Retrieved 16 Jan 2018.

  18. Child, Oliver. "Menace: the Machine Educable Noughts And Crosses Engine Read". Chalkdust Magazine. Retrieved 16 Jan 2018.

  19. Cohen, Harvey. "The Perceptron". Retrieved 5 June 2016.

  20. Colner, Robert. "A brief history of machine learning". SlideShare. Retrieved 5 June 2016.

  21. Seppo Linnainmaa (1970). "The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors." Master's Thesis (in Finnish), Univ. Helsinki, 6–7.

  22. Linnainmaa, Seppo (1976). "Taylor expansion of the accumulated rounding error". BIT Numerical Mathematics. 16 (2): 146–160. doi:10.1007/BF01931367.

  23. Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation?". Documenta Matematica, Extra Volume ISMP: 389–400.

  24. Griewank, Andreas and Walther, A. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.

  25. Schmidhuber, Jürgen (2015). "Deep learning in neural networks: An overview". Neural Networks. 61: 85–117. arXiv:1404.7828. Bibcode:2014arXiv1404.7828S.

  26. Schmidhuber, Jürgen (2015). Deep Learning. Scholarpedia, 10(11):32832. Section on Backpropagation

  27. Hopfield, John (April 1982). "Neural networks and physical systems with emergent collective computational abilities" (PDF). Proceedings of the National Academy of Sciences of the United States of America. 79: 2554–2558. Bibcode:1982PNAS...79.2554H. doi:10.1073/pnas.79.8.2554. PMC 346238. PMID 6953413. Retrieved 8 June 2016.

  28. Rumelhart, David; Hinton, Geoffrey; Williams, Ronald (9 October 1986). "Learning representations by back-propagating errors" (PDF). Nature. 323: 533–536. Bibcode:1986Natur.323..533R. doi:10.1038/323533a0. Retrieved 5 June 2016.

  29. Watksin, Christopher (1 May 1989). "Learning from Delayed Rewards" (PDF).

  30. Markoff, John (29 August 1990). "BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest". New York Times. Retrieved 8 June 2016.

  31. Seppo Linnainmaa (1970). "The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors." Master's Thesis (in Finnish), Univ. Helsinki, 6–7.

  32. Linnainmaa, Seppo (1976). "Taylor expansion of the accumulated rounding error". BIT Numerical Mathematics. 16 (2): 146–160. doi:10.1007/BF01931367.

  33. Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation?". Documenta Matematica, Extra Volume ISMP: 389–400.

  34. Griewank, Andreas and Walther, A. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.

  35. Schmidhuber, Jürgen (2015). "Deep learning in neural networks: An overview". Neural Networks. 61: 85–117. arXiv:1404.7828. Bibcode:2014arXiv1404.7828S.

  36. Schmidhuber, Jürgen (2015). Deep Learning. Scholarpedia, 10(11):32832. Section on Backpropagation

  37. Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36: 193–202. doi:10.1007/bf00344251. PMID 7370364. Retrieved 5 June 2016.

  38. Le Cun, Yann. "Deep Learning". CiteSeerX 10.1.1.297.6176

  39. Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). doi:10.1145/203330.203343.

  40. Ho, Tin Kam (August 1995). "Random Decision Forests" (PDF). Proceedings of the Third International Conference on Document Analysis and Recognition. Montreal, Quebec: IEEE. 1: 278–282. doi:10.1109/ICDAR.1995.598994. ISBN 0-8186-7128-9. Retrieved 5 June 2016.

  41. Golge, Eren. "BRIEF HISTORY OF MACHINE LEARNING". A Blog From a Human-engineer-being. Retrieved 5 June 2016.

  42. Cortes, Corinna; Vapnik, Vladimir (September 1995). "Support-vector networks". Machine Learning. Kluwer Academic Publishers. 20 (3): 273–297. doi:10.1007/BF00994018. ISSN 0885-6125.

  43. Hochreiter, Sepp; Schmidhuber, Jürgen (1997). "LONG SHORT-TERM MEMORY" (PDF). Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276. Archived from the original (PDF) on 2015-05-26.

  44. LeCun, Yann; Cortes, Corinna; Burges, Christopher. "THE MNIST DATABASE of handwritten digits". Retrieved 16 June 2016.

  45. Markoff, John (17 February 2011). "Computer Wins on 'Jeopardy!': Trivial, It's Not". New York Times. p. A1. Retrieved 5 June 2016.

  46. Le, Quoc V.; Ranzato, Marc'Aurelio; Monga, Rajat; Devin, Matthieu; Corrado, Greg; Chen, Kai; Dean, Jeffrey; Ng, Andrew Y. (2012). "Building high-level features using large scale unsupervised learning" (PDF). Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012. icml.cc / Omnipress. arXiv:1112.6209. Bibcode:2011arXiv1112.6209L.

  47. Markoff, John (26 June 2012). "How Many Computers to Identify a Cat? 16,000". New York Times. p. B1. Retrieved 5 June 2016.

  48. Taigman, Yaniv; Yang, Ming; Ranzato, Marc'Aurelio; Wolf, Lior (24 June 2014). "DeepFace: Closing the Gap to Human-Level Performance in Face Verification". Conference on Computer Vision and Pattern Recognition. Retrieved 8 June 2016.

  49. Canini, Kevin; Chandra, Tushar; Ie, Eugene; McFadden, Jim; Goldman, Ken; Gunter, Mike; Harmsen, Jeremiah; LeFevre, Kristen; Lepikhin, Dmitry; Llinares, Tomas Lloret; Mukherjee, Indraneel; Pereira, Fernando; Redstone, Josh; Shaked, Tal; Singer, Yoram. "Sibyl: A system for large scale supervised machine learning" (PDF). Jack Baskin School of Engineering. UC Santa Cruz. Retrieved 8 June 2016.

  50. Woodie, Alex (17 July 2014). "Inside Sibyl, Google's Massively Parallel Machine Learning Platform". Datanami. Tabor Communications. Retrieved 8 June 2016.

  51. "Google achieves AI 'breakthrough' by beating Go champion". BBC News. BBC. 27 January 2016. Retrieved 5 June 2016.

  52. "AlphaGo". Google DeepMind. Google Inc. Retrieved 5 June 2016.

108 views3 comments