In recent years, the strong rise of artificial intelligence, especially the human-machine war between AlphaGo and the Korean chess player Li Shishi, let us appreciate the great potential of artificial intelligence technology. Data is the carrier, intelligence is the goal, and machine learning is the technical way from data to intelligence. Therefore, machine learning is the core of data science and the essence of modern artificial intelligence.
In layman's terms, machine learning is the mining of valuable information from data. The data itself is unconscious and it does not automatically present useful information. How can I find out what is valuable? The first step is to give the data an abstract representation; then to model based on the representation; then to estimate the parameters of the model, that is, the calculation; in order to deal with the problems caused by large-scale data, we also need to design some efficient means of implementation, Includes hardware level and algorithm level. Statistics is the main tool and approach to modeling, and model solving is mostly defined as an optimization problem. In particular, the frequency method is actually an optimization problem. The Bayesian model is often involved in the Monte Carlo random sampling method. Therefore, machine learning is an interdisciplinary subject between computer science and statistics.
Drawing on the definition of the three-level theory of computer vision by Marr, the founder of computer vision theory, I divided machine learning into three levels: elementary, intermediate, and advanced. The primary stage is data acquisition and feature extraction. The intermediate stage is data processing and analysis. It has three aspects: first, application problem orientation. Simply put, it mainly applies existing models and methods to solve some practical problems. We can understand it as data mining. Secondly, according to the application. The need for problems, the development and development of models, methods and algorithms, and the study of mathematical principles or theoretical foundations that support them are the core of the machine learning discipline. Third, some intelligence is achieved through reasoning. The advanced stage is intelligence and cognition, which is the goal of achieving intelligence. Data mining and machine learning are essentially the same, the difference is that data mining is closer to the data end, while machine learning is closer to the smart end.
Statistics and calculation
Larry Wasserman, a professor of statistics at Carnegie Mellon University who was elected a member of the American Academy of Sciences this year, wrote a book with a very overbearing name: All of StaTIsTIcs. The introductory part of this book has a very interesting description of statistics and machine learning. Wasserman believes that the original statistics are in the statistics department, the computer is in the computer department, the two are not in line with each other, and they do not agree with each other's value. Computer scientists believe that statistical theory is useless and does not solve problems, while statisticians believe that computer scientists are only "reinventing the wheel", no new ideas. However, he believes that the situation has changed, statisticians recognize the contributions that computer scientists are making, and computer scientists recognize the universal significance of statistical theory and methodology. Therefore, Worthman wrote this book. It can be said that this is a book in the field of computers written for statisticians, a book in the field of statistics written by computer scholars.
Now everyone has reached a consensus: If you are using a machine learning method and do not understand its basic principles, this is a very terrible thing. It is for this reason that the current academic community is still skeptical about deep learning. Although deep learning has demonstrated its powerful capabilities in practical applications, the principles are not yet clear.
Computer scientists usually have powerful computing power and intuition to solve problems, while statisticians are good at theoretical analysis and problem modeling, so the two are very complementary. BoosTIng, Support Vector Machine (SVM), integrated learning and sparse learning are the most active directions in the field of machine learning in the past decade or nearly two decades. These achievements are the joint efforts of the statistical community and the computer science community. For example, mathematician Vapnik et al. proposed the theory of support vector machines as early as the 1960s, but it was not until the end of the 1990s that the computer industry invented a very efficient algorithm, and with the subsequent excellent implementation. Open source code, support vector machine is now a benchmark model for classification algorithms. For example, Kernel Principal Component Analysis (KPCA) is a nonlinear dimensionality reduction method proposed by computer scientists. It is equivalent to MulTI-Dimensional Scaling (MDS). The latter existed very early in the statistical world, but if there is no rediscovery by the computer community, some good things may be buried.
The two best statisticians in the world are from the University of California at Berkeley and Stanford. The University of California at Berkeley is one of the birthplaces of American statistics. It is arguably the center of today's statistics and machine learning. Its professors in machine learning usually have formal positions in both the computer and statistics departments. The late Professor Leo Breiman is the main founder of statistical machine learning. He is a major contributor to many statistical learning methods, such as Bagging, classification regression tree (CART), random forests, and non-negative garrote sparse models. . Blyman is the professor of Michael Jordan, who originally led Jordan to introduce Jordan from the Massachusetts Institute of Technology to Berkeley. It can be said that the statistics department of Berkeley has made Jordan, and in turn he has created new vitality for the statistical development of Berkeley. He has trained a large number of outstanding scholars in the field of machine learning and established irreplaceable merits.
One of the main directions of the Stanford University Department of Statistics is statistical learning. For example, the book "Elements of statistical learning" is written by several famous professors in the Department of Statistics. The direction of artificial intelligence in the Department of Computer Science at Stanford University has always dominated the world, especially in areas such as uncertain reasoning, probability map models, and probabilistic robots. Their online open classes "machine learning", "probability graph model" and "artificial intelligence" have benefited scholars around the world. Interestingly, Stanford University and Berkeley have an enviable competitive relationship. The annual Joint Statistics Day is the exchange platform for the statistical departments of the two universities. Berkeley professor Blayman and Stanford University professor Jerome Friedman have collaborated to create many important statistical learning models. In addition, the book "Artificial Intelligence: A Modern Approach" by the two school professors Stuart Russell and Peter Norvig is a collection of artificial intelligence.
Carnegie Mellon University is a very unique school, it is not the traditional American Ivy League University. It can be said that it is based on computer science and is the first school in the world to establish a machine learning department. Professor Tom Mitchell is one of the early founders and guardians of machine learning, and he has taught "machine learning" courses for undergraduates. The school's statistics are also first-rate, a world research center for Bayesian statistics.
In the field of machine learning, the University of Toronto has a pivotal position. Its machine learning research group has gathered a group of world-class scholars to publish a number of groundbreaking papers in Science and Nature. It is rare. Professor Geoffrey Hinton is a great thinker and a practitioner. He is one of the founders of neural networks and a major contributor to error back propagation (BP) algorithms and deep learning. It is because of his unremitting efforts that the neural network has ushered in a big explosion. Professor Radford Neal is a student of Hinton. He has done a series of important work in Bayesian statistics, especially in the Monte Carlo Markov Chain Simulation Method (MCMC), and has also opened up many The Bayesian statistical method package has been dedicated to optimizing the R language.
Nantong Boxin Electronic Technology Co., Ltd. , https://www.ntbosen.com